Showing preview only (368K chars total). Download the full file or copy to clipboard to get everything.
Repository: CGuangyan-BIT/PointGPT
Branch: V1.2
Commit: 9ba3d40e3aaa
Files: 102
Total size: 341.7 KB
Directory structure:
gitextract_eruklvu8/
├── DATASET.md
├── LICENSE
├── README.md
├── cfgs/
│ ├── PointGPT-B/
│ │ ├── fewshot.yaml
│ │ ├── finetune_modelnet.yaml
│ │ ├── finetune_modelnet_8k.yaml
│ │ ├── finetune_scan_hardest.yaml
│ │ ├── finetune_scan_objbg.yaml
│ │ ├── finetune_scan_objonly.yaml
│ │ ├── post_pretrain.yaml
│ │ └── pretrain.yaml
│ ├── PointGPT-L/
│ │ ├── fewshot.yaml
│ │ ├── finetune_modelnet.yaml
│ │ ├── finetune_modelnet_8k.yaml
│ │ ├── finetune_scan_hardest.yaml
│ │ ├── finetune_scan_objbg.yaml
│ │ ├── finetune_scan_objonly.yaml
│ │ ├── post_pretrain.yaml
│ │ └── pretrain.yaml
│ ├── PointGPT-S/
│ │ ├── fewshot.yaml
│ │ ├── finetune_modelnet.yaml
│ │ ├── finetune_modelnet_8k.yaml
│ │ ├── finetune_scan_hardest.yaml
│ │ ├── finetune_scan_objbg.yaml
│ │ ├── finetune_scan_objonly.yaml
│ │ └── pretrain.yaml
│ └── dataset_configs/
│ ├── LabeledHybrid.yaml
│ ├── ModelNet40.yaml
│ ├── ModelNet40FewShot.yaml
│ ├── ScanObjectNN_hardest.yaml
│ ├── ScanObjectNN_objectbg.yaml
│ ├── ScanObjectNN_objectonly.yaml
│ ├── ShapeNet-55.yaml
│ └── UnlabeledHybrid.yaml
├── datasets/
│ ├── LabeledHybrid.py
│ ├── ModelNetDataset.py
│ ├── ModelNetDatasetFewShot.py
│ ├── ScanObjectNNDataset.py
│ ├── ShapeNet55Dataset.py
│ ├── UnlabeledHybrid.py
│ ├── __init__.py
│ ├── build.py
│ ├── data_transforms.py
│ ├── generate_few_shot_data.py
│ └── io.py
├── extensions/
│ ├── chamfer_dist/
│ │ ├── __init__.py
│ │ ├── chamfer.cu
│ │ ├── chamfer_cuda.cpp
│ │ ├── setup.py
│ │ └── test.py
│ └── emd/
│ ├── README.md
│ ├── __init__.py
│ ├── cuda/
│ │ ├── emd.cpp
│ │ └── emd_kernel.cu
│ ├── emd.py
│ ├── setup.py
│ └── test_emd_loss.py
├── figures/
│ └── a
├── main.py
├── main_vis.py
├── models/
│ ├── GPT.py
│ ├── PointGPT.py
│ ├── __init__.py
│ ├── build.py
│ └── z_order.py
├── requirements.txt
├── segmentation/
│ ├── __init__.py
│ ├── dataset.py
│ ├── extensions/
│ │ ├── chamfer_dist/
│ │ │ ├── __init__.py
│ │ │ ├── chamfer.cu
│ │ │ ├── chamfer_cuda.cpp
│ │ │ ├── setup.py
│ │ │ └── test.py
│ │ └── emd/
│ │ ├── README.md
│ │ ├── __init__.py
│ │ ├── cuda/
│ │ │ ├── emd.cpp
│ │ │ └── emd_kernel.cu
│ │ ├── emd.py
│ │ ├── setup.py
│ │ └── test_emd_loss.py
│ ├── logger.py
│ ├── main.py
│ ├── misc.py
│ ├── models/
│ │ ├── gpt2_seg.py
│ │ ├── pointnet2_utils.py
│ │ ├── pt.py
│ │ └── z_order.py
│ ├── pointnet_util.py
│ └── provider.py
├── tools/
│ ├── __init__.py
│ ├── builder.py
│ ├── runner.py
│ ├── runner_finetune.py
│ └── runner_pretrain.py
└── utils/
├── AverageMeter.py
├── checkpoint.py
├── config.py
├── dist_utils.py
├── logger.py
├── misc.py
├── parser.py
└── registry.py
================================================
FILE CONTENTS
================================================
================================================
FILE: DATASET.md
================================================
## Dataset
The overall directory structure should be:
```
│Point-MAE/
├──cfgs/
├──data/
│ ├──ModelNet/
│ ├──ModelNetFewshot/
│ ├──ScanObjectNN/
│ ├──ShapeNet55-34/
│ ├──shapenetcore_partanno_segmentation_benchmark_v0_normal/
├──datasets/
├──.......
```
### ModelNet40 Dataset:
```
│ModelNet/
├──modelnet40_normal_resampled/
│ ├── modelnet40_shape_names.txt
│ ├── modelnet40_train.txt
│ ├── modelnet40_test.txt
│ ├── modelnet40_train_8192pts_fps.dat
│ ├── modelnet40_test_8192pts_fps.dat
```
Download: You can download the processed data from [Point-BERT repo](https://github.com/lulutang0608/Point-BERT/blob/49e2c7407d351ce8fe65764bbddd5d9c0e0a4c52/DATASET.md), or download from the [official website](https://modelnet.cs.princeton.edu/#) and process it by yourself.
### ModelNet Few-shot Dataset:
```
│ModelNetFewshot/
├──5way10shot/
│ ├── 0.pkl
│ ├── ...
│ ├── 9.pkl
├──5way20shot/
│ ├── ...
├──10way10shot/
│ ├── ...
├──10way20shot/
│ ├── ...
```
Download: Please download the data from [Point-BERT repo](https://github.com/lulutang0608/Point-BERT/blob/49e2c7407d351ce8fe65764bbddd5d9c0e0a4c52/DATASET.md). We use the same data split as theirs.
### ScanObjectNN Dataset:
```
│ScanObjectNN/
├──main_split/
│ ├── training_objectdataset_augmentedrot_scale75.h5
│ ├── test_objectdataset_augmentedrot_scale75.h5
│ ├── training_objectdataset.h5
│ ├── test_objectdataset.h5
├──main_split_nobg/
│ ├── training_objectdataset.h5
│ ├── test_objectdataset.h5
```
Download: Please download the data from the [official website](https://hkust-vgd.github.io/scanobjectnn/).
### ShapeNet55/34 Dataset:
```
│ShapeNet55-34/
├──shapenet_pc/
│ ├── 02691156-1a04e3eab45ca15dd86060f189eb133.npy
│ ├── 02691156-1a6ad7a24bb89733f412783097373bdc.npy
│ ├── .......
├──ShapeNet-55/
│ ├── train.txt
│ └── test.txt
```
Download: Please download the data from [Point-BERT repo](https://github.com/lulutang0608/Point-BERT/blob/49e2c7407d351ce8fe65764bbddd5d9c0e0a4c52/DATASET.md).
### ShapeNetPart Dataset:
```
|shapenetcore_partanno_segmentation_benchmark_v0_normal/
├──02691156/
│ ├── 1a04e3eab45ca15dd86060f189eb133.txt
│ ├── .......
│── .......
│──train_test_split/
│──synsetoffset2category.txt
```
Download: Please download the data from [here](https://shapenet.cs.stanford.edu/media/shapenetcore_partanno_segmentation_benchmark_v0_normal.zip).
================================================
FILE: LICENSE
================================================
MIT License
Copyright (c) 2022 PANG-Yatian, YUAN-Li
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
================================================
FILE: README.md
================================================
# PointGPT
## PointGPT: Auto-regressively Generative Pre-training from Point Clouds [ArXiv](https://arxiv.org/abs/2305.11487)
In this work, we present PointGPT, a novel approach that extends the concept of GPT to point clouds, utilizing a point cloud auto-regressive generation task for pre-training transformer models. In object classification tasks, our PointGPT achieves 94.9% accuracy on the ModelNet40 dataset and 93.4% accuracy on the ScanObjectNN dataset, outperforming all other transformer models. In few-shot learning tasks, our method also attains new SOTA performance on all four benchmarks.
<div align="center">
<img src="./figures/net.png" width = "666" align=center />
</div>
## News
[2023.09.22] PointGPT has been accepted by NeurIPS 2023!
[2023.09.08] Unlabeled hybrid dataset and labeled hybrid dataset have been released!
[2023.08.19] Code has been updated; PointGPT-B and PointGPT-L models have been released!
[2023.06.20] Code and the PointGPT-S models have been released!
## 1. Requirements
PyTorch >= 1.7.0;
python >= 3.7;
CUDA >= 9.0;
GCC >= 4.9;
torchvision;
```
pip install -r requirements.txt
```
```
# Chamfer Distance & emd
cd ./extensions/chamfer_dist
python setup.py install --user
cd ./extensions/emd
python setup.py install --user
# PointNet++
pip install "git+https://github.com/erikwijmans/Pointnet2_PyTorch.git#egg=pointnet2_ops&subdirectory=pointnet2_ops_lib"
# GPU kNN
pip install --upgrade https://github.com/unlimblue/KNN_CUDA/releases/download/0.2/KNN_CUDA-0.2-py3-none-any.whl
```
## 2. Datasets
Our training data for the PointGPT-S model encompasses ShapeNet, ScanObjectNN, ModelNet40, and ShapeNetPart datasets. For detailed information, please refer to [DATASET.md](./DATASET.md).
To pretrain the PointGPT-B and PointGPT-L models, we employ both unlabeled hybrid dataset and labeled hybrid dataset, available for download [here](https://drive.google.com/file/d/1TWgd3eJX1HDruFfU9JrGnBfcVhzJIXqT/view?usp=sharing).
## 3. PointGPT Models
### PointGPT-S Models
| Task | Dataset | Config | Acc. | Download |
| ----------------- | -------------- | --------------------------------------------------------------- | ---------- | --------------------------------------------------------------------------------------------- |
| Pre-training | ShapeNet | [pretrain.yaml](./cfgs/PointGPT-S/pretrain.yaml) | N.A. | [here](https://drive.google.com/file/d/1gTFI327kXVDFQ90JfYX0zIS4opM1EkqX/view?usp=drive_link) |
| Classification | ScanObjectNN | [finetune_scan_hardest.yaml](./cfgs/PointGPT-S/finetune_scan_hardest.yaml) | 86.9% | [here](https://drive.google.com/file/d/12Tj2OFKsEPT5zd5nQQ2VNEZlCKHncdGh/view?usp=drive_link) |
| Classification | ScanObjectNN | [finetune_scan_objbg.yaml](./cfgs/PointGPT-S/finetune_scan_objbg.yaml) | 91.6% | [here](https://drive.google.com/file/d/1s4RrBkfwVr8r0H2FxwiHULcyMe_EAJ9D/view?usp=drive_link) |
| Classification | ScanObjectNN | [finetune_scan_objonly.yaml](./cfgs/PointGPT-S/finetune_scan_objonly.yaml) | 90.0% | [here](https://drive.google.com/file/d/173yfDAlqqed-oRHaogX6DC4Uj1b8Rvxt/view?usp=drive_link) |
| Classification | ModelNet40(1k) | [finetune_modelnet.yaml](./cfgs/PointGPT-S/finetune_modelnet.yaml) | 94.0% | [here](https://drive.google.com/file/d/17uoJchAzwapTNHVxOWNH4HLNZz9kbGoo/view?usp=drive_link) |
| Classification | ModelNet40(8k) | [finetune_modelnet_8k.yaml](./cfgs/PointGPT-S/finetune_modelnet_8k.yaml) | 94.2% | [here](https://drive.google.com/file/d/1XocTFSsKZgKHx2cLqZJi2rcF74hQ-1nx/view?usp=drive_link) |
| Part segmentation | ShapeNetPart | [segmentation](./segmentation) | 86.2% mIoU | [here](https://drive.google.com/file/d/1WVMTtIq4vPQOOnlDsymVA5541lNL-hm3/view?usp=drive_link) |
| Task | Dataset | Config | 5w10s Acc. (%) | 5w20s Acc. (%) | 10w10s Acc. (%) | 10w20s Acc. (%) |
| ----------------- | ---------- | ----------------------------------- | -------------- | -------------- | --------------- | --------------- |
| Few-shot learning | ModelNet40 | [fewshot.yaml](./cfgs/fewshot.yaml) | 96.8 ± 2.0 | 98.6 ± 1.1 | 92.6 ± 4.6 | 95.2 ± 3.4 |
### PointGPT-B Models
| Task | Dataset | Config | Acc. | Download |
| ----------------- | -------------- | --------------------------------------------------------------- | ---------- | --------------------------------------------------------------------------------------------- |
| Pre-training | UnlabeledHybrid | [pretrain.yaml](./cfgs/PointGPT-B/pretrain.yaml) | N.A. | [here](https://drive.google.com/file/d/1Gyf9ZR8MCPg1XOCALjJR9VJepV7iAi5S/view?usp=sharing) |
| Post-pre-training | LabeledHybrid | [post_pretrain.yaml](./cfgs/PointGPT-B/post_pretrain.yaml) | N.A. | [here](https://drive.google.com/file/d/1Gc7thuU-D1Sq4NIMTV6-U1LhVN0E2z9l/view?usp=sharing) |
| Classification | ScanObjectNN | [finetune_scan_hardest.yaml](./cfgs/PointGPT-B/finetune_scan_hardest.yaml) | 91.9% | [here](https://drive.google.com/file/d/1tHi7W935DxVttXHG0Mgb0HSfYWUqXLwB/view?usp=sharing) |
| Classification | ScanObjectNN | [finetune_scan_objbg.yaml](./cfgs/PointGPT-B/finetune_scan_objbg.yaml) | 95.8% | [here](https://drive.google.com/file/d/1te8DuC_-cOzt4JayyaNWvxHcRztjDlGF/view?usp=sharing) |
| Classification | ScanObjectNN | [finetune_scan_objonly.yaml](./cfgs/PointGPT-B/finetune_scan_objonly.yaml) | 95.2% | [here](https://drive.google.com/file/d/17c8KvDrAuY0GgcO7SGE-4zlMArjzkjLX/view?usp=sharing) |
| Classification | ModelNet40(1k) | [finetune_modelnet.yaml](./cfgs/PointGPT-B/finetune_modelnet.yaml) | 94.4% | [here](https://drive.google.com/file/d/1l5zhy52erSp5gigbhYaT0nyMrV_lbh-C/view?usp=sharing) |
| Classification | ModelNet40(8k) | [finetune_modelnet_8k.yaml](./cfgs/PointGPT-B/finetune_modelnet_8k.yaml) | 94.6% | [here](https://drive.google.com/file/d/1FzM7ULPUAOk_J0BRHFvv0nS_Xd65oWbV/view?usp=sharing) |
| Part segmentation | ShapeNetPart | [segmentation](./segmentation) | 86.5% mIoU | [here](https://drive.google.com/file/d/1P6hELhX6Yr-rN04q6N71wZfvW2HnLhqD/view?usp=sharing) |
| Task | Dataset | Config | 5w10s Acc. (%) | 5w20s Acc. (%) | 10w10s Acc. (%) | 10w20s Acc. (%) |
| ----------------- | ---------- | ----------------------------------- | -------------- | -------------- | --------------- | --------------- |
| Few-shot learning | ModelNet40 | [fewshot.yaml](./cfgs/PointGPT-B/fewshot.yaml) | 97.5 ± 2.0 | 98.8 ± 1.0 | 93.5 ± 4.0 | 95.8 ± 3.0 |
### PointGPT-L Models
| Task | Dataset | Config | Acc. | Download |
| ----------------- | -------------- | --------------------------------------------------------------- | ---------- | --------------------------------------------------------------------------------------------- |
| Pre-training | UnlabeledHybrid | [pretrain.yaml](./cfgs/PointGPT-L/pretrain.yaml) | N.A. | [here](https://drive.google.com/file/d/1nzCwriFbC2QoDbRpGhWvf_DbFIkFU6zV/view?usp=sharing) |
| Post-pre-training | LabeledHybrid | [post_pretrain.yaml](./cfgs/PointGPT-L/post_pretrain.yaml) | N.A. | [here](https://drive.google.com/file/d/1Kh6f6gFR12Y86FAeBtMU9NbNpB5vZnpu/view?usp=sharing) |
| Classification | ScanObjectNN | [finetune_scan_hardest.yaml](./cfgs/PointGPT-L/finetune_scan_hardest.yaml) | 93.4% | [here](https://drive.google.com/file/d/1e_qIfZCqQmq0eRpYhf9xrIxl6TkzsaZ9/view?usp=sharing) |
| Classification | ScanObjectNN | [finetune_scan_objbg.yaml](./cfgs/PointGPT-L/finetune_scan_objbg.yaml) | 97.2% | [here](https://drive.google.com/file/d/1gd8gn0ffK0zfWv7AAUbygzIPSeeRU8fD/view?usp=sharing) |
| Classification | ScanObjectNN | [finetune_scan_objonly.yaml](./cfgs/PointGPT-L/finetune_scan_objonly.yaml) | 96.6% | [here](https://drive.google.com/file/d/1F2MnPmQGKnYUgmS5uz3PNInU23jWsNj1/view?usp=sharing) |
| Classification | ModelNet40(1k) | [finetune_modelnet.yaml](./cfgs/PointGPT-L/finetune_modelnet.yaml) | 94.7% | [here](https://drive.google.com/file/d/1ntWwZCvD_Tqykq9F7QrDKXH7aL-dcCsQ/view?usp=sharing) |
| Classification | ModelNet40(8k) | [finetune_modelnet_8k.yaml](./cfgs/PointGPT-L/finetune_modelnet_8k.yaml) | 94.9% | [here](https://drive.google.com/file/d/1gKgdbtIuRinJY-NElSHwrKAL5OhBjrGD/view?usp=sharing) |
| Part segmentation | ShapeNetPart | [segmentation](./segmentation) | 86.6% mIoU | [here](https://drive.google.com/file/d/1d3fXLBkXvzl9YjX5DDMdm7rUtCvfwgUL/view?usp=sharing) |
| Task | Dataset | Config | 5w10s Acc. (%) | 5w20s Acc. (%) | 10w10s Acc. (%) | 10w20s Acc. (%) |
| ----------------- | ---------- | ----------------------------------- | -------------- | -------------- | --------------- | --------------- |
| Few-shot learning | ModelNet40 | [fewshot.yaml](./cfgs/PointGPT-L/fewshot.yaml) | 98.0 ± 1.9 | 99.0 ± 1.0 | 94.1 ± 3.3 | 96.1 ± 2.8 |
## 4. PointGPT Pre-training
To pretrain PointGPT, run the following command.
```
CUDA_VISIBLE_DEVICES=<GPU> python main.py --config cfgs/<MODEL_NAME>/pretrain.yaml --exp_name <output_file_name>
```
To post-pretrain PointGPT, run the following command.
```
CUDA_VISIBLE_DEVICES=<GPU> python main.py --config cfgs/<MODEL_NAME>/post_pretrain.yaml --exp_name <output_file_name> --finetune_model
```
## 5. PointGPT Fine-tuning
Fine-tuning on ScanObjectNN, run the following command:
```
CUDA_VISIBLE_DEVICES=<GPUs> python main.py --config cfgs/<MODEL_NAME>/finetune_scan_hardest.yaml \
--finetune_model --exp_name <output_file_name> --ckpts <path/to/pre-trained/model>
```
Fine-tuning on ModelNet40, run the following command:
```
CUDA_VISIBLE_DEVICES=<GPUs> python main.py --config cfgs/<MODEL_NAME>/finetune_modelnet.yaml \
--finetune_model --exp_name <output_file_name> --ckpts <path/to/pre-trained/model>
```
Voting on ModelNet40, run the following command:
```
CUDA_VISIBLE_DEVICES=<GPUs> python main.py --test --config cfgs/<MODEL_NAME>/finetune_modelnet.yaml \
--exp_name <output_file_name> --ckpts <path/to/best/fine-tuned/model>
```
Few-shot learning, run the following command:
```
CUDA_VISIBLE_DEVICES=<GPUs> python main.py --config cfgs/<MODEL_NAME>/fewshot.yaml --finetune_model \
--ckpts <path/to/pre-trained/model> --exp_name <output_file_name> --way <5 or 10> --shot <10 or 20> --fold <0-9>
```
Part segmentation on ShapeNetPart, run the following command:
```
cd segmentation
python main.py --ckpts <path/to/pre-trained/model> --root path/to/data --learning_rate 0.0002 --epoch 300 --model_name <MODEL_NAME>
```
## 6. Visualization
Visulization of pre-trained model on validation set, run:
```
python main_vis.py --test --ckpts <path/to/pre-trained/model> --config cfgs/<MODEL_NAME>/pretrain.yaml --exp_name <name>
```
<div align="center">
<img src="./figures/vis.png" width = "900" align=center />
</div>
## 7. Ablation studies on post-pre-training stage
<table>
<thead>
<tr>
<th rowspan="2">Methods</th>
<th colspan="3"><u>ScanObjectNN</u></th>
<th colspan="2"><u>ModelNet40</u></th>
<th colspan="2">ShapeNetPart</th>
</tr>
<tr>
<th>OBJ_BG</th>
<th>OBJ_ONLY</th>
<th>PB_T50_RS</th>
<th>1k P</th>
<th>8k P</th>
<th>Cls.mIoU</th>
<th>Inst.mIoU</th>
</tr>
</thead>
<tbody>
<tr>
<th colspan="8">without post-pre-training</th>
</tr>
<tr>
<td><i>PointGPT-B</i></td>
<td>93.6</td>
<td>92.5</td>
<td>89.6</td>
<td>94.2</td>
<td>94.4</td>
<td>84.5</td>
<td>86.4</td>
</tr>
<tr>
<td><i>PointGPT-L</i></td>
<td>95.7</td>
<td>94.1</td>
<td>91.1</td>
<td>94.5</td>
<td>94.7</td>
<td>84.7</td>
<td>86.5</td>
</tr>
<tr>
<th colspan="8">with post-pre-training</th>
</tr>
<tr>
<td><i>PointGPT-B</i></td>
<td>95.8 <span style="color:green">(+2.2)</span></td>
<td>95.2 <span style="color:green">(+2.7)</span></td>
<td>91.9 <span style="color:green">(+2.3)</span></td>
<td>94.4 <span style="color:green">(+0.2)</span></td>
<td>94.6 <span style="color:green">(+0.2)</span></td>
<td>84.5 <span style="color:green">(+0.0)</span></td>
<td>86.5 <span style="color:green">(+0.1)</span></td>
</tr>
<tr>
<td><i>PointGPT-L</i></td>
<td>97.2 <span style="color:green">(+1.5)</span></td>
<td>96.6 <span style="color:green">(+2.5)</span></td>
<td>93.4 <span style="color:green">(+2.3)</span></td>
<td>94.7 <span style="color:green">(+0.2)</span></td>
<td>94.9 <span style="color:green">(+0.2)</span></td>
<td>84.8 <span style="color:green">(+0.1)</span></td>
<td>86.6 <span style="color:green">(+0.1)</span></td>
</tr>
</tbody>
</table>
## Acknowledgements
Our codes are built upon [Point-MAE](https://github.com/Pang-Yatian/Point-MAE), [Point-BERT](https://github.com/lulutang0608/Point-BERT), [Pointnet2_PyTorch](https://github.com/erikwijmans/Pointnet2_PyTorch) and [Pointnet_Pointnet2_pytorch](https://github.com/yanx27/Pointnet_Pointnet2_pytorch)
The unlabeled hybrid dataset and labeled hybrid dataset are built upon [ModelNet40](https://3dshapenets.cs.princeton.edu/), [PartNet](https://partnet.cs.stanford.edu/), [ShapeNet](http://www.shapenet.org), [S3DIS](http://buildingparser.stanford.edu/), [ScanObjectNN](https://hkust-vgd.github.io/scanobjectnn/), [SUN RGB-D](https://rgbd.cs.princeton.edu/), and [Semantic3D](http://semantic3d.net/)
## Reference
```
@article{chen2024pointgpt,
title={Pointgpt: Auto-regressively generative pre-training from point clouds},
author={Chen, Guangyan and Wang, Meiling and Yang, Yi and Yu, Kai and Yuan, Li and Yue, Yufeng},
journal={Advances in Neural Information Processing Systems},
volume={36},
year={2024}
}
```
For unlabeled hybrid dataset or labeled hybrid dataset, please also cite the following work.
```
@inproceedings{wu20153d,
title={3d shapenets: A deep representation for volumetric shapes},
author={Wu, Zhirong and Song, Shuran and Khosla, Aditya and Yu, Fisher and Zhang, Linguang and Tang, Xiaoou and Xiao, Jianxiong},
booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
pages={1912--1920},
year={2015}
}
@inproceedings{mo2019partnet,
title={Partnet: A large-scale benchmark for fine-grained and hierarchical part-level 3d object understanding},
author={Mo, Kaichun and Zhu, Shilin and Chang, Angel X and Yi, Li and Tripathi, Subarna and Guibas, Leonidas J and Su, Hao},
booktitle={Proceedings of the IEEE/CVF conference on computer vision and pattern recognition},
pages={909--918},
year={2019}
}
@article{chang2015shapenet,
title={Shapenet: An information-rich 3d model repository},
author={Chang, Angel X and Funkhouser, Thomas and Guibas, Leonidas and Hanrahan, Pat and Huang, Qixing and Li, Zimo and Savarese, Silvio and Savva, Manolis and Song, Shuran and Su, Hao and others},
journal={arXiv preprint arXiv:1512.03012},
year={2015}
}
@inproceedings{armeni20163d,
title={3d semantic parsing of large-scale indoor spaces},
author={Armeni, Iro and Sener, Ozan and Zamir, Amir R and Jiang, Helen and Brilakis, Ioannis and Fischer, Martin and Savarese, Silvio},
booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
pages={1534--1543},
year={2016}
}
@inproceedings{uy-scanobjectnn-iccv19,
title = {Revisiting Point Cloud Classification: A New Benchmark Dataset and Classification Model on Real-World Data},
author = {Mikaela Angelina Uy and Quang-Hieu Pham and Binh-Son Hua and Duc Thanh Nguyen and Sai-Kit Yeung},
booktitle = {International Conference on Computer Vision (ICCV)},
year = {2019}
}
@inproceedings{song2015sun,
title={Sun rgb-d: A rgb-d scene understanding benchmark suite},
author={Song, Shuran and Lichtenberg, Samuel P and Xiao, Jianxiong},
booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
pages={567--576},
year={2015}
}
@article{hackel2017semantic3d,
title={Semantic3d. net: A new large-scale point cloud classification benchmark},
author={Hackel, Timo and Savinov, Nikolay and Ladicky, Lubor and Wegner, Jan D and Schindler, Konrad and Pollefeys, Marc},
journal={arXiv preprint arXiv:1704.03847},
year={2017}
}
```
================================================
FILE: cfgs/PointGPT-B/fewshot.yaml
================================================
optimizer: { type: AdamW, kwargs: { lr: 0.0005, weight_decay: 0.05 } }
scheduler: { type: CosLR, kwargs: { epochs: 300, initial_epochs: 30 } }
dataset:
{
train:
{
_base_: cfgs/dataset_configs/ModelNet40FewShot.yaml,
others: { subset: "train" },
},
val:
{
_base_: cfgs/dataset_configs/ModelNet40FewShot.yaml,
others: { subset: "test" },
},
test:
{
_base_: cfgs/dataset_configs/ModelNet40.yaml,
others: { subset: "test" },
},
}
model:
{
NAME: PointTransformer,
trans_dim: 768,
depth: 12,
drop_path_rate: 0.1,
cls_dim: 40,
num_heads: 12,
group_size: 32,
num_group: 64,
encoder_dims: 768,
decoder_depth: 4,
}
npoints: 1024
total_bs: 32
step_per_update: 1
max_epoch: 300
grad_norm_clip: 10
================================================
FILE: cfgs/PointGPT-B/finetune_modelnet.yaml
================================================
optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }
scheduler: { type: CosLR, kwargs: { epochs: 50, initial_epochs: 10 } }
dataset:
{
train:
{
_base_: cfgs/dataset_configs/ModelNet40.yaml,
others: { subset: "train" },
},
val:
{
_base_: cfgs/dataset_configs/ModelNet40.yaml,
others: { subset: "test" },
},
test:
{
_base_: cfgs/dataset_configs/ModelNet40.yaml,
others: { subset: "test" },
},
}
model:
{
NAME: PointTransformer,
trans_dim: 768,
depth: 12,
drop_path_rate: 0.2,
cls_dim: 40,
num_heads: 12,
group_size: 32,
num_group: 64,
encoder_dims: 768,
decoder_depth: 4,
loss: cdl2,
weight_center: 1,
}
npoints: 1024
total_bs: 128
step_per_update: 1
max_epoch: 50
grad_norm_clip: 10
================================================
FILE: cfgs/PointGPT-B/finetune_modelnet_8k.yaml
================================================
optimizer: { type: AdamW, kwargs: { lr: 0.00005, weight_decay: 0.005 } }
scheduler: { type: CosLR, kwargs: { epochs: 50, initial_epochs: 10 } }
dataset:
{
train:
{
_base_: cfgs/dataset_configs/ModelNet40.yaml,
others: { subset: "train" },
},
val:
{
_base_: cfgs/dataset_configs/ModelNet40.yaml,
others: { subset: "test" },
},
test:
{
_base_: cfgs/dataset_configs/ModelNet40.yaml,
others: { subset: "test" },
},
}
model:
{
NAME: PointTransformer,
trans_dim: 768,
depth: 12,
drop_path_rate: 0.2,
cls_dim: 40,
num_heads: 12,
group_size: 32,
num_group: 512,
encoder_dims: 768,
decoder_depth: 4,
loss: cdl2,
weight_center: 1,
}
npoints: 8192
total_bs: 32
step_per_update: 1
max_epoch: 50
grad_norm_clip: 10
================================================
FILE: cfgs/PointGPT-B/finetune_scan_hardest.yaml
================================================
optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }
scheduler: { type: CosLR, kwargs: { epochs: 30, initial_epochs: 10 } }
dataset:
{
train:
{
_base_: cfgs/dataset_configs/ScanObjectNN_hardest.yaml,
others: { subset: "train" },
},
val:
{
_base_: cfgs/dataset_configs/ScanObjectNN_hardest.yaml,
others: { subset: "test" },
},
test:
{
_base_: cfgs/dataset_configs/ScanObjectNN_hardest.yaml,
others: { subset: "test" },
},
}
model:
{
NAME: PointTransformer,
trans_dim: 768,
depth: 12,
drop_path_rate: 0.2,
cls_dim: 15,
num_heads: 12,
group_size: 32,
num_group: 128,
encoder_dims: 768,
decoder_depth: 4,
}
npoints: 2048
total_bs: 64
step_per_update: 1
max_epoch: 30
grad_norm_clip: 10
================================================
FILE: cfgs/PointGPT-B/finetune_scan_objbg.yaml
================================================
optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }
scheduler: { type: CosLR, kwargs: { epochs: 30, initial_epochs: 10 } }
dataset:
{
train:
{
_base_: cfgs/dataset_configs/ScanObjectNN_objectbg.yaml,
others: { subset: "train" },
},
val:
{
_base_: cfgs/dataset_configs/ScanObjectNN_objectbg.yaml,
others: { subset: "test" },
},
test:
{
_base_: cfgs/dataset_configs/ScanObjectNN_objectbg.yaml,
others: { subset: "test" },
},
}
model:
{
NAME: PointTransformer,
trans_dim: 768,
depth: 12,
drop_path_rate: 0.2,
cls_dim: 15,
num_heads: 12,
group_size: 32,
num_group: 128,
encoder_dims: 768,
decoder_depth: 4,
}
npoints: 2048
total_bs: 64
step_per_update: 1
max_epoch: 30
grad_norm_clip: 10
================================================
FILE: cfgs/PointGPT-B/finetune_scan_objonly.yaml
================================================
optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }
scheduler: { type: CosLR, kwargs: { epochs: 50, initial_epochs: 10 } }
dataset:
{
train:
{
_base_: cfgs/dataset_configs/ScanObjectNN_objectonly.yaml,
others: { subset: "train" },
},
val:
{
_base_: cfgs/dataset_configs/ScanObjectNN_objectonly.yaml,
others: { subset: "test" },
},
test:
{
_base_: cfgs/dataset_configs/ScanObjectNN_objectonly.yaml,
others: { subset: "test" },
},
}
model:
{
NAME: PointTransformer,
trans_dim: 768,
depth: 12,
drop_path_rate: 0.2,
cls_dim: 15,
num_heads: 12,
group_size: 32,
num_group: 128,
encoder_dims: 768,
decoder_depth: 4,
}
npoints: 2048
total_bs: 64
step_per_update: 1
max_epoch: 50
grad_norm_clip: 10
================================================
FILE: cfgs/PointGPT-B/post_pretrain.yaml
================================================
optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }
scheduler: { type: CosLR, kwargs: { epochs: 100, initial_epochs: 10 } }
dataset:
{
train:
{
_base_: cfgs/dataset_configs/LabeledHybrid.yaml,
others: { subset: "train" },
},
val:
{
_base_: cfgs/dataset_configs/LabeledHybrid.yaml,
others: { subset: "test" },
},
test:
{
_base_: cfgs/dataset_configs/LabeledHybrid.yaml,
others: { subset: "test" },
},
}
model:
{
NAME: PointTransformer,
trans_dim: 768,
depth: 12,
drop_path_rate: 0.2,
cls_dim: 87,
num_heads: 12,
group_size: 32,
num_group: 64,
encoder_dims: 768,
decoder_depth: 4,
loss: cdl2,
weight_center: 1,
}
npoints: 1024
total_bs: 256
step_per_update: 1
max_epoch: 100
grad_norm_clip: 10
================================================
FILE: cfgs/PointGPT-B/pretrain.yaml
================================================
optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }
scheduler: { type: CosLR, kwargs: { epochs: 300, initial_epochs: 10 } }
dataset:
{
train:
{
_base_: cfgs/dataset_configs/UnlabeledHybrid.yaml,
others: { subset: "train", npoints: 1024 },
},
val:
{
_base_: cfgs/dataset_configs/UnlabeledHybrid.yaml,
others: { subset: "test", npoints: 1024 },
},
test:
{
_base_: cfgs/dataset_configs/UnlabeledHybrid.yaml,
others: { subset: "test", npoints: 1024 },
},
}
model:
{
NAME: PointGPT,
cls_dim: 40,
group_size: 32,
num_group: 64,
loss: cdl12,
weight_center: 1,
transformer_config:
{
mask_ratio: 0.7,
mask_type: "rand",
trans_dim: 768,
encoder_dims: 768,
depth: 12,
drop_path_rate: 0.1,
num_heads: 12,
decoder_depth: 4,
decoder_num_heads: 12,
},
}
npoints: 1024
total_bs: 128
step_per_update: 1
max_epoch: 300
================================================
FILE: cfgs/PointGPT-L/fewshot.yaml
================================================
optimizer: { type: AdamW, kwargs: { lr: 0.0005, weight_decay: 0.05 } }
scheduler: { type: CosLR, kwargs: { epochs: 300, initial_epochs: 30 } }
dataset:
{
train:
{
_base_: cfgs/dataset_configs/ModelNet40FewShot.yaml,
others: { subset: "train" },
},
val:
{
_base_: cfgs/dataset_configs/ModelNet40FewShot.yaml,
others: { subset: "test" },
},
test:
{
_base_: cfgs/dataset_configs/ModelNet40.yaml,
others: { subset: "test" },
},
}
model:
{
NAME: PointTransformer,
trans_dim: 768,
depth: 12,
drop_path_rate: 0.1,
cls_dim: 40,
num_heads: 12,
group_size: 32,
num_group: 64,
encoder_dims: 768,
decoder_depth: 4,
}
npoints: 1024
total_bs: 32
step_per_update: 1
max_epoch: 300
grad_norm_clip: 10
================================================
FILE: cfgs/PointGPT-L/finetune_modelnet.yaml
================================================
optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }
scheduler: { type: CosLR, kwargs: { epochs: 50, initial_epochs: 10 } }
dataset:
{
train:
{
_base_: cfgs/dataset_configs/ModelNet40.yaml,
others: { subset: "train" },
},
val:
{
_base_: cfgs/dataset_configs/ModelNet40.yaml,
others: { subset: "test" },
},
test:
{
_base_: cfgs/dataset_configs/ModelNet40.yaml,
others: { subset: "test" },
},
}
model:
{
NAME: PointTransformer,
trans_dim: 1024,
depth: 24,
drop_path_rate: 0.2,
cls_dim: 40,
num_heads: 16,
group_size: 32,
num_group: 64,
encoder_dims: 1024,
decoder_depth: 4,
loss: cdl2,
weight_center: 1,
}
npoints: 1024
total_bs: 128
step_per_update: 1
max_epoch: 50
grad_norm_clip: 10
================================================
FILE: cfgs/PointGPT-L/finetune_modelnet_8k.yaml
================================================
optimizer: { type: AdamW, kwargs: { lr: 0.00005, weight_decay: 0.005 } }
scheduler: { type: CosLR, kwargs: { epochs: 50, initial_epochs: 10 } }
dataset:
{
train:
{
_base_: cfgs/dataset_configs/ModelNet40.yaml,
others: { subset: "train" },
},
val:
{
_base_: cfgs/dataset_configs/ModelNet40.yaml,
others: { subset: "test" },
},
test:
{
_base_: cfgs/dataset_configs/ModelNet40.yaml,
others: { subset: "test" },
},
}
model:
{
NAME: PointTransformer,
trans_dim: 1024,
depth: 24,
drop_path_rate: 0.2,
cls_dim: 40,
num_heads: 16,
group_size: 32,
num_group: 512,
encoder_dims: 1024,
decoder_depth: 4,
loss: cdl2,
weight_center: 1,
}
npoints: 8192
total_bs: 32
step_per_update: 1
max_epoch: 50
grad_norm_clip: 10
================================================
FILE: cfgs/PointGPT-L/finetune_scan_hardest.yaml
================================================
optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }
scheduler: { type: CosLR, kwargs: { epochs: 50, initial_epochs: 10 } }
dataset:
{
train:
{
_base_: cfgs/dataset_configs/ScanObjectNN_hardest.yaml,
others: { subset: "train" },
},
val:
{
_base_: cfgs/dataset_configs/ScanObjectNN_hardest.yaml,
others: { subset: "test" },
},
test:
{
_base_: cfgs/dataset_configs/ScanObjectNN_hardest.yaml,
others: { subset: "test" },
},
}
model:
{
NAME: PointTransformer,
trans_dim: 1024,
depth: 24,
drop_path_rate: 0.2,
cls_dim: 15,
num_heads: 16,
group_size: 32,
num_group: 128,
encoder_dims: 1024,
decoder_depth: 4,
}
npoints: 2048
total_bs: 64
step_per_update: 1
max_epoch: 50
grad_norm_clip: 10
================================================
FILE: cfgs/PointGPT-L/finetune_scan_objbg.yaml
================================================
optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }
scheduler: { type: CosLR, kwargs: { epochs: 50, initial_epochs: 10 } }
dataset:
{
train:
{
_base_: cfgs/dataset_configs/ScanObjectNN_objectbg.yaml,
others: { subset: "train" },
},
val:
{
_base_: cfgs/dataset_configs/ScanObjectNN_objectbg.yaml,
others: { subset: "test" },
},
test:
{
_base_: cfgs/dataset_configs/ScanObjectNN_objectbg.yaml,
others: { subset: "test" },
},
}
model:
{
NAME: PointTransformer,
trans_dim: 1024,
depth: 24,
drop_path_rate: 0.2,
cls_dim: 15,
num_heads: 16,
group_size: 32,
num_group: 128,
encoder_dims: 1024,
decoder_depth: 4,
}
npoints: 2048
total_bs: 64
step_per_update: 1
max_epoch: 50
grad_norm_clip: 10
================================================
FILE: cfgs/PointGPT-L/finetune_scan_objonly.yaml
================================================
optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }
scheduler: { type: CosLR, kwargs: { epochs: 50, initial_epochs: 10 } }
dataset:
{
train:
{
_base_: cfgs/dataset_configs/ScanObjectNN_objectonly.yaml,
others: { subset: "train" },
},
val:
{
_base_: cfgs/dataset_configs/ScanObjectNN_objectonly.yaml,
others: { subset: "test" },
},
test:
{
_base_: cfgs/dataset_configs/ScanObjectNN_objectonly.yaml,
others: { subset: "test" },
},
}
model:
{
NAME: PointTransformer,
trans_dim: 1024,
depth: 24,
drop_path_rate: 0.2,
cls_dim: 15,
num_heads: 16,
group_size: 32,
num_group: 128,
encoder_dims: 1024,
decoder_depth: 4,
}
npoints: 2048
total_bs: 64
step_per_update: 1
max_epoch: 50
grad_norm_clip: 10
================================================
FILE: cfgs/PointGPT-L/post_pretrain.yaml
================================================
optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }
scheduler: { type: CosLR, kwargs: { epochs: 100, initial_epochs: 10 } }
dataset:
{
train:
{
_base_: cfgs/dataset_configs/LabeledHybrid.yaml,
others: { subset: "train" },
},
val:
{
_base_: cfgs/dataset_configs/LabeledHybrid.yaml,
others: { subset: "test" },
},
test:
{
_base_: cfgs/dataset_configs/LabeledHybrid.yaml,
others: { subset: "test" },
},
}
model:
{
NAME: PointTransformer,
trans_dim: 1024,
depth: 24,
drop_path_rate: 0.2,
cls_dim: 87,
num_heads: 16,
group_size: 32,
num_group: 64,
encoder_dims: 1024,
decoder_depth: 4,
loss: cdl2,
weight_center: 1,
}
npoints: 1024
total_bs: 256
step_per_update: 1
max_epoch: 100
grad_norm_clip: 10
================================================
FILE: cfgs/PointGPT-L/pretrain.yaml
================================================
optimizer: { type: AdamW, kwargs: { lr: 0.00006, weight_decay: 0.05 } }
scheduler: { type: CosLR, kwargs: { epochs: 600, initial_epochs: 80 } }
dataset:
{
train:
{
_base_: cfgs/dataset_configs/UnlabeledHybrid.yaml,
others: { subset: "train", npoints: 1024 },
},
val:
{
_base_: cfgs/dataset_configs/UnlabeledHybrid.yaml,
others: { subset: "test", npoints: 1024 },
},
test:
{
_base_: cfgs/dataset_configs/UnlabeledHybrid.yaml,
others: { subset: "test", npoints: 1024 },
},
}
model:
{
NAME: PointGPT,
cls_dim: 40,
group_size: 32,
num_group: 64,
loss: cdl12,
weight_center: 1,
transformer_config:
{
mask_ratio: 0.7,
mask_type: "rand",
trans_dim: 1024,
encoder_dims: 1024,
depth: 24,
drop_path_rate: 0.1,
num_heads: 16,
decoder_depth: 4,
decoder_num_heads: 12,
},
}
npoints: 1024
total_bs: 128
step_per_update: 1
max_epoch: 350
================================================
FILE: cfgs/PointGPT-S/fewshot.yaml
================================================
optimizer: { type: AdamW, kwargs: { lr: 0.0005, weight_decay: 0.05 } }
scheduler: { type: CosLR, kwargs: { epochs: 300, initial_epochs: 30 } }
dataset:
{
train:
{
_base_: cfgs/dataset_configs/ModelNet40FewShot.yaml,
others: { subset: "train" },
},
val:
{
_base_: cfgs/dataset_configs/ModelNet40FewShot.yaml,
others: { subset: "test" },
},
test:
{
_base_: cfgs/dataset_configs/ModelNet40.yaml,
others: { subset: "test" },
},
}
model:
{
NAME: PointTransformer,
trans_dim: 384,
depth: 12,
drop_path_rate: 0.1,
cls_dim: 40,
num_heads: 6,
group_size: 32,
num_group: 64,
encoder_dims: 384,
decoder_depth: 4,
}
npoints: 1024
total_bs: 32
step_per_update: 1
max_epoch: 300
grad_norm_clip: 10
================================================
FILE: cfgs/PointGPT-S/finetune_modelnet.yaml
================================================
optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }
scheduler: { type: CosLR, kwargs: { epochs: 300, initial_epochs: 10 } }
dataset:
{
train:
{
_base_: cfgs/dataset_configs/ModelNet40.yaml,
others: { subset: "train" },
},
val:
{
_base_: cfgs/dataset_configs/ModelNet40.yaml,
others: { subset: "test" },
},
test:
{
_base_: cfgs/dataset_configs/ModelNet40.yaml,
others: { subset: "test" },
},
}
model:
{
NAME: PointTransformer,
trans_dim: 384,
depth: 12,
drop_path_rate: 0.1,
cls_dim: 40,
num_heads: 6,
group_size: 32,
num_group: 64,
encoder_dims: 384,
decoder_depth: 4,
loss: cdl2,
weight_center: 1,
}
npoints: 1024
total_bs: 128
step_per_update: 1
max_epoch: 300
grad_norm_clip: 10
================================================
FILE: cfgs/PointGPT-S/finetune_modelnet_8k.yaml
================================================
optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.005 } }
scheduler: { type: CosLR, kwargs: { epochs: 300, initial_epochs: 10 } }
dataset:
{
train:
{
_base_: cfgs/dataset_configs/ModelNet40.yaml,
others: { subset: "train" },
},
val:
{
_base_: cfgs/dataset_configs/ModelNet40.yaml,
others: { subset: "test" },
},
test:
{
_base_: cfgs/dataset_configs/ModelNet40.yaml,
others: { subset: "test" },
},
}
model:
{
NAME: PointTransformer,
trans_dim: 384,
depth: 12,
drop_path_rate: 0.1,
cls_dim: 40,
num_heads: 6,
group_size: 32,
num_group: 512,
encoder_dims: 384,
decoder_depth: 4,
loss: cdl2,
weight_center: 1,
}
npoints: 8192
total_bs: 32
step_per_update: 1
max_epoch: 300
grad_norm_clip: 10
================================================
FILE: cfgs/PointGPT-S/finetune_scan_hardest.yaml
================================================
optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }
scheduler: { type: CosLR, kwargs: { epochs: 300, initial_epochs: 30 } }
dataset:
{
train:
{
_base_: cfgs/dataset_configs/ScanObjectNN_hardest.yaml,
others: { subset: "train" },
},
val:
{
_base_: cfgs/dataset_configs/ScanObjectNN_hardest.yaml,
others: { subset: "test" },
},
test:
{
_base_: cfgs/dataset_configs/ScanObjectNN_hardest.yaml,
others: { subset: "test" },
},
}
model:
{
NAME: PointTransformer,
trans_dim: 384,
depth: 12,
drop_path_rate: 0.1,
cls_dim: 15,
num_heads: 6,
group_size: 32,
num_group: 128,
encoder_dims: 384,
decoder_depth: 4,
}
npoints: 2048
total_bs: 64
step_per_update: 1
max_epoch: 300
grad_norm_clip: 10
================================================
FILE: cfgs/PointGPT-S/finetune_scan_objbg.yaml
================================================
optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }
scheduler: { type: CosLR, kwargs: { epochs: 300, initial_epochs: 10 } }
dataset:
{
train:
{
_base_: cfgs/dataset_configs/ScanObjectNN_objectbg.yaml,
others: { subset: "train" },
},
val:
{
_base_: cfgs/dataset_configs/ScanObjectNN_objectbg.yaml,
others: { subset: "test" },
},
test:
{
_base_: cfgs/dataset_configs/ScanObjectNN_objectbg.yaml,
others: { subset: "test" },
},
}
model:
{
NAME: PointTransformer,
trans_dim: 384,
depth: 12,
drop_path_rate: 0.1,
cls_dim: 15,
num_heads: 6,
group_size: 32,
num_group: 128,
encoder_dims: 384,
decoder_depth: 4,
}
npoints: 2048
total_bs: 32
step_per_update: 1
max_epoch: 300
grad_norm_clip: 10
================================================
FILE: cfgs/PointGPT-S/finetune_scan_objonly.yaml
================================================
optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }
scheduler: { type: CosLR, kwargs: { epochs: 300, initial_epochs: 10 } }
dataset:
{
train:
{
_base_: cfgs/dataset_configs/ScanObjectNN_objectonly.yaml,
others: { subset: "train" },
},
val:
{
_base_: cfgs/dataset_configs/ScanObjectNN_objectonly.yaml,
others: { subset: "test" },
},
test:
{
_base_: cfgs/dataset_configs/ScanObjectNN_objectonly.yaml,
others: { subset: "test" },
},
}
model:
{
NAME: PointTransformer,
trans_dim: 384,
depth: 12,
drop_path_rate: 0.1,
cls_dim: 15,
num_heads: 6,
group_size: 32,
num_group: 128,
encoder_dims: 384,
decoder_depth: 4,
}
npoints: 2048
total_bs: 32
step_per_update: 1
max_epoch: 300
grad_norm_clip: 10
================================================
FILE: cfgs/PointGPT-S/pretrain.yaml
================================================
optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }
scheduler: { type: CosLR, kwargs: { epochs: 300, initial_epochs: 10 } }
dataset:
{
train:
{
_base_: cfgs/dataset_configs/ShapeNet-55.yaml,
others: { subset: "train", npoints: 1024 },
},
val:
{
_base_: cfgs/dataset_configs/ShapeNet-55.yaml,
others: { subset: "test", npoints: 1024 },
},
test:
{
_base_: cfgs/dataset_configs/ShapeNet-55.yaml,
others: { subset: "test", npoints: 1024 },
},
}
model:
{
NAME: PointGPT,
cls_dim: 40,
group_size: 32,
num_group: 64,
loss: cdl12,
weight_center: 1,
transformer_config:
{
mask_ratio: 0.7,
mask_type: "rand",
trans_dim: 384,
encoder_dims: 384,
depth: 12,
drop_path_rate: 0.1,
num_heads: 6,
decoder_depth: 4,
decoder_num_heads: 6,
},
}
npoints: 1024
total_bs: 64
step_per_update: 1
max_epoch: 300
================================================
FILE: cfgs/dataset_configs/LabeledHybrid.yaml
================================================
NAME: LabeledHybrid
DATA_PATH: data/HybridDatasets/post_pretrain
N_POINTS: 2048
PC_PATH: data/HybridDatasets
npoints: 1024
NUM_CATEGORY: 87
================================================
FILE: cfgs/dataset_configs/ModelNet40.yaml
================================================
NAME: ModelNet
DATA_PATH: data/ModelNet/modelnet40_normal_resampled
N_POINTS: 8192
NUM_CATEGORY: 40
USE_NORMALS: FALSE
================================================
FILE: cfgs/dataset_configs/ModelNet40FewShot.yaml
================================================
NAME: ModelNetFewShot
DATA_PATH: data/ModelNetFewshot
N_POINTS: 8192
NUM_CATEGORY: 40
USE_NORMALS: FALSE
================================================
FILE: cfgs/dataset_configs/ScanObjectNN_hardest.yaml
================================================
NAME: ScanObjectNN_hardest
ROOT: data/ScanObjectNN/h5_files/main_split
================================================
FILE: cfgs/dataset_configs/ScanObjectNN_objectbg.yaml
================================================
NAME: ScanObjectNN
ROOT: data/ScanObjectNN/h5_files/main_split
================================================
FILE: cfgs/dataset_configs/ScanObjectNN_objectonly.yaml
================================================
NAME: ScanObjectNN
ROOT: data/ScanObjectNN/h5_files/main_split_nobg
================================================
FILE: cfgs/dataset_configs/ShapeNet-55.yaml
================================================
NAME: ShapeNet
DATA_PATH: data/ShapeNet55-34/ShapeNet-55
N_POINTS: 8192
PC_PATH: data/ShapeNet55-34/shapenet_pc
================================================
FILE: cfgs/dataset_configs/UnlabeledHybrid.yaml
================================================
NAME: UnlabeledHybrid
DATA_PATH: data/HybridDatasets/pretrain
N_POINTS: 2048
PC_PATH: data/HybridDatasets
================================================
FILE: datasets/LabeledHybrid.py
================================================
import os
import torch
import numpy as np
import torch.utils.data as data
from .io import IO
from .build import DATASETS
from utils.logger import *
@DATASETS.register_module()
class LabeledHybrid(data.Dataset):
def __init__(self, config):
self.data_root = config.DATA_PATH
self.pc_path = config.PC_PATH
self.subset = config.subset
self.npoints = config.N_POINTS
self.data_list_file = os.path.join(self.data_root, f'{self.subset}.txt')
self.label_list_file = os.path.join(self.data_root, f'{self.subset}_num.txt')
self.sample_points_num = config.npoints
print_log(f'[DATASET] sample out {self.sample_points_num} points', logger = 'LabeledHybrid')
print_log(f'[DATASET] Open file {self.data_list_file}', logger = 'LabeledHybrid')
with open(self.data_list_file, 'r') as f:
lines = f.readlines()
print_log(f'[DATASET] Open file {self.label_list_file}', logger = 'LabeledHybrid')
with open(self.label_list_file, 'r') as f:
lines_label = f.readlines()
self.file_list = []
for line in lines:
self.file_list.append(line.strip())
print_log(f'[DATASET] {len(self.file_list)} instances were loaded', logger = 'LabeledHybrid')
self.label_list = []
for line_label in lines_label:
self.label_list.append(np.array(int(line_label.strip())))
print_log(f'[DATASET] {len(self.label_list)} labels were loaded', logger = 'LabeledHybrid')
def pc_norm(self, pc):
""" pc: NxC, return NxC """
centroid = np.mean(pc, axis=0)
pc = pc - centroid
m = np.max(np.sqrt(np.sum(pc**2, axis=1)))
pc = pc / m
return pc
def random_sample(self, pc, num):
permutation = np.arange(pc.shape[0])
np.random.shuffle(permutation)
pc = pc[permutation[:num]]
return pc
def __getitem__(self, idx):
sample = self.file_list[idx]
label = self.label_list[idx]
data = IO.get(os.path.join(self.pc_path, sample)).astype(np.float32)
data = self.random_sample(data, self.sample_points_num)
data = self.pc_norm(data)
data = torch.from_numpy(data).float()
return 'LabeledHybrid', 'sample', (data, label)
def __len__(self):
return len(self.file_list)
================================================
FILE: datasets/ModelNetDataset.py
================================================
'''
@author: Xu Yan
@file: ModelNet.py
@time: 2021/3/19 15:51
'''
import os
import numpy as np
import warnings
import pickle
from tqdm import tqdm
from torch.utils.data import Dataset
from .build import DATASETS
from utils.logger import *
import torch
warnings.filterwarnings('ignore')
def pc_normalize(pc):
centroid = np.mean(pc, axis=0)
pc = pc - centroid
m = np.max(np.sqrt(np.sum(pc**2, axis=1)))
pc = pc / m
return pc
def farthest_point_sample(point, npoint):
"""
Input:
xyz: pointcloud data, [N, D]
npoint: number of samples
Return:
centroids: sampled pointcloud index, [npoint, D]
"""
N, D = point.shape
xyz = point[:,:3]
centroids = np.zeros((npoint,))
distance = np.ones((N,)) * 1e10
farthest = np.random.randint(0, N)
for i in range(npoint):
centroids[i] = farthest
centroid = xyz[farthest, :]
dist = np.sum((xyz - centroid) ** 2, -1)
mask = dist < distance
distance[mask] = dist[mask]
farthest = np.argmax(distance, -1)
point = point[centroids.astype(np.int32)]
return point
@DATASETS.register_module()
class ModelNet(Dataset):
def __init__(self, config):
self.root = config.DATA_PATH
self.npoints = config.N_POINTS
self.use_normals = config.USE_NORMALS
self.num_category = config.NUM_CATEGORY
self.process_data = True
self.uniform = True
split = config.subset
self.subset = config.subset
if self.num_category == 10:
self.catfile = os.path.join(self.root, 'modelnet10_shape_names.txt')
else:
self.catfile = os.path.join(self.root, 'modelnet40_shape_names.txt')
self.cat = [line.rstrip() for line in open(self.catfile)]
self.classes = dict(zip(self.cat, range(len(self.cat))))
shape_ids = {}
if self.num_category == 10:
shape_ids['train'] = [line.rstrip() for line in open(os.path.join(self.root, 'modelnet10_train.txt'))]
shape_ids['test'] = [line.rstrip() for line in open(os.path.join(self.root, 'modelnet10_test.txt'))]
else:
shape_ids['train'] = [line.rstrip() for line in open(os.path.join(self.root, 'modelnet40_train.txt'))]
shape_ids['test'] = [line.rstrip() for line in open(os.path.join(self.root, 'modelnet40_test.txt'))]
assert (split == 'train' or split == 'test')
shape_names = ['_'.join(x.split('_')[0:-1]) for x in shape_ids[split]]
self.datapath = [(shape_names[i], os.path.join(self.root, shape_names[i], shape_ids[split][i]) + '.txt') for i
in range(len(shape_ids[split]))]
print_log('The size of %s data is %d' % (split, len(self.datapath)), logger = 'ModelNet')
if self.uniform:
self.save_path = os.path.join(self.root, 'modelnet%d_%s_%dpts_fps.dat' % (self.num_category, split, self.npoints))
else:
self.save_path = os.path.join(self.root, 'modelnet%d_%s_%dpts.dat' % (self.num_category, split, self.npoints))
if self.process_data:
if not os.path.exists(self.save_path):
print_log('Processing data %s (only running in the first time)...' % self.save_path, logger = 'ModelNet')
self.list_of_points = [None] * len(self.datapath)
self.list_of_labels = [None] * len(self.datapath)
for index in tqdm(range(len(self.datapath)), total=len(self.datapath)):
fn = self.datapath[index]
cls = self.classes[self.datapath[index][0]]
cls = np.array([cls]).astype(np.int32)
point_set = np.loadtxt(fn[1], delimiter=',').astype(np.float32)
if self.uniform:
point_set = farthest_point_sample(point_set, self.npoints)
else:
point_set = point_set[0:self.npoints, :]
self.list_of_points[index] = point_set
self.list_of_labels[index] = cls
with open(self.save_path, 'wb') as f:
pickle.dump([self.list_of_points, self.list_of_labels], f)
else:
print_log('Load processed data from %s...' % self.save_path, logger = 'ModelNet')
with open(self.save_path, 'rb') as f:
self.list_of_points, self.list_of_labels = pickle.load(f)
def __len__(self):
return len(self.datapath)
def _get_item(self, index):
if self.process_data:
point_set, label = self.list_of_points[index], self.list_of_labels[index]
else:
fn = self.datapath[index]
cls = self.classes[self.datapath[index][0]]
label = np.array([cls]).astype(np.int32)
point_set = np.loadtxt(fn[1], delimiter=',').astype(np.float32)
if self.uniform:
point_set = farthest_point_sample(point_set, self.npoints)
else:
point_set = point_set[0:self.npoints, :]
point_set[:, 0:3] = pc_normalize(point_set[:, 0:3])
if not self.use_normals:
point_set = point_set[:, 0:3]
return point_set, label[0]
def __getitem__(self, index):
points, label = self._get_item(index)
pt_idxs = np.arange(0, points.shape[0]) # 2048
if self.subset == 'train':
np.random.shuffle(pt_idxs)
current_points = points[pt_idxs].copy()
current_points = torch.from_numpy(current_points).float()
return 'ModelNet', 'sample', (current_points, label)
================================================
FILE: datasets/ModelNetDatasetFewShot.py
================================================
'''
@author: Xu Yan
@file: ModelNet.py
@time: 2021/3/19 15:51
'''
import os
import numpy as np
import warnings
import pickle
from tqdm import tqdm
from torch.utils.data import Dataset
from .build import DATASETS
from utils.logger import *
import torch
import random
warnings.filterwarnings('ignore')
def pc_normalize(pc):
centroid = np.mean(pc, axis=0)
pc = pc - centroid
m = np.max(np.sqrt(np.sum(pc**2, axis=1)))
pc = pc / m
return pc
@DATASETS.register_module()
class ModelNetFewShot(Dataset):
def __init__(self, config):
self.root = config.DATA_PATH
self.npoints = config.N_POINTS
self.use_normals = config.USE_NORMALS
self.num_category = config.NUM_CATEGORY
self.process_data = True
self.uniform = True
split = config.subset
self.subset = config.subset
self.way = config.way
self.shot = config.shot
self.fold = config.fold
if self.way == -1 or self.shot == -1 or self.fold == -1:
raise RuntimeError()
self.pickle_path = os.path.join(self.root, f'{self.way}way_{self.shot}shot', f'{self.fold}.pkl')
print_log('Load processed data from %s...' % self.pickle_path, logger = 'ModelNetFewShot')
with open(self.pickle_path, 'rb') as f:
self.dataset = pickle.load(f)[self.subset]
print_log('The size of %s data is %d' % (split, len(self.dataset)), logger = 'ModelNetFewShot')
def __len__(self):
return len(self.dataset)
def __getitem__(self, index):
points, label, _ = self.dataset[index]
points[:, 0:3] = pc_normalize(points[:, 0:3])
if not self.use_normals:
points = points[:, 0:3]
pt_idxs = np.arange(0, points.shape[0]) # 2048
if self.subset == 'train':
np.random.shuffle(pt_idxs)
current_points = points[pt_idxs].copy()
current_points = torch.from_numpy(current_points).float()
return 'ModelNet', 'sample', (current_points, label)
================================================
FILE: datasets/ScanObjectNNDataset.py
================================================
import numpy as np
import os, sys, h5py
from torch.utils.data import Dataset
import torch
from .build import DATASETS
from utils.logger import *
BASE_DIR = os.path.dirname(os.path.abspath(__file__))
sys.path.append(BASE_DIR)
@DATASETS.register_module()
class ScanObjectNN(Dataset):
def __init__(self, config, **kwargs):
super().__init__()
self.subset = config.subset
self.root = config.ROOT
if self.subset == 'train':
h5 = h5py.File(os.path.join(self.root, 'training_objectdataset.h5'), 'r')
self.points = np.array(h5['data']).astype(np.float32)
self.labels = np.array(h5['label']).astype(int)
h5.close()
elif self.subset == 'test':
h5 = h5py.File(os.path.join(self.root, 'test_objectdataset.h5'), 'r')
self.points = np.array(h5['data']).astype(np.float32)
self.labels = np.array(h5['label']).astype(int)
h5.close()
else:
raise NotImplementedError()
print(f'Successfully load ScanObjectNN shape of {self.points.shape}')
def __getitem__(self, idx):
pt_idxs = np.arange(0, self.points.shape[1]) # 2048
if self.subset == 'train':
np.random.shuffle(pt_idxs)
current_points = self.points[idx, pt_idxs].copy()
current_points = torch.from_numpy(current_points).float()
label = self.labels[idx]
return 'ScanObjectNN', 'sample', (current_points, label)
def __len__(self):
return self.points.shape[0]
@DATASETS.register_module()
class ScanObjectNN_hardest(Dataset):
def __init__(self, config, **kwargs):
super().__init__()
self.subset = config.subset
self.root = config.ROOT
if self.subset == 'train':
h5 = h5py.File(os.path.join(self.root, 'training_objectdataset_augmentedrot_scale75.h5'), 'r')
self.points = np.array(h5['data']).astype(np.float32)
self.labels = np.array(h5['label']).astype(int)
h5.close()
elif self.subset == 'test':
h5 = h5py.File(os.path.join(self.root, 'test_objectdataset_augmentedrot_scale75.h5'), 'r')
self.points = np.array(h5['data']).astype(np.float32)
self.labels = np.array(h5['label']).astype(int)
h5.close()
else:
raise NotImplementedError()
print(f'Successfully load ScanObjectNN shape of {self.points.shape}')
def __getitem__(self, idx):
pt_idxs = np.arange(0, self.points.shape[1]) # 2048
if self.subset == 'train':
np.random.shuffle(pt_idxs)
current_points = self.points[idx, pt_idxs].copy()
current_points = torch.from_numpy(current_points).float()
label = self.labels[idx]
return 'ScanObjectNN', 'sample', (current_points, label)
def __len__(self):
return self.points.shape[0]
================================================
FILE: datasets/ShapeNet55Dataset.py
================================================
import os
import torch
import numpy as np
import torch.utils.data as data
from .io import IO
from .build import DATASETS
from utils.logger import *
@DATASETS.register_module()
class ShapeNet(data.Dataset):
def __init__(self, config):
self.data_root = config.DATA_PATH
self.pc_path = config.PC_PATH
self.subset = config.subset
self.npoints = config.N_POINTS
self.data_list_file = os.path.join(
self.data_root, f'{self.subset}.txt')
test_data_list_file = os.path.join(self.data_root, 'test.txt')
self.sample_points_num = config.npoints
self.whole = config.get('whole')
print_log(
f'[DATASET] sample out {self.sample_points_num} points', logger='ShapeNet-55')
print_log(
f'[DATASET] Open file {self.data_list_file}', logger='ShapeNet-55')
with open(self.data_list_file, 'r') as f:
lines = f.readlines()
if self.whole:
with open(test_data_list_file, 'r') as f:
test_lines = f.readlines()
print_log(
f'[DATASET] Open file {test_data_list_file}', logger='ShapeNet-55')
lines = test_lines + lines
self.file_list = []
for line in lines:
line = line.strip()
taxonomy_id = line.split('-')[0]
model_id = line.split('-')[1].split('.')[0]
self.file_list.append({
'taxonomy_id': taxonomy_id,
'model_id': model_id,
'file_path': line
})
print_log(
f'[DATASET] {len(self.file_list)} instances were loaded', logger='ShapeNet-55')
self.permutation = np.arange(self.npoints)
def pc_norm(self, pc):
""" pc: NxC, return NxC """
centroid = np.mean(pc, axis=0)
pc = pc - centroid
m = np.max(np.sqrt(np.sum(pc**2, axis=1)))
pc = pc / m
return pc
def random_sample(self, pc, num):
np.random.shuffle(self.permutation)
pc = pc[self.permutation[:num]]
return pc
def __getitem__(self, idx):
sample = self.file_list[idx]
data = IO.get(os.path.join(
self.pc_path, sample['file_path'])).astype(np.float32)
data = self.random_sample(data, self.sample_points_num)
data = self.pc_norm(data)
data = torch.from_numpy(data).float()
return sample['taxonomy_id'], sample['model_id'], data
def __len__(self):
return len(self.file_list)
================================================
FILE: datasets/UnlabeledHybrid.py
================================================
import os
import torch
import numpy as np
import torch.utils.data as data
from .io import IO
from .build import DATASETS
from utils.logger import *
@DATASETS.register_module()
class UnlabeledHybrid(data.Dataset):
def __init__(self, config):
self.data_root = config.DATA_PATH
self.pc_path = config.PC_PATH
self.subset = config.subset
self.npoints = config.N_POINTS
self.data_list_file = os.path.join(
self.data_root, f'{self.subset}.txt')
test_data_list_file = os.path.join(self.data_root, 'test.txt')
self.sample_points_num = config.npoints
self.whole = config.get('whole')
print_log(
f'[DATASET] sample out {self.sample_points_num} points', logger='UnlabeledHybrid')
print_log(
f'[DATASET] Open file {self.data_list_file}', logger='UnlabeledHybrid')
with open(self.data_list_file, 'r') as f:
lines = f.readlines()
if self.whole:
with open(test_data_list_file, 'r') as f:
test_lines = f.readlines()
print_log(
f'[DATASET] Open file {test_data_list_file}', logger='UnlabeledHybrid')
lines = test_lines + lines
self.file_list = []
for line in lines:
line = line.strip()
taxonomy_id = ''
model_id = ''
self.file_list.append({
'taxonomy_id': taxonomy_id,
'model_id': model_id,
'file_path': line
})
print_log(
f'[DATASET] {len(self.file_list)} instances were loaded', logger='UnlabeledHybrid')
self.permutation = np.arange(self.npoints)
def pc_norm(self, pc):
""" pc: NxC, return NxC """
centroid = np.mean(pc, axis=0)
pc = pc - centroid
m = np.max(np.sqrt(np.sum(pc**2, axis=1)))
pc = pc / m
return pc
def random_sample(self, pc, num):
permutation = np.arange(pc.shape[0])
np.random.shuffle(permutation)
pc = pc[permutation[:num]]
return pc
def __getitem__(self, idx):
sample = self.file_list[idx]
data = IO.get(os.path.join(
self.pc_path, sample['file_path'])).astype(np.float32)
data = self.random_sample(data, self.sample_points_num)
data = self.pc_norm(data)
data = torch.from_numpy(data).float()
# sample['taxonomy_id'] and sample['model_id'] are not utilized
return sample['taxonomy_id'], sample['model_id'], data
def __len__(self):
return len(self.file_list)
================================================
FILE: datasets/__init__.py
================================================
from .build import build_dataset_from_cfg
import datasets.ShapeNet55Dataset
import datasets.ModelNetDataset
import datasets.ModelNetDatasetFewShot
import datasets.ScanObjectNNDataset
import datasets.LabeledHybrid
import datasets.UnlabeledHybrid
================================================
FILE: datasets/build.py
================================================
from utils import registry
DATASETS = registry.Registry('dataset')
def build_dataset_from_cfg(cfg, default_args = None):
"""
Build a dataset, defined by `dataset_name`.
Args:
cfg (eDICT):
Returns:
Dataset: a constructed dataset specified by dataset_name.
"""
return DATASETS.build(cfg, default_args = default_args)
================================================
FILE: datasets/data_transforms.py
================================================
import numpy as np
import torch
import random
class PointcloudRotate(object):
def __call__(self, pc):
bsize = pc.size()[0]
for i in range(bsize):
rotation_angle = np.random.uniform() * 2 * np.pi
cosval = np.cos(rotation_angle)
sinval = np.sin(rotation_angle)
rotation_matrix = np.array([[cosval, 0, sinval],
[0, 1, 0],
[-sinval, 0, cosval]])
R = torch.from_numpy(rotation_matrix.astype(np.float32)).to(pc.device)
pc[i, :, :] = torch.matmul(pc[i], R)
return pc
class PointcloudScaleAndTranslate(object):
def __init__(self, scale_low=2. / 3., scale_high=3. / 2., translate_range=0.2):
self.scale_low = scale_low
self.scale_high = scale_high
self.translate_range = translate_range
def __call__(self, pc):
bsize = pc.size()[0]
for i in range(bsize):
xyz1 = np.random.uniform(low=self.scale_low, high=self.scale_high, size=[3])
xyz2 = np.random.uniform(low=-self.translate_range, high=self.translate_range, size=[3])
pc[i, :, 0:3] = torch.mul(pc[i, :, 0:3], torch.from_numpy(xyz1).float().cuda()) + torch.from_numpy(xyz2).float().cuda()
return pc
class PointcloudJitter(object):
def __init__(self, std=0.01, clip=0.05):
self.std, self.clip = std, clip
def __call__(self, pc):
bsize = pc.size()[0]
for i in range(bsize):
jittered_data = pc.new(pc.size(1), 3).normal_(
mean=0.0, std=self.std
).clamp_(-self.clip, self.clip)
pc[i, :, 0:3] += jittered_data
return pc
class PointcloudScale(object):
def __init__(self, scale_low=2. / 3., scale_high=3. / 2.):
self.scale_low = scale_low
self.scale_high = scale_high
def __call__(self, pc):
bsize = pc.size()[0]
for i in range(bsize):
xyz1 = np.random.uniform(low=self.scale_low, high=self.scale_high, size=[3])
pc[i, :, 0:3] = torch.mul(pc[i, :, 0:3], torch.from_numpy(xyz1).float().cuda())
return pc
class PointcloudTranslate(object):
def __init__(self, translate_range=0.2):
self.translate_range = translate_range
def __call__(self, pc):
bsize = pc.size()[0]
for i in range(bsize):
xyz2 = np.random.uniform(low=-self.translate_range, high=self.translate_range, size=[3])
pc[i, :, 0:3] = pc[i, :, 0:3] + torch.from_numpy(xyz2).float().cuda()
return pc
class PointcloudRandomInputDropout(object):
def __init__(self, max_dropout_ratio=0.5):
assert max_dropout_ratio >= 0 and max_dropout_ratio < 1
self.max_dropout_ratio = max_dropout_ratio
def __call__(self, pc):
bsize = pc.size()[0]
for i in range(bsize):
dropout_ratio = np.random.random() * self.max_dropout_ratio # 0~0.875
drop_idx = np.where(np.random.random((pc.size()[1])) <= dropout_ratio)[0]
if len(drop_idx) > 0:
cur_pc = pc[i, :, :]
cur_pc[drop_idx.tolist(), 0:3] = cur_pc[0, 0:3].repeat(len(drop_idx), 1) # set to the first point
pc[i, :, :] = cur_pc
return pc
class RandomHorizontalFlip(object):
def __init__(self, upright_axis = 'z', is_temporal=False):
"""
upright_axis: axis index among x,y,z, i.e. 2 for z
"""
self.is_temporal = is_temporal
self.D = 4 if is_temporal else 3
self.upright_axis = {'x': 0, 'y': 1, 'z': 2}[upright_axis.lower()]
# Use the rest of axes for flipping.
self.horz_axes = set(range(self.D)) - set([self.upright_axis])
def __call__(self, coords):
bsize = coords.size()[0]
for i in range(bsize):
if random.random() < 0.95:
for curr_ax in self.horz_axes:
if random.random() < 0.5:
coord_max = torch.max(coords[i, :, curr_ax])
coords[i, :, curr_ax] = coord_max - coords[i, :, curr_ax]
return coords
================================================
FILE: datasets/generate_few_shot_data.py
================================================
import pickle
import numpy as np
import random
import os
root = '../data/ModelNet/modelnet40_normal_resampled'
target = '../data/ModelNetFewshot'
train_data_path = os.path.join(root, 'modelnet40_train_8192pts_fps.dat')
test_data_path = os.path.join(root, 'modelnet40_test_8192pts_fps.dat')
# train
with open(train_data_path, 'rb') as f:
train_list_of_points, train_list_of_labels = pickle.load(f)
with open(test_data_path, 'rb') as f:
test_list_of_points, test_list_of_labels = pickle.load(f)
# list_of_points = train_list_of_points + test_list_of_points
# list_of_labels = train_list_of_labels + test_list_of_labels
def generate_fewshot_data(way, shot, prefix_ind, eval_sample=20):
train_cls_dataset = {}
test_cls_dataset = {}
train_dataset = []
test_dataset = []
# build a dict containing different class
for point, label in zip(train_list_of_points, train_list_of_labels):
label = label[0]
if train_cls_dataset.get(label) is None:
train_cls_dataset[label] = []
train_cls_dataset[label].append(point)
# build a dict containing different class
for point, label in zip(test_list_of_points, test_list_of_labels):
label = label[0]
if test_cls_dataset.get(label) is None:
test_cls_dataset[label] = []
test_cls_dataset[label].append(point)
print(sum([train_cls_dataset[i].__len__() for i in range(40)]))
print(sum([test_cls_dataset[i].__len__() for i in range(40)]))
# import pdb; pdb.set_trace()
keys = list(train_cls_dataset.keys())
random.shuffle(keys)
for i, key in enumerate(keys[:way]):
train_data_list = train_cls_dataset[key]
random.shuffle(train_data_list)
assert len(train_data_list) > shot
for data in train_data_list[:shot]:
train_dataset.append((data, i, key))
test_data_list = test_cls_dataset[key]
random.shuffle(test_data_list)
# import pdb; pdb.set_trace()
assert len(test_data_list) >= eval_sample
for data in test_data_list[:eval_sample]:
test_dataset.append((data, i, key))
random.shuffle(train_dataset)
random.shuffle(test_dataset)
dataset = {
'train': train_dataset,
'test' : test_dataset
}
save_path = os.path.join(target, f'{way}way_{shot}shot')
if not os.path.exists(save_path):
os.makedirs(save_path)
with open(os.path.join(save_path, f'{prefix_ind}.pkl'), 'wb') as f:
pickle.dump(dataset, f)
if __name__ == '__main__':
ways = [5, 10]
shots = [10, 20]
for way in ways:
for shot in shots:
for i in range(10):
generate_fewshot_data(way = way, shot = shot, prefix_ind = i)
================================================
FILE: datasets/io.py
================================================
import h5py
import numpy as np
# import open3d
import os
class IO:
@classmethod
def get(cls, file_path):
_, file_extension = os.path.splitext(file_path)
if file_extension in ['.npy']:
return cls._read_npy(file_path)
# elif file_extension in ['.pcd']:
# return cls._read_pcd(file_path)
elif file_extension in ['.h5']:
return cls._read_h5(file_path)
elif file_extension in ['.txt']:
return cls._read_txt(file_path)
else:
raise Exception('Unsupported file extension: %s' % file_extension)
# References: https://github.com/numpy/numpy/blob/master/numpy/lib/format.py
@classmethod
def _read_npy(cls, file_path):
return np.load(file_path)
# References: https://github.com/dimatura/pypcd/blob/master/pypcd/pypcd.py#L275
# Support PCD files without compression ONLY!
# @classmethod
# def _read_pcd(cls, file_path):
# pc = open3d.io.read_point_cloud(file_path)
# ptcloud = np.array(pc.points)
# return ptcloud
@classmethod
def _read_txt(cls, file_path):
return np.loadtxt(file_path)
@classmethod
def _read_h5(cls, file_path):
f = h5py.File(file_path, 'r')
return f['data'][()]
================================================
FILE: extensions/chamfer_dist/__init__.py
================================================
# -*- coding: utf-8 -*-
# @Author: Thibault GROUEIX
# @Date: 2019-08-07 20:54:24
# @Last Modified by: Haozhe Xie
# @Last Modified time: 2019-12-18 15:06:25
# @Email: cshzxie@gmail.com
import torch
import chamfer
class ChamferFunction(torch.autograd.Function):
@staticmethod
def forward(ctx, xyz1, xyz2):
dist1, dist2, idx1, idx2 = chamfer.forward(xyz1, xyz2)
ctx.save_for_backward(xyz1, xyz2, idx1, idx2)
return dist1, dist2
@staticmethod
def backward(ctx, grad_dist1, grad_dist2):
xyz1, xyz2, idx1, idx2 = ctx.saved_tensors
grad_xyz1, grad_xyz2 = chamfer.backward(xyz1, xyz2, idx1, idx2, grad_dist1, grad_dist2)
return grad_xyz1, grad_xyz2
class ChamferDistanceL2(torch.nn.Module):
f''' Chamder Distance L2
'''
def __init__(self, ignore_zeros=False):
super().__init__()
self.ignore_zeros = ignore_zeros
def forward(self, xyz1, xyz2):
batch_size = xyz1.size(0)
if batch_size == 1 and self.ignore_zeros:
non_zeros1 = torch.sum(xyz1, dim=2).ne(0)
non_zeros2 = torch.sum(xyz2, dim=2).ne(0)
xyz1 = xyz1[non_zeros1].unsqueeze(dim=0)
xyz2 = xyz2[non_zeros2].unsqueeze(dim=0)
dist1, dist2 = ChamferFunction.apply(xyz1, xyz2)
return torch.mean(dist1) + torch.mean(dist2)
class ChamferDistanceL2_split(torch.nn.Module):
f''' Chamder Distance L2
'''
def __init__(self, ignore_zeros=False):
super().__init__()
self.ignore_zeros = ignore_zeros
def forward(self, xyz1, xyz2):
batch_size = xyz1.size(0)
if batch_size == 1 and self.ignore_zeros:
non_zeros1 = torch.sum(xyz1, dim=2).ne(0)
non_zeros2 = torch.sum(xyz2, dim=2).ne(0)
xyz1 = xyz1[non_zeros1].unsqueeze(dim=0)
xyz2 = xyz2[non_zeros2].unsqueeze(dim=0)
dist1, dist2 = ChamferFunction.apply(xyz1, xyz2)
return torch.mean(dist1), torch.mean(dist2)
class ChamferDistanceL1(torch.nn.Module):
f''' Chamder Distance L1
'''
def __init__(self, ignore_zeros=False):
super().__init__()
self.ignore_zeros = ignore_zeros
def forward(self, xyz1, xyz2):
batch_size = xyz1.size(0)
if batch_size == 1 and self.ignore_zeros:
non_zeros1 = torch.sum(xyz1, dim=2).ne(0)
non_zeros2 = torch.sum(xyz2, dim=2).ne(0)
xyz1 = xyz1[non_zeros1].unsqueeze(dim=0)
xyz2 = xyz2[non_zeros2].unsqueeze(dim=0)
dist1, dist2 = ChamferFunction.apply(xyz1, xyz2)
# import pdb
# pdb.set_trace()
dist1 = torch.sqrt(dist1)
dist2 = torch.sqrt(dist2)
return (torch.mean(dist1) + torch.mean(dist2))/2
================================================
FILE: extensions/chamfer_dist/chamfer.cu
================================================
/*
* @Author: Haozhe Xie
* @Date: 2019-08-07 20:54:24
* @Last Modified by: Haozhe Xie
* @Last Modified time: 2020-06-17 14:58:55
* @Email: cshzxie@gmail.com
*/
#include <cuda.h>
#include <cuda_runtime.h>
#include <torch/extension.h>
#include <vector>
__global__ void chamfer_dist_kernel(int batch_size,
int n,
const float* xyz1,
int m,
const float* xyz2,
float* dist,
int* indexes) {
const int batch = 512;
__shared__ float buf[batch * 3];
for (int i = blockIdx.x; i < batch_size; i += gridDim.x) {
for (int k2 = 0; k2 < m; k2 += batch) {
int end_k = min(m, k2 + batch) - k2;
for (int j = threadIdx.x; j < end_k * 3; j += blockDim.x) {
buf[j] = xyz2[(i * m + k2) * 3 + j];
}
__syncthreads();
for (int j = threadIdx.x + blockIdx.y * blockDim.x; j < n;
j += blockDim.x * gridDim.y) {
float x1 = xyz1[(i * n + j) * 3 + 0];
float y1 = xyz1[(i * n + j) * 3 + 1];
float z1 = xyz1[(i * n + j) * 3 + 2];
float best_dist = 0;
int best_dist_index = 0;
int end_ka = end_k - (end_k & 3);
if (end_ka == batch) {
for (int k = 0; k < batch; k += 4) {
{
float x2 = buf[k * 3 + 0] - x1;
float y2 = buf[k * 3 + 1] - y1;
float z2 = buf[k * 3 + 2] - z1;
float dist = x2 * x2 + y2 * y2 + z2 * z2;
if (k == 0 || dist < best_dist) {
best_dist = dist;
best_dist_index = k + k2;
}
}
{
float x2 = buf[k * 3 + 3] - x1;
float y2 = buf[k * 3 + 4] - y1;
float z2 = buf[k * 3 + 5] - z1;
float dist = x2 * x2 + y2 * y2 + z2 * z2;
if (dist < best_dist) {
best_dist = dist;
best_dist_index = k + k2 + 1;
}
}
{
float x2 = buf[k * 3 + 6] - x1;
float y2 = buf[k * 3 + 7] - y1;
float z2 = buf[k * 3 + 8] - z1;
float dist = x2 * x2 + y2 * y2 + z2 * z2;
if (dist < best_dist) {
best_dist = dist;
best_dist_index = k + k2 + 2;
}
}
{
float x2 = buf[k * 3 + 9] - x1;
float y2 = buf[k * 3 + 10] - y1;
float z2 = buf[k * 3 + 11] - z1;
float dist = x2 * x2 + y2 * y2 + z2 * z2;
if (dist < best_dist) {
best_dist = dist;
best_dist_index = k + k2 + 3;
}
}
}
} else {
for (int k = 0; k < end_ka; k += 4) {
{
float x2 = buf[k * 3 + 0] - x1;
float y2 = buf[k * 3 + 1] - y1;
float z2 = buf[k * 3 + 2] - z1;
float dist = x2 * x2 + y2 * y2 + z2 * z2;
if (k == 0 || dist < best_dist) {
best_dist = dist;
best_dist_index = k + k2;
}
}
{
float x2 = buf[k * 3 + 3] - x1;
float y2 = buf[k * 3 + 4] - y1;
float z2 = buf[k * 3 + 5] - z1;
float dist = x2 * x2 + y2 * y2 + z2 * z2;
if (dist < best_dist) {
best_dist = dist;
best_dist_index = k + k2 + 1;
}
}
{
float x2 = buf[k * 3 + 6] - x1;
float y2 = buf[k * 3 + 7] - y1;
float z2 = buf[k * 3 + 8] - z1;
float dist = x2 * x2 + y2 * y2 + z2 * z2;
if (dist < best_dist) {
best_dist = dist;
best_dist_index = k + k2 + 2;
}
}
{
float x2 = buf[k * 3 + 9] - x1;
float y2 = buf[k * 3 + 10] - y1;
float z2 = buf[k * 3 + 11] - z1;
float dist = x2 * x2 + y2 * y2 + z2 * z2;
if (dist < best_dist) {
best_dist = dist;
best_dist_index = k + k2 + 3;
}
}
}
}
for (int k = end_ka; k < end_k; k++) {
float x2 = buf[k * 3 + 0] - x1;
float y2 = buf[k * 3 + 1] - y1;
float z2 = buf[k * 3 + 2] - z1;
float dist = x2 * x2 + y2 * y2 + z2 * z2;
if (k == 0 || dist < best_dist) {
best_dist = dist;
best_dist_index = k + k2;
}
}
if (k2 == 0 || dist[(i * n + j)] > best_dist) {
dist[(i * n + j)] = best_dist;
indexes[(i * n + j)] = best_dist_index;
}
}
__syncthreads();
}
}
}
std::vector<torch::Tensor> chamfer_cuda_forward(torch::Tensor xyz1,
torch::Tensor xyz2) {
const int batch_size = xyz1.size(0);
const int n = xyz1.size(1); // num_points point cloud A
const int m = xyz2.size(1); // num_points point cloud B
torch::Tensor dist1 =
torch::zeros({batch_size, n}, torch::CUDA(torch::kFloat));
torch::Tensor dist2 =
torch::zeros({batch_size, m}, torch::CUDA(torch::kFloat));
torch::Tensor idx1 = torch::zeros({batch_size, n}, torch::CUDA(torch::kInt));
torch::Tensor idx2 = torch::zeros({batch_size, m}, torch::CUDA(torch::kInt));
chamfer_dist_kernel<<<dim3(32, 16, 1), 512>>>(
batch_size, n, xyz1.data_ptr<float>(), m, xyz2.data_ptr<float>(),
dist1.data_ptr<float>(), idx1.data_ptr<int>());
chamfer_dist_kernel<<<dim3(32, 16, 1), 512>>>(
batch_size, m, xyz2.data_ptr<float>(), n, xyz1.data_ptr<float>(),
dist2.data_ptr<float>(), idx2.data_ptr<int>());
cudaError_t err = cudaGetLastError();
if (err != cudaSuccess) {
printf("Error in chamfer_cuda_forward: %s\n", cudaGetErrorString(err));
}
return {dist1, dist2, idx1, idx2};
}
__global__ void chamfer_dist_grad_kernel(int b,
int n,
const float* xyz1,
int m,
const float* xyz2,
const float* grad_dist1,
const int* idx1,
float* grad_xyz1,
float* grad_xyz2) {
for (int i = blockIdx.x; i < b; i += gridDim.x) {
for (int j = threadIdx.x + blockIdx.y * blockDim.x; j < n;
j += blockDim.x * gridDim.y) {
float x1 = xyz1[(i * n + j) * 3 + 0];
float y1 = xyz1[(i * n + j) * 3 + 1];
float z1 = xyz1[(i * n + j) * 3 + 2];
int j2 = idx1[i * n + j];
float x2 = xyz2[(i * m + j2) * 3 + 0];
float y2 = xyz2[(i * m + j2) * 3 + 1];
float z2 = xyz2[(i * m + j2) * 3 + 2];
float g = grad_dist1[i * n + j] * 2;
atomicAdd(&(grad_xyz1[(i * n + j) * 3 + 0]), g * (x1 - x2));
atomicAdd(&(grad_xyz1[(i * n + j) * 3 + 1]), g * (y1 - y2));
atomicAdd(&(grad_xyz1[(i * n + j) * 3 + 2]), g * (z1 - z2));
atomicAdd(&(grad_xyz2[(i * m + j2) * 3 + 0]), -(g * (x1 - x2)));
atomicAdd(&(grad_xyz2[(i * m + j2) * 3 + 1]), -(g * (y1 - y2)));
atomicAdd(&(grad_xyz2[(i * m + j2) * 3 + 2]), -(g * (z1 - z2)));
}
}
}
std::vector<torch::Tensor> chamfer_cuda_backward(torch::Tensor xyz1,
torch::Tensor xyz2,
torch::Tensor idx1,
torch::Tensor idx2,
torch::Tensor grad_dist1,
torch::Tensor grad_dist2) {
const int batch_size = xyz1.size(0);
const int n = xyz1.size(1); // num_points point cloud A
const int m = xyz2.size(1); // num_points point cloud B
torch::Tensor grad_xyz1 = torch::zeros_like(xyz1, torch::CUDA(torch::kFloat));
torch::Tensor grad_xyz2 = torch::zeros_like(xyz2, torch::CUDA(torch::kFloat));
chamfer_dist_grad_kernel<<<dim3(1, 16, 1), 256>>>(
batch_size, n, xyz1.data_ptr<float>(), m, xyz2.data_ptr<float>(),
grad_dist1.data_ptr<float>(), idx1.data_ptr<int>(),
grad_xyz1.data_ptr<float>(), grad_xyz2.data_ptr<float>());
chamfer_dist_grad_kernel<<<dim3(1, 16, 1), 256>>>(
batch_size, m, xyz2.data_ptr<float>(), n, xyz1.data_ptr<float>(),
grad_dist2.data_ptr<float>(), idx2.data_ptr<int>(),
grad_xyz2.data_ptr<float>(), grad_xyz1.data_ptr<float>());
cudaError_t err = cudaGetLastError();
if (err != cudaSuccess) {
printf("Error in chamfer_cuda_backward: %s\n", cudaGetErrorString(err));
}
return {grad_xyz1, grad_xyz2};
}
================================================
FILE: extensions/chamfer_dist/chamfer_cuda.cpp
================================================
/*
* @Author: Haozhe Xie
* @Date: 2019-08-07 20:54:24
* @Last Modified by: Haozhe Xie
* @Last Modified time: 2019-12-10 10:33:50
* @Email: cshzxie@gmail.com
*/
#include <torch/extension.h>
#include <vector>
std::vector<torch::Tensor> chamfer_cuda_forward(torch::Tensor xyz1,
torch::Tensor xyz2);
std::vector<torch::Tensor> chamfer_cuda_backward(torch::Tensor xyz1,
torch::Tensor xyz2,
torch::Tensor idx1,
torch::Tensor idx2,
torch::Tensor grad_dist1,
torch::Tensor grad_dist2);
std::vector<torch::Tensor> chamfer_forward(torch::Tensor xyz1,
torch::Tensor xyz2) {
return chamfer_cuda_forward(xyz1, xyz2);
}
std::vector<torch::Tensor> chamfer_backward(torch::Tensor xyz1,
torch::Tensor xyz2,
torch::Tensor idx1,
torch::Tensor idx2,
torch::Tensor grad_dist1,
torch::Tensor grad_dist2) {
return chamfer_cuda_backward(xyz1, xyz2, idx1, idx2, grad_dist1, grad_dist2);
}
PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
m.def("forward", &chamfer_forward, "Chamfer forward (CUDA)");
m.def("backward", &chamfer_backward, "Chamfer backward (CUDA)");
}
================================================
FILE: extensions/chamfer_dist/setup.py
================================================
# -*- coding: utf-8 -*-
# @Author: Haozhe Xie
# @Date: 2019-08-07 20:54:24
# @Last Modified by: Haozhe Xie
# @Last Modified time: 2019-12-10 10:04:25
# @Email: cshzxie@gmail.com
from setuptools import setup
from torch.utils.cpp_extension import BuildExtension, CUDAExtension
setup(name='chamfer',
version='2.0.0',
ext_modules=[
CUDAExtension('chamfer', [
'chamfer_cuda.cpp',
'chamfer.cu',
]),
],
cmdclass={'build_ext': BuildExtension})
================================================
FILE: extensions/chamfer_dist/test.py
================================================
# -*- coding: utf-8 -*-
# @Author: Haozhe Xie
# @Date: 2019-12-10 10:38:01
# @Last Modified by: Haozhe Xie
# @Last Modified time: 2019-12-26 14:21:36
# @Email: cshzxie@gmail.com
#
# Note:
# - Replace float -> double, kFloat -> kDouble in chamfer.cu
import os
import sys
import torch
import unittest
from torch.autograd import gradcheck
sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), os.path.pardir, os.path.pardir)))
from extensions.chamfer_dist import ChamferFunction
class ChamferDistanceTestCase(unittest.TestCase):
def test_chamfer_dist(self):
x = torch.rand(4, 64, 3).double()
y = torch.rand(4, 128, 3).double()
x.requires_grad = True
y.requires_grad = True
print(gradcheck(ChamferFunction.apply, [x.cuda(), y.cuda()]))
if __name__ == '__main__':
# unittest.main()
import pdb
x = torch.rand(32,128,3)
y = torch.rand(32,128,3)
pdb.set_trace()
================================================
FILE: extensions/emd/README.md
================================================
# PyTorch Wrapper for Point-cloud Earth-Mover-Distance (EMD)
## Dependency
The code has been tested on Ubuntu 16.04, PyTorch 1.1.0, CUDA 9.0.
## Usage
First compile using
python setup.py install
Then, copy the lib file out to the main directory,
cp build/lib.linux-x86_64-3.6/emd_cuda.cpython-36m-x86_64-linux-gnu.so .
Then, you can use it by simply
from emd import earth_mover_distance
d = earth_mover_distance(p1, p2, transpose=False) # p1: B x N1 x 3, p2: B x N2 x 3
Check `test_emd_loss.py` for example.
## Author
The cuda code is originally written by Haoqiang Fan. The PyTorch wrapper is written by Kaichun Mo. Also, Jiayuan Gu provided helps.
## License
MIT
================================================
FILE: extensions/emd/__init__.py
================================================
from .emd import earth_mover_distance as emd
__all__ = ['emd']
================================================
FILE: extensions/emd/cuda/emd.cpp
================================================
#ifndef _EMD
#define _EMD
#include <vector>
#include <torch/extension.h>
//CUDA declarations
at::Tensor ApproxMatchForward(
const at::Tensor xyz1,
const at::Tensor xyz2);
at::Tensor MatchCostForward(
const at::Tensor xyz1,
const at::Tensor xyz2,
const at::Tensor match);
std::vector<at::Tensor> MatchCostBackward(
const at::Tensor grad_cost,
const at::Tensor xyz1,
const at::Tensor xyz2,
const at::Tensor match);
PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
m.def("approxmatch_forward", &ApproxMatchForward,"ApproxMatch forward (CUDA)");
m.def("matchcost_forward", &MatchCostForward,"MatchCost forward (CUDA)");
m.def("matchcost_backward", &MatchCostBackward,"MatchCost backward (CUDA)");
}
#endif
================================================
FILE: extensions/emd/cuda/emd_kernel.cu
================================================
/**********************************
* Original Author: Haoqiang Fan
* Modified by: Kaichun Mo
*********************************/
#ifndef _EMD_KERNEL
#define _EMD_KERNEL
#include <cmath>
#include <vector>
#include <ATen/ATen.h>
#include <ATen/cuda/CUDAApplyUtils.cuh> // at::cuda::getApplyGrid
// #include <THC/THC.h>
#define CHECK_CUDA(x) TORCH_CHECK(x.type().is_cuda(), #x " must be a CUDA tensor")
#define CHECK_CONTIGUOUS(x) TORCH_CHECK(x.is_contiguous(), #x " must be contiguous")
#define CHECK_INPUT(x) CHECK_CUDA(x); CHECK_CONTIGUOUS(x)
/********************************
* Forward kernel for approxmatch
*********************************/
template<typename scalar_t>
__global__ void approxmatch(int b,int n,int m,const scalar_t * __restrict__ xyz1,const scalar_t * __restrict__ xyz2,scalar_t * __restrict__ match,scalar_t * temp){
scalar_t * remainL=temp+blockIdx.x*(n+m)*2, * remainR=temp+blockIdx.x*(n+m)*2+n,*ratioL=temp+blockIdx.x*(n+m)*2+n+m,*ratioR=temp+blockIdx.x*(n+m)*2+n+m+n;
scalar_t multiL,multiR;
if (n>=m){
multiL=1;
multiR=n/m;
}else{
multiL=m/n;
multiR=1;
}
const int Block=1024;
__shared__ scalar_t buf[Block*4];
for (int i=blockIdx.x;i<b;i+=gridDim.x){
for (int j=threadIdx.x;j<n*m;j+=blockDim.x)
match[i*n*m+j]=0;
for (int j=threadIdx.x;j<n;j+=blockDim.x)
remainL[j]=multiL;
for (int j=threadIdx.x;j<m;j+=blockDim.x)
remainR[j]=multiR;
__syncthreads();
for (int j=7;j>=-2;j--){
scalar_t level=-powf(4.0f,j);
if (j==-2){
level=0;
}
for (int k0=0;k0<n;k0+=blockDim.x){
int k=k0+threadIdx.x;
scalar_t x1=0,y1=0,z1=0;
if (k<n){
x1=xyz1[i*n*3+k*3+0];
y1=xyz1[i*n*3+k*3+1];
z1=xyz1[i*n*3+k*3+2];
}
scalar_t suml=1e-9f;
for (int l0=0;l0<m;l0+=Block){
int lend=min(m,l0+Block)-l0;
for (int l=threadIdx.x;l<lend;l+=blockDim.x){
scalar_t x2=xyz2[i*m*3+l0*3+l*3+0];
scalar_t y2=xyz2[i*m*3+l0*3+l*3+1];
scalar_t z2=xyz2[i*m*3+l0*3+l*3+2];
buf[l*4+0]=x2;
buf[l*4+1]=y2;
buf[l*4+2]=z2;
buf[l*4+3]=remainR[l0+l];
}
__syncthreads();
for (int l=0;l<lend;l++){
scalar_t x2=buf[l*4+0];
scalar_t y2=buf[l*4+1];
scalar_t z2=buf[l*4+2];
scalar_t d=level*((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1)+(z2-z1)*(z2-z1));
scalar_t w=__expf(d)*buf[l*4+3];
suml+=w;
}
__syncthreads();
}
if (k<n)
ratioL[k]=remainL[k]/suml;
}
__syncthreads();
for (int l0=0;l0<m;l0+=blockDim.x){
int l=l0+threadIdx.x;
scalar_t x2=0,y2=0,z2=0;
if (l<m){
x2=xyz2[i*m*3+l*3+0];
y2=xyz2[i*m*3+l*3+1];
z2=xyz2[i*m*3+l*3+2];
}
scalar_t sumr=0;
for (int k0=0;k0<n;k0+=Block){
int kend=min(n,k0+Block)-k0;
for (int k=threadIdx.x;k<kend;k+=blockDim.x){
buf[k*4+0]=xyz1[i*n*3+k0*3+k*3+0];
buf[k*4+1]=xyz1[i*n*3+k0*3+k*3+1];
buf[k*4+2]=xyz1[i*n*3+k0*3+k*3+2];
buf[k*4+3]=ratioL[k0+k];
}
__syncthreads();
for (int k=0;k<kend;k++){
scalar_t x1=buf[k*4+0];
scalar_t y1=buf[k*4+1];
scalar_t z1=buf[k*4+2];
scalar_t w=__expf(level*((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1)+(z2-z1)*(z2-z1)))*buf[k*4+3];
sumr+=w;
}
__syncthreads();
}
if (l<m){
sumr*=remainR[l];
scalar_t consumption=fminf(remainR[l]/(sumr+1e-9f),1.0f);
ratioR[l]=consumption*remainR[l];
remainR[l]=fmaxf(0.0f,remainR[l]-sumr);
}
}
__syncthreads();
for (int k0=0;k0<n;k0+=blockDim.x){
int k=k0+threadIdx.x;
scalar_t x1=0,y1=0,z1=0;
if (k<n){
x1=xyz1[i*n*3+k*3+0];
y1=xyz1[i*n*3+k*3+1];
z1=xyz1[i*n*3+k*3+2];
}
scalar_t suml=0;
for (int l0=0;l0<m;l0+=Block){
int lend=min(m,l0+Block)-l0;
for (int l=threadIdx.x;l<lend;l+=blockDim.x){
buf[l*4+0]=xyz2[i*m*3+l0*3+l*3+0];
buf[l*4+1]=xyz2[i*m*3+l0*3+l*3+1];
buf[l*4+2]=xyz2[i*m*3+l0*3+l*3+2];
buf[l*4+3]=ratioR[l0+l];
}
__syncthreads();
scalar_t rl=ratioL[k];
if (k<n){
for (int l=0;l<lend;l++){
scalar_t x2=buf[l*4+0];
scalar_t y2=buf[l*4+1];
scalar_t z2=buf[l*4+2];
scalar_t w=__expf(level*((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1)+(z2-z1)*(z2-z1)))*rl*buf[l*4+3];
match[i*n*m+(l0+l)*n+k]+=w;
suml+=w;
}
}
__syncthreads();
}
if (k<n)
remainL[k]=fmaxf(0.0f,remainL[k]-suml);
}
__syncthreads();
}
}
}
//void approxmatchLauncher(int b,int n,int m,const scalar_t * xyz1,const scalar_t * xyz2,scalar_t * match,scalar_t * temp){
// approxmatch<<<32,512>>>(b,n,m,xyz1,xyz2,match,temp);
//}
/* ApproxMatch forward interface
Input:
xyz1: (B, N1, 3) # dataset_points
xyz2: (B, N2, 3) # query_points
Output:
match: (B, N2, N1)
*/
at::Tensor ApproxMatchForward(
const at::Tensor xyz1,
const at::Tensor xyz2){
const auto b = xyz1.size(0);
const auto n = xyz1.size(1);
const auto m = xyz2.size(1);
CHECK_EQ(xyz2.size(0), b);
CHECK_EQ(xyz1.size(2), 3);
CHECK_EQ(xyz2.size(2), 3);
CHECK_INPUT(xyz1);
CHECK_INPUT(xyz2);
auto match = at::zeros({b, m, n}, xyz1.type());
auto temp = at::zeros({b, (n+m)*2}, xyz1.type());
AT_DISPATCH_FLOATING_TYPES(xyz1.scalar_type(), "ApproxMatchForward", ([&] {
approxmatch<scalar_t><<<32,512>>>(b, n, m, xyz1.data<scalar_t>(), xyz2.data<scalar_t>(), match.data<scalar_t>(), temp.data<scalar_t>());
}));
AT_CUDA_CHECK(cudaGetLastError());
return match;
}
/********************************
* Forward kernel for matchcost
*********************************/
template<typename scalar_t>
__global__ void matchcost(int b,int n,int m,const scalar_t * __restrict__ xyz1,const scalar_t * __restrict__ xyz2,const scalar_t * __restrict__ match,scalar_t * __restrict__ out){
__shared__ scalar_t allsum[512];
const int Block=1024;
__shared__ scalar_t buf[Block*3];
for (int i=blockIdx.x;i<b;i+=gridDim.x){
scalar_t subsum=0;
for (int k0=0;k0<n;k0+=blockDim.x){
int k=k0+threadIdx.x;
scalar_t x1=0,y1=0,z1=0;
if (k<n){
x1=xyz1[i*n*3+k*3+0];
y1=xyz1[i*n*3+k*3+1];
z1=xyz1[i*n*3+k*3+2];
}
for (int l0=0;l0<m;l0+=Block){
int lend=min(m,l0+Block)-l0;
for (int l=threadIdx.x;l<lend*3;l+=blockDim.x)
buf[l]=xyz2[i*m*3+l0*3+l];
__syncthreads();
if (k<n){
for (int l=0;l<lend;l++){
scalar_t x2=buf[l*3+0];
scalar_t y2=buf[l*3+1];
scalar_t z2=buf[l*3+2];
scalar_t d=(x2-x1)*(x2-x1)+(y2-y1)*(y2-y1)+(z2-z1)*(z2-z1);
subsum+=d*match[i*n*m+(l0+l)*n+k];
}
}
__syncthreads();
}
}
allsum[threadIdx.x]=subsum;
for (int j=1;j<blockDim.x;j<<=1){
__syncthreads();
if ((threadIdx.x&j)==0 && threadIdx.x+j<blockDim.x){
allsum[threadIdx.x]+=allsum[threadIdx.x+j];
}
}
if (threadIdx.x==0)
out[i]=allsum[0];
__syncthreads();
}
}
//void matchcostLauncher(int b,int n,int m,const scalar_t * xyz1,const scalar_t * xyz2,const scalar_t * match,scalar_t * out){
// matchcost<<<32,512>>>(b,n,m,xyz1,xyz2,match,out);
//}
/* MatchCost forward interface
Input:
xyz1: (B, N1, 3) # dataset_points
xyz2: (B, N2, 3) # query_points
match: (B, N2, N1)
Output:
cost: (B)
*/
at::Tensor MatchCostForward(
const at::Tensor xyz1,
const at::Tensor xyz2,
const at::Tensor match){
const auto b = xyz1.size(0);
const auto n = xyz1.size(1);
const auto m = xyz2.size(1);
CHECK_EQ(xyz2.size(0), b);
CHECK_EQ(xyz1.size(2), 3);
CHECK_EQ(xyz2.size(2), 3);
CHECK_INPUT(xyz1);
CHECK_INPUT(xyz2);
auto cost = at::zeros({b}, xyz1.type());
AT_DISPATCH_FLOATING_TYPES(xyz1.scalar_type(), "MatchCostForward", ([&] {
matchcost<scalar_t><<<32,512>>>(b, n, m, xyz1.data<scalar_t>(), xyz2.data<scalar_t>(), match.data<scalar_t>(), cost.data<scalar_t>());
}));
AT_CUDA_CHECK(cudaGetLastError());
return cost;
}
/********************************
* matchcostgrad2 kernel
*********************************/
template<typename scalar_t>
__global__ void matchcostgrad2(int b,int n,int m,const scalar_t * __restrict__ grad_cost,const scalar_t * __restrict__ xyz1,const scalar_t * __restrict__ xyz2,const scalar_t * __restrict__ match,scalar_t * __restrict__ grad2){
__shared__ scalar_t sum_grad[256*3];
for (int i=blockIdx.x;i<b;i+=gridDim.x){
int kbeg=m*blockIdx.y/gridDim.y;
int kend=m*(blockIdx.y+1)/gridDim.y;
for (int k=kbeg;k<kend;k++){
scalar_t x2=xyz2[(i*m+k)*3+0];
scalar_t y2=xyz2[(i*m+k)*3+1];
scalar_t z2=xyz2[(i*m+k)*3+2];
scalar_t subsumx=0,subsumy=0,subsumz=0;
for (int j=threadIdx.x;j<n;j+=blockDim.x){
scalar_t x1=x2-xyz1[(i*n+j)*3+0];
scalar_t y1=y2-xyz1[(i*n+j)*3+1];
scalar_t z1=z2-xyz1[(i*n+j)*3+2];
scalar_t d=match[i*n*m+k*n+j]*2;
subsumx+=x1*d;
subsumy+=y1*d;
subsumz+=z1*d;
}
sum_grad[threadIdx.x*3+0]=subsumx;
sum_grad[threadIdx.x*3+1]=subsumy;
sum_grad[threadIdx.x*3+2]=subsumz;
for (int j=1;j<blockDim.x;j<<=1){
__syncthreads();
int j1=threadIdx.x;
int j2=threadIdx.x+j;
if ((j1&j)==0 && j2<blockDim.x){
sum_grad[j1*3+0]+=sum_grad[j2*3+0];
sum_grad[j1*3+1]+=sum_grad[j2*3+1];
sum_grad[j1*3+2]+=sum_grad[j2*3+2];
}
}
if (threadIdx.x==0){
grad2[(i*m+k)*3+0]=sum_grad[0]*grad_cost[i];
grad2[(i*m+k)*3+1]=sum_grad[1]*grad_cost[i];
grad2[(i*m+k)*3+2]=sum_grad[2]*grad_cost[i];
}
__syncthreads();
}
}
}
/********************************
* matchcostgrad1 kernel
*********************************/
template<typename scalar_t>
__global__ void matchcostgrad1(int b,int n,int m,const scalar_t * __restrict__ grad_cost,const scalar_t * __restrict__ xyz1,const scalar_t * __restrict__ xyz2,const scalar_t * __restrict__ match,scalar_t * __restrict__ grad1){
for (int i=blockIdx.x;i<b;i+=gridDim.x){
for (int l=threadIdx.x;l<n;l+=blockDim.x){
scalar_t x1=xyz1[i*n*3+l*3+0];
scalar_t y1=xyz1[i*n*3+l*3+1];
scalar_t z1=xyz1[i*n*3+l*3+2];
scalar_t dx=0,dy=0,dz=0;
for (int k=0;k<m;k++){
scalar_t x2=xyz2[i*m*3+k*3+0];
scalar_t y2=xyz2[i*m*3+k*3+1];
scalar_t z2=xyz2[i*m*3+k*3+2];
scalar_t d=match[i*n*m+k*n+l]*2;
dx+=(x1-x2)*d;
dy+=(y1-y2)*d;
dz+=(z1-z2)*d;
}
grad1[i*n*3+l*3+0]=dx*grad_cost[i];
grad1[i*n*3+l*3+1]=dy*grad_cost[i];
grad1[i*n*3+l*3+2]=dz*grad_cost[i];
}
}
}
//void matchcostgradLauncher(int b,int n,int m,const scalar_t * xyz1,const scalar_t * xyz2,const scalar_t * match,scalar_t * grad1,scalar_t * grad2){
// matchcostgrad1<<<32,512>>>(b,n,m,xyz1,xyz2,match,grad1);
// matchcostgrad2<<<dim3(32,32),256>>>(b,n,m,xyz1,xyz2,match,grad2);
//}
/* MatchCost backward interface
Input:
grad_cost: (B) # gradients on cost
xyz1: (B, N1, 3) # dataset_points
xyz2: (B, N2, 3) # query_points
match: (B, N2, N1)
Output:
grad1: (B, N1, 3)
grad2: (B, N2, 3)
*/
std::vector<at::Tensor> MatchCostBackward(
const at::Tensor grad_cost,
const at::Tensor xyz1,
const at::Tensor xyz2,
const at::Tensor match){
const auto b = xyz1.size(0);
const auto n = xyz1.size(1);
const auto m = xyz2.size(1);
CHECK_EQ(xyz2.size(0), b);
CHECK_EQ(xyz1.size(2), 3);
CHECK_EQ(xyz2.size(2), 3);
CHECK_INPUT(xyz1);
CHECK_INPUT(xyz2);
auto grad1 = at::zeros({b, n, 3}, xyz1.type());
auto grad2 = at::zeros({b, m, 3}, xyz1.type());
AT_DISPATCH_FLOATING_TYPES(xyz1.scalar_type(), "MatchCostBackward", ([&] {
matchcostgrad1<scalar_t><<<32,512>>>(b, n, m, grad_cost.data<scalar_t>(), xyz1.data<scalar_t>(), xyz2.data<scalar_t>(), match.data<scalar_t>(), grad1.data<scalar_t>());
matchcostgrad2<scalar_t><<<dim3(32,32),256>>>(b, n, m, grad_cost.data<scalar_t>(), xyz1.data<scalar_t>(), xyz2.data<scalar_t>(), match.data<scalar_t>(), grad2.data<scalar_t>());
}));
AT_CUDA_CHECK(cudaGetLastError());
return std::vector<at::Tensor>({grad1, grad2});
}
#endif
================================================
FILE: extensions/emd/emd.py
================================================
import torch
import emd_cuda
class EarthMoverDistanceFunction(torch.autograd.Function):
@staticmethod
def forward(ctx, xyz1, xyz2):
xyz1 = xyz1.contiguous()
xyz2 = xyz2.contiguous()
assert xyz1.is_cuda and xyz2.is_cuda, "Only support cuda currently."
match = emd_cuda.approxmatch_forward(xyz1, xyz2)
cost = emd_cuda.matchcost_forward(xyz1, xyz2, match)
ctx.save_for_backward(xyz1, xyz2, match)
return cost
@staticmethod
def backward(ctx, grad_cost):
xyz1, xyz2, match = ctx.saved_tensors
grad_cost = grad_cost.contiguous()
grad_xyz1, grad_xyz2 = emd_cuda.matchcost_backward(grad_cost, xyz1, xyz2, match)
return grad_xyz1, grad_xyz2
class earth_mover_distance(torch.nn.Module):
f''' emd
'''
def __init__(self):
super().__init__()
def forward(self, xyz1, xyz2, transpose=False):
"""Earth Mover Distance (Approx)
Args:
xyz1 (torch.Tensor): (b, n1, 3)
xyz2 (torch.Tensor): (b, n2, 3)
transpose (bool): whether to transpose inputs as it might be BCN format.
Extensions only support BNC format.
Returns:
cost (torch.Tensor): (b)
"""
cost = EarthMoverDistanceFunction.apply(xyz1, xyz2)
cost = cost / xyz1.size(1)
return cost.mean()
# def earth_mover_distance(xyz1, xyz2, transpose=True):
# """Earth Mover Distance (Approx)
# Args:
# xyz1 (torch.Tensor): (b, 3, n1)
# xyz2 (torch.Tensor): (b, 3, n1)
# transpose (bool): whether to transpose inputs as it might be BCN format.
# Extensions only support BNC format.
# Returns:
# cost (torch.Tensor): (b)
# """
# if xyz1.dim() == 2:
# xyz1 = xyz1.unsqueeze(0)
# if xyz2.dim() == 2:
# xyz2 = xyz2.unsqueeze(0)
# if transpose:
# xyz1 = xyz1.transpose(1, 2)
# xyz2 = xyz2.transpose(1, 2)
# cost = EarthMoverDistanceFunction.apply(xyz1, xyz2)
# return cost
================================================
FILE: extensions/emd/setup.py
================================================
"""Setup extension
Notes:
If extra_compile_args is provided, you need to provide different instances for different extensions.
Refer to https://github.com/pytorch/pytorch/issues/20169
"""
from setuptools import setup
from torch.utils.cpp_extension import BuildExtension, CUDAExtension
setup(
name='emd_ext',
ext_modules=[
CUDAExtension(
name='emd_cuda',
sources=[
'cuda/emd.cpp',
'cuda/emd_kernel.cu',
],
extra_compile_args={'cxx': ['-g'], 'nvcc': ['-O2']}
),
],
cmdclass={
'build_ext': BuildExtension
})
================================================
FILE: extensions/emd/test_emd_loss.py
================================================
import torch
import numpy as np
import time
from emd import earth_mover_distance
# gt
p1 = torch.from_numpy(np.array([[[1.7, -0.1, 0.1], [0.1, 1.2, 0.3]]], dtype=np.float32)).cuda()
p1 = p1.repeat(3, 1, 1)
p2 = torch.from_numpy(np.array([[[0.3, 1.8, 0.2], [1.2, -0.2, 0.3]]], dtype=np.float32)).cuda()
p2 = p2.repeat(3, 1, 1)
print(p1)
print(p2)
print(p1.shape)
p1.requires_grad = True
p2.requires_grad = True
gt_dist = (((p1[0, 0] - p2[0, 1])**2).sum() + ((p1[0, 1] - p2[0, 0])**2).sum()) / 2 + \
(((p1[1, 0] - p2[1, 1])**2).sum() + ((p1[1, 1] - p2[1, 0])**2).sum()) * 2 + \
(((p1[2, 0] - p2[2, 1])**2).sum() + ((p1[2, 1] - p2[2, 0])**2).sum()) / 3
print('gt_dist: ', gt_dist)
gt_dist.backward()
print(p1.grad)
print(p2.grad)
# emd
p1 = torch.from_numpy(np.array([[[1.7, -0.1, 0.1], [0.1, 1.2, 0.3]]], dtype=np.float32)).cuda()
p1 = p1.repeat(3, 1, 1)
p2 = torch.from_numpy(np.array([[[0.3, 1.8, 0.2], [1.2, -0.2, 0.3]]], dtype=np.float32)).cuda()
p2 = p2.repeat(3, 1, 1)
print(p1)
print(p2)
p1.requires_grad = True
p2.requires_grad = True
d = earth_mover_distance(p1, p2, transpose=False)
print(d)
loss = d[0] / 2 + d[1] * 2 + d[2] / 3
print(loss)
loss.backward()
print(p1.grad)
print(p2.grad)
================================================
FILE: figures/a
================================================
================================================
FILE: main.py
================================================
from tools import pretrain_run_net as pretrain
from tools import finetune_run_net as finetune
from tools import test_run_net as test_net
from utils import parser, dist_utils, misc
from utils.logger import *
from utils.config import *
import time
import os
import torch
from tensorboardX import SummaryWriter
from torchstat import stat
def main():
# args
args = parser.get_args()
# CUDA
args.use_gpu = torch.cuda.is_available()
if args.use_gpu:
torch.backends.cudnn.benchmark = True
# init distributed env first, since logger depends on the dist info.
if args.launcher == 'none':
args.distributed = False
else:
args.distributed = True
dist_utils.init_dist(args.launcher)
# re-set gpu_ids with distributed training mode
_, world_size = dist_utils.get_dist_info()
args.world_size = world_size
# logger
timestamp = time.strftime('%Y%m%d_%H%M%S', time.localtime())
log_file = os.path.join(args.experiment_path, f'{timestamp}.log')
logger = get_root_logger(log_file=log_file, name=args.log_name)
# define the tensorboard writer
if not args.test:
if args.local_rank == 0:
train_writer = SummaryWriter(
os.path.join(args.tfboard_path, 'train'))
val_writer = SummaryWriter(os.path.join(args.tfboard_path, 'test'))
else:
train_writer = None
val_writer = None
# config
config = get_config(args, logger=logger)
# batch size
if args.distributed:
assert config.total_bs % world_size == 0
config.dataset.train.others.bs = config.total_bs // world_size
if config.dataset.get('extra_train'):
config.dataset.extra_train.others.bs = config.total_bs // world_size * 2
config.dataset.val.others.bs = config.total_bs // world_size * 2
if config.dataset.get('test'):
config.dataset.test.others.bs = config.total_bs // world_size
else:
config.dataset.train.others.bs = config.total_bs
if config.dataset.get('extra_train'):
config.dataset.extra_train.others.bs = config.total_bs * 2
config.dataset.val.others.bs = config.total_bs * 2
if config.dataset.get('test'):
config.dataset.test.others.bs = config.total_bs
# log
log_args_to_file(args, 'args', logger=logger)
log_config_to_file(config, 'config', logger=logger)
# exit()
logger.info(f'Distributed training: {args.distributed}')
# set random seeds
if args.seed is not None:
logger.info(f'Set random seed to {args.seed}, '
f'deterministic: {args.deterministic}')
# seed + rank, for augmentation
misc.set_random_seed(args.seed + args.local_rank,
deterministic=args.deterministic)
if args.distributed:
assert args.local_rank == torch.distributed.get_rank()
if args.shot != -1:
config.dataset.train.others.shot = args.shot
config.dataset.train.others.way = args.way
config.dataset.train.others.fold = args.fold
config.dataset.val.others.shot = args.shot
config.dataset.val.others.way = args.way
config.dataset.val.others.fold = args.fold
# run
if args.test:
test_net(args, config)
else:
if args.finetune_model or args.scratch_model:
finetune(args, config, train_writer, val_writer)
else:
pretrain(args, config, train_writer, val_writer)
if __name__ == '__main__':
main()
================================================
FILE: main_vis.py
================================================
# from tools import run_net
from tools import test_net
from utils import parser, dist_utils, misc
from utils.logger import *
from utils.config import *
import time
import os
import torch
from tensorboardX import SummaryWriter
def main():
# args
args = parser.get_args()
# CUDA
args.use_gpu = torch.cuda.is_available()
if args.use_gpu:
torch.backends.cudnn.benchmark = True
# init distributed env first, since logger depends on the dist info.
if args.launcher == 'none':
args.distributed = False
else:
args.distributed = True
dist_utils.init_dist(args.launcher)
# re-set gpu_ids with distributed training mode
_, world_size = dist_utils.get_dist_info()
args.world_size = world_size
# logger
timestamp = time.strftime('%Y%m%d_%H%M%S', time.localtime())
log_file = os.path.join(args.experiment_path, f'{timestamp}.log')
logger = get_root_logger(log_file=log_file, name=args.log_name)
# define the tensorboard writer
if not args.test:
if args.local_rank == 0:
train_writer = SummaryWriter(os.path.join(args.tfboard_path, 'train'))
val_writer = SummaryWriter(os.path.join(args.tfboard_path, 'test'))
else:
train_writer = None
val_writer = None
# config
config = get_config(args, logger = logger)
# batch size
if args.distributed:
assert config.total_bs % world_size == 0
config.dataset.train.others.bs = config.total_bs // world_size
config.dataset.val.others.bs = 1
config.dataset.test.others.bs = 1
else:
config.dataset.train.others.bs = config.total_bs
config.dataset.val.others.bs = 1
config.dataset.test.others.bs = 1
# log
log_args_to_file(args, 'args', logger = logger)
log_config_to_file(config, 'config', logger = logger)
# exit()
logger.info(f'Distributed training: {args.distributed}')
# set random seeds
if args.seed is not None:
logger.info(f'Set random seed to {args.seed}, '
f'deterministic: {args.deterministic}')
misc.set_random_seed(args.seed + args.local_rank, deterministic=args.deterministic) # seed + rank, for augmentation
if args.distributed:
assert args.local_rank == torch.distributed.get_rank()
# run
if args.test:
test_net(args, config)
else:
# run_net(args, config, train_writer, val_writer)
raise NotImplementedError
if __name__ == '__main__':
main()
================================================
FILE: models/GPT.py
================================================
import torch
import torch.nn as nn
import torch.nn.functional as F
import numpy as np
import matplotlib.pyplot as plt
class Block(nn.Module):
def __init__(self, embed_dim, num_heads):
super(Block, self).__init__()
self.ln_1 = nn.LayerNorm(embed_dim)
self.ln_2 = nn.LayerNorm(embed_dim)
self.attn = nn.MultiheadAttention(embed_dim, num_heads)
self.mlp = nn.Sequential(
nn.Linear(embed_dim, embed_dim * 4),
nn.GELU(),
nn.Linear(embed_dim * 4, embed_dim),
)
def forward(self, x, attn_mask):
x = self.ln_1(x)
# a, _ = self.attn(x, x, x, attn_mask=attn_mask, need_weights=False)
a, _ = self.attn(x, x, x, attn_mask=attn_mask, need_weights=False)
x = x + a
m = self.mlp(self.ln_2(x))
x = x + m
return x
class GPT_extractor(nn.Module):
def __init__(
self, embed_dim, num_heads, num_layers, num_classes, trans_dim, group_size, pretrained=False
):
super(GPT_extractor, self).__init__()
self.embed_dim = embed_dim
self.trans_dim = trans_dim
self.group_size = group_size
# start of sequence token
self.sos = torch.nn.Parameter(torch.zeros(embed_dim))
nn.init.normal_(self.sos)
self.layers = nn.ModuleList()
for _ in range(num_layers):
self.layers.append(Block(embed_dim, num_heads))
self.ln_f = nn.LayerNorm(embed_dim)
# prediction head
self.increase_dim = nn.Sequential(
nn.Conv1d(self.trans_dim, 3*(self.group_size), 1)
)
if pretrained == False:
self.cls_head_finetune = nn.Sequential(
nn.Linear(self.trans_dim * 2, 256),
nn.BatchNorm1d(256),
nn.ReLU(inplace=True),
nn.Dropout(0.5),
nn.Linear(256, 256),
nn.BatchNorm1d(256),
nn.ReLU(inplace=True),
nn.Dropout(0.5),
nn.Linear(256, num_classes)
)
self.cls_norm = nn.LayerNorm(self.trans_dim)
def forward(self, h, pos, attn_mask, classify=False):
"""
Expect input as shape [sequence len, batch]
If classify, return classification logits
"""
batch, length, C = h.shape
h = h.transpose(0, 1)
pos = pos.transpose(0, 1)
# prepend sos token
sos = torch.ones(1, batch, self.embed_dim, device=h.device) * self.sos
if not classify:
h = torch.cat([sos, h[:-1, :, :]], axis=0)
else:
h = torch.cat([sos, h], axis=0)
# transformer
for layer in self.layers:
h = layer(h + pos, attn_mask)
h = self.ln_f(h)
encoded_points = h.transpose(0, 1)
if not classify:
return encoded_points
h = h.transpose(0, 1)
h = self.cls_norm(h)
concat_f = torch.cat([h[:, 1], h[:, 2:].max(1)[0]], dim=-1)
ret = self.cls_head_finetune(concat_f)
return ret, encoded_points
class GPT_generator(nn.Module):
def __init__(
self, embed_dim, num_heads, num_layers, trans_dim, group_size
):
super(GPT_generator, self).__init__()
self.embed_dim = embed_dim
self.trans_dim = trans_dim
self.group_size = group_size
# start of sequence token
self.sos = torch.nn.Parameter(torch.zeros(embed_dim))
nn.init.normal_(self.sos)
self.layers = nn.ModuleList()
for _ in range(num_layers):
self.layers.append(Block(embed_dim, num_heads))
self.ln_f = nn.LayerNorm(embed_dim)
self.increase_dim = nn.Sequential(
nn.Conv1d(self.trans_dim, 3*(self.group_size), 1)
)
def forward(self, h, pos, attn_mask):
"""
Expect input as shape [sequence len, batch]
If classify, return classification logits
"""
batch, length, C = h.shape
h = h.transpose(0, 1)
pos = pos.transpose(0, 1)
# transformer
for layer in self.layers:
h = layer(h + pos, attn_mask)
h = self.ln_f(h)
rebuild_points = self.increase_dim(h.transpose(1, 2)).transpose(
1, 2).transpose(0, 1).reshape(batch * length, -1, 3)
return rebuild_points
================================================
FILE: models/PointGPT.py
================================================
import torch
import torch.nn as nn
import torch.nn.functional as F
import timm
from timm.models.layers import DropPath, trunc_normal_
import numpy as np
from .build import MODELS
from utils import misc
from utils.checkpoint import get_missing_parameters_message, get_unexpected_parameters_message
from utils.logger import *
import random
from knn_cuda import KNN
from extensions.chamfer_dist import ChamferDistanceL1, ChamferDistanceL2
from models.GPT import GPT_extractor, GPT_generator
import math
from models.z_order import *
class Encoder_large(nn.Module): # Embedding module
def __init__(self, encoder_channel):
super().__init__()
self.encoder_channel = encoder_channel
self.first_conv = nn.Sequential(
nn.Conv1d(3, 256, 1),
nn.BatchNorm1d(256),
nn.ReLU(inplace=True),
nn.Conv1d(256, 512, 1),
nn.BatchNorm1d(512),
nn.ReLU(inplace=True),
nn.Conv1d(512, 1024, 1)
)
self.second_conv = nn.Sequential(
nn.Conv1d(2048, 2048, 1),
nn.BatchNorm1d(2048),
nn.ReLU(inplace=True),
nn.Conv1d(2048, self.encoder_channel, 1)
)
def forward(self, point_groups):
'''
point_groups : B G N 3
-----------------
feature_global : B G C
'''
bs, g, n, _ = point_groups.shape
point_groups = point_groups.reshape(bs * g, n, 3)
# encoder
feature = self.first_conv(point_groups.transpose(2, 1)) # BG 256 n
feature_global = torch.max(feature, dim=2, keepdim=True)[0] # BG 256 1
feature = torch.cat(
[feature_global.expand(-1, -1, n), feature], dim=1) # BG 512 n
feature = self.second_conv(feature) # BG 1024 n
feature_global = torch.max(feature, dim=2, keepdim=False)[0] # BG 1024
return feature_global.reshape(bs, g, self.encoder_channel)
class Encoder_small(nn.Module): # Embedding module
def __init__(self, encoder_channel):
super().__init__()
self.encoder_channel = encoder_channel
self.first_conv = nn.Sequential(
nn.Conv1d(3, 128, 1),
nn.BatchNorm1d(128),
nn.ReLU(inplace=True),
nn.Conv1d(128, 256, 1)
)
self.second_conv = nn.Sequential(
nn.Conv1d(512, 512, 1),
nn.BatchNorm1d(512),
nn.ReLU(inplace=True),
nn.Conv1d(512, self.encoder_channel, 1)
)
def forward(self, point_groups):
'''
point_groups : B G N 3
-----------------
feature_global : B G C
'''
bs, g, n, _ = point_groups.shape
point_groups = point_groups.reshape(bs * g, n, 3)
# encoder
feature = self.first_conv(point_groups.transpose(2, 1))
feature_global = torch.max(feature, dim=2, keepdim=True)[0]
feature = torch.cat(
[feature_global.expand(-1, -1, n), feature], dim=1)
feature = self.second_conv(feature)
feature_global = torch.max(feature, dim=2, keepdim=False)[0]
return feature_global.reshape(bs, g, self.encoder_channel)
class Group(nn.Module):
def __init__(self, num_group, group_size):
super().__init__()
self.num_group = num_group
self.group_size = group_size
self.knn = KNN(k=self.group_size, transpose_mode=True)
self.knn_2 = KNN(k=1, transpose_mode=True)
def simplied_morton_sorting(self, xyz, center):
'''
Simplifying the Morton code sorting to iterate and set the nearest patch to the last patch as the next patch, we found this to be more efficient.
'''
batch_size, num_points, _ = xyz.shape
distances_batch = torch.cdist(center, center)
distances_batch[:, torch.eye(self.num_group).bool()] = float("inf")
idx_base = torch.arange(
0, batch_size, device=xyz.device) * self.num_group
sorted_indices_list = []
sorted_indices_list.append(idx_base)
distances_batch = distances_batch.view(batch_size, self.num_group, self.num_group).transpose(
1, 2).contiguous().view(batch_size * self.num_group, self.num_group)
distances_batch[idx_base] = float("inf")
distances_batch = distances_batch.view(
batch_size, self.num_group, self.num_group).transpose(1, 2).contiguous()
for i in range(self.num_group - 1):
distances_batch = distances_batch.view(
batch_size * self.num_group, self.num_group)
distances_to_last_batch = distances_batch[sorted_indices_list[-1]]
closest_point_idx = torch.argmin(distances_to_last_batch, dim=-1)
closest_point_idx = closest_point_idx + idx_base
sorted_indices_list.append(closest_point_idx)
distances_batch = distances_batch.view(batch_size, self.num_group, self.num_group).transpose(
1, 2).contiguous().view(batch_size * self.num_group, self.num_group)
distances_batch[closest_point_idx] = float("inf")
distances_batch = distances_batch.view(
batch_size, self.num_group, self.num_group).transpose(1, 2).contiguous()
sorted_indices = torch.stack(sorted_indices_list, dim=-1)
sorted_indices = sorted_indices.view(-1)
return sorted_indices
def morton_sorting(self, xyz, center):
batch_size, num_points, _ = xyz.shape
all_indices = []
for index in range(batch_size):
points = center[index]
z = get_z_values(points.cpu().numpy())
idxs = np.zeros((self.num_group), dtype=np.int32)
temp = np.arange(self.num_group)
z_ind = np.argsort(z[temp])
idxs = temp[z_ind]
all_indices.append(idxs)
all_indices = torch.tensor(all_indices, device=xyz.device)
idx_base = torch.arange(
0, batch_size, device=xyz.device).view(-1, 1) * self.num_group
sorted_indices = all_indices + idx_base
sorted_indices = sorted_indices.view(-1)
def forward(self, xyz):
'''
input: B N 3
---------------------------
output: B G M 3
center : B G 3
'''
batch_size, num_points, _ = xyz.shape
# fps the centers out
center = misc.fps(xyz, self.num_group) # B G 3
# knn to get the neighborhood
_, idx = self.knn(xyz, center) # B G M
assert idx.size(1) == self.num_group
assert idx.size(2) == self.group_size
idx_base = torch.arange(
0, batch_size, device=xyz.device).view(-1, 1, 1) * num_points
idx = idx + idx_base
idx = idx.view(-1)
neighborhood = xyz.view(batch_size * num_points, -1)[idx, :]
neighborhood = neighborhood.view(
batch_size, self.num_group, self.group_size, 3).contiguous()
# normalize
neighborhood = neighborhood - center.unsqueeze(2)
# can utilize morton_sorting by choosing morton_sorting function
sorted_indices = self.simplied_morton_sorting(xyz, center)
neighborhood = neighborhood.view(
batch_size * self.num_group, self.group_size, 3)[sorted_indices, :, :]
neighborhood = neighborhood.view(
batch_size, self.num_group, self.group_size, 3).contiguous()
center = center.view(
batch_size * self.num_group, 3)[sorted_indices, :]
center = center.view(
batch_size, self.num_group, 3).contiguous()
return neighborhood, center
# Transformers
class Mlp(nn.Module):
def __init__(self, in_features, hidden_features=None, out_features=None, act_layer=nn.GELU, drop=0.):
super().__init__()
out_features = out_features or in_features
hidden_features = hidden_features or in_features
self.fc1 = nn.Linear(in_features, hidden_features)
self.act = act_layer()
self.fc2 = nn.Linear(hidden_features, out_features)
self.drop = nn.Dropout(drop)
def forward(self, x):
x = self.fc1(x)
x = self.act(x)
x = self.drop(x)
x = self.fc2(x)
x = self.drop(x)
return x
class Attention(nn.Module):
def __init__(self, dim, num_heads=8, qkv_bias=False, qk_scale=None, attn_drop=0., proj_drop=0.):
super().__init__()
self.num_heads = num_heads
head_dim = dim // num_heads
self.scale = qk_scale or head_dim ** -0.5
self.qkv = nn.Linear(dim, dim * 3, bias=qkv_bias)
self.attn_drop = nn.Dropout(attn_drop)
self.proj = nn.Linear(dim, dim)
self.proj_drop = nn.Dropout(proj_drop)
def forward(self, x):
B, N, C = x.shape
qkv = self.qkv(x).reshape(B, N, 3, self.num_heads, C //
self.num_heads).permute(2, 0, 3, 1, 4)
# make torchscript happy (cannot use tensor as tuple)
q, k, v = qkv[0], qkv[1], qkv[2]
attn = (q @ k.transpose(-2, -1)) * self.scale
attn = attn.softmax(dim=-1)
attn = self.attn_drop(attn)
x = (attn @ v).transpose(1, 2).reshape(B, N, C)
x = self.proj(x)
x = self.proj_drop(x)
return x
class Block(nn.Module):
def __init__(self, dim, num_heads, mlp_ratio=4., qkv_bias=False, qk_scale=None, drop=0., attn_drop=0.,
drop_path=0., act_layer=nn.GELU, norm_layer=nn.LayerNorm):
super().__init__()
self.norm1 = norm_layer(dim)
self.drop_path = DropPath(
drop_path) if drop_path > 0. else nn.Identity()
self.norm2 = norm_layer(dim)
mlp_hidden_dim = int(dim * mlp_ratio)
self.mlp = Mlp(in_features=dim, hidden_features=mlp_hidden_dim,
act_layer=act_layer, drop=drop)
self.attn = Attention(
dim, num_heads=num_heads, qkv_bias=qkv_bias, qk_scale=qk_scale, attn_drop=attn_drop, proj_drop=drop)
def forward(self, x):
x = x + self.drop_path(self.attn(self.norm1(x)))
x = x + self.drop_path(self.mlp(self.norm2(x)))
return x
class PositionEmbeddingCoordsSine(nn.Module):
"""Similar to transformer's position encoding, but generalizes it to
arbitrary dimensions and continuous coordinates.
Args:
n_dim: Number of input dimensions, e.g. 2 for image coordinates.
d_model: Number of dimensions to encode into
temperature:
scale:
"""
def __init__(self, n_dim: int = 1, d_model: int = 256, temperature=10000, scale=None):
super().__init__()
self.n_dim = n_dim
self.num_pos_feats = d_model // n_dim // 2 * 2
self.temperature = temperature
self.padding = d_model - self.num_pos_feats * self.n_dim
if scale is None:
scale = 1.0
self.scale = scale * 2 * math.pi
def forward(self, xyz: torch.Tensor) -> torch.Tensor:
"""
Args:
xyz: Point positions (*, d_in)
Returns:
pos_emb (*, d_out)
"""
assert xyz.shape[-1] == self.n_dim
dim_t = torch.arange(self.num_pos_feats,
dtype=torch.float32, device=xyz.device)
dim_t = self.temperature ** (2 * torch.div(dim_t,
2, rounding_mode='trunc') / self.num_pos_feats)
xyz = xyz * self.scale
pos_divided = xyz.unsqueeze(-1) / dim_t
pos_sin = pos_divided[..., 0::2].sin()
pos_cos = pos_divided[..., 1::2].cos()
pos_emb = torch.stack([pos_sin, pos_cos], dim=-
1).reshape(*xyz.shape[:-1], -1)
# Pad unused dimensions with zeros
pos_emb = F.pad(pos_emb, (0, self.padding))
return pos_emb
class GPT_Transformer(nn.Module):
def __init__(self, config, **kwargs):
super().__init__()
self.config = config
# define the transformer argparse
self.mask_ratio = config.transformer_config.mask_ratio
self.trans_dim = config.transformer_config.trans_dim
self.depth = config.transformer_config.depth
self.decoder_depth = config.transformer_config.decoder_depth
self.drop_path_rate = config.transformer_config.drop_path_rate
self.num_heads = config.transformer_config.num_heads
self.group_size = config.group_size
print_log(f'[args] {config.transformer_config}', logger='Transformer')
self.encoder_dims = config.transformer_config.encoder_dims
assert self.encoder_dims in [384, 768, 1024]
if self.encoder_dims == 384:
self.encoder = Encoder_small(encoder_channel=self.encoder_dims)
else:
self.encoder = Encoder_large(encoder_channel=self.encoder_dims)
self.pos_embed = PositionEmbeddingCoordsSine(3, self.encoder_dims, 1.0)
self.blocks = GPT_extractor(
embed_dim=self.encoder_dims,
num_heads=self.num_heads,
num_layers=self.depth,
num_classes=config.cls_dim,
trans_dim=self.trans_dim,
group_size=self.group_size,
pretrained=True,
)
self.generator_blocks = GPT_generator(
embed_dim=self.encoder_dims,
num_heads=self.num_heads,
num_layers=self.decoder_depth,
trans_dim=self.trans_dim,
group_size=self.group_size
)
# do not perform additional mask on the first (self.keep_attend) tokens
self.keep_attend = 10
self.num_groups = config.num_group
self.num_mask = int(
(self.num_groups - self.keep_attend) * self.mask_ratio)
self.sos_pos = nn.Parameter(torch.zeros(1, 1, self.trans_dim))
self.norm = nn.LayerNorm(self.trans_dim)
self.apply(self._init_weights)
def _init_weights(self, m):
if isinstance(m, nn.Linear):
trunc_normal_(m.weight, std=.02)
if isinstance(m, nn.Linear) and m.bias is not None:
nn.init.constant_(m.bias, 0)
elif isinstance(m, nn.LayerNorm):
nn.init.constant_(m.bias, 0)
nn.init.constant_(m.weight, 1.0)
elif isinstance(m, nn.Conv1d):
trunc_normal_(m.weight, std=.02)
if m.bias is not None:
nn.init.constant_(m.bias, 0)
def forward(self, neighborhood, center, noaug=False, classify=False):
# generate mask
group_input_tokens = self.encoder(neighborhood) # B G C
batch_size, seq_len, C = group_input_tokens.size()
relative_position = center[:, 1:, :] - center[:, :-1, :]
relative_norm = torch.norm(relative_position, dim=-1, keepdim=True)
relative_direction = relative_position / relative_norm
position = torch.cat(
[center[:, 0, :].unsqueeze(1), relative_direction], dim=1)
pos_relative = self.pos_embed(position)
sos_pos = self.sos_pos.expand(group_input_tokens.size(0), -1, -1)
pos_absolute = self.pos_embed(center[:, :-1, :])
pos_absolute = torch.cat([sos_pos, pos_absolute], dim=1)
attn_mask = torch.full(
(seq_len, seq_len), -float("Inf"), device=group_input_tokens.device, dtype=group_input_tokens.dtype
).to(torch.bool)
with torch.no_grad():
attn_mask = torch.triu(attn_mask, diagonal=1)
# point wise
# overall_mask = np.zeros([self.num_groups, self.num_groups])
# for i in range(self.num_groups):
# mask = np.hstack([
# np.zeros(self.num_groups-self.num_mask),
# np.ones(self.num_mask),
# ])
# np.random.shuffle(mask)
# overall_mask[i, :] = mask
# overall_mask = torch.from_numpy(
# overall_mask).to(torch.bool).to('cuda')
# column wise
overall_mask = np.hstack([
np.zeros(self.num_groups-self.keep_attend-self.num_mask),
np.ones(self.num_mask),
])
np.random.shuffle(overall_mask)
overall_mask = np.hstack([
np.zeros(self.keep_attend),
overall_mask,
])
overall_mask = torch.from_numpy(
overall_mask).to(torch.bool).to('cuda')
eye_mask = torch.eye(self.num_groups).to(torch.bool).to('cuda')
attn_mask = attn_mask | overall_mask.unsqueeze(0) & ~eye_mask
# transformer
if classify == False:
encoded_features = self.blocks(
group_input_tokens, pos_absolute, attn_mask, classify=classify)
generated_points = self.generator_blocks(
encoded_features, pos_relative, attn_mask)
return generated_points
else:
print('----error---- This code is detached ----error----')
logits, generated_points = self.blocks(
group_input_tokens, pos_absolute, classify=classify)
return logits, generated_points
@MODELS.register_module()
class PointGPT(nn.Module):
def __init__(self, config):
super().__init__()
print_log(f'[PointGPT] ', logger='PointGPT')
self.config = config
self.trans_dim = config.transformer_config.trans_dim
self.GPT_Transformer = GPT_Transformer(config)
self.group_size = config.group_size
self.num_group = config.num_group
self.drop_path_rate = config.transformer_config.drop_path_rate
self.weight_center = config.weight_center
print_log(
f'[PointGPT] divide point cloud into G{self.num_group} x S{self.group_size} points ...', logger='PointGPT')
self.group_divider = Group(
num_group=self.num_group, group_size=self.group_size)
self.loss = config.loss
self.build_loss_func(self.loss)
def build_loss_func(self, loss_type):
if loss_type == "cdl1":
self.loss_func_p = ChamferDistanceL1().cuda()
elif loss_type == 'cdl2':
self.loss_func_p = ChamferDistanceL2().cuda()
elif loss_type == 'cdl12':
self.loss_func_p1 = ChamferDistanceL1().cuda()
self.loss_func_p2 = ChamferDistanceL2().cuda()
else:
raise NotImplementedError
self.loss_func_c = nn.MSELoss().cuda()
def forward(self, pts, vis=False, **kwargs):
neighborhood, center = self.group_divider(pts)
B = neighborhood.shape[0]
generated_points = self.GPT_Transformer(
neighborhood, center)
gt_points = neighborhood.reshape(
B*(self.num_group), self.group_size, 3)
loss1 = self.loss_func_p1(generated_points, gt_points)
loss2 = self.loss_func_p2(generated_points, gt_points)
if vis: # visualization
gt_points = gt_points.reshape(
B, self.num_group, self.group_size, 3)
gt_points = (gt_points + center.unsqueeze(-2)
).reshape(-1, 3).unsqueeze(0)
generated_points = generated_points.reshape(
B, self.num_group, self.group_size, 3) + center.unsqueeze(-2)
generated_points = generated_points.reshape(-1, 3).unsqueeze(0)
return generated_points, gt_points, center
return loss1 + loss2
@MODELS.register_module()
class PointTransformer(nn.Module):
def __init__(self, config, **kwargs):
super().__init__()
self.config = config
self.trans_dim = config.trans_dim
self.depth = config.depth
self.decoder_depth = config.decoder_depth
self.drop_path_rate = config.drop_path_rate
self.cls_dim = config.cls_dim
self.num_heads = config.num_heads
self.group_size = config.group_size
self.num_group = config.num_group
self.encoder_dims = config.encoder_dims
self.group_divider = Group(
num_group=self.num_group, group_size=self.group_size)
assert self.encoder_dims in [384, 768, 1024]
if self.encoder_dims == 384:
self.encoder = Encoder_small(encoder_channel=self.encoder_dims)
else:
self.encoder = Encoder_large(encoder_channel=self.encoder_dims)
self.pos_embed = PositionEmbeddingCoordsSine(3, self.encoder_dims, 1.0)
self.blocks = GPT_extractor(
embed_dim=self.encoder_dims,
num_heads=self.num_heads,
num_layers=self.depth,
num_classes=config.cls_dim,
trans_dim=self.trans_dim,
group_size=self.group_size
)
self.generator_blocks = GPT_generator(
embed_dim=self.encoder_dims,
num_heads=self.num_heads,
num_layers=self.decoder_depth,
trans_dim=self.trans_dim,
group_size=self.group_size
)
self.norm = nn.LayerNorm(self.trans_dim)
self.cls_token = nn.Parameter(torch.zeros(1, 1, self.trans_dim))
self.cls_pos = nn.Parameter(torch.randn(1, 1, self.trans_dim))
self.sos_pos = nn.Parameter(torch.zeros(1, 1, self.trans_dim))
self.norm = nn.LayerNorm(self.trans_dim)
self.build_loss_func()
trunc_normal_(self.cls_token, std=.02)
trunc_normal_(self.cls_pos, std=.02)
def build_loss_func(self, loss_type='cdl12'):
self.loss_ce = nn.CrossEntropyLoss()
if loss_type == "cdl1":
self.loss_func_p = ChamferDistanceL1().cuda()
elif loss_type == 'cdl2':
self.loss_func_p = ChamferDistanceL2().cuda()
elif loss_type == 'cdl12':
self.loss_func_p1 = ChamferDistanceL1().cuda()
self.loss_func_p2 = ChamferDistanceL2().cuda()
else:
raise NotImplementedError
self.loss_ce = nn.CrossEntropyLoss()
def get_loss_acc(self, ret, gt):
loss = self.loss_ce(ret, gt.long())
pred = ret.argmax(-1)
acc = (pred == gt).sum() / float(gt.size(0))
return loss, acc * 100
def load_model_from_ckpt(self, bert_ckpt_path):
if bert_ckpt_path is not None:
ckpt = torch.load(bert_ckpt_path)
base_ckpt = {k.replace("module.", ""): v for k,
v in ckpt['base_model'].items()}
for k in list(base_ckpt.keys()):
if k.startswith('GPT_Transformer'):
base_ckpt[k[len('GPT_Transformer.'):]] = base_ckpt[k]
del base_ckpt[k]
elif k.startswith('base_model'):
base_ckpt[k[len('base_model.'):]] = base_ckpt[k]
del base_ckpt[k]
if 'cls_head_finetune' in k:
del base_ckpt[k]
incompatible = self.load_state_dict(base_ckpt, strict=False)
if incompatible.missing_keys:
print_log('missing_keys', logger='Transformer')
print_log(
get_missing_parameters_message(incompatible.missing_keys),
logger='Transformer'
)
if incompatible.unexpected_keys:
print_log('unexpected_keys', logger='Transformer')
print_log(
get_unexpected_parameters_message(
incompatible.unexpected_keys),
logger='Transformer'
)
print_log(
f'[Transformer] Successful Loading the ckpt from {bert_ckpt_path}', logger='Transformer')
else:
print_log('Training from scratch!!!', logger='Transformer')
self.apply(self._init_weights)
def _init_weights(self, m):
if isinstance(m, nn.Linear):
trunc_normal_(m.weight, std=.02)
if isinstance(m, nn.Linear) and m.bias is not None:
nn.init.constant_(m.bias, 0)
elif isinstance(m, nn.LayerNorm):
nn.init.constant_(m.bias, 0)
nn.init.constant_(m.weight, 1.0)
elif isinstance(m, nn.Conv1d):
trunc_normal_(m.weight, std=.02)
if m.bias is not None:
nn.init.constant_(m.bias, 0)
def forward(self, pts):
neighborhood, center = self.group_divider(pts)
group_input_tokens = self.encoder(neighborhood) # B G N
B, L, _ = group_input_tokens.shape
cls_tokens = self.cls_token.expand(group_input_tokens.size(0), -1, -1)
cls_pos = self.cls_pos.expand(group_input_tokens.size(0), -1, -1)
pos = self.pos_embed(center)
sos_pos = self.sos_pos.expand(group_input_tokens.size(0), -1, -1)
pos = torch.cat([sos_pos, pos], dim=1)
relative_position = center[:, 1:, :] - center[:, :-1, :]
relative_norm = torch.norm(relative_position, dim=-1, keepdim=True)
relative_direction = relative_position / relative_norm
position = torch.cat(
[center[:, 0, :].unsqueeze(1), relative_direction], dim=1)
pos_relative = self.pos_embed(position)
x = torch.cat((cls_tokens, group_input_tokens), dim=1)
pos = torch.cat((cls_pos, pos), dim=1)
attn_mask = torch.full(
(L+2, L+2), -float("Inf"), device=group_input_tokens.device, dtype=group_input_tokens.dtype
).to(torch.bool)
attn_mask = torch.triu(attn_mask, diagonal=1)
# transformer
ret, encoded_features = self.blocks(x, pos, attn_mask, classify=True)
encoded_features = torch.cat(
[encoded_features[:, 0, :].unsqueeze(1), encoded_features[:, 2:-1, :]], dim=1)
attn_mask = torch.full(
(L, L), -float("Inf"), device=group_input_tokens.device, dtype=group_input_tokens.dtype
).to(torch.bool)
attn_mask = torch.triu(attn_mask, diagonal=1)
generated_points = self.generator_blocks(
encoded_features, pos_relative, attn_mask)
neighborhood = neighborhood + center.unsqueeze(2)
gt_points = neighborhood.reshape(
B*(self.num_group), self.group_size, 3)
loss1 = self.loss_func_p1(generated_points, gt_points)
loss2 = self.loss_func_p2(generated_points, gt_points)
return ret, loss1 + loss2
================================================
FILE: models/__init__.py
================================================
from .build import build_model_from_cfg
import models.PointGPT
================================================
FILE: models/build.py
================================================
from utils import registry
MODELS = registry.Registry('models')
def build_model_from_cfg(cfg, **kwargs):
"""
Build a dataset, defined by `dataset_name`.
Args:
cfg (eDICT):
Returns:
Dataset: a constructed dataset specified by dataset_name.
"""
return MODELS.build(cfg, **kwargs)
================================================
FILE: models/z_order.py
================================================
import numpy as np
def round_to_int_32(data):
"""
Takes a Numpy array of float values between
-1 and 1, and rounds them to significant
32-bit integer values, to be used in the
morton code computation
:param data: multidimensional numpy array
:return: same as data but in 32-bit int format
"""
# first we rescale points to 0-512
min_data = np.abs(np.min(data)-0.5)
data = 256*(data + min_data)
# now convert to int
data = np.round(2 ** 21 - data).astype(dtype=np.int32)
return data
def split_by_3(x):
"""
Method to separate bits of a 32-bit integer
by 3 positions apart, using the magic bits
https://www.forceflow.be/2013/10/07/morton-encodingdecoding-through-bit-interleaving-implementations/
:param x: 32-bit integer
:return: x with bits separated
"""
# we only look at 21 bits, since we want to generate
# a 64-bit code eventually (3 x 21 bits = 63 bits, which
# is the maximum we can fit in a 64-bit code)
x &= 0x1fffff # only take first 21 bits
# shift left 32 bits, OR with self, and 00011111000000000000000000000000000000001111111111111111
x = (x | (x << 32)) & 0x1f00000000ffff
# shift left 16 bits, OR with self, and 00011111000000000000000011111111000000000000000011111111
x = (x | (x << 16)) & 0x1f0000ff0000ff
# shift left 8 bits, OR with self, and 0001000000001111000000001111000000001111000000001111000000000000
x = (x | (x << 8)) & 0x100f00f00f00f00f
# shift left 4 bits, OR with self, and 0001000011000011000011000011000011000011000011000011000100000000
x = (x | (x << 4)) & 0x10c30c30c30c30c3
# shift left 2 bits, OR with self, and 0001001001001001001001001001001001001001001001001001001001001001
x = (x | (x << 2)) & 0x1249249249249249
return x
def get_z_order(x, y, z):
"""
Given 3 arrays of corresponding x, y, z
coordinates, compute the morton (or z) code for
each point and return an index array
We compute the Morton order as follows:
1- Split all coordinates by 3 (add 2 zeros between bits)
2- Shift bits left by 1 for y and 2 for z
3- Interleave x, shifted y, and shifted z
The mordon order is the final interleaved bit sequence
:param x: x coordinates
:param y: y coordinates
:param z: z coordinates
:return: index array with morton code
"""
res = 0
res |= split_by_3(x) | split_by_3(y) << 1 | split_by_3(z) << 2
return res
def get_z_values(data):
"""
Computes the z values for a point array
:param data: Nx3 array of x, y, and z location
:return: Nx1 array of z values
"""
points_round = round_to_int_32(data) # convert to int
z = get_z_order(points_round[:, 0], points_round[:, 1], points_round[:, 2])
return z
================================================
FILE: requirements.txt
================================================
argparse
easydict
h5py
matplotlib
numpy
open3d==0.9
opencv-python
pyyaml
scipy
tensorboardX
timm==0.4.5
tqdm
transforms3d
termcolor
================================================
FILE: segmentation/__init__.py
================================================
================================================
FILE: segmentation/dataset.py
================================================
import numpy as np
import os
from torch.utils.data import Dataset
import torch
from pointnet_util import farthest_point_sample, pc_normalize
import json
class ModelNetDataLoader(Dataset):
def __init__(self, root, npoint=1024, split='train', uniform=False, normal_channel=True, cache_size=15000):
self.root = root
self.npoints = npoint
self.uniform = uniform
self.catfile = os.path.join(self.root, 'modelnet40_shape_names.txt')
self.cat = [line.rstrip() for line in open(self.catfile)]
self.classes = dict(zip(self.cat, range(len(self.cat))))
self.normal_channel = normal_channel
shape_ids = {}
shape_ids['train'] = [line.rstrip() for line in open(
os.path.join(self.root, 'modelnet40_train.txt'))]
shape_ids['test'] = [line.rstrip() for line in open(
os.path.join(self.root, 'modelnet40_test.txt'))]
assert (split == 'train' or split == 'test')
shape_names = ['_'.join(x.split('_')[0:-1]) for x in shape_ids[split]]
# list of (shape_name, shape_txt_file_path) tuple
self.datapath = [(shape_names[i], os.path.join(self.root, shape_names[i], shape_ids[split][i]) + '.txt') for i
in range(len(shape_ids[split]))]
print('The size of %s data is %d' % (split, len(self.datapath)))
self.cache_size = cache_size # how many data points to cache in memory
self.cache = {} # from index to (point_set, cls) tuple
def __len__(self):
return len(self.datapath)
def _get_item(self, index):
if index in self.cache:
point_set, cls = self.cache[index]
else:
fn = self.datapath[index]
cls = self.classes[self.datapath[index][0]]
cls = np.array([cls]).astype(np.int32)
point_set = np.loadtxt(fn[1], delimiter=',').astype(np.float32)
if self.uniform:
point_set = farthest_point_sample(point_set, self.npoints)
else:
point_set = point_set[0:self.npoints, :]
point_set[:, 0:3] = pc_normalize(point_set[:, 0:3])
if not self.normal_channel:
point_set = point_set[:, 0:3]
if len(self.cache) < self.cache_size:
self.cache[index] = (point_set, cls)
return point_set, cls
def __getitem__(self, index):
return self._get_item(index)
class PartNormalDataset(Dataset):
def __init__(self, root='/data/cgy/ShapenetPart/shapenetcore_partanno_segmentation_benchmark_v0_normal', npoints=2500, split='train', class_choice=None, normal_channel=False):
self.npoints = npoints
self.root = root
self.catfile = os.path.join(self.root, 'synsetoffset2category.txt')
self.cat = {}
self.normal_channel = normal_channel
with open(self.catfile, 'r') as f:
for line in f:
ls = line.strip().split()
self.cat[ls[0]] = ls[1]
self.cat = {k: v for k, v in self.cat.items()}
self.classes_original = dict(zip(self.cat, range(len(self.cat))))
if not class_choice is None:
self.cat = {k: v for k, v in self.cat.items() if k in class_choice}
# print(self.cat)
self.meta = {}
with open(os.path.join(self.root, 'train_test_split', 'shuffled_train_file_list.json'), 'r') as f:
train_ids = set([str(d.split('/')[2]) for d in json.load(f)])
with open(os.path.join(self.root, 'train_test_split', 'shuffled_val_file_list.json'), 'r') as f:
val_ids = set([str(d.split('/')[2]) for d in json.load(f)])
with open(os.path.join(self.root, 'train_test_split', 'shuffled_test_file_list.json'), 'r') as f:
test_ids = set([str(d.split('/')[2]) for d in json.load(f)])
for item in self.cat:
# print('category', item)
self.meta[item] = []
dir_point = os.path.join(self.root, self.cat[item])
fns = sorted(os.listdir(dir_point))
# print(fns[0][0:-4])
if split == 'trainval':
fns = [fn for fn in fns if (
(fn[0:-4] in train_ids) or (fn[0:-4] in val_ids))]
elif split == 'train':
fns = [fn for fn in fns if fn[0:-4] in train_ids]
elif split == 'val':
fns = [fn for fn in fns if fn[0:-4] in val_ids]
elif split == 'test':
fns = [fn for fn in fns if fn[0:-4] in test_ids]
else:
print('Unknown split: %s. Exiting..' % (split))
exit(-1)
# print(os.path.basename(fns))
for fn in fns:
token = (os.path.splitext(os.path.basename(fn))[0])
self.meta[item].append(os.path.join(dir_point, token + '.txt'))
self.datapath = []
for item in self.cat:
for fn in self.meta[item]:
self.datapath.append((item, fn))
self.classes = {}
for i in self.cat.keys():
self.classes[i] = self.classes_original[i]
# Mapping from category ('Chair') to a list of int [10,11,12,13] as segmentation labels
self.seg_classes = {'Earphone': [16, 17, 18], 'Motorbike': [30, 31, 32, 33, 34, 35], 'Rocket': [41, 42, 43],
'Car': [8, 9, 10, 11], 'Laptop': [28, 29], 'Cap': [6, 7], 'Skateboard': [44, 45, 46],
'Mug': [36, 37], 'Guitar': [19, 20, 21], 'Bag': [4, 5], 'Lamp': [24, 25, 26, 27],
'Table': [47, 48, 49], 'Airplane': [0, 1, 2, 3], 'Pistol': [38, 39, 40],
'Chair': [12, 13, 14, 15], 'Knife': [22, 23]}
# for cat in sorted(self.seg_classes.keys()):
# print(cat, self.seg_classes[cat])
self.cache = {} # from index to (point_set, cls, seg) tuple
self.cache_size = 20000
def __getitem__(self, index):
if index in self.cache:
point_set, cls, seg = self.cache[index]
else:
fn = self.datapath[index]
cat = self.datapath[index][0]
cls = self.classes[cat]
cls = np.array([cls]).astype(np.int32)
data = np.loadtxt(fn[1]).astype(np.float32)
if not self.normal_channel:
point_set = data[:, 0:3]
else:
point_set = data[:, 0:6]
seg = data[:, -1].astype(np.int32)
if len(self.cache) < self.cache_size:
self.cache[index] = (point_set, cls, seg)
point_set[:, 0:3] = pc_normalize(point_set[:, 0:3])
choice = np.random.choice(len(seg), self.npoints, replace=True)
# resample
point_set = point_set[choice, :]
seg = seg[choice]
return point_set, cls, seg
def __len__(self):
return len(self.datapath)
if __name__ == '__main__':
data = ModelNetDataLoader('modelnet40_normal_resampled/',
split='train', uniform=False, normal_channel=True)
DataLoader = torch.utils.data.DataLoader(data, batch_size=12, shuffle=True)
for point, label in DataLoader:
print(point.shape)
print(label.shape)
================================================
FILE: segmentation/extensions/chamfer_dist/__init__.py
================================================
# -*- coding: utf-8 -*-
# @Author: Thibault GROUEIX
# @Date: 2019-08-07 20:54:24
# @Last Modified by: Haozhe Xie
# @Last Modified time: 2019-12-18 15:06:25
# @Email: cshzxie@gmail.com
import torch
import chamfer
class ChamferFunction(torch.autograd.Function):
@staticmethod
def forward(ctx, xyz1, xyz2):
dist1, dist2, idx1, idx2 = chamfer.forward(xyz1, xyz2)
ctx.save_for_backward(xyz1, xyz2, idx1, idx2)
return dist1, dist2
@staticmethod
def backward(ctx, grad_dist1, grad_dist2):
xyz1, xyz2, idx1, idx2 = ctx.saved_tensors
grad_xyz1, grad_xyz2 = chamfer.backward(xyz1, xyz2, idx1, idx2, grad_dist1, grad_dist2)
return grad_xyz1, grad_xyz2
class ChamferDistanceL2(torch.nn.Module):
f''' Chamder Distance L2
'''
def __init__(self, ignore_zeros=False):
super().__init__()
self.ignore_zeros = ignore_zeros
def forward(self, xyz1, xyz2):
batch_size = xyz1.size(0)
if batch_size == 1 and self.ignore_zeros:
non_zeros1 = torch.sum(xyz1, dim=2).ne(0)
non_zeros2 = torch.sum(xyz2, dim=2).ne(0)
xyz1 = xyz1[non_zeros1].unsqueeze(dim=0)
xyz2 = xyz2[non_zeros2].unsqueeze(dim=0)
dist1, dist2 = ChamferFunction.apply(xyz1, xyz2)
return torch.mean(dist1) + torch.mean(dist2)
class ChamferDistanceL2_split(torch.nn.Module):
f''' Chamder Distance L2
'''
def __init__(self, ignore_zeros=False):
super().__init__()
self.ignore_zeros = ignore_zeros
def forward(self, xyz1, xyz2):
batch_size = xyz1.size(0)
if batch_size == 1 and self.ignore_zeros:
non_zeros1 = torch.sum(xyz1, dim=2).ne(0)
non_zeros2 = torch.sum(xyz2, dim=2).ne(0)
xyz1 = xyz1[non_zeros1].unsqueeze(dim=0)
xyz2 = xyz2[non_zeros2].unsqueeze(dim=0)
dist1, dist2 = ChamferFunction.apply(xyz1, xyz2)
return torch.mean(dist1), torch.mean(dist2)
class ChamferDistanceL1(torch.nn.Module):
f''' Chamder Distance L1
'''
def __init__(self, ignore_zeros=False):
super().__init__()
self.ignore_zeros = ignore_zeros
def forward(self, xyz1, xyz2):
batch_size = xyz1.size(0)
if batch_size == 1 and self.ignore_zeros:
non_zeros1 = torch.sum(xyz1, dim=2).ne(0)
non_zeros2 = torch.sum(xyz2, dim=2).ne(0)
xyz1 = xyz1[non_zeros1].unsqueeze(dim=0)
xyz2 = xyz2[non_zeros2].unsqueeze(dim=0)
dist1, dist2 = ChamferFunction.apply(xyz1, xyz2)
# import pdb
# pdb.set_trace()
dist1 = torch.sqrt(dist1)
dist2 = torch.sqrt(dist2)
return (torch.mean(dist1) + torch.mean(dist2))/2
================================================
FILE: segmentation/extensions/chamfer_dist/chamfer.cu
================================================
/*
* @Author: Haozhe Xie
* @Date: 2019-08-07 20:54:24
* @Last Modified by: Haozhe Xie
* @Last Modified time: 2020-06-17 14:58:55
* @Email: cshzxie@gmail.com
*/
#include <cuda.h>
#include <cuda_runtime.h>
#include <torch/extension.h>
#include <vector>
__global__ void chamfer_dist_kernel(int batch_size,
int n,
const float* xyz1,
int m,
const float* xyz2,
float* dist,
int* indexes) {
const int batch = 512;
__shared__ float buf[batch * 3];
for (int i = blockIdx.x; i < batch_size; i += gridDim.x) {
for (int k2 = 0; k2 < m; k2 += batch) {
int end_k = min(m, k2 + batch) - k2;
for (int j = threadIdx.x; j < end_k * 3; j += blockDim.x) {
buf[j] = xyz2[(i * m + k2) * 3 + j];
}
__syncthreads();
for (int j = threadIdx.x + blockIdx.y * blockDim.x; j < n;
j += blockDim.x * gridDim.y) {
float x1 = xyz1[(i * n + j) * 3 + 0];
float y1 = xyz1[(i * n + j) * 3 + 1];
float z1 = xyz1[(i * n + j) * 3 + 2];
float best_dist = 0;
int best_dist_index = 0;
int end_ka = end_k - (end_k & 3);
if (end_ka == batch) {
for (int k = 0; k < batch; k += 4) {
{
float x2 = buf[k * 3 + 0] - x1;
float y2 = buf[k * 3 + 1] - y1;
float z2 = buf[k * 3 + 2] - z1;
float dist = x2 * x2 + y2 * y2 + z2 * z2;
if (k == 0 || dist < best_dist) {
best_dist = dist;
best_dist_index = k + k2;
}
}
{
float x2 = buf[k * 3 + 3] - x1;
float y2 = buf[k * 3 + 4] - y1;
float z2 = buf[k * 3 + 5] - z1;
float dist = x2 * x2 + y2 * y2 + z2 * z2;
if (dist < best_dist) {
best_dist = dist;
best_dist_index = k + k2 + 1;
}
}
{
float x2 = buf[k * 3 + 6] - x1;
float y2 = buf[k * 3 + 7] - y1;
float z2 = buf[k * 3 + 8] - z1;
float dist = x2 * x2 + y2 * y2 + z2 * z2;
if (dist < best_dist) {
best_dist = dist;
best_dist_index = k + k2 + 2;
}
}
{
float x2 = buf[k * 3 + 9] - x1;
float y2 = buf[k * 3 + 10] - y1;
float z2 = buf[k * 3 + 11] - z1;
float dist = x2 * x2 + y2 * y2 + z2 * z2;
if (dist < best_dist) {
best_dist = dist;
best_dist_index = k + k2 + 3;
}
}
}
} else {
for (int k = 0; k < end_ka; k += 4) {
{
float x2 = buf[k * 3 + 0] - x1;
float y2 = buf[k * 3 + 1] - y1;
float z2 = buf[k * 3 + 2] - z1;
float dist = x2 * x2 + y2 * y2 + z2 * z2;
if (k == 0 || dist < best_dist) {
best_dist = dist;
best_dist_index = k + k2;
}
}
{
float x2 = buf[k * 3 + 3] - x1;
float y2 = buf[k * 3 + 4] - y1;
float z2 = buf[k * 3 + 5] - z1;
float dist = x2 * x2 + y2 * y2 + z2 * z2;
if (dist < best_dist) {
best_dist = dist;
best_dist_index = k + k2 + 1;
}
}
{
float x2 = buf[k * 3 + 6] - x1;
float y2 = buf[k * 3 + 7] - y1;
float z2 = buf[k * 3 + 8] - z1;
float dist = x2 * x2 + y2 * y2 + z2 * z2;
if (dist < best_dist) {
best_dist = dist;
best_dist_index = k + k2 + 2;
}
}
{
float x2 = buf[k * 3 + 9] - x1;
float y2 = buf[k * 3 + 10] - y1;
float z2 = buf[k * 3 + 11] - z1;
float dist = x2 * x2 + y2 * y2 + z2 * z2;
if (dist < best_dist) {
best_dist = dist;
best_dist_index = k + k2 + 3;
}
}
}
}
for (int k = end_ka; k < end_k; k++) {
float x2 = buf[k * 3 + 0] - x1;
float y2 = buf[k * 3 + 1] - y1;
float z2 = buf[k * 3 + 2] - z1;
float dist = x2 * x2 + y2 * y2 + z2 * z2;
if (k == 0 || dist < best_dist) {
best_dist = dist;
best_dist_index = k + k2;
}
}
if (k2 == 0 || dist[(i * n + j)] > best_dist) {
dist[(i * n + j)] = best_dist;
indexes[(i * n + j)] = best_dist_index;
}
}
__syncthreads();
}
}
}
std::vector<torch::Tensor> chamfer_cuda_forward(torch::Tensor xyz1,
torch::Tensor xyz2) {
const int batch_size = xyz1.size(0);
const int n = xyz1.size(1); // num_points point cloud A
const int m = xyz2.size(1); // num_points point cloud B
torch::Tensor dist1 =
torch::zeros({batch_size, n}, torch::CUDA(torch::kFloat));
torch::Tensor dist2 =
torch::zeros({batch_size, m}, torch::CUDA(torch::kFloat));
torch::Tensor idx1 = torch::zeros({batch_size, n}, torch::CUDA(torch::kInt));
torch::Tensor idx2 = torch::zeros({batch_size, m}, torch::CUDA(torch::kInt));
chamfer_dist_kernel<<<dim3(32, 16, 1), 512>>>(
batch_size, n, xyz1.data_ptr<float>(), m, xyz2.data_ptr<float>(),
dist1.data_ptr<float>(), idx1.data_ptr<int>());
chamfer_dist_kernel<<<dim3(32, 16, 1), 512>>>(
batch_size, m, xyz2.data_ptr<float>(), n, xyz1.data_ptr<float>(),
dist2.data_ptr<float>(), idx2.data_ptr<int>());
cudaError_t err = cudaGetLastError();
if (err != cudaSuccess) {
printf("Error in chamfer_cuda_forward: %s\n", cudaGetErrorString(err));
}
return {dist1, dist2, idx1, idx2};
}
__global__ void chamfer_dist_grad_kernel(int b,
int n,
const float* xyz1,
int m,
const float* xyz2,
const float* grad_dist1,
const int* idx1,
float* grad_xyz1,
float* grad_xyz2) {
for (int i = blockIdx.x; i < b; i += gridDim.x) {
for (int j = threadIdx.x + blockIdx.y * blockDim.x; j < n;
j += blockDim.x * gridDim.y) {
float x1 = xyz1[(i * n + j) * 3 + 0];
float y1 = xyz1[(i * n + j) * 3 + 1];
float z1 = xyz1[(i * n + j) * 3 + 2];
int j2 = idx1[i * n + j];
float x2 = xyz2[(i * m + j2) * 3 + 0];
float y2 = xyz2[(i * m + j2) * 3 + 1];
float z2 = xyz2[(i * m + j2) * 3 + 2];
float g = grad_dist1[i * n + j] * 2;
atomicAdd(&(grad_xyz1[(i * n + j) * 3 + 0]), g * (x1 - x2));
atomicAdd(&(grad_xyz1[(i * n + j) * 3 + 1]), g * (y1 - y2));
atomicAdd(&(grad_xyz1[(i * n + j) * 3 + 2]), g * (z1 - z2));
atomicAdd(&(grad_xyz2[(i * m + j2) * 3 + 0]), -(g * (x1 - x2)));
atomicAdd(&(grad_xyz2[(i * m + j2) * 3 + 1]), -(g * (y1 - y2)));
atomicAdd(&(grad_xyz2[(i * m + j2) * 3 + 2]), -(g * (z1 - z2)));
}
}
}
std::vector<torch::Tensor> chamfer_cuda_backward(torch::Tensor xyz1,
torch::Tensor xyz2,
torch::Tensor idx1,
torch::Tensor idx2,
torch::Tensor grad_dist1,
torch::Tensor grad_dist2) {
const int batch_size = xyz1.size(0);
const int n = xyz1.size(1); // num_points point cloud A
const int m = xyz2.size(1); // num_points point cloud B
torch::Tensor grad_xyz1 = torch::zeros_like(xyz1, torch::CUDA(torch::kFloat));
torch::Tensor grad_xyz2 = torch::zeros_like(xyz2, torch::CUDA(torch::kFloat));
chamfer_dist_grad_kernel<<<dim3(1, 16, 1), 256>>>(
batch_size, n, xyz1.data_ptr<float>(), m, xyz2.data_ptr<float>(),
grad_dist1.data_ptr<float>(), idx1.data_ptr<int>(),
grad_xyz1.data_ptr<float>(), grad_xyz2.data_ptr<float>());
chamfer_dist_grad_kernel<<<dim3(1, 16, 1), 256>>>(
batch_size, m, xyz2.data_ptr<float>(), n, xyz1.data_ptr<float>(),
grad_dist2.data_ptr<float>(), idx2.data_ptr<int>(),
grad_xyz2.data_ptr<float>(), grad_xyz1.data_ptr<float>());
cudaError_t err = cudaGetLastError();
if (err != cudaSuccess) {
printf("Error in chamfer_cuda_backward: %s\n", cudaGetErrorString(err));
}
return {grad_xyz1, grad_xyz2};
}
================================================
FILE: segmentation/extensions/chamfer_dist/chamfer_cuda.cpp
================================================
/*
* @Author: Haozhe Xie
* @Date: 2019-08-07 20:54:24
* @Last Modified by: Haozhe Xie
* @Last Modified time: 2019-12-10 10:33:50
* @Email: cshzxie@gmail.com
*/
#include <torch/extension.h>
#include <vector>
std::vector<torch::Tensor> chamfer_cuda_forward(torch::Tensor xyz1,
torch::Tensor xyz2);
std::vector<torch::Tensor> chamfer_cuda_backward(torch::Tensor xyz1,
torch::Tensor xyz2,
torch::Tensor idx1,
torch::Tensor idx2,
torch::Tensor grad_dist1,
torch::Tensor grad_dist2);
std::vector<torch::Tensor> chamfer_forward(torch::Tensor xyz1,
torch::Tensor xyz2) {
return chamfer_cuda_forward(xyz1, xyz2);
}
std::vector<torch::Tensor> chamfer_backward(torch::Tensor xyz1,
torch::Tensor xyz2,
torch::Tensor idx1,
torch::Tensor idx2,
torch::Tensor grad_dist1,
torch::Tensor grad_dist2) {
return chamfer_cuda_backward(xyz1, xyz2, idx1, idx2, grad_dist1, grad_dist2);
}
PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
m.def("forward", &chamfer_forward, "Chamfer forward (CUDA)");
m.def("backward", &chamfer_backward, "Chamfer backward (CUDA)");
}
================================================
FILE: segmentation/extensions/chamfer_dist/setup.py
================================================
# -*- coding: utf-8 -*-
# @Author: Haozhe Xie
# @Date: 2019-08-07 20:54:24
# @Last Modified by: Haozhe Xie
# @Last Modified time: 2019-12-10 10:04:25
# @Email: cshzxie@gmail.com
from setuptools import setup
from torch.utils.cpp_extension import BuildExtension, CUDAExtension
setup(name='chamfer',
version='2.0.0',
ext_modules=[
CUDAExtension('chamfer', [
'chamfer_cuda.cpp',
'chamfer.cu',
]),
],
cmdclass={'build_ext': BuildExtension})
================================================
FILE: segmentation/extensions/chamfer_dist/test.py
================================================
# -*- coding: utf-8 -*-
# @Author: Haozhe Xie
# @Date: 2019-12-10 10:38:01
# @Last Modified by: Haozhe Xie
# @Last Modified time: 2019-12-26 14:21:36
# @Email: cshzxie@gmail.com
#
# Note:
# - Replace float -> double, kFloat -> kDouble in chamfer.cu
import os
import sys
import torch
import unittest
from torch.autograd import gradcheck
sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), os.path.pardir, os.path.pardir)))
from extensions.chamfer_dist import ChamferFunction
class ChamferDistanceTestCase(unittest.TestCase):
def test_chamfer_dist(self):
x = torch.rand(4, 64, 3).double()
y = torch.rand(4, 128, 3).double()
x.requires_grad = True
y.requires_grad = True
print(gradcheck(ChamferFunction.apply, [x.cuda(), y.cuda()]))
if __name__ == '__main__':
# unittest.main()
import pdb
x = torch.rand(32,128,3)
y = torch.rand(32,128,3)
pdb.set_trace()
================================================
FILE: segmentation/extensions/emd/README.md
================================================
# PyTorch Wrapper for Point-cloud Earth-Mover-Distance (EMD)
## Dependency
The code has been tested on Ubuntu 16.04, PyTorch 1.1.0, CUDA 9.0.
## Usage
First compile using
python setup.py install
Then, copy the lib file out to the main directory,
cp build/lib.linux-x86_64-3.6/emd_cuda.cpython-36m-x86_64-linux-gnu.so .
Then, you can use it by simply
from emd import earth_mover_distance
d = earth_mover_distance(p1, p2, transpose=False) # p1: B x N1 x 3, p2: B x N2 x 3
Check `test_emd_loss.py` for example.
## Author
The cuda code is originally written by Haoqiang Fan. The PyTorch wrapper is written by Kaichun Mo. Also, Jiayuan Gu provided helps.
## License
MIT
================================================
FILE: segmentation/extensions/emd/__init__.py
================================================
from .emd import earth_mover_distance as emd
__all__ = ['emd']
================================================
FILE: segmentation/extensions/emd/cuda/emd.cpp
================================================
#ifndef _EMD
#define _EMD
#include <vector>
#include <torch/extension.h>
//CUDA declarations
at::Tensor ApproxMatchForward(
const at::Tensor xyz1,
const at::Tensor xyz2);
at::Tensor MatchCostForward(
const at::Tensor xyz1,
const at::Tensor xyz2,
const at::Tensor match);
std::vector<at::Tensor> MatchCostBackward(
const at::Tensor grad_cost,
const at::Tensor xyz1,
const at::Tensor xyz2,
const at::Tensor match);
PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
m.def("approxmatch_forward", &ApproxMatchForward,"ApproxMatch forward (CUDA)");
m.def("matchcost_forward", &MatchCostForward,"MatchCost forward (CUDA)");
m.def("matchcost_backward", &MatchCostBackward,"MatchCost backward (CUDA)");
}
#endif
================================================
FILE: segmentation/extensions/emd/cuda/emd_kernel.cu
================================================
/**********************************
* Original Author: Haoqiang Fan
* Modified by: Kaichun Mo
*********************************/
#ifndef _EMD_KERNEL
#define _EMD_KERNEL
#include <cmath>
#include <vector>
#include <ATen/ATen.h>
#include <ATen/cuda/CUDAApplyUtils.cuh> // at::cuda::getApplyGrid
// #include <THC/THC.h>
#define CHECK_CUDA(x) TORCH_CHECK(x.type().is_cuda(), #x " must be a CUDA tensor")
#define CHECK_CONTIGUOUS(x) TORCH_CHECK(x.is_contiguous(), #x " must be contiguous")
#define CHECK_INPUT(x) CHECK_CUDA(x); CHECK_CONTIGUOUS(x)
/********************************
* Forward kernel for approxmatch
*********************************/
template<typename scalar_t>
__global__ void approxmatch(int b,int n,int m,const scalar_t * __restrict__ xyz1,const scalar_t * __restrict__ xyz2,scalar_t * __restrict__ match,scalar_t * temp){
scalar_t * remainL=temp+blockIdx.x*(n+m)*2, * remainR=temp+blockIdx.x*(n+m)*2+n,*ratioL=temp+blockIdx.x*(n+m)*2+n+m,*ratioR=temp+blockIdx.x*(n+m)*2+n+m+n;
scalar_t multiL,multiR;
if (n>=m){
multiL=1;
multiR=n/m;
}else{
multiL=m/n;
multiR=1;
}
const int Block=1024;
__shared__ scalar_t buf[Block*4];
for (int i=blockIdx.x;i<b;i+=gridDim.x){
for (int j=threadIdx.x;j<n*m;j+=blockDim.x)
match[i*n*m+j]=0;
for (int j=threadIdx.x;j<n;j+=blockDim.x)
remainL[j]=multiL;
for (int j=threadIdx.x;j<m;j+=blockDim.x)
remainR[j]=multiR;
__syncthreads();
for (int j=7;j>=-2;j--){
scalar_t level=-powf(4.0f,j);
if (j==-2){
level=0;
}
for (int k0=0;k0<n;k0+=blockDim.x){
int k=k0+threadIdx.x;
scalar_t x1=0,y1=0,z1=0;
if (k<n){
x1=xyz1[i*n*3+k*3+0];
y1=xyz1[i*n*3+k*3+1];
z1=xyz1[i*n*3+k*3+2];
}
scalar_t suml=1e-9f;
for (int l0=0;l0<m;l0+=Block){
int lend=min(m,l0+Block)-l0;
for (int l=threadIdx.x;l<lend;l+=blockDim.x){
scalar_t x2=xyz2[i*m*3+l0*3+l*3+0];
scalar_t y2=xyz2[i*m*3+l0*3+l*3+1];
scalar_t z2=xyz2[i*m*3+l0*3+l*3+2];
buf[l*4+0]=x2;
buf[l*4+1]=y2;
buf[l*4+2]=z2;
buf[l*4+3]=remainR[l0+l];
}
__syncthreads();
for (int l=0;l<lend;l++){
scalar_t x2=buf[l*4+0];
scalar_t y2=buf[l*4+1];
scalar_t z2=buf[l*4+2];
scalar_t d=level*((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1)+(z2-z1)*(z2-z1));
scalar_t w=__expf(d)*buf[l*4+3];
suml+=w;
}
__syncthreads();
}
if (k<n)
ratioL[k]=remainL[k]/suml;
}
__syncthreads();
for (int l0=0;l0<m;l0+=blockDim.x){
int l=l0+threadIdx.x;
scalar_t x2=0,y2=0,z2=0;
if (l<m){
x2=xyz2[i*m*3+l*3+0];
y2=xyz2[i*m*3+l*3+1];
z2=xyz2[i*m*3+l*3+2];
}
scalar_t sumr=0;
for (int k0=0;k0<n;k0+=Block){
int kend=min(n,k0+Block)-k0;
for (int k=threadIdx.x;k<kend;k+=blockDim.x){
buf[k*4+0]=xyz1[i*n*3+k0*3+k*3+0];
buf[k*4+1]=xyz1[i*n*3+k0*3+k*3+1];
buf[k*4+2]=xyz1[i*n*3+k0*3+k*3+2];
buf[k*4+3]=ratioL[k0+k];
}
__syncthreads();
for (int k=0;k<kend;k++){
scalar_t x1=buf[k*4+0];
scalar_t y1=buf[k*4+1];
scalar_t z1=buf[k*4+2];
scalar_t w=__expf(level*((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1)+(z2-z1)*(z2-z1)))*buf[k*4+3];
sumr+=w;
}
__syncthreads();
}
if (l<m){
sumr*=remainR[l];
scalar_t consumption=fminf(remainR[l]/(sumr+1e-9f),1.0f);
ratioR[l]=consumption*remainR[l];
remainR[l]=fmaxf(0.0f,remainR[l]-sumr);
}
}
__syncthreads();
for (int k0=0;k0<n;k0+=blockDim.x){
int k=k0+threadIdx.x;
scalar_t x1=0,y1=0,z1=0;
if (k<n){
x1=xyz1[i*n*3+k*3+0];
y1=xyz1[i*n*3+k*3+1];
z1=xyz1[i*n*3+k*3+2];
}
scalar_t suml=0;
for (int l0=0;l0<m;l0+=Block){
int lend=min(m,l0+Block)-l0;
for (int l=threadIdx.x;l<lend;l+=blockDim.x){
buf[l*4+0]=xyz2[i*m*3+l0*3+l*3+0];
buf[l*4+1]=xyz2[i*m*3+l0*3+l*3+1];
buf[l*4+2]=xyz2[i*m*3+l0*3+l*3+2];
buf[l*4+3]=ratioR[l0+l];
}
__syncthreads();
scalar_t rl=ratioL[k];
if (k<n){
for (int l=0;l<lend;l++){
scalar_t x2=buf[l*4+0];
scalar_t y2=buf[l*4+1];
scalar_t z2=buf[l*4+2];
scalar_t w=__expf(level*((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1)+(z2-z1)*(z2-z1)))*rl*buf[l*4+3];
match[i*n*m+(l0+l)*n+k]+=w;
suml+=w;
}
}
__syncthreads();
}
if (k<n)
remainL[k]=fmaxf(0.0f,remainL[k]-suml);
}
__syncthreads();
}
}
}
//void approxmatchLauncher(int b,int n,int m,const scalar_t * xyz1,const scalar_t * xyz2,scalar_t * match,scalar_t * temp){
// approxmatch<<<32,512>>>(b,n,m,xyz1,xyz2,match,temp);
//}
/* ApproxMatch forward interface
Input:
xyz1: (B, N1, 3) # dataset_points
xyz2: (B, N2, 3) # query_points
Output:
match: (B, N2, N1)
*/
at::Tensor ApproxMatchForward(
const at::Tensor xyz1,
const at::Tensor xyz2){
const auto b = xyz1.size(0);
const auto n = xyz1.size(1);
const auto m = xyz2.size(1);
CHECK_EQ(xyz2.size(0), b);
CHECK_EQ(xyz1.size(2), 3);
CHECK_EQ(xyz2.size(2), 3);
CHECK_INPUT(xyz1);
CHECK_INPUT(xyz2);
auto match = at::zeros({b, m, n}, xyz1.type());
auto temp = at::zeros({b, (n+m)*2}, xyz1.type());
AT_DISPATCH_FLOATING_TYPES(xyz1.scalar_type(), "ApproxMatchForward", ([&] {
approxmatch<scalar_t><<<32,512>>>(b, n, m, xyz1.data<scalar_t>(), xyz2.data<scalar_t>(), match.data<scalar_t>(), temp.data<scalar_t>());
}));
AT_CUDA_CHECK(cudaGetLastError());
return match;
}
/********************************
* Forward kernel for matchcost
*********************************/
template<typename scalar_t>
__global__ void matchcost(int b,int n,int m,const scalar_t * __restrict__ xyz1,const scalar_t * __restrict__ xyz2,const scalar_t * __restrict__ match,scalar_t * __restrict__ out){
__shared__ scalar_t allsum[512];
const int Block=1024;
__shared__ scalar_t buf[Block*3];
for (int i=blockIdx.x;i<b;i+=gridDim.x){
scalar_t subsum=0;
for (int k0=0;k0<n;k0+=blockDim.x){
int k=k0+threadIdx.x;
scalar_t x1=0,y1=0,z1=0;
if (k<n){
x1=xyz1[i*n*3+k*3+0];
y1=xyz1[i*n*3+k*3+1];
z1=xyz1[i*n*3+k*3+2];
}
for (int l0=0;l0<m;l0+=Block){
int lend=min(m,l0+Block)-l0;
for (int l=threadIdx.x;l<lend*3;l+=blockDim.x)
buf[l]=xyz2[i*m*3+l0*3+l];
__syncthreads();
if (k<n){
for (int l=0;l<lend;l++){
scalar_t x2=buf[l*3+0];
scalar_t y2=buf[l*3+1];
scalar_t z2=buf[l*3+2];
scalar_t d=(x2-x1)*(x2-x1)+(y2-y1)*(y2-y1)+(z2-z1)*(z2-z1);
subsum+=d*match[i*n*m+(l0+l)*n+k];
}
}
__syncthreads();
}
}
allsum[threadIdx.x]=subsum;
for (int j=1;j<blockDim.x;j<<=1){
__syncthreads();
if ((threadIdx.x&j)==0 && threadIdx.x+j<blockDim.x){
allsum[threadIdx.x]+=allsum[threadIdx.x+j];
}
}
if (threadIdx.x==0)
out[i]=allsum[0];
__syncthreads();
}
}
//void matchcostLauncher(int b,int n,int m,const scalar_t * xyz1,const scalar_t * xyz2,const scalar_t * match,scalar_t * out){
// matchcost<<<32,512>>>(b,n,m,xyz1,xyz2,match,out);
//}
/* MatchCost forward interface
Input:
xyz1: (B, N1, 3) # dataset_points
xyz2: (B, N2, 3) # query_points
match: (B, N2, N1)
Output:
cost: (B)
*/
at::Tensor MatchCostForward(
const at::Tensor xyz1,
const at::Tensor xyz2,
const at::Tensor match){
const auto b = xyz1.size(0);
const auto n = xyz1.size(1);
const auto m = xyz2.size(1);
CHECK_EQ(xyz2.size(0), b);
CHECK_EQ(xyz1.size(2), 3);
CHECK_EQ(xyz2.size(2), 3);
CHECK_INPUT(xyz1);
CHECK_INPUT(xyz2);
auto cost = at::zeros({b}, xyz1.type());
AT_DISPATCH_FLOATING_TYPES(xyz1.scalar_type(), "MatchCostForward", ([&] {
matchcost<scalar_t><<<32,512>>>(b, n, m, xyz1.data<scalar_t>(), xyz2.data<scalar_t>(), match.data<scalar_t>(), cost.data<scalar_t>());
}));
AT_CUDA_CHECK(cudaGetLastError());
return cost;
}
/********************************
* matchcostgrad2 kernel
*********************************/
template<typename scalar_t>
__global__ void matchcostgrad2(int b,int n,int m,const scalar_t * __restrict__ grad_cost,const scalar_t * __restrict__ xyz1,const scalar_t * __restrict__ xyz2,const scalar_t * __restrict__ match,scalar_t * __restrict__ grad2){
__shared__ scalar_t sum_grad[256*3];
for (int i=blockIdx.x;i<b;i+=gridDim.x){
int kbeg=m*blockIdx.y/gridDim.y;
int kend=m*(blockIdx.y+1)/gridDim.y;
for (int k=kbeg;k<kend;k++){
scalar_t x2=xyz2[(i*m+k)*3+0];
scalar_t y2=xyz2[(i*m+k)*3+1];
scalar_t z2=xyz2[(i*m+k)*3+2];
scalar_t subsumx=0,subsumy=0,subsumz=0;
for (int j=threadIdx.x;j<n;j+=blockDim.x){
scalar_t x1=x2-xyz1[(i*n+j)*3+0];
scalar_t y1=y2-xyz1[(i*n+j)*3+1];
scalar_t z1=z2-xyz1[(i*n+j)*3+2];
scalar_t d=match[i*n*m+k*n+j]*2;
subsumx+=x1*d;
subsumy+=y1*d;
subsumz+=z1*d;
}
sum_grad[threadIdx.x*3+0]=subsumx;
sum_grad[threadIdx.x*3+1]=subsumy;
sum_grad[threadIdx.x*3+2]=subsumz;
for (int j=1;j<blockDim.x;j<<=1){
__syncthreads();
int j1=threadIdx.x;
int j2=threadIdx.x+j;
if ((j1&j)==0 && j2<blockDim.x){
sum_grad[j1*3+0]+=sum_grad[j2*3+0];
sum_grad[j1*3+1]+=sum_grad[j2*3+1];
sum_grad[j1*3+2]+=sum_grad[j2*3+2];
}
}
if (threadIdx.x==0){
grad2[(i*m+k)*3+0]=sum_grad[0]*grad_cost[i];
grad2[(i*m+k)*3+1]=sum_grad[1]*grad_cost[i];
grad2[(i*m+k)*3+2]=sum_grad[2]*grad_cost[i];
}
__syncthreads();
}
}
}
/********************************
* matchcostgrad1 kernel
*********************************/
template<typename scalar_t>
__global__ void matchcostgrad1(int b,int n,int m,const scalar_t * __restrict__ grad_cost,const scalar_t * __restrict__ xyz1,const scalar_t * __restrict__ xyz2,const scalar_t * __restrict__ match,scalar_t * __restrict__ grad1){
for (int i=blockIdx.x;i<b;i+=gridDim.x){
for (int l=threadIdx.x;l<n;l+=blockDim.x){
scalar_t x1=xyz1[i*n*3+l*3+0];
scalar_t y1=xyz1[i*n*3+l*3+1];
scalar_t z1=xyz1[i*n*3+l*3+2];
scalar_t dx=0,dy=0,dz=0;
for (int k=0;k<m;k++){
scalar_t x2=xyz2[i*m*3+k*3+0];
scalar_t y2=xyz2[i*m*3+k*3+1];
scalar_t z2=xyz2[i*m*3+k*3+2];
scalar_t d=match[i*n*m+k*n+l]*2;
dx+=(x1-x2)*d;
dy+=(y1-y2)*d;
dz+=(z1-z2)*d;
}
grad1[i*n*3+l*3+0]=dx*grad_cost[i];
grad1[i*n*3+l*3+1]=dy*grad_cost[i];
grad1[i*n*3+l*3+2]=dz*grad_cost[i];
}
}
}
//void matchcostgradLauncher(int b,int n,int m,const scalar_t * xyz1,const scalar_t * xyz2,const scalar_t * match,scalar_t * grad1,scalar_t * grad2){
// matchcostgrad1<<<32,512>>>(b,n,m,xyz1,xyz2,match,grad1);
// matchcostgrad2<<<dim3(32,32),256>>>(b,n,m,xyz1,xyz2,match,grad2);
//}
/* MatchCost backward interface
Input:
grad_cost: (B) # gradients on cost
xyz1: (B, N1, 3) # dataset_points
xyz2: (B, N2, 3) # query_points
match: (B, N2, N1)
Output:
grad1: (B, N1, 3)
grad2: (B, N2, 3)
*/
std::vector<at::Tensor> MatchCostBackward(
const at::Tensor grad_cost,
const at::Tensor xyz1,
const at::Tensor xyz2,
const at::Tensor match){
const auto b = xyz1.size(0);
const auto n = xyz1.size(1);
const auto m = xyz2.size(1);
CHECK_EQ(xyz2.size(0), b);
CHECK_EQ(xyz1.size(2), 3);
CHECK_EQ(xyz2.size(2), 3);
CHECK_INPUT(xyz1);
CHECK_INPUT(xyz2);
auto grad1 = at::zeros({b, n, 3}, xyz1.type());
auto grad2 = at::zeros({b, m, 3}, xyz1.type());
AT_DISPATCH_FLOATING_TYPES(xyz1.scalar_type(), "MatchCostBackward", ([&] {
matchcostgrad1<scalar_t><<<32,512>>>(b, n, m, grad_cost.data<scalar_t>(), xyz1.data<scalar_t>(), xyz2.data<scalar_t>(), match.data<scalar_t>(), grad1.data<scalar_t>());
matchcostgrad2<scalar_t><<<dim3(32,32),256>>>(b, n, m, grad_cost.data<scalar_t>(), xyz1.data<scalar_t>(), xyz2.data<scalar_t>(), match.data<scalar_t>(), grad2.data<scalar_t>());
}));
AT_CUDA_CHECK(cudaGetLastError());
return std::vector<at::Tensor>({grad1, grad2});
}
#endif
================================================
FILE: segmentation/extensions/emd/emd.py
================================================
import torch
import emd_cuda
class EarthMoverDistanceFunction(torch.autograd.Function):
@staticmethod
def forward(ctx, xyz1, xyz2):
xyz1 = xyz1.contiguous()
xyz2 = xyz2.contiguous()
assert xyz1.is_cuda and xyz2.is_cuda, "Only support cuda currently."
match = emd_cuda.approxmatch_forward(xyz1, xyz2)
cost = emd_cuda.matchcost_forward(xyz1, xyz2, match)
ctx.save_for_backward(xyz1, xyz2, match)
return cost
@staticmethod
def backward(ctx, grad_cost):
xyz1, xyz2, match = ctx.saved_tensors
grad_cost = grad_cost.contiguous()
grad_xyz1, grad_xyz2 = emd_cuda.matchcost_backward(grad_cost, xyz1, xyz2, match)
return grad_xyz1, grad_xyz2
class earth_mover_distance(torch.nn.Module):
f''' emd
'''
def __init__(self):
super().__init__()
def forward(self, xyz1, xyz2, transpose=False):
"""Earth Mover Distance (Approx)
Args:
xyz1 (torch.Tensor): (b, n1, 3)
xyz2 (torch.Tensor): (b, n2, 3)
transpose (bool): whether to transpose inputs as it might be BCN format.
Extensions only support BNC format.
Returns:
cost (torch.Tensor): (b)
"""
cost = EarthMoverDistanceFunction.apply(xyz1, xyz2)
cost = cost / xyz1.size(1)
return cost.mean()
# def earth_mover_distance(xyz1, xyz2, transpose=True):
# """Earth Mover Distance (Approx)
# Args:
# xyz1 (torch.Tensor): (b, 3, n1)
# xyz2 (torch.Tensor): (b, 3, n1)
# transpose (bool): whether to transpose inputs as it might be BCN format.
# Extensions only support BNC format.
# Returns:
# cost (torch.Tensor): (b)
# """
# if xyz1.dim() == 2:
# xyz1 = xyz1.unsqueeze(0)
# if xyz2.dim() == 2:
# xyz2 = xyz2.unsqueeze(0)
# if transpose:
# xyz1 = xyz1.transpose(1, 2)
# xyz2 = xyz2.transpose(1, 2)
# cost = EarthMoverDistanceFunction.apply(xyz1, xyz2)
# return cost
================================================
FILE: segmentation/extensions/emd/setup.py
================================================
"""Setup extension
Notes:
If extra_compile_args is provided, you need to provide different instances for different extensions.
Refer to https://github.com/pytorch/pytorch/issues/20169
"""
from setuptools import setup
from torch.utils.cpp_extension import BuildExtension, CUDAExtension
setup(
name='emd_ext',
ext_modules=[
CUDAExtension(
name='emd_cuda',
sources=[
'cuda/emd.cpp',
'cuda/emd_kernel.cu',
],
extra_compile_args={'cxx': ['-g'], 'nvcc': ['-O2']}
),
],
cmdclass={
'build_ext': BuildExtension
})
================================================
FILE: segmentation/extensions/emd/test_emd_loss.py
================================================
import torch
import numpy as np
import time
from emd import earth_mover_distance
# gt
p1 = torch.from_numpy(np.array([[[1.7, -0.1, 0.1], [0.1, 1.2, 0.3]]], dtype=np.float32)).cuda()
p1 = p1.repeat(3, 1, 1)
p2 = torch.from_numpy(np.array([[[0.3, 1.8, 0.2], [1.2, -0.2, 0.3]]], dtype=np.float32)).cuda()
p2 = p2.repeat(3, 1, 1)
print(p1)
print(p2)
print(p1.shape)
p1.requires_grad = True
p2.requires_grad = True
gt_dist = (((p1[0, 0] - p2[0, 1])**2).sum() + ((p1[0, 1] - p2[0, 0])**2).sum()) / 2 + \
(((p1[1, 0] - p2[1, 1])**2).sum() + ((p1[1, 1] - p2[1, 0])**2).sum()) * 2 + \
(((p1[2, 0] - p2[2, 1])**2).sum() + ((p1[2, 1] - p2[2, 0])**2).sum()) / 3
print('gt_dist: ', gt_dist)
gt_dist.backward()
print(p1.grad)
print(p2.grad)
# emd
p1 = torch.from_numpy(np.array([[[1.7, -0.1, 0.1], [0.1, 1.2, 0.3]]], dtype=np.float32)).cuda()
p1 = p1.repeat(3, 1, 1)
p2 = torch.from_numpy(np.array([[[0.3, 1.8, 0.2], [1.2, -0.2, 0.3]]], dtype=np.float32)).cuda()
p2 = p2.repeat(3, 1, 1)
print(p1)
print(p2)
p1.requires_grad = True
p2.requires_grad = True
d = earth_mover_distance(p1, p2, transpose=False)
print(d)
loss = d[0] / 2 + d[1] * 2 + d[2] / 3
print(loss)
loss.backward()
print(p1.grad)
print(p2.grad)
================================================
FILE: segmentation/logger.py
================================================
import logging
import torch.distributed as dist
import copy
import logging
import os
from collections import defaultdict
import torch
import torch.nn as nn
from typing import Any
from typing import Optional, List, Dict, NamedTuple, Tuple, Iterable
from termcolor import colored
logger_initialized = {}
def get_root_logger(log_file=None, log_level=logging.INFO, name='main'):
"""Get root logger and add a keyword filter to it.
The logger will be initialized if it has not been initialized. By default a
StreamHandler will be added. If `log_file` is specified, a FileHandler will
also be added. The name of the root logger is the top-level package name,
e.g., "mmdet3d".
Args:
log_file (str, optional): File path of log. Defaults to None.
log_level (int, optional): The level of logger.
Defaults to logging.INFO.
name (str, optional): The name of the root logger, also used as a
filter keyword. Defaults to 'mmdet3d'.
Returns:
:obj:`logging.Logger`: The obtained logger
"""
logger = get_logger(name=name, log_file=log_file, log_level=log_level)
# add a logging filter
logging_filter = logging.Filter(name)
logging_filter.filter = lambda record: record.find(name) != -1
return logger
def get_logger(name, log_file=None, log_level=logging.INFO, file_mode='w'):
"""Initialize and get a logger by name.
If the logger has not been initialized, this method will initialize the
logger by adding one or two handlers, otherwise the initialized logger will
be directly returned. During initialization, a StreamHandler will always be
added. If `log_file` is specified and the process rank is 0, a FileHandler
will also be added.
Args:
name (str): Logger name.
log_file (str | None): The log filename. If specified, a FileHandler
will be added to the logger.
log_level (int): The logger level. Note that only the process of
rank 0 is affected, and other processes will set the level to
"Error" thus be silent most of the time.
file_mode (str): The file mode used in opening log file.
Defaults to 'w'.
Returns:
logging.Logger: The expected logger.
"""
logger = logging.getLogger(name)
if name in logger_initialized:
return logger
# handle hierarchical names
# e.g., logger "a" is initialized, then logger "a.b" will skip the
# initialization since it is a child of "a".
for logger_name in logger_initialized:
if name.startswith(logger_name):
return logger
# handle duplicate logs to the console
# Starting in 1.8.0, PyTorch DDP attaches a StreamHandler <stderr> (NOTSET)
# to the root logger. As logger.propagate is True by default, this root
# level handler causes logging messages from rank>0 processes to
# unexpectedly show up on the console, creating much unwanted clutter.
# To fix this issue, we set the root logger's StreamHandler, if any, to log
# at the ERROR level.
for handler in logger.root.handlers:
if type(handler) is logging.StreamHandler:
handler.setLevel(logging.ERROR)
stream_handler = logging.StreamHandler()
handlers = [stream_handler]
if dist.is_available() and dist.is_initialized():
rank = dist.get_rank()
else:
rank = 0
# only rank 0 will add a FileHandler
if rank == 0 and log_file is not None:
# Here, the default behaviour of the official logger is 'a'. Thus, we
# provide an interface to change the file mode to the default
# behaviour.
file_handler = logging.FileHandler(log_file, file_mode)
handlers.append(file_handler)
formatter = logging.Formatter(
'%(asctime)s - %(name)s - %(levelname)s - %(message)s')
for handler in handlers:
handler.setFormatter(formatter)
handler.setLevel(log_level)
logger.addHandler(handler)
if rank == 0:
logger.setLevel(log_level)
else:
logger.setLevel(logging.ERROR)
logger_initialized[name] = True
return logger
def print_log(msg, logger=None, level=logging.INFO):
"""Print a log message.
Args:
msg (str): The message to be logged.
logger (logging.Logger | str | None): The logger to be used.
Some special loggers are:
- "silent": no message will be printed.
- other str: the logger obtained with `get_root_logger(logger)`.
- None: The `print()` method will be used to print log messages.
level (int): Logging level. Only available when `logger` is a Logger
object or "root".
"""
if logger is None:
print(msg)
elif isinstance(logger, logging.Logger):
logger.log(level, msg)
elif logger == 'silent':
pass
elif isinstance(logger, str):
_logger = get_logger(logger)
_logger.log(level, msg)
else:
raise TypeError(
'logger should be either a logging.Logger object, str, '
f'"silent" or None, but got {type(logger)}')
def get_missing_parameters_message(keys: List[str]) -> str:
"""
Get a logging-friendly message to report parameter names (keys) that are in
the model but not found in a checkpoint.
Args:
keys (list[str]): List of keys that were not found in the checkpoint.
Returns:
str: message.
"""
groups = _group_checkpoint_keys(keys)
msg = "Some model parameters or buffers are not found in the checkpoint:\n"
msg += "\n".join(
" " + colored(k + _group_to_str(v), "blue") for k, v in groups.items()
)
return msg
def get_unexpected_parameters_message(keys: List[str]) -> str:
"""
Get a logging-friendly message to report parameter names (keys) that are in
the checkpoint but not found in the model.
Args:
keys (list[str]): List of keys that were not found in the model.
Returns:
str: message.
"""
groups = _group_checkpoint_keys(keys)
msg = "The checkpoint state_dict contains keys that are not used by the model:\n"
msg += "\n".join(
" " + colored(k + _group_to_str(v), "magenta") for k, v in groups.items()
)
return msg
def _strip_prefix_if_present(state_dict: Dict[str, Any], prefix: str) -> None:
"""
Strip the prefix in metadata, if any.
Args:
state_dict (OrderedDict): a state-dict to be loaded to the model.
prefix (str): prefix.
"""
keys = sorted(state_dict.keys())
if not all(len(key) == 0 or key.startswith(prefix) for key in keys):
return
for key in keys:
newkey = key[len(prefix):]
state_dict[newkey] = state_dict.pop(key)
# also strip the prefix in metadata, if any..
try:
metadata = state_dict._metadata # pyre-ignore
except AttributeError:
pass
else:
for key in list(metadata.keys()):
# for the metadata dict, the key can be:
# '': for the DDP module, which we want to remove.
# 'module': for the actual model.
# 'module.xx.xx': for the rest.
if len(key) == 0:
continue
newkey = key[len(prefix):]
metadata[newkey] = metadata.pop(key)
def _group_checkpoint_keys(keys: List[str]) -> Dict[str, List[str]]:
"""
Group keys based on common prefixes. A prefix is the string up to the final
"." in each key.
Args:
keys (list[str]): list of parameter names, i.e. keys in the model
checkpoint dict.
Returns:
dict[list]: keys with common prefixes are grouped into lists.
"""
groups = defaultdict(list)
for key in keys:
pos = key.rfind(".")
if pos >= 0:
head, tail = key[:pos], [key[pos + 1:]]
else:
head, tail = key, []
groups[head].extend(tail)
return groups
def _group_to_str(group: List[str]) -> str:
"""
Format a group of parameter name suffixes into a loggable string.
Args:
group (list[str]): list of parameter name suffixes.
Returns:
gitextract_eruklvu8/
├── DATASET.md
├── LICENSE
├── README.md
├── cfgs/
│ ├── PointGPT-B/
│ │ ├── fewshot.yaml
│ │ ├── finetune_modelnet.yaml
│ │ ├── finetune_modelnet_8k.yaml
│ │ ├── finetune_scan_hardest.yaml
│ │ ├── finetune_scan_objbg.yaml
│ │ ├── finetune_scan_objonly.yaml
│ │ ├── post_pretrain.yaml
│ │ └── pretrain.yaml
│ ├── PointGPT-L/
│ │ ├── fewshot.yaml
│ │ ├── finetune_modelnet.yaml
│ │ ├── finetune_modelnet_8k.yaml
│ │ ├── finetune_scan_hardest.yaml
│ │ ├── finetune_scan_objbg.yaml
│ │ ├── finetune_scan_objonly.yaml
│ │ ├── post_pretrain.yaml
│ │ └── pretrain.yaml
│ ├── PointGPT-S/
│ │ ├── fewshot.yaml
│ │ ├── finetune_modelnet.yaml
│ │ ├── finetune_modelnet_8k.yaml
│ │ ├── finetune_scan_hardest.yaml
│ │ ├── finetune_scan_objbg.yaml
│ │ ├── finetune_scan_objonly.yaml
│ │ └── pretrain.yaml
│ └── dataset_configs/
│ ├── LabeledHybrid.yaml
│ ├── ModelNet40.yaml
│ ├── ModelNet40FewShot.yaml
│ ├── ScanObjectNN_hardest.yaml
│ ├── ScanObjectNN_objectbg.yaml
│ ├── ScanObjectNN_objectonly.yaml
│ ├── ShapeNet-55.yaml
│ └── UnlabeledHybrid.yaml
├── datasets/
│ ├── LabeledHybrid.py
│ ├── ModelNetDataset.py
│ ├── ModelNetDatasetFewShot.py
│ ├── ScanObjectNNDataset.py
│ ├── ShapeNet55Dataset.py
│ ├── UnlabeledHybrid.py
│ ├── __init__.py
│ ├── build.py
│ ├── data_transforms.py
│ ├── generate_few_shot_data.py
│ └── io.py
├── extensions/
│ ├── chamfer_dist/
│ │ ├── __init__.py
│ │ ├── chamfer.cu
│ │ ├── chamfer_cuda.cpp
│ │ ├── setup.py
│ │ └── test.py
│ └── emd/
│ ├── README.md
│ ├── __init__.py
│ ├── cuda/
│ │ ├── emd.cpp
│ │ └── emd_kernel.cu
│ ├── emd.py
│ ├── setup.py
│ └── test_emd_loss.py
├── figures/
│ └── a
├── main.py
├── main_vis.py
├── models/
│ ├── GPT.py
│ ├── PointGPT.py
│ ├── __init__.py
│ ├── build.py
│ └── z_order.py
├── requirements.txt
├── segmentation/
│ ├── __init__.py
│ ├── dataset.py
│ ├── extensions/
│ │ ├── chamfer_dist/
│ │ │ ├── __init__.py
│ │ │ ├── chamfer.cu
│ │ │ ├── chamfer_cuda.cpp
│ │ │ ├── setup.py
│ │ │ └── test.py
│ │ └── emd/
│ │ ├── README.md
│ │ ├── __init__.py
│ │ ├── cuda/
│ │ │ ├── emd.cpp
│ │ │ └── emd_kernel.cu
│ │ ├── emd.py
│ │ ├── setup.py
│ │ └── test_emd_loss.py
│ ├── logger.py
│ ├── main.py
│ ├── misc.py
│ ├── models/
│ │ ├── gpt2_seg.py
│ │ ├── pointnet2_utils.py
│ │ ├── pt.py
│ │ └── z_order.py
│ ├── pointnet_util.py
│ └── provider.py
├── tools/
│ ├── __init__.py
│ ├── builder.py
│ ├── runner.py
│ ├── runner_finetune.py
│ └── runner_pretrain.py
└── utils/
├── AverageMeter.py
├── checkpoint.py
├── config.py
├── dist_utils.py
├── logger.py
├── misc.py
├── parser.py
└── registry.py
SYMBOL INDEX (392 symbols across 48 files)
FILE: datasets/LabeledHybrid.py
class LabeledHybrid (line 10) | class LabeledHybrid(data.Dataset):
method __init__ (line 11) | def __init__(self, config):
method pc_norm (line 40) | def pc_norm(self, pc):
method random_sample (line 49) | def random_sample(self, pc, num):
method __getitem__ (line 55) | def __getitem__(self, idx):
method __len__ (line 66) | def __len__(self):
FILE: datasets/ModelNetDataset.py
function pc_normalize (line 20) | def pc_normalize(pc):
function farthest_point_sample (line 29) | def farthest_point_sample(point, npoint):
class ModelNet (line 53) | class ModelNet(Dataset):
method __init__ (line 54) | def __init__(self, config):
method __len__ (line 118) | def __len__(self):
method _get_item (line 121) | def _get_item(self, index):
method __getitem__ (line 142) | def __getitem__(self, index):
FILE: datasets/ModelNetDatasetFewShot.py
function pc_normalize (line 21) | def pc_normalize(pc):
class ModelNetFewShot (line 29) | class ModelNetFewShot(Dataset):
method __init__ (line 30) | def __init__(self, config):
method __len__ (line 56) | def __len__(self):
method __getitem__ (line 59) | def __getitem__(self, index):
FILE: datasets/ScanObjectNNDataset.py
class ScanObjectNN (line 12) | class ScanObjectNN(Dataset):
method __init__ (line 13) | def __init__(self, config, **kwargs):
method __getitem__ (line 33) | def __getitem__(self, idx):
method __len__ (line 46) | def __len__(self):
class ScanObjectNN_hardest (line 52) | class ScanObjectNN_hardest(Dataset):
method __init__ (line 53) | def __init__(self, config, **kwargs):
method __getitem__ (line 73) | def __getitem__(self, idx):
method __len__ (line 86) | def __len__(self):
FILE: datasets/ShapeNet55Dataset.py
class ShapeNet (line 11) | class ShapeNet(data.Dataset):
method __init__ (line 12) | def __init__(self, config):
method pc_norm (line 52) | def pc_norm(self, pc):
method random_sample (line 60) | def random_sample(self, pc, num):
method __getitem__ (line 65) | def __getitem__(self, idx):
method __len__ (line 76) | def __len__(self):
FILE: datasets/UnlabeledHybrid.py
class UnlabeledHybrid (line 11) | class UnlabeledHybrid(data.Dataset):
method __init__ (line 12) | def __init__(self, config):
method pc_norm (line 52) | def pc_norm(self, pc):
method random_sample (line 60) | def random_sample(self, pc, num):
method __getitem__ (line 66) | def __getitem__(self, idx):
method __len__ (line 78) | def __len__(self):
FILE: datasets/build.py
function build_dataset_from_cfg (line 7) | def build_dataset_from_cfg(cfg, default_args = None):
FILE: datasets/data_transforms.py
class PointcloudRotate (line 6) | class PointcloudRotate(object):
method __call__ (line 7) | def __call__(self, pc):
class PointcloudScaleAndTranslate (line 20) | class PointcloudScaleAndTranslate(object):
method __init__ (line 21) | def __init__(self, scale_low=2. / 3., scale_high=3. / 2., translate_ra...
method __call__ (line 26) | def __call__(self, pc):
class PointcloudJitter (line 36) | class PointcloudJitter(object):
method __init__ (line 37) | def __init__(self, std=0.01, clip=0.05):
method __call__ (line 40) | def __call__(self, pc):
class PointcloudScale (line 50) | class PointcloudScale(object):
method __init__ (line 51) | def __init__(self, scale_low=2. / 3., scale_high=3. / 2.):
method __call__ (line 55) | def __call__(self, pc):
class PointcloudTranslate (line 64) | class PointcloudTranslate(object):
method __init__ (line 65) | def __init__(self, translate_range=0.2):
method __call__ (line 68) | def __call__(self, pc):
class PointcloudRandomInputDropout (line 78) | class PointcloudRandomInputDropout(object):
method __init__ (line 79) | def __init__(self, max_dropout_ratio=0.5):
method __call__ (line 83) | def __call__(self, pc):
class RandomHorizontalFlip (line 95) | class RandomHorizontalFlip(object):
method __init__ (line 98) | def __init__(self, upright_axis = 'z', is_temporal=False):
method __call__ (line 109) | def __call__(self, coords):
FILE: datasets/generate_few_shot_data.py
function generate_fewshot_data (line 20) | def generate_fewshot_data(way, shot, prefix_ind, eval_sample=20):
FILE: datasets/io.py
class IO (line 6) | class IO:
method get (line 8) | def get(cls, file_path):
method _read_npy (line 24) | def _read_npy(cls, file_path):
method _read_txt (line 36) | def _read_txt(cls, file_path):
method _read_h5 (line 40) | def _read_h5(cls, file_path):
FILE: extensions/chamfer_dist/__init__.py
class ChamferFunction (line 13) | class ChamferFunction(torch.autograd.Function):
method forward (line 15) | def forward(ctx, xyz1, xyz2):
method backward (line 22) | def backward(ctx, grad_dist1, grad_dist2):
class ChamferDistanceL2 (line 28) | class ChamferDistanceL2(torch.nn.Module):
method __init__ (line 31) | def __init__(self, ignore_zeros=False):
method forward (line 35) | def forward(self, xyz1, xyz2):
class ChamferDistanceL2_split (line 46) | class ChamferDistanceL2_split(torch.nn.Module):
method __init__ (line 49) | def __init__(self, ignore_zeros=False):
method forward (line 53) | def forward(self, xyz1, xyz2):
class ChamferDistanceL1 (line 64) | class ChamferDistanceL1(torch.nn.Module):
method __init__ (line 67) | def __init__(self, ignore_zeros=False):
method forward (line 71) | def forward(self, xyz1, xyz2):
FILE: extensions/chamfer_dist/chamfer_cuda.cpp
function chamfer_forward (line 22) | std::vector<torch::Tensor> chamfer_forward(torch::Tensor xyz1,
function chamfer_backward (line 27) | std::vector<torch::Tensor> chamfer_backward(torch::Tensor xyz1,
function PYBIND11_MODULE (line 36) | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
FILE: extensions/chamfer_dist/test.py
class ChamferDistanceTestCase (line 23) | class ChamferDistanceTestCase(unittest.TestCase):
method test_chamfer_dist (line 24) | def test_chamfer_dist(self):
FILE: extensions/emd/cuda/emd.cpp
function PYBIND11_MODULE (line 23) | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
FILE: extensions/emd/emd.py
class EarthMoverDistanceFunction (line 5) | class EarthMoverDistanceFunction(torch.autograd.Function):
method forward (line 7) | def forward(ctx, xyz1, xyz2):
method backward (line 17) | def backward(ctx, grad_cost):
class earth_mover_distance (line 26) | class earth_mover_distance(torch.nn.Module):
method __init__ (line 29) | def __init__(self):
method forward (line 32) | def forward(self, xyz1, xyz2, transpose=False):
FILE: main.py
function main (line 14) | def main():
FILE: main_vis.py
function main (line 11) | def main():
FILE: models/GPT.py
class Block (line 8) | class Block(nn.Module):
method __init__ (line 9) | def __init__(self, embed_dim, num_heads):
method forward (line 20) | def forward(self, x, attn_mask):
class GPT_extractor (line 31) | class GPT_extractor(nn.Module):
method __init__ (line 32) | def __init__(
method forward (line 70) | def forward(self, h, pos, attn_mask, classify=False):
class GPT_generator (line 104) | class GPT_generator(nn.Module):
method __init__ (line 105) | def __init__(
method forward (line 127) | def forward(self, h, pos, attn_mask):
FILE: models/PointGPT.py
class Encoder_large (line 18) | class Encoder_large(nn.Module): # Embedding module
method __init__ (line 19) | def __init__(self, encoder_channel):
method forward (line 38) | def forward(self, point_groups):
class Encoder_small (line 55) | class Encoder_small(nn.Module): # Embedding module
method __init__ (line 56) | def __init__(self, encoder_channel):
method forward (line 72) | def forward(self, point_groups):
class Group (line 90) | class Group(nn.Module):
method __init__ (line 91) | def __init__(self, num_group, group_size):
method simplied_morton_sorting (line 98) | def simplied_morton_sorting(self, xyz, center):
method morton_sorting (line 130) | def morton_sorting(self, xyz, center):
method forward (line 148) | def forward(self, xyz):
class Mlp (line 188) | class Mlp(nn.Module):
method __init__ (line 189) | def __init__(self, in_features, hidden_features=None, out_features=Non...
method forward (line 198) | def forward(self, x):
class Attention (line 207) | class Attention(nn.Module):
method __init__ (line 208) | def __init__(self, dim, num_heads=8, qkv_bias=False, qk_scale=None, at...
method forward (line 218) | def forward(self, x):
class Block (line 235) | class Block(nn.Module):
method __init__ (line 236) | def __init__(self, dim, num_heads, mlp_ratio=4., qkv_bias=False, qk_sc...
method forward (line 251) | def forward(self, x):
class PositionEmbeddingCoordsSine (line 257) | class PositionEmbeddingCoordsSine(nn.Module):
method __init__ (line 268) | def __init__(self, n_dim: int = 1, d_model: int = 256, temperature=100...
method forward (line 280) | def forward(self, xyz: torch.Tensor) -> torch.Tensor:
class GPT_Transformer (line 307) | class GPT_Transformer(nn.Module):
method __init__ (line 308) | def __init__(self, config, **kwargs):
method _init_weights (line 360) | def _init_weights(self, m):
method forward (line 373) | def forward(self, neighborhood, center, noaug=False, classify=False):
class PointGPT (line 442) | class PointGPT(nn.Module):
method __init__ (line 443) | def __init__(self, config):
method build_loss_func (line 463) | def build_loss_func(self, loss_type):
method forward (line 475) | def forward(self, pts, vis=False, **kwargs):
class PointTransformer (line 503) | class PointTransformer(nn.Module):
method __init__ (line 504) | def __init__(self, config, **kwargs):
method build_loss_func (line 561) | def build_loss_func(self, loss_type='cdl12'):
method get_loss_acc (line 574) | def get_loss_acc(self, ret, gt):
method load_model_from_ckpt (line 580) | def load_model_from_ckpt(self, bert_ckpt_path):
method _init_weights (line 618) | def _init_weights(self, m):
method forward (line 631) | def forward(self, pts):
FILE: models/build.py
function build_model_from_cfg (line 7) | def build_model_from_cfg(cfg, **kwargs):
FILE: models/z_order.py
function round_to_int_32 (line 4) | def round_to_int_32(data):
function split_by_3 (line 23) | def split_by_3(x):
function get_z_order (line 50) | def get_z_order(x, y, z):
function get_z_values (line 72) | def get_z_values(data):
FILE: segmentation/dataset.py
class ModelNetDataLoader (line 9) | class ModelNetDataLoader(Dataset):
method __init__ (line 10) | def __init__(self, root, npoint=1024, split='train', uniform=False, no...
method __len__ (line 36) | def __len__(self):
method _get_item (line 39) | def _get_item(self, index):
method __getitem__ (line 62) | def __getitem__(self, index):
class PartNormalDataset (line 66) | class PartNormalDataset(Dataset):
method __init__ (line 67) | def __init__(self, root='/data/cgy/ShapenetPart/shapenetcore_partanno_...
method __getitem__ (line 138) | def __getitem__(self, index):
method __len__ (line 163) | def __len__(self):
FILE: segmentation/extensions/chamfer_dist/__init__.py
class ChamferFunction (line 13) | class ChamferFunction(torch.autograd.Function):
method forward (line 15) | def forward(ctx, xyz1, xyz2):
method backward (line 22) | def backward(ctx, grad_dist1, grad_dist2):
class ChamferDistanceL2 (line 28) | class ChamferDistanceL2(torch.nn.Module):
method __init__ (line 31) | def __init__(self, ignore_zeros=False):
method forward (line 35) | def forward(self, xyz1, xyz2):
class ChamferDistanceL2_split (line 46) | class ChamferDistanceL2_split(torch.nn.Module):
method __init__ (line 49) | def __init__(self, ignore_zeros=False):
method forward (line 53) | def forward(self, xyz1, xyz2):
class ChamferDistanceL1 (line 64) | class ChamferDistanceL1(torch.nn.Module):
method __init__ (line 67) | def __init__(self, ignore_zeros=False):
method forward (line 71) | def forward(self, xyz1, xyz2):
FILE: segmentation/extensions/chamfer_dist/chamfer_cuda.cpp
function chamfer_forward (line 22) | std::vector<torch::Tensor> chamfer_forward(torch::Tensor xyz1,
function chamfer_backward (line 27) | std::vector<torch::Tensor> chamfer_backward(torch::Tensor xyz1,
function PYBIND11_MODULE (line 36) | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
FILE: segmentation/extensions/chamfer_dist/test.py
class ChamferDistanceTestCase (line 23) | class ChamferDistanceTestCase(unittest.TestCase):
method test_chamfer_dist (line 24) | def test_chamfer_dist(self):
FILE: segmentation/extensions/emd/cuda/emd.cpp
function PYBIND11_MODULE (line 23) | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
FILE: segmentation/extensions/emd/emd.py
class EarthMoverDistanceFunction (line 5) | class EarthMoverDistanceFunction(torch.autograd.Function):
method forward (line 7) | def forward(ctx, xyz1, xyz2):
method backward (line 17) | def backward(ctx, grad_cost):
class earth_mover_distance (line 26) | class earth_mover_distance(torch.nn.Module):
method __init__ (line 29) | def __init__(self):
method forward (line 32) | def forward(self, xyz1, xyz2, transpose=False):
FILE: segmentation/logger.py
function get_root_logger (line 18) | def get_root_logger(log_file=None, log_level=logging.INFO, name='main'):
function get_logger (line 41) | def get_logger(name, log_file=None, log_level=logging.INFO, file_mode='w'):
function print_log (line 115) | def print_log(msg, logger=None, level=logging.INFO):
function get_missing_parameters_message (line 141) | def get_missing_parameters_message(keys: List[str]) -> str:
function get_unexpected_parameters_message (line 158) | def get_unexpected_parameters_message(keys: List[str]) -> str:
function _strip_prefix_if_present (line 175) | def _strip_prefix_if_present(state_dict: Dict[str, Any], prefix: str) ->...
function _group_checkpoint_keys (line 208) | def _group_checkpoint_keys(keys: List[str]) -> Dict[str, List[str]]:
function _group_to_str (line 229) | def _group_to_str(group: List[str]) -> str:
function _named_modules_with_dup (line 246) | def _named_modules_with_dup(
FILE: segmentation/main.py
function inplace_relu (line 35) | def inplace_relu(m):
function to_categorical (line 41) | def to_categorical(y, num_classes):
function parse_args (line 49) | def parse_args():
function get_model_loss (line 79) | def get_model_loss(MODEL, args, num_part):
function main (line 97) | def main(args):
FILE: segmentation/misc.py
function fps (line 13) | def fps(data, number):
function worker_init_fn (line 23) | def worker_init_fn(worker_id):
function build_lambda_sche (line 27) | def build_lambda_sche(opti, config):
function build_lambda_bnsche (line 36) | def build_lambda_bnsche(model, config):
function set_random_seed (line 45) | def set_random_seed(seed, deterministic=False):
function is_seq_of (line 72) | def is_seq_of(seq, expected_type, seq_type=None):
function set_bn_momentum_default (line 94) | def set_bn_momentum_default(bn_momentum):
class BNMomentumScheduler (line 102) | class BNMomentumScheduler(object):
method __init__ (line 104) | def __init__(
method step (line 122) | def step(self, epoch=None):
method get_momentum (line 129) | def get_momentum(self, epoch=None):
function seprate_point_cloud (line 135) | def seprate_point_cloud(xyz, num_points, crop, fixed_points=None, paddin...
function get_ptcloud_img (line 191) | def get_ptcloud_img(ptcloud):
function visualize_KITTI (line 211) | def visualize_KITTI(path, data_list, titles=['input', 'pred'], cmap=['bw...
function random_dropping (line 241) | def random_dropping(pc, e):
function random_scale (line 251) | def random_scale(partial, scale_range=[0.8, 1.2]):
FILE: segmentation/models/gpt2_seg.py
class Block (line 7) | class Block(nn.Module):
method __init__ (line 8) | def __init__(self, embed_dim, num_heads, drop_path):
method forward (line 21) | def forward(self, x):
class GPT_extractor (line 35) | class GPT_extractor(nn.Module):
method __init__ (line 36) | def __init__(
method forward (line 62) | def forward(self, h, pos, classify=False):
class GPT_generator (line 95) | class GPT_generator(nn.Module):
method __init__ (line 96) | def __init__(
method forward (line 123) | def forward(self, h, pos):
FILE: segmentation/models/pointnet2_utils.py
function timeit (line 7) | def timeit(tag, t):
function pc_normalize (line 11) | def pc_normalize(pc):
function square_distance (line 19) | def square_distance(src, dst):
function index_points (line 41) | def index_points(points, idx):
function farthest_point_sample (line 60) | def farthest_point_sample(xyz, npoint):
function query_ball_point (line 84) | def query_ball_point(radius, nsample, xyz, new_xyz):
function sample_and_group (line 107) | def sample_and_group(npoint, radius, nsample, xyz, points, returnfps=Fal...
function sample_and_group_all (line 138) | def sample_and_group_all(xyz, points):
class PointNetSetAbstraction (line 158) | class PointNetSetAbstraction(nn.Module):
method __init__ (line 159) | def __init__(self, npoint, radius, nsample, in_channel, mlp, group_all):
method forward (line 173) | def forward(self, xyz, points):
class PointNetSetAbstractionMsg (line 202) | class PointNetSetAbstractionMsg(nn.Module):
method __init__ (line 203) | def __init__(self, npoint, radius_list, nsample_list, in_channel, mlp_...
method forward (line 221) | def forward(self, xyz, points):
class PointNetFeaturePropagation (line 262) | class PointNetFeaturePropagation(nn.Module):
method __init__ (line 263) | def __init__(self, in_channel, mlp):
method forward (line 273) | def forward(self, xyz1, xyz2, points1, points2):
FILE: segmentation/models/pt.py
function fps (line 17) | def fps(data, number):
class Group (line 28) | class Group(nn.Module):
method __init__ (line 29) | def __init__(self, num_group, group_size):
method simplied_morton_sorting (line 36) | def simplied_morton_sorting(self, xyz, center):
method morton_sorting (line 65) | def morton_sorting(self, xyz, center):
method forward (line 83) | def forward(self, xyz):
class Encoder_small (line 121) | class Encoder_small(nn.Module):
method __init__ (line 122) | def __init__(self, encoder_channel):
method forward (line 138) | def forward(self, point_groups):
class Encoder_large (line 154) | class Encoder_large(nn.Module): # Embedding module
method __init__ (line 155) | def __init__(self, encoder_channel):
method forward (line 174) | def forward(self, point_groups):
class Mlp (line 191) | class Mlp(nn.Module):
method __init__ (line 192) | def __init__(self, in_features, hidden_features=None, out_features=Non...
method forward (line 201) | def forward(self, x):
class Attention (line 210) | class Attention(nn.Module):
method __init__ (line 211) | def __init__(self, dim, num_heads=8, qkv_bias=False, qk_scale=None, at...
method forward (line 223) | def forward(self, x):
class Block (line 240) | class Block(nn.Module):
method __init__ (line 241) | def __init__(self, dim, num_heads, mlp_ratio=4., qkv_bias=False, qk_sc...
method forward (line 255) | def forward(self, x):
class TransformerEncoder (line 261) | class TransformerEncoder(nn.Module):
method __init__ (line 265) | def __init__(self, embed_dim=768, depth=4, num_heads=12, mlp_ratio=4.,...
method forward (line 278) | def forward(self, x, pos):
class PositionEmbeddingCoordsSine (line 288) | class PositionEmbeddingCoordsSine(nn.Module):
method __init__ (line 299) | def __init__(self, n_dim: int = 1, d_model: int = 256, temperature=100...
method forward (line 311) | def forward(self, xyz: torch.Tensor) -> torch.Tensor:
class get_model (line 338) | class get_model(nn.Module):
method __init__ (line 339) | def __init__(self, cls_dim, trans_dim=384, depth=12, drop_path_rate=0....
method get_loss_acc (line 416) | def get_loss_acc(self, ret, gt):
method load_model_from_ckpt (line 422) | def load_model_from_ckpt(self, bert_ckpt_path):
method forward (line 454) | def forward(self, pts, cls_label):
class get_loss (line 522) | class get_loss(nn.Module):
method __init__ (line 523) | def __init__(self):
method forward (line 526) | def forward(self, pred, target):
FILE: segmentation/models/z_order.py
function round_to_int_32 (line 4) | def round_to_int_32(data):
function split_by_3 (line 23) | def split_by_3(x):
function get_z_order (line 50) | def get_z_order(x, y, z):
function get_z_values (line 72) | def get_z_values(data):
FILE: segmentation/pointnet_util.py
function timeit (line 11) | def timeit(tag, t):
function pc_normalize (line 15) | def pc_normalize(pc):
function square_distance (line 22) | def square_distance(src, dst):
function index_points (line 39) | def index_points(points, idx):
function farthest_point_sample (line 53) | def farthest_point_sample(xyz, npoint):
function query_ball_point (line 76) | def query_ball_point(radius, nsample, xyz, new_xyz):
function sample_and_group (line 99) | def sample_and_group(npoint, radius, nsample, xyz, points, returnfps=Fal...
function sample_and_group_all (line 139) | def sample_and_group_all(xyz, points):
class PointNetSetAbstraction (line 159) | class PointNetSetAbstraction(nn.Module):
method __init__ (line 160) | def __init__(self, npoint, radius, nsample, in_channel, mlp, group_all...
method forward (line 175) | def forward(self, xyz, points):
class PointNetSetAbstractionMsg (line 199) | class PointNetSetAbstractionMsg(nn.Module):
method __init__ (line 200) | def __init__(self, npoint, radius_list, nsample_list, in_channel, mlp_...
method forward (line 219) | def forward(self, xyz, points, seed_idx=None):
class PointNetFeaturePropagation (line 261) | class PointNetFeaturePropagation(nn.Module):
method __init__ (line 262) | def __init__(self, in_channel, mlp):
method forward (line 272) | def forward(self, xyz1, xyz2, points1, points2):
FILE: segmentation/provider.py
function normalize_data (line 3) | def normalize_data(batch_data):
function shuffle_data (line 22) | def shuffle_data(data, labels):
function shuffle_points (line 34) | def shuffle_points(batch_data):
function rotate_point_cloud (line 46) | def rotate_point_cloud(batch_data):
function rotate_point_cloud_z (line 66) | def rotate_point_cloud_z(batch_data):
function rotate_point_cloud_with_normal (line 86) | def rotate_point_cloud_with_normal(batch_xyz_normal):
function rotate_perturbation_point_cloud_with_normal (line 106) | def rotate_perturbation_point_cloud_with_normal(batch_data, angle_sigma=...
function rotate_point_cloud_by_angle (line 133) | def rotate_point_cloud_by_angle(batch_data, rotation_angle):
function rotate_point_cloud_by_angle_with_normal (line 152) | def rotate_point_cloud_by_angle_with_normal(batch_data, rotation_angle):
function rotate_perturbation_point_cloud (line 176) | def rotate_perturbation_point_cloud(batch_data, angle_sigma=0.06, angle_...
function jitter_point_cloud (line 201) | def jitter_point_cloud(batch_data, sigma=0.01, clip=0.05):
function shift_point_cloud (line 214) | def shift_point_cloud(batch_data, shift_range=0.1):
function random_scale_point_cloud (line 228) | def random_scale_point_cloud(batch_data, scale_low=0.8, scale_high=1.25):
function random_point_dropout (line 241) | def random_point_dropout(batch_pc, max_dropout_ratio=0.875):
FILE: tools/builder.py
function dataset_builder (line 16) | def dataset_builder(args, config):
function model_builder (line 39) | def model_builder(config):
function build_opti_sche (line 44) | def build_opti_sche(base_model, config):
function resume_model (line 103) | def resume_model(base_model, args, logger=None):
function resume_optimizer (line 133) | def resume_optimizer(optimizer, args, logger=None):
function save_checkpoint (line 147) | def save_checkpoint(base_model, optimizer, epoch, metrics, best_metrics,...
function load_model (line 160) | def load_model(base_model, ckpt_path, logger=None):
FILE: tools/runner.py
function test_net (line 14) | def test_net(args, config):
function test (line 34) | def test(base_model, test_dataloader, args, config, logger=None):
FILE: tools/runner_finetune.py
class Acc_Metric (line 37) | class Acc_Metric:
method __init__ (line 38) | def __init__(self, acc=0.):
method better_than (line 46) | def better_than(self, other):
method state_dict (line 52) | def state_dict(self):
function run_net (line 58) | def run_net(args, config, train_writer=None, val_writer=None):
function validate (line 249) | def validate(base_model, test_dataloader, epoch, val_writer, args, confi...
function validate_vote (line 293) | def validate_vote(base_model, test_dataloader, epoch, val_writer, args, ...
function test_net (line 361) | def test_net(args, config):
function test (line 380) | def test(base_model, test_dataloader, args, config, logger=None):
function test_vote (line 429) | def test_vote(base_model, test_dataloader, epoch, val_writer, args, conf...
FILE: tools/runner_pretrain.py
class Acc_Metric (line 31) | class Acc_Metric:
method __init__ (line 32) | def __init__(self, acc=0.):
method better_than (line 38) | def better_than(self, other):
method state_dict (line 44) | def state_dict(self):
function evaluate_svm (line 50) | def evaluate_svm(train_features, train_labels, test_features, test_labels):
function run_net (line 57) | def run_net(args, config, train_writer=None, val_writer=None):
function validate (line 209) | def validate(base_model, extra_train_dataloader, test_dataloader, epoch,...
function test_net (line 272) | def test_net():
FILE: utils/AverageMeter.py
class AverageMeter (line 2) | class AverageMeter(object):
method __init__ (line 3) | def __init__(self, items=None):
method reset (line 8) | def reset(self):
method update (line 13) | def update(self, values):
method val (line 24) | def val(self, idx=None):
method count (line 30) | def count(self, idx=None):
method avg (line 36) | def avg(self, idx=None):
FILE: utils/checkpoint.py
function get_missing_parameters_message (line 16) | def get_missing_parameters_message(keys: List[str]) -> str:
function get_unexpected_parameters_message (line 33) | def get_unexpected_parameters_message(keys: List[str]) -> str:
function _strip_prefix_if_present (line 50) | def _strip_prefix_if_present(state_dict: Dict[str, Any], prefix: str) ->...
function _group_checkpoint_keys (line 83) | def _group_checkpoint_keys(keys: List[str]) -> Dict[str, List[str]]:
function _group_to_str (line 104) | def _group_to_str(group: List[str]) -> str:
function _named_modules_with_dup (line 121) | def _named_modules_with_dup(
FILE: utils/config.py
function log_args_to_file (line 7) | def log_args_to_file(args, pre='args', logger=None):
function log_config_to_file (line 12) | def log_config_to_file(cfg, pre='cfg', logger=None):
function merge_new_config (line 21) | def merge_new_config(config, new_config):
function cfg_from_yaml_file (line 41) | def cfg_from_yaml_file(cfg_file):
function get_config (line 52) | def get_config(args, logger=None):
function save_experiment_config (line 66) | def save_experiment_config(args, config, logger=None):
FILE: utils/dist_utils.py
function init_dist (line 9) | def init_dist(launcher, backend='nccl', **kwargs):
function _init_dist_pytorch (line 18) | def _init_dist_pytorch(backend, **kwargs):
function get_dist_info (line 27) | def get_dist_info():
function reduce_tensor (line 41) | def reduce_tensor(tensor, args):
function gather_tensor (line 50) | def gather_tensor(tensor, args):
FILE: utils/logger.py
function get_root_logger (line 6) | def get_root_logger(log_file=None, log_level=logging.INFO, name='main'):
function get_logger (line 29) | def get_logger(name, log_file=None, log_level=logging.INFO, file_mode='w'):
function print_log (line 103) | def print_log(msg, logger=None, level=logging.INFO):
FILE: utils/misc.py
function fps (line 13) | def fps(data, number):
function worker_init_fn (line 23) | def worker_init_fn(worker_id):
function build_lambda_sche (line 26) | def build_lambda_sche(opti, config):
function build_lambda_bnsche (line 34) | def build_lambda_bnsche(model, config):
function set_random_seed (line 42) | def set_random_seed(seed, deterministic=False):
function is_seq_of (line 69) | def is_seq_of(seq, expected_type, seq_type=None):
function set_bn_momentum_default (line 91) | def set_bn_momentum_default(bn_momentum):
class BNMomentumScheduler (line 97) | class BNMomentumScheduler(object):
method __init__ (line 99) | def __init__(
method step (line 117) | def step(self, epoch=None):
method get_momentum (line 124) | def get_momentum(self, epoch=None):
function seprate_point_cloud (line 131) | def seprate_point_cloud(xyz, num_points, crop, fixed_points = None, padd...
function get_ptcloud_img (line 186) | def get_ptcloud_img(ptcloud,roll,pitch):
function visualize_KITTI (line 207) | def visualize_KITTI(path, data_list, titles = ['input','pred'], cmap=['b...
function random_dropping (line 236) | def random_dropping(pc, e):
function random_scale (line 246) | def random_scale(partial, scale_range=[0.8, 1.2]):
FILE: utils/parser.py
function get_args (line 5) | def get_args():
function create_experiment_dir (line 104) | def create_experiment_dir(args):
FILE: utils/registry.py
class Registry (line 6) | class Registry:
method __init__ (line 32) | def __init__(self, name, build_func=None, parent=None, scope=None):
method __len__ (line 56) | def __len__(self):
method __contains__ (line 59) | def __contains__(self, key):
method __repr__ (line 62) | def __repr__(self):
method infer_scope (line 69) | def infer_scope():
method split_scope_key (line 89) | def split_scope_key(key):
method name (line 108) | def name(self):
method scope (line 112) | def scope(self):
method module_dict (line 116) | def module_dict(self):
method children (line 120) | def children(self):
method get (line 123) | def get(self, key):
method build (line 146) | def build(self, *args, **kwargs):
method _add_children (line 149) | def _add_children(self, registry):
method _register_module (line 168) | def _register_module(self, module_class, module_name=None, force=False):
method deprecated_register_module (line 183) | def deprecated_register_module(self, cls=None, force=False):
method register_module (line 193) | def register_module(self, name=None, force=False, module=None):
function build_from_cfg (line 246) | def build_from_cfg(cfg, registry, default_args=None):
Condensed preview — 102 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (369K chars).
[
{
"path": "DATASET.md",
"chars": 2369,
"preview": "## Dataset\n\nThe overall directory structure should be:\n```\n│Point-MAE/\n├──cfgs/\n├──data/\n│ ├──ModelNet/\n│ ├──ModelNe"
},
{
"path": "LICENSE",
"chars": 1077,
"preview": "MIT License\n\nCopyright (c) 2022 PANG-Yatian, YUAN-Li\n\nPermission is hereby granted, free of charge, to any person obtain"
},
{
"path": "README.md",
"chars": 17253,
"preview": "# PointGPT\n\n## PointGPT: Auto-regressively Generative Pre-training from Point Clouds [ArXiv](https://arxiv.org/abs/2305."
},
{
"path": "cfgs/PointGPT-B/fewshot.yaml",
"chars": 832,
"preview": "optimizer: { type: AdamW, kwargs: { lr: 0.0005, weight_decay: 0.05 } }\n\nscheduler: { type: CosLR, kwargs: { epochs: 300,"
},
{
"path": "cfgs/PointGPT-B/finetune_modelnet.yaml",
"chars": 855,
"preview": "optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }\n\nscheduler: { type: CosLR, kwargs: { epochs: 50, "
},
{
"path": "cfgs/PointGPT-B/finetune_modelnet_8k.yaml",
"chars": 857,
"preview": "optimizer: { type: AdamW, kwargs: { lr: 0.00005, weight_decay: 0.005 } }\n\nscheduler: { type: CosLR, kwargs: { epochs: 50"
},
{
"path": "cfgs/PointGPT-B/finetune_scan_hardest.yaml",
"chars": 847,
"preview": "optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }\n\nscheduler: { type: CosLR, kwargs: { epochs: 30, "
},
{
"path": "cfgs/PointGPT-B/finetune_scan_objbg.yaml",
"chars": 850,
"preview": "optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }\n\nscheduler: { type: CosLR, kwargs: { epochs: 30, "
},
{
"path": "cfgs/PointGPT-B/finetune_scan_objonly.yaml",
"chars": 856,
"preview": "optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }\n\nscheduler: { type: CosLR, kwargs: { epochs: 50, "
},
{
"path": "cfgs/PointGPT-B/post_pretrain.yaml",
"chars": 866,
"preview": "optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }\n\nscheduler: { type: CosLR, kwargs: { epochs: 100,"
},
{
"path": "cfgs/PointGPT-B/pretrain.yaml",
"chars": 1040,
"preview": "optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }\n\nscheduler: { type: CosLR, kwargs: { epochs: 300,"
},
{
"path": "cfgs/PointGPT-L/fewshot.yaml",
"chars": 832,
"preview": "optimizer: { type: AdamW, kwargs: { lr: 0.0005, weight_decay: 0.05 } }\n\nscheduler: { type: CosLR, kwargs: { epochs: 300,"
},
{
"path": "cfgs/PointGPT-L/finetune_modelnet.yaml",
"chars": 857,
"preview": "optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }\n\nscheduler: { type: CosLR, kwargs: { epochs: 50, "
},
{
"path": "cfgs/PointGPT-L/finetune_modelnet_8k.yaml",
"chars": 859,
"preview": "optimizer: { type: AdamW, kwargs: { lr: 0.00005, weight_decay: 0.005 } }\n\nscheduler: { type: CosLR, kwargs: { epochs: 50"
},
{
"path": "cfgs/PointGPT-L/finetune_scan_hardest.yaml",
"chars": 849,
"preview": "optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }\n\nscheduler: { type: CosLR, kwargs: { epochs: 50, "
},
{
"path": "cfgs/PointGPT-L/finetune_scan_objbg.yaml",
"chars": 852,
"preview": "optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }\n\nscheduler: { type: CosLR, kwargs: { epochs: 50, "
},
{
"path": "cfgs/PointGPT-L/finetune_scan_objonly.yaml",
"chars": 858,
"preview": "optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }\n\nscheduler: { type: CosLR, kwargs: { epochs: 50, "
},
{
"path": "cfgs/PointGPT-L/post_pretrain.yaml",
"chars": 868,
"preview": "optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }\n\nscheduler: { type: CosLR, kwargs: { epochs: 100,"
},
{
"path": "cfgs/PointGPT-L/pretrain.yaml",
"chars": 1043,
"preview": "optimizer: { type: AdamW, kwargs: { lr: 0.00006, weight_decay: 0.05 } }\n\nscheduler: { type: CosLR, kwargs: { epochs: 600"
},
{
"path": "cfgs/PointGPT-S/fewshot.yaml",
"chars": 831,
"preview": "optimizer: { type: AdamW, kwargs: { lr: 0.0005, weight_decay: 0.05 } }\n\nscheduler: { type: CosLR, kwargs: { epochs: 300,"
},
{
"path": "cfgs/PointGPT-S/finetune_modelnet.yaml",
"chars": 856,
"preview": "optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }\n\nscheduler: { type: CosLR, kwargs: { epochs: 300,"
},
{
"path": "cfgs/PointGPT-S/finetune_modelnet_8k.yaml",
"chars": 857,
"preview": "optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.005 } }\n\nscheduler: { type: CosLR, kwargs: { epochs: 300"
},
{
"path": "cfgs/PointGPT-S/finetune_scan_hardest.yaml",
"chars": 848,
"preview": "optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }\n\nscheduler: { type: CosLR, kwargs: { epochs: 300,"
},
{
"path": "cfgs/PointGPT-S/finetune_scan_objbg.yaml",
"chars": 851,
"preview": "optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }\n\nscheduler: { type: CosLR, kwargs: { epochs: 300,"
},
{
"path": "cfgs/PointGPT-S/finetune_scan_objonly.yaml",
"chars": 857,
"preview": "optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }\n\nscheduler: { type: CosLR, kwargs: { epochs: 300,"
},
{
"path": "cfgs/PointGPT-S/pretrain.yaml",
"chars": 1025,
"preview": "optimizer: { type: AdamW, kwargs: { lr: 0.0001, weight_decay: 0.05 } }\n\nscheduler: { type: CosLR, kwargs: { epochs: 300,"
},
{
"path": "cfgs/dataset_configs/LabeledHybrid.yaml",
"chars": 140,
"preview": "NAME: LabeledHybrid\nDATA_PATH: data/HybridDatasets/post_pretrain\nN_POINTS: 2048\nPC_PATH: data/HybridDatasets\nnpoints: 10"
},
{
"path": "cfgs/dataset_configs/ModelNet40.yaml",
"chars": 118,
"preview": "NAME: ModelNet\nDATA_PATH: data/ModelNet/modelnet40_normal_resampled\nN_POINTS: 8192\nNUM_CATEGORY: 40\nUSE_NORMALS: FALSE"
},
{
"path": "cfgs/dataset_configs/ModelNet40FewShot.yaml",
"chars": 104,
"preview": "NAME: ModelNetFewShot\nDATA_PATH: data/ModelNetFewshot\nN_POINTS: 8192\nNUM_CATEGORY: 40\nUSE_NORMALS: FALSE"
},
{
"path": "cfgs/dataset_configs/ScanObjectNN_hardest.yaml",
"chars": 70,
"preview": "NAME: ScanObjectNN_hardest\nROOT: data/ScanObjectNN/h5_files/main_split"
},
{
"path": "cfgs/dataset_configs/ScanObjectNN_objectbg.yaml",
"chars": 62,
"preview": "NAME: ScanObjectNN\nROOT: data/ScanObjectNN/h5_files/main_split"
},
{
"path": "cfgs/dataset_configs/ScanObjectNN_objectonly.yaml",
"chars": 67,
"preview": "NAME: ScanObjectNN\nROOT: data/ScanObjectNN/h5_files/main_split_nobg"
},
{
"path": "cfgs/dataset_configs/ShapeNet-55.yaml",
"chars": 112,
"preview": "NAME: ShapeNet\nDATA_PATH: data/ShapeNet55-34/ShapeNet-55\nN_POINTS: 8192\nPC_PATH: data/ShapeNet55-34/shapenet_pc\n"
},
{
"path": "cfgs/dataset_configs/UnlabeledHybrid.yaml",
"chars": 106,
"preview": "NAME: UnlabeledHybrid\nDATA_PATH: data/HybridDatasets/pretrain\nN_POINTS: 2048\nPC_PATH: data/HybridDatasets\n"
},
{
"path": "datasets/LabeledHybrid.py",
"chars": 2375,
"preview": "import os\nimport torch\nimport numpy as np\nimport torch.utils.data as data\nfrom .io import IO\nfrom .build import DATASETS"
},
{
"path": "datasets/ModelNetDataset.py",
"chars": 5674,
"preview": "'''\n@author: Xu Yan\n@file: ModelNet.py\n@time: 2021/3/19 15:51\n'''\nimport os\nimport numpy as np\nimport warnings\nimport pi"
},
{
"path": "datasets/ModelNetDatasetFewShot.py",
"chars": 2029,
"preview": "'''\n@author: Xu Yan\n@file: ModelNet.py\n@time: 2021/3/19 15:51\n'''\nimport os\nimport numpy as np\nimport warnings\nimport pi"
},
{
"path": "datasets/ScanObjectNNDataset.py",
"chars": 2983,
"preview": "import numpy as np\nimport os, sys, h5py\nfrom torch.utils.data import Dataset\nimport torch\nfrom .build import DATASETS\nfr"
},
{
"path": "datasets/ShapeNet55Dataset.py",
"chars": 2530,
"preview": "import os\nimport torch\nimport numpy as np\nimport torch.utils.data as data\nfrom .io import IO\nfrom .build import DATASETS"
},
{
"path": "datasets/UnlabeledHybrid.py",
"chars": 2618,
"preview": "import os\nimport torch\nimport numpy as np\nimport torch.utils.data as data\nfrom .io import IO\nfrom .build import DATASETS"
},
{
"path": "datasets/__init__.py",
"chars": 244,
"preview": "from .build import build_dataset_from_cfg\nimport datasets.ShapeNet55Dataset\nimport datasets.ModelNetDataset\nimport datas"
},
{
"path": "datasets/build.py",
"chars": 362,
"preview": "from utils import registry\n\n\nDATASETS = registry.Registry('dataset')\n\n\ndef build_dataset_from_cfg(cfg, default_args = No"
},
{
"path": "datasets/data_transforms.py",
"chars": 4190,
"preview": "import numpy as np\nimport torch\nimport random\n\n\nclass PointcloudRotate(object):\n def __call__(self, pc):\n bsiz"
},
{
"path": "datasets/generate_few_shot_data.py",
"chars": 2746,
"preview": "import pickle\nimport numpy as np\nimport random\nimport os\n\nroot = '../data/ModelNet/modelnet40_normal_resampled'\ntarget ="
},
{
"path": "datasets/io.py",
"chars": 1296,
"preview": "import h5py\nimport numpy as np\n# import open3d\nimport os\n\nclass IO:\n @classmethod\n def get(cls, file_path):\n "
},
{
"path": "extensions/chamfer_dist/__init__.py",
"chars": 2753,
"preview": "# -*- coding: utf-8 -*-\n# @Author: Thibault GROUEIX\n# @Date: 2019-08-07 20:54:24\n# @Last Modified by: Haozhe Xie\n# @"
},
{
"path": "extensions/chamfer_dist/chamfer.cu",
"chars": 9130,
"preview": "/*\n * @Author: Haozhe Xie\n * @Date: 2019-08-07 20:54:24\n * @Last Modified by: Haozhe Xie\n * @Last Modified time: 202"
},
{
"path": "extensions/chamfer_dist/chamfer_cuda.cpp",
"chars": 1617,
"preview": "/*\n * @Author: Haozhe Xie\n * @Date: 2019-08-07 20:54:24\n * @Last Modified by: Haozhe Xie\n * @Last Modified time: 201"
},
{
"path": "extensions/chamfer_dist/setup.py",
"chars": 515,
"preview": "# -*- coding: utf-8 -*-\n# @Author: Haozhe Xie\n# @Date: 2019-08-07 20:54:24\n# @Last Modified by: Haozhe Xie\n# @Last M"
},
{
"path": "extensions/chamfer_dist/test.py",
"chars": 950,
"preview": "# -*- coding: utf-8 -*-\n# @Author: Haozhe Xie\n# @Date: 2019-12-10 10:38:01\n# @Last Modified by: Haozhe Xie\n# @Last M"
},
{
"path": "extensions/emd/README.md",
"chars": 721,
"preview": "# PyTorch Wrapper for Point-cloud Earth-Mover-Distance (EMD)\n\n## Dependency\n\nThe code has been tested on Ubuntu 16.04, P"
},
{
"path": "extensions/emd/__init__.py",
"chars": 63,
"preview": "from .emd import earth_mover_distance as emd\n\n__all__ = ['emd']"
},
{
"path": "extensions/emd/cuda/emd.cpp",
"chars": 744,
"preview": "#ifndef _EMD\n#define _EMD\n\n#include <vector>\n#include <torch/extension.h>\n\n//CUDA declarations\nat::Tensor ApproxMatchFor"
},
{
"path": "extensions/emd/cuda/emd_kernel.cu",
"chars": 11830,
"preview": "/**********************************\n * Original Author: Haoqiang Fan\n * Modified by: Kaichun Mo\n ***********************"
},
{
"path": "extensions/emd/emd.py",
"chars": 2076,
"preview": "import torch\nimport emd_cuda\n\n\nclass EarthMoverDistanceFunction(torch.autograd.Function):\n @staticmethod\n def forw"
},
{
"path": "extensions/emd/setup.py",
"chars": 642,
"preview": "\"\"\"Setup extension\n\nNotes:\n If extra_compile_args is provided, you need to provide different instances for different "
},
{
"path": "extensions/emd/test_emd_loss.py",
"chars": 1223,
"preview": "import torch\nimport numpy as np\nimport time\nfrom emd import earth_mover_distance\n\n# gt\np1 = torch.from_numpy(np.array([["
},
{
"path": "figures/a",
"chars": 1,
"preview": "\n"
},
{
"path": "main.py",
"chars": 3563,
"preview": "from tools import pretrain_run_net as pretrain\nfrom tools import finetune_run_net as finetune\nfrom tools import test_run"
},
{
"path": "main_vis.py",
"chars": 2550,
"preview": "# from tools import run_net\nfrom tools import test_net\nfrom utils import parser, dist_utils, misc\nfrom utils.logger impo"
},
{
"path": "models/GPT.py",
"chars": 4365,
"preview": "import torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nimport numpy as np\nimport matplotlib.pyplot as plt\n\n\n"
},
{
"path": "models/PointGPT.py",
"chars": 26320,
"preview": "import torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nimport timm\nfrom timm.models.layers import DropPath, "
},
{
"path": "models/__init__.py",
"chars": 63,
"preview": "from .build import build_model_from_cfg\nimport models.PointGPT\n"
},
{
"path": "models/build.py",
"chars": 325,
"preview": "from utils import registry\n\n\nMODELS = registry.Registry('models')\n\n\ndef build_model_from_cfg(cfg, **kwargs):\n \"\"\"\n "
},
{
"path": "models/z_order.py",
"chars": 2812,
"preview": "import numpy as np\n\n\ndef round_to_int_32(data):\n \"\"\"\n Takes a Numpy array of float values between\n -1 and 1, an"
},
{
"path": "requirements.txt",
"chars": 131,
"preview": "argparse\neasydict\nh5py\nmatplotlib\nnumpy\nopen3d==0.9\nopencv-python\npyyaml\nscipy\ntensorboardX\ntimm==0.4.5\ntqdm\ntransforms3"
},
{
"path": "segmentation/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "segmentation/dataset.py",
"chars": 7251,
"preview": "import numpy as np\nimport os\nfrom torch.utils.data import Dataset\nimport torch\nfrom pointnet_util import farthest_point_"
},
{
"path": "segmentation/extensions/chamfer_dist/__init__.py",
"chars": 2753,
"preview": "# -*- coding: utf-8 -*-\n# @Author: Thibault GROUEIX\n# @Date: 2019-08-07 20:54:24\n# @Last Modified by: Haozhe Xie\n# @"
},
{
"path": "segmentation/extensions/chamfer_dist/chamfer.cu",
"chars": 9130,
"preview": "/*\n * @Author: Haozhe Xie\n * @Date: 2019-08-07 20:54:24\n * @Last Modified by: Haozhe Xie\n * @Last Modified time: 202"
},
{
"path": "segmentation/extensions/chamfer_dist/chamfer_cuda.cpp",
"chars": 1617,
"preview": "/*\n * @Author: Haozhe Xie\n * @Date: 2019-08-07 20:54:24\n * @Last Modified by: Haozhe Xie\n * @Last Modified time: 201"
},
{
"path": "segmentation/extensions/chamfer_dist/setup.py",
"chars": 515,
"preview": "# -*- coding: utf-8 -*-\n# @Author: Haozhe Xie\n# @Date: 2019-08-07 20:54:24\n# @Last Modified by: Haozhe Xie\n# @Last M"
},
{
"path": "segmentation/extensions/chamfer_dist/test.py",
"chars": 950,
"preview": "# -*- coding: utf-8 -*-\n# @Author: Haozhe Xie\n# @Date: 2019-12-10 10:38:01\n# @Last Modified by: Haozhe Xie\n# @Last M"
},
{
"path": "segmentation/extensions/emd/README.md",
"chars": 721,
"preview": "# PyTorch Wrapper for Point-cloud Earth-Mover-Distance (EMD)\n\n## Dependency\n\nThe code has been tested on Ubuntu 16.04, P"
},
{
"path": "segmentation/extensions/emd/__init__.py",
"chars": 63,
"preview": "from .emd import earth_mover_distance as emd\n\n__all__ = ['emd']"
},
{
"path": "segmentation/extensions/emd/cuda/emd.cpp",
"chars": 744,
"preview": "#ifndef _EMD\n#define _EMD\n\n#include <vector>\n#include <torch/extension.h>\n\n//CUDA declarations\nat::Tensor ApproxMatchFor"
},
{
"path": "segmentation/extensions/emd/cuda/emd_kernel.cu",
"chars": 11830,
"preview": "/**********************************\n * Original Author: Haoqiang Fan\n * Modified by: Kaichun Mo\n ***********************"
},
{
"path": "segmentation/extensions/emd/emd.py",
"chars": 2076,
"preview": "import torch\nimport emd_cuda\n\n\nclass EarthMoverDistanceFunction(torch.autograd.Function):\n @staticmethod\n def forw"
},
{
"path": "segmentation/extensions/emd/setup.py",
"chars": 642,
"preview": "\"\"\"Setup extension\n\nNotes:\n If extra_compile_args is provided, you need to provide different instances for different "
},
{
"path": "segmentation/extensions/emd/test_emd_loss.py",
"chars": 1223,
"preview": "import torch\nimport numpy as np\nimport time\nfrom emd import earth_mover_distance\n\n# gt\np1 = torch.from_numpy(np.array([["
},
{
"path": "segmentation/logger.py",
"chars": 8884,
"preview": "import logging\nimport torch.distributed as dist\n\nimport copy\nimport logging\nimport os\nfrom collections import defaultdic"
},
{
"path": "segmentation/main.py",
"chars": 15552,
"preview": "\"\"\"\nAuthor: Benny\nDate: Nov 2019\n\"\"\"\nimport argparse\nimport os\nimport torch\nimport datetime\nimport logging\nimport sys\nim"
},
{
"path": "segmentation/misc.py",
"chars": 7920,
"preview": "import numpy as np\nimport matplotlib.pyplot as plt\nfrom mpl_toolkits.mplot3d import Axes3D\nimport random\nimport torch\nim"
},
{
"path": "segmentation/models/gpt2_seg.py",
"chars": 4310,
"preview": "import torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nfrom timm.models.layers import DropPath, trunc_normal"
},
{
"path": "segmentation/models/pointnet2_utils.py",
"chars": 11163,
"preview": "import torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nfrom time import time\nimport numpy as np\n\ndef timeit("
},
{
"path": "segmentation/models/pt.py",
"chars": 20353,
"preview": "import torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nfrom timm.models.layers import DropPath, trunc_normal"
},
{
"path": "segmentation/models/z_order.py",
"chars": 2812,
"preview": "import numpy as np\n\n\ndef round_to_int_32(data):\n \"\"\"\n Takes a Numpy array of float values between\n -1 and 1, an"
},
{
"path": "segmentation/pointnet_util.py",
"chars": 11280,
"preview": "import torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nfrom time import time\nimport numpy as np\n\n\n# referenc"
},
{
"path": "segmentation/provider.py",
"chars": 9955,
"preview": "import numpy as np\n\ndef normalize_data(batch_data):\n \"\"\" Normalize the batch data, use coordinates of the block cente"
},
{
"path": "tools/__init__.py",
"chars": 226,
"preview": "# from .runner import run_net\nfrom .runner import test_net\nfrom .runner_pretrain import run_net as pretrain_run_net\nfrom"
},
{
"path": "tools/builder.py",
"chars": 7865,
"preview": "import os\nimport sys\n# online package\nimport torch\n# optimizer\nimport torch.optim as optim\n# dataloader\nfrom datasets im"
},
{
"path": "tools/runner.py",
"chars": 3973,
"preview": "import torch\nimport torch.nn as nn\nimport os\nimport json\nfrom tools import builder\nfrom utils import misc, dist_utils\nim"
},
{
"path": "tools/runner_finetune.py",
"chars": 17953,
"preview": "import torch\nimport torch.nn as nn\nfrom tools import builder\nfrom utils import misc, dist_utils\nimport time\nfrom utils.l"
},
{
"path": "tools/runner_pretrain.py",
"chars": 10198,
"preview": "import torch\nimport torch.nn as nn\nimport os\nimport json\nfrom tools import builder\nfrom utils import misc, dist_utils\nim"
},
{
"path": "utils/AverageMeter.py",
"chars": 1361,
"preview": "\nclass AverageMeter(object):\n def __init__(self, items=None):\n self.items = items\n self.n_items = 1 if "
},
{
"path": "utils/checkpoint.py",
"chars": 4056,
"preview": "#!/usr/bin/env python3\n# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n\nimport copy\nimport loggi"
},
{
"path": "utils/config.py",
"chars": 2362,
"preview": "import yaml\nfrom easydict import EasyDict\nimport os\nfrom .logger import print_log\n\n\ndef log_args_to_file(args, pre='args"
},
{
"path": "utils/dist_utils.py",
"chars": 1497,
"preview": "import os\n\nimport torch\nimport torch.multiprocessing as mp\nfrom torch import distributed as dist\n\n\n\ndef init_dist(launch"
},
{
"path": "utils/logger.py",
"chars": 4922,
"preview": "import logging\nimport torch.distributed as dist\n\nlogger_initialized = {}\n\ndef get_root_logger(log_file=None, log_level=l"
},
{
"path": "utils/misc.py",
"chars": 7915,
"preview": "import numpy as np\nimport matplotlib.pyplot as plt\nfrom mpl_toolkits.mplot3d import Axes3D\nimport random\nimport torch\nim"
},
{
"path": "utils/parser.py",
"chars": 3942,
"preview": "import os\nimport argparse\nfrom pathlib import Path\n\ndef get_args():\n parser = argparse.ArgumentParser()\n parser.ad"
},
{
"path": "utils/registry.py",
"chars": 10823,
"preview": "import inspect\nimport warnings\nfrom functools import partial\nfrom utils import config\n\nclass Registry:\n \"\"\"A registry"
}
]
About this extraction
This page contains the full source code of the CGuangyan-BIT/PointGPT GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 102 files (341.7 KB), approximately 97.0k tokens, and a symbol index with 392 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.