Full Code of bagh2178/SG-Nav for AI

main d56863c96dea cached
618 files
16.2 MB
4.3M tokens
4564 symbols
1 requests
Copy disabled (too large) Download .txt
Showing preview only (17,163K chars total). Download the full file to get everything.
Repository: bagh2178/SG-Nav
Branch: main
Commit: d56863c96dea
Files: 618
Total size: 16.2 MB

Directory structure:
gitextract_t68xqa9y/

├── .gitignore
├── GLIP/
│   ├── CODE_OF_CONDUCT.md
│   ├── DATA.md
│   ├── LICENSE
│   ├── README.md
│   ├── SECURITY.md
│   ├── SUPPORT.md
│   ├── configs/
│   │   ├── flickr/
│   │   │   ├── test.yaml
│   │   │   └── val.yaml
│   │   ├── lvis/
│   │   │   ├── minival.yaml
│   │   │   └── val.yaml
│   │   ├── odinw/
│   │   │   └── Aquarium_Aquarium_Combined.v2-raw-1024.coco.yaml
│   │   └── pretrain/
│   │       ├── glip_Swin_L.yaml
│   │       └── glip_Swin_T_O365_GoldG.yaml
│   ├── maskrcnn_benchmark/
│   │   ├── __init__.py
│   │   ├── config/
│   │   │   ├── __init__.py
│   │   │   ├── defaults.py
│   │   │   └── paths_catalog.py
│   │   ├── csrc/
│   │   │   ├── ROIAlign.h
│   │   │   ├── ROIPool.h
│   │   │   ├── SigmoidFocalLoss.h
│   │   │   ├── cpu/
│   │   │   │   ├── ROIAlign_cpu.cpp
│   │   │   │   ├── nms_cpu.cpp
│   │   │   │   ├── soft_nms.cpp
│   │   │   │   └── vision.h
│   │   │   ├── cuda/
│   │   │   │   ├── ROIAlign_cuda.cu
│   │   │   │   ├── ROIPool_cuda.cu
│   │   │   │   ├── SigmoidFocalLoss_cuda.cu
│   │   │   │   ├── deform_conv_cuda.cu
│   │   │   │   ├── deform_conv_kernel_cuda.cu
│   │   │   │   ├── deform_pool_cuda.cu
│   │   │   │   ├── deform_pool_kernel_cuda.cu
│   │   │   │   ├── ml_nms.cu
│   │   │   │   ├── nms.cu
│   │   │   │   └── vision.h
│   │   │   ├── deform_conv.h
│   │   │   ├── deform_pool.h
│   │   │   ├── ml_nms.h
│   │   │   ├── nms.h
│   │   │   └── vision.cpp
│   │   ├── data/
│   │   │   ├── __init__.py
│   │   │   ├── build.py
│   │   │   ├── collate_batch.py
│   │   │   ├── datasets/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── background.py
│   │   │   │   ├── box_label_loader.py
│   │   │   │   ├── caption.py
│   │   │   │   ├── coco.py
│   │   │   │   ├── coco_dt.py
│   │   │   │   ├── concat_dataset.py
│   │   │   │   ├── custom_distributed_sampler.py
│   │   │   │   ├── duplicate_dataset.py
│   │   │   │   ├── evaluation/
│   │   │   │   │   ├── __init__.py
│   │   │   │   │   ├── box_aug.py
│   │   │   │   │   ├── coco/
│   │   │   │   │   │   ├── __init__.py
│   │   │   │   │   │   └── coco_eval.py
│   │   │   │   │   ├── flickr/
│   │   │   │   │   │   ├── __init__.py
│   │   │   │   │   │   └── flickr_eval.py
│   │   │   │   │   ├── lvis/
│   │   │   │   │   │   ├── _change_lvis_annotation.py
│   │   │   │   │   │   ├── lvis.py
│   │   │   │   │   │   └── lvis_eval.py
│   │   │   │   │   ├── od_eval.py
│   │   │   │   │   ├── od_to_grounding/
│   │   │   │   │   │   ├── __init__.py
│   │   │   │   │   │   └── od_eval.py
│   │   │   │   │   ├── vg/
│   │   │   │   │   │   ├── __init__.py
│   │   │   │   │   │   └── vg_eval.py
│   │   │   │   │   └── voc/
│   │   │   │   │       ├── __init__.py
│   │   │   │   │       └── voc_eval.py
│   │   │   │   ├── flickr.py
│   │   │   │   ├── gqa.py
│   │   │   │   ├── imagenet.py
│   │   │   │   ├── list_dataset.py
│   │   │   │   ├── lvis.py
│   │   │   │   ├── mixed.py
│   │   │   │   ├── mixup.py
│   │   │   │   ├── modulated_coco.py
│   │   │   │   ├── object365.py
│   │   │   │   ├── od_to_grounding.py
│   │   │   │   ├── phrasecut.py
│   │   │   │   ├── pseudo_data.py
│   │   │   │   ├── refexp.py
│   │   │   │   ├── tsv.py
│   │   │   │   ├── vg.py
│   │   │   │   └── voc.py
│   │   │   ├── samplers/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── distributed.py
│   │   │   │   ├── grouped_batch_sampler.py
│   │   │   │   └── iteration_based_batch_sampler.py
│   │   │   └── transforms/
│   │   │       ├── __init__.py
│   │   │       ├── build.py
│   │   │       └── transforms.py
│   │   ├── engine/
│   │   │   ├── __init__.py
│   │   │   ├── alter_trainer.py
│   │   │   ├── evolution.py
│   │   │   ├── inference.py
│   │   │   ├── predictor.py
│   │   │   ├── predictor_glip.py
│   │   │   ├── singlepath_trainer.py
│   │   │   ├── stage_trainer.py
│   │   │   └── trainer.py
│   │   ├── layers/
│   │   │   ├── __init__.py
│   │   │   ├── batch_norm.py
│   │   │   ├── deform_conv.py
│   │   │   ├── deform_pool.py
│   │   │   ├── dropblock.py
│   │   │   ├── dyhead.py
│   │   │   ├── dyrelu.py
│   │   │   ├── evonorm.py
│   │   │   ├── iou_loss.py
│   │   │   ├── misc.py
│   │   │   ├── nms.py
│   │   │   ├── roi_align.py
│   │   │   ├── roi_pool.py
│   │   │   ├── se.py
│   │   │   ├── set_loss.py
│   │   │   ├── sigmoid_focal_loss.py
│   │   │   └── smooth_l1_loss.py
│   │   ├── modeling/
│   │   │   ├── __init__.py
│   │   │   ├── backbone/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── bifpn.py
│   │   │   │   ├── blocks.py
│   │   │   │   ├── efficientdet.py
│   │   │   │   ├── efficientnet.py
│   │   │   │   ├── fbnet.py
│   │   │   │   ├── fpn.py
│   │   │   │   ├── mixer.py
│   │   │   │   ├── ops.py
│   │   │   │   ├── resnet.py
│   │   │   │   ├── swint.py
│   │   │   │   ├── swint_v2.py
│   │   │   │   ├── swint_v2_vl.py
│   │   │   │   └── swint_vl.py
│   │   │   ├── balanced_positive_negative_sampler.py
│   │   │   ├── box_coder.py
│   │   │   ├── detector/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── generalized_rcnn.py
│   │   │   │   └── generalized_vl_rcnn.py
│   │   │   ├── language_backbone/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── backbone.py
│   │   │   │   ├── bert_model.py
│   │   │   │   ├── build.py
│   │   │   │   ├── clip_model.py
│   │   │   │   ├── hfpt_tokenizer.py
│   │   │   │   ├── rnn_model.py
│   │   │   │   ├── simple_tokenizer.py
│   │   │   │   ├── test_clip_tokenizer.py
│   │   │   │   └── word_utils.py
│   │   │   ├── make_layers.py
│   │   │   ├── matcher.py
│   │   │   ├── poolers.py
│   │   │   ├── registry.py
│   │   │   ├── roi_heads/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── box_head/
│   │   │   │   │   ├── __init__.py
│   │   │   │   │   ├── box_head.py
│   │   │   │   │   ├── inference.py
│   │   │   │   │   ├── loss.py
│   │   │   │   │   ├── roi_box_feature_extractors.py
│   │   │   │   │   └── roi_box_predictors.py
│   │   │   │   ├── keypoint_head/
│   │   │   │   │   ├── inference.py
│   │   │   │   │   ├── keypoint_head.py
│   │   │   │   │   ├── loss.py
│   │   │   │   │   ├── roi_keypoint_feature_extractors.py
│   │   │   │   │   └── roi_keypoint_predictors.py
│   │   │   │   └── mask_head/
│   │   │   │       ├── __init__.py
│   │   │   │       ├── hourglass.py
│   │   │   │       ├── inference.py
│   │   │   │       ├── loss.py
│   │   │   │       ├── mask_head.py
│   │   │   │       ├── roi_mask_feature_extractors.py
│   │   │   │       └── roi_mask_predictors.py
│   │   │   ├── rpn/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── anchor_generator.py
│   │   │   │   ├── atss.py
│   │   │   │   ├── dyhead.py
│   │   │   │   ├── fcos.py
│   │   │   │   ├── inference.py
│   │   │   │   ├── loss.py
│   │   │   │   ├── modeling_bert.py
│   │   │   │   ├── retina.py
│   │   │   │   ├── rpn.py
│   │   │   │   ├── transformer.py
│   │   │   │   └── vldyhead.py
│   │   │   └── utils.py
│   │   ├── solver/
│   │   │   ├── __init__.py
│   │   │   ├── build.py
│   │   │   └── lr_scheduler.py
│   │   ├── structures/
│   │   │   ├── __init__.py
│   │   │   ├── bounding_box.py
│   │   │   ├── boxlist_ops.py
│   │   │   ├── image_list.py
│   │   │   ├── keypoint.py
│   │   │   └── segmentation_mask.py
│   │   └── utils/
│   │       ├── README.md
│   │       ├── __init__.py
│   │       ├── amp.py
│   │       ├── big_model_loading.py
│   │       ├── c2_model_loading.py
│   │       ├── checkpoint.py
│   │       ├── collect_env.py
│   │       ├── comm.py
│   │       ├── cv2_util.py
│   │       ├── dist.py
│   │       ├── ema.py
│   │       ├── env.py
│   │       ├── flops.py
│   │       ├── fuse_helper.py
│   │       ├── imports.py
│   │       ├── logger.py
│   │       ├── mdetr_dist.py
│   │       ├── metric_logger.py
│   │       ├── miscellaneous.py
│   │       ├── model_serialization.py
│   │       ├── model_zoo.py
│   │       ├── pretrain_model_loading.py
│   │       ├── registry.py
│   │       ├── shallow_contrastive_loss_helper.py
│   │       └── stats.py
│   ├── setup.py
│   └── tools/
│       ├── cityscapes/
│       │   ├── convert_cityscapes_to_coco.py
│       │   └── instances2dict_with_polygons.py
│       ├── eval_all.py
│       ├── finetune.py
│       ├── test_grounding_net.py
│       ├── test_net.py
│       └── train_net.py
├── GroundingDINO/
│   ├── LICENSE
│   ├── README.md
│   ├── demo/
│   │   ├── gradio_app.py
│   │   └── inference_on_a_image.py
│   ├── groundingdino/
│   │   ├── __init__.py
│   │   ├── config/
│   │   │   ├── GroundingDINO_SwinB.py
│   │   │   └── GroundingDINO_SwinT_OGC.py
│   │   ├── datasets/
│   │   │   ├── __init__.py
│   │   │   └── transforms.py
│   │   ├── models/
│   │   │   ├── GroundingDINO/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── backbone/
│   │   │   │   │   ├── __init__.py
│   │   │   │   │   ├── backbone.py
│   │   │   │   │   ├── position_encoding.py
│   │   │   │   │   └── swin_transformer.py
│   │   │   │   ├── bertwarper.py
│   │   │   │   ├── csrc/
│   │   │   │   │   ├── MsDeformAttn/
│   │   │   │   │   │   ├── ms_deform_attn.h
│   │   │   │   │   │   ├── ms_deform_attn_cpu.cpp
│   │   │   │   │   │   ├── ms_deform_attn_cpu.h
│   │   │   │   │   │   ├── ms_deform_attn_cuda.cu
│   │   │   │   │   │   ├── ms_deform_attn_cuda.h
│   │   │   │   │   │   └── ms_deform_im2col_cuda.cuh
│   │   │   │   │   ├── cuda_version.cu
│   │   │   │   │   └── vision.cpp
│   │   │   │   ├── fuse_modules.py
│   │   │   │   ├── groundingdino.py
│   │   │   │   ├── ms_deform_attn.py
│   │   │   │   ├── transformer.py
│   │   │   │   ├── transformer_vanilla.py
│   │   │   │   └── utils.py
│   │   │   ├── __init__.py
│   │   │   └── registry.py
│   │   ├── util/
│   │   │   ├── __init__.py
│   │   │   ├── box_ops.py
│   │   │   ├── get_tokenlizer.py
│   │   │   ├── inference.py
│   │   │   ├── logger.py
│   │   │   ├── misc.py
│   │   │   ├── slconfig.py
│   │   │   ├── slio.py
│   │   │   ├── time_counter.py
│   │   │   ├── utils.py
│   │   │   ├── visualizer.py
│   │   │   └── vl_utils.py
│   │   └── version.py
│   ├── pyproject.toml
│   ├── requirements.txt
│   └── setup.py
├── LICENSE
├── README.md
├── SG_Nav.py
├── configs/
│   ├── challenge_objectnav2021.local.rgbd.yaml
│   ├── challenge_objectnav2022.local.rgbd.yaml
│   ├── challenge_pointnav2021.local.rgbd.yaml
│   ├── challenge_pointnav2021.local.rgbd_test_scene.yaml
│   ├── ddppo_objectnav.yaml
│   ├── ddppo_pointnav.yaml
│   └── ddppo_pointnav_.yaml
├── habitat-lab/
│   ├── .circleci/
│   │   └── config.yml
│   ├── .editorconfig
│   ├── .github/
│   │   ├── ISSUE_TEMPLATE/
│   │   │   ├── bug-report.md
│   │   │   ├── feature-request.md
│   │   │   └── questions-help-support.md
│   │   └── PULL_REQUEST_TEMPLATE.md
│   ├── .gitignore
│   ├── .pre-commit-config.yaml
│   ├── CODE_OF_CONDUCT.md
│   ├── CONTRIBUTING.md
│   ├── Dockerfile
│   ├── LICENSE
│   ├── MANIFEST.in
│   ├── README.md
│   ├── configs/
│   │   ├── baselines/
│   │   │   └── ppo.yaml
│   │   ├── datasets/
│   │   │   ├── eqa/
│   │   │   │   └── mp3d.yaml
│   │   │   ├── imagenav/
│   │   │   │   ├── gibson.yaml
│   │   │   │   └── mp3d.yaml
│   │   │   ├── objectnav/
│   │   │   │   ├── hm3d.yaml
│   │   │   │   └── mp3d.yaml
│   │   │   ├── pointnav/
│   │   │   │   ├── gibson.yaml
│   │   │   │   ├── gibson_0_plus.yaml
│   │   │   │   ├── gibson_v2.yaml
│   │   │   │   ├── habitat_test.yaml
│   │   │   │   ├── hm3d.yaml
│   │   │   │   └── mp3d.yaml
│   │   │   ├── rearrangepick/
│   │   │   │   └── replica_cad.yaml
│   │   │   ├── single_episode.yaml
│   │   │   └── vln/
│   │   │       └── mp3d_r2r.yaml
│   │   ├── tasks/
│   │   │   ├── eqa_mp3d.yaml
│   │   │   ├── imagenav.yaml
│   │   │   ├── imagenav_gibson.yaml
│   │   │   ├── objectnav_hm3d.yaml
│   │   │   ├── objectnav_mp3d.yaml
│   │   │   ├── pointnav.yaml
│   │   │   ├── pointnav_gibson.yaml
│   │   │   ├── pointnav_hm3d.yaml
│   │   │   ├── pointnav_mp3d.yaml
│   │   │   ├── pointnav_rgbd.yaml
│   │   │   ├── rearrange/
│   │   │   │   ├── pick.yaml
│   │   │   │   ├── pick_spa.yaml
│   │   │   │   ├── pick_state.yaml
│   │   │   │   └── play.yaml
│   │   │   └── vln_r2r.yaml
│   │   └── test/
│   │       ├── habitat_all_sensors_test.yaml
│   │       ├── habitat_mp3d_eqa_test.yaml
│   │       ├── habitat_mp3d_object_nav_test.yaml
│   │       ├── habitat_r2r_vln_test.yaml
│   │       └── new_keys_test.yaml
│   ├── docs/
│   │   ├── .gitignore
│   │   ├── build-public.sh
│   │   ├── build.sh
│   │   ├── conf-public.py
│   │   ├── conf.py
│   │   ├── docs.rst
│   │   └── pages/
│   │       ├── habitat-lab-demo.rst
│   │       ├── habitat-sim-demo.rst
│   │       ├── index.rst
│   │       ├── quickstart.rst
│   │       └── view-transform-warp.rst
│   ├── examples/
│   │   ├── __init__.py
│   │   ├── benchmark.py
│   │   ├── example.py
│   │   ├── example_pointnav.py
│   │   ├── interactive_play.py
│   │   ├── new_actions.py
│   │   ├── register_new_sensors_and_measures.py
│   │   ├── shortest_path_follower_example.py
│   │   ├── tutorials/
│   │   │   ├── colabs/
│   │   │   │   └── Habitat_Lab.ipynb
│   │   │   └── nb_python/
│   │   │       └── Habitat_Lab.py
│   │   ├── visualization_examples.py
│   │   ├── vln_benchmark.py
│   │   └── vln_reference_path_follower_example.py
│   ├── habitat/
│   │   ├── __init__.py
│   │   ├── config/
│   │   │   ├── __init__.py
│   │   │   └── default.py
│   │   ├── core/
│   │   │   ├── __init__.py
│   │   │   ├── agent.py
│   │   │   ├── benchmark.py
│   │   │   ├── challenge.py
│   │   │   ├── dataset.py
│   │   │   ├── embodied_task.py
│   │   │   ├── env.py
│   │   │   ├── environments.py
│   │   │   ├── logging.py
│   │   │   ├── registry.py
│   │   │   ├── simulator.py
│   │   │   ├── spaces.py
│   │   │   ├── utils.py
│   │   │   └── vector_env.py
│   │   ├── datasets/
│   │   │   ├── __init__.py
│   │   │   ├── eqa/
│   │   │   │   ├── __init__.py
│   │   │   │   └── mp3d_eqa_dataset.py
│   │   │   ├── object_nav/
│   │   │   │   ├── __init__.py
│   │   │   │   └── object_nav_dataset.py
│   │   │   ├── pointnav/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── pointnav_dataset.py
│   │   │   │   └── pointnav_generator.py
│   │   │   ├── rearrange/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── configs/
│   │   │   │   │   ├── pick/
│   │   │   │   │   │   └── counter.yaml
│   │   │   │   │   └── test_config.yaml
│   │   │   │   ├── generate_episode_inits.py
│   │   │   │   ├── rearrange_dataset.py
│   │   │   │   ├── rearrange_generator.py
│   │   │   │   ├── receptacle.py
│   │   │   │   └── samplers.py
│   │   │   ├── registration.py
│   │   │   ├── utils.py
│   │   │   └── vln/
│   │   │       ├── __init__.py
│   │   │       └── r2r_vln_dataset.py
│   │   ├── py.typed
│   │   ├── sims/
│   │   │   ├── __init__.py
│   │   │   ├── habitat_simulator/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── actions.py
│   │   │   │   ├── debug_visualizer.py
│   │   │   │   ├── habitat_simulator.py
│   │   │   │   └── sim_utilities.py
│   │   │   ├── pyrobot/
│   │   │   │   ├── __init__.py
│   │   │   │   └── pyrobot.py
│   │   │   └── registration.py
│   │   ├── tasks/
│   │   │   ├── __init__.py
│   │   │   ├── eqa/
│   │   │   │   ├── __init__.py
│   │   │   │   └── eqa.py
│   │   │   ├── nav/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── nav.py
│   │   │   │   ├── object_nav_task.py
│   │   │   │   └── shortest_path_follower.py
│   │   │   ├── rearrange/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── actions.py
│   │   │   │   ├── grip_actions.py
│   │   │   │   ├── marker_info.py
│   │   │   │   ├── policy_modules.py
│   │   │   │   ├── rearrange_grasp_manager.py
│   │   │   │   ├── rearrange_sensors.py
│   │   │   │   ├── rearrange_sim.py
│   │   │   │   ├── rearrange_task.py
│   │   │   │   ├── sub_tasks/
│   │   │   │   │   ├── pick_sensors.py
│   │   │   │   │   └── pick_task.py
│   │   │   │   └── utils.py
│   │   │   ├── registration.py
│   │   │   ├── utils.py
│   │   │   └── vln/
│   │   │       ├── __init__.py
│   │   │       └── vln.py
│   │   ├── utils/
│   │   │   ├── __init__.py
│   │   │   ├── common.py
│   │   │   ├── geometry_utils.py
│   │   │   ├── pickle5_multiprocessing.py
│   │   │   ├── profiling_wrapper.py
│   │   │   ├── test_utils.py
│   │   │   └── visualizations/
│   │   │       ├── __init__.py
│   │   │       ├── fog_of_war.py
│   │   │       ├── maps.py
│   │   │       └── utils.py
│   │   └── version.py
│   ├── habitat_baselines/
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── agents/
│   │   │   ├── __init__.py
│   │   │   ├── benchmark_gym.py
│   │   │   ├── mp_agents.py
│   │   │   ├── ppo_agents.py
│   │   │   ├── simple_agents.py
│   │   │   └── slam_agents.py
│   │   ├── common/
│   │   │   ├── __init__.py
│   │   │   ├── base_il_trainer.py
│   │   │   ├── base_trainer.py
│   │   │   ├── baseline_registry.py
│   │   │   ├── environments.py
│   │   │   ├── obs_transformers.py
│   │   │   ├── rollout_storage.py
│   │   │   ├── tensor_dict.py
│   │   │   └── tensorboard_utils.py
│   │   ├── config/
│   │   │   ├── __init__.py
│   │   │   ├── default.py
│   │   │   ├── eqa/
│   │   │   │   ├── il_eqa_cnn_pretrain.yaml
│   │   │   │   ├── il_pacman_nav.yaml
│   │   │   │   └── il_vqa.yaml
│   │   │   ├── imagenav/
│   │   │   │   ├── ddppo_imagenav_example.yaml
│   │   │   │   ├── ddppo_imagenav_gibson.yaml
│   │   │   │   └── ppo_imagenav_example.yaml
│   │   │   ├── objectnav/
│   │   │   │   ├── ddppo_objectnav.yaml
│   │   │   │   └── ddppo_objectnav_hm3d.yaml
│   │   │   ├── pointnav/
│   │   │   │   ├── ddppo_pointnav.yaml
│   │   │   │   ├── ppo_pointnav.yaml
│   │   │   │   ├── ppo_pointnav_example.yaml
│   │   │   │   └── ppo_pointnav_habitat_iccv19.yaml
│   │   │   ├── rearrange/
│   │   │   │   ├── rl_pick.yaml
│   │   │   │   ├── rl_pick_state.yaml
│   │   │   │   └── spap_pick.yaml
│   │   │   └── test/
│   │   │       ├── ddppo_imagenav_test.yaml
│   │   │       ├── ddppo_pointnav_test.yaml
│   │   │       ├── ppo_imagenav_test.yaml
│   │   │       └── ppo_pointnav_test.yaml
│   │   ├── il/
│   │   │   ├── README.md
│   │   │   ├── __init__.py
│   │   │   ├── metrics.py
│   │   │   ├── models/
│   │   │   │   ├── __init__.py
│   │   │   │   └── models.py
│   │   │   ├── requirements.txt
│   │   │   └── trainers/
│   │   │       ├── __init__.py
│   │   │       ├── eqa_cnn_pretrain_trainer.py
│   │   │       ├── pacman_trainer.py
│   │   │       └── vqa_trainer.py
│   │   ├── motion_planning/
│   │   │   ├── __init__.py
│   │   │   ├── grasp_generator.py
│   │   │   ├── motion_plan.py
│   │   │   ├── mp_sim.py
│   │   │   ├── mp_spaces.py
│   │   │   └── robot_target.py
│   │   ├── py.typed
│   │   ├── rl/
│   │   │   ├── __init__.py
│   │   │   ├── ddppo/
│   │   │   │   ├── README.md
│   │   │   │   ├── __init__.py
│   │   │   │   ├── algo/
│   │   │   │   │   ├── __init__.py
│   │   │   │   │   └── ddppo.py
│   │   │   │   ├── data_generation/
│   │   │   │   │   ├── create_gibson_large_dataset.py
│   │   │   │   │   └── gibson_dset_with_qual.json
│   │   │   │   ├── ddp_utils.py
│   │   │   │   ├── multi_node_slurm.sh
│   │   │   │   ├── policy/
│   │   │   │   │   ├── __init__.py
│   │   │   │   │   ├── resnet.py
│   │   │   │   │   ├── resnet_policy.py
│   │   │   │   │   └── running_mean_and_var.py
│   │   │   │   ├── requirements.txt
│   │   │   │   └── single_node.sh
│   │   │   ├── models/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── rnn_state_encoder.py
│   │   │   │   └── simple_cnn.py
│   │   │   ├── ppo/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── policy.py
│   │   │   │   ├── ppo.py
│   │   │   │   └── ppo_trainer.py
│   │   │   └── requirements.txt
│   │   ├── run.py
│   │   ├── slambased/
│   │   │   ├── README.md
│   │   │   ├── __init__.py
│   │   │   ├── install_deps.sh
│   │   │   ├── mappers.py
│   │   │   ├── monodepth.py
│   │   │   ├── path_planners.py
│   │   │   ├── reprojection.py
│   │   │   ├── requirements.txt
│   │   │   └── utils.py
│   │   └── utils/
│   │       ├── __init__.py
│   │       ├── common.py
│   │       ├── env_utils.py
│   │       ├── gym_adapter.py
│   │       ├── gym_definitions.py
│   │       ├── render_wrapper.py
│   │       └── visualizations/
│   │           ├── __init__.py
│   │           └── utils.py
│   ├── mypy.ini
│   ├── pyproject.toml
│   ├── requirements.txt
│   ├── scripts/
│   │   └── generate_profile_shell_scripts.py
│   ├── setup.cfg
│   ├── setup.py
│   └── test/
│       ├── test_baseline_agents.py
│       ├── test_baseline_trainers.py
│       ├── test_config.py
│       ├── test_dataset.py
│       ├── test_ddppo_reduce.py
│       ├── test_demo_notebook.py
│       ├── test_examples.py
│       ├── test_gym_wrapper.py
│       ├── test_habitat_env.py
│       ├── test_habitat_example.py
│       ├── test_habitat_sim.py
│       ├── test_habitat_task.py
│       ├── test_install.py
│       ├── test_motion_plan.py
│       ├── test_mp3d_eqa.py
│       ├── test_object_nav_task.py
│       ├── test_pointnav_dataset.py
│       ├── test_pyrobot.py
│       ├── test_r2r_vln.py
│       ├── test_rearrange_task.py
│       ├── test_rnn_state_encoder.py
│       ├── test_sensors.py
│       ├── test_spaces.py
│       ├── test_tensor_dict.py
│       └── test_visual_utils.py
├── requirements.txt
├── scenegraph.py
├── segment_anything/
│   ├── .flake8
│   ├── CODE_OF_CONDUCT.md
│   ├── CONTRIBUTING.md
│   ├── LICENSE
│   ├── README.md
│   ├── linter.sh
│   ├── notebooks/
│   │   ├── automatic_mask_generator_example.ipynb
│   │   ├── onnx_model_example.ipynb
│   │   └── predictor_example.ipynb
│   ├── scripts/
│   │   ├── amg.py
│   │   └── export_onnx_model.py
│   ├── segment_anything/
│   │   ├── __init__.py
│   │   ├── automatic_mask_generator.py
│   │   ├── build_sam.py
│   │   ├── build_sam_hq.py
│   │   ├── modeling/
│   │   │   ├── __init__.py
│   │   │   ├── common.py
│   │   │   ├── image_encoder.py
│   │   │   ├── mask_decoder.py
│   │   │   ├── mask_decoder_hq.py
│   │   │   ├── prompt_encoder.py
│   │   │   ├── sam.py
│   │   │   └── transformer.py
│   │   ├── predictor.py
│   │   └── utils/
│   │       ├── __init__.py
│   │       ├── amg.py
│   │       ├── onnx.py
│   │       └── transforms.py
│   ├── setup.cfg
│   └── setup.py
├── tools/
│   ├── agent.py
│   ├── download_mp.py
│   ├── matterport_category_mappings.tsv
│   ├── obj.npy
│   ├── replica.yaml
│   └── room.npy
└── utils/
    ├── __init__.py
    ├── image_process.py
    ├── utils_fmm/
    │   ├── __init__.py
    │   ├── control_helper.py
    │   ├── depth_utils.py
    │   ├── fmm_planner.py
    │   ├── mapping.py
    │   ├── model.py
    │   ├── pose_utils.py
    │   └── rotation_utils.py
    ├── utils_glip.py
    └── utils_scenegraph/
        ├── __init__.py
        ├── grounded_sam_demo.py
        ├── iou.py
        ├── mapping.py
        ├── slam_classes.py
        └── utils.py

================================================
FILE CONTENTS
================================================

================================================
FILE: .gitignore
================================================
**/__pycache__/
glip_large_model.pth
/data
/GLIP/build
/GLIP/maskrcnn_benchmark.egg-info
/segment_anything/segment_anything.egg-info/
*.so
segment_anything/sam_vit_*
GroundingDINO/groundingdino_swint_ogc.pth
.vscode
start.py
start_multiprocess.py

================================================
FILE: GLIP/CODE_OF_CONDUCT.md
================================================
# Microsoft Open Source Code of Conduct

This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).

Resources:

- [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/)
- [Microsoft Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/)
- Contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with questions or concerns


================================================
FILE: GLIP/DATA.md
================================================
We provide guidance for preparing the data used by GLIP. Note that not all data are needed for a specific experiments. Please check the `` Required Data`` fields in [README](README.md) to download necessary data. All data should by placed under the ``DATASET`` folder.


#### ``COCO``
Download the original [COCO](https://cocodataset.org/#download) data into ``DATASET/coco`` folder. The contents should be organized as follows:

###### train2017
    DATASET/coco/train2017
    DATASET/coco/annotations/instances_train2017.json

###### val2017
    DATASET/coco/val2017
    DATASET/coco/annotations/instances_val2017.json
###### test2017
    DATASET/coco/test2017
    DATASET/coco/annotations/image_info_test-dev2017.json
###### train2014
    DATASET/coco/train2014

#### ``LVIS``
LVIS use the same images as COCO. Thus prepare the COCO images first.

    DATASET/coco

Download the following annotation files:

    "wget https://penzhanwu2bbs.blob.core.windows.net/data/GLIPv1_Open/coco/annotations/lvis_v1_minival_inserted_image_name.json -O DATASET/coco/annotations/lvis_v1_minival_inserted_image_name.json"
    "wget https://penzhanwu2bbs.blob.core.windows.net/data/GLIPv1_Open/coco/annotations/lvis_od_val.json -O coco/annotations/lvis_od_val.json"

#### ``Object Detection in the Wild (ODinW)``
Please see the "ODinW / Custom Dataset Evaluation" section in [README.md](README.md) for preparing the Aquarium dataset. We will release all the data in ODinW in the next update.


#### ``Objects365``
We store Objects365 data in the TSV format. Please see [link](https://github.com/microsoft/scene_graph_benchmark/tree/main/tools/mini_tsv) for a description of the TSV format. We provide the annotation files:

    wget https://penzhanwu2bbs.blob.core.windows.net/data/GLIPv1_Open/Objects365/objects365_train_vgoiv6.cas2000.yaml -O DATASET/Objects365/objects365_train_vgoiv6.cas2000.yaml
    wget https://penzhanwu2bbs.blob.core.windows.net/data/GLIPv1_Open/Objects365/train.label.tsv -O DATASET/Objects365/train.label.tsv
    wget https://penzhanwu2bbs.blob.core.windows.net/data/GLIPv1_Open/Objects365/train.label.linelist.cas.2000.tsv -O DATASET/Objects365/train.label.linelist.cas.2000.tsv
    wget https://penzhanwu2bbs.blob.core.windows.net/data/GLIPv1_Open/Objects365/train.label.lineidx -O DATASET/Objects365/train.label.lineidx
    wget https://penzhanwu2bbs.blob.core.windows.net/data/GLIPv1_Open/Objects365/train.hw.tsv -O DATASET/Objects365/train.hw.tsv
    wget https://penzhanwu2bbs.blob.core.windows.net/data/GLIPv1_Open/Objects365/train.hw.lineidx -O DATASET/Objects365/train.hw.lineidx
    wget https://penzhanwu2bbs.blob.core.windows.net/data/GLIPv1_Open/Objects365/object365_vgoiv6_class2ind.json -O DATASET/Objects365/object365_vgoiv6_class2ind.json

We cannot host the image data. Please download the original image data and organize them into ``DATASET/Objects365/images.tsv`` and ``DATASET/Objects365/images.lineidx``.
    
#### ``Flickr30K``
Download the Flickr30K images from [Link](http://shannon.cs.illinois.edu/DenotationGraph/) and put them under ``DATASET/flickr30k/flickr30k_images/``. Download the [MDETR annotations](https://zenodo.org/record/4729015/files/mdetr_annotations.tar.gz?download=1) and put them under ``DATASET/mdetr_annotations/``. The dataset structure should look like:

    DATASET/flickr30k/flickr30k_images/
    DATASET/mdetr_annotations/final_flickr_separateGT_*

#### ``MixedGrounding``
This is the grounding dataset curated by [MDETR](https://github.com/ashkamath/mdetr/blob/main/.github/pretrain.md). 
Please prepare the COCO train2014 data and put them under ``DATASET/coco/train2014``. 
Prepare the [GQA images](https://nlp.stanford.edu/data/gqa/images.zip) and put them under ``DATASET/gqa/images/``. 

Then download the annotation files. The original MDETR annotation file contains COCO images; we provide a version without COCO images: ``wget https://penzhanwu2bbs.blob.core.windows.net/data/GLIPv1_Open/mdetr_annotations/final_mixed_train_no_coco.json -O DATASET/mdetr_annotations/final_mixed_train_no_coco.json``.

The dataset structure should look like:

    "DATASET/coco/train2014" 
    "DATASET/gqa/images"
    "DATASET/mdetr_annotations/final_mixed_train_no_coco.json",

#### ``GCC``
Goolge conceptual captions with pseudo-grounding annotations.
To be released in the next update.

================================================
FILE: GLIP/LICENSE
================================================
    MIT License

    Copyright (c) Microsoft Corporation.

    Permission is hereby granted, free of charge, to any person obtaining a copy
    of this software and associated documentation files (the "Software"), to deal
    in the Software without restriction, including without limitation the rights
    to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
    copies of the Software, and to permit persons to whom the Software is
    furnished to do so, subject to the following conditions:

    The above copyright notice and this permission notice shall be included in all
    copies or substantial portions of the Software.

    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
    IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
    AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
    LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
    OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
    SOFTWARE


================================================
FILE: GLIP/README.md
================================================
# GLIP: Grounded Language-Image Pre-training  

<img src="docs/main_model.png" width="800"> 

## Updates
04/30/2022: Updated [Demo](https://colab.research.google.com/drive/12x7v-_miN7-SRiziK3Cx4ffJzstBJNqb?usp=sharing)!

04/14/2022: GLIP has been accepted to CVPR 2022 as an oral presentation! First version of code and pre-trained models are released!

12/06/2021: GLIP paper on arxiv https://arxiv.org/abs/2112.03857.

11/23/2021: Project page built. <br/>

## Introduction
This repository is the project page for [GLIP](https://arxiv.org/abs/2112.03857).  GLIP demonstrate strong zero-shot and few-shot transferability to various object-level recognition tasks. 

1. When directly evaluated on COCO and LVIS (without seeing any images in COCO), GLIP achieves 49.8 AP and 26.9 AP, respectively, surpassing many supervised baselines.
2. After fine-tuned on COCO, GLIP achieves 60.8 AP on val and 61.5 AP on test-dev, surpassing prior SoTA.
3. When transferred to 13 downstream object detection tasks, a few-shot GLIP rivals with a fully-supervised Dynamic Head.

We provide code for:

1. **pre-training** GLIP on detection and grounding data;
2. **zero-shot evaluating** GLIP on standard benchmarks (COCO, LVIS, Flickr30K) and custom COCO-formated datasets;
3. **fine-tuning** GLIP on standard benchmarks (COCO) and custom COCO-formated datasets;
4. **a Colab demo**.

Please see respective sections for instructions.

## Demo
Please see a Colab demo at [link](https://colab.research.google.com/drive/12x7v-_miN7-SRiziK3Cx4ffJzstBJNqb?usp=sharing)!

## Installation and Setup

***Environment***
This repo requires Pytorch>=1.9 and torchvision. We recommand using docker to setup the environment. You can use this pre-built docker image ``docker pull pengchuanzhang/maskrcnn:ubuntu18-py3.7-cuda10.2-pytorch1.9`` or this one ``docker pull pengchuanzhang/pytorch:ubuntu20.04_torch1.9-cuda11.3-nccl2.9.9`` depending on your GPU.

Then install the following packages:
```
pip install einops shapely timm yacs tensorboardX ftfy prettytable pymongo
pip install transformers 
python setup.py build develop --user
```

***Backbone Checkpoints.*** Download the ImageNet pre-trained backbone checkpoints into the ``MODEL`` folder. 
```
mkdir MODEL
wget https://penzhanwu2bbs.blob.core.windows.net/data/GLIPv1_Open/models/swin_tiny_patch4_window7_224.pth -O MODEL/swin_tiny_patch4_window7_224.pth
wget https://penzhanwu2bbs.blob.core.windows.net/data/GLIPv1_Open/models/swin_large_patch4_window12_384_22k.pth -O MODEL/swin_large_patch4_window12_384_22k.pth
```


## Model Zoo

Model | COCO [1] | LVIS [2] | LVIS [3] | ODinW [4] | Pre-Train Data | Config  | Weight
-- | -- | -- | -- | -- | -- | -- | --
GLIP-T (C) | 46.7 / 55.1 | 14.3 | [17.7](https://penzhanwu2bbs.blob.core.windows.net/data/GLIPv1_Open/models/glip_tiny_model_o365_goldg_lvisbest.pth) | 44.4 | Objects365,GoldG | [config](configs/pretrain/glip_Swin_T_O365_GoldG.yaml) | [weight](https://penzhanwu2bbs.blob.core.windows.net/data/GLIPv1_Open/models/glip_tiny_model_o365_goldg.pth)
GLIP-T [5]  | 46.6 / 55.2  | 17.6  | [20.1](https://penzhanwu2bbs.blob.core.windows.net/data/GLIPv1_Open/models/glip_tiny_model_o365_goldg_cc_sbu_lvisbest.pth) | 42.7 | Objects365,GoldG,CC3M,SBU | [config](configs/pretrain/glip_Swin_T_O365_GoldG.yaml) [6] | [weight](https://penzhanwu2bbs.blob.core.windows.net/data/GLIPv1_Open/models/glip_tiny_model_o365_goldg_cc_sbu.pth)
GLIP-L [7] | 51.4 / 61.7 [8]  | 29.3 | [30.1](https://penzhanwu2bbs.blob.core.windows.net/data/GLIPv1_Open/models/glip_large_model_lvisbest.pth) | 51.2 | FourODs,GoldG,CC3M+12M,SBU | [config](configs/pretrain/glip_Swin_L.yaml) [9] | [weight](https://penzhanwu2bbs.blob.core.windows.net/data/GLIPv1_Open/models/glip_large_model.pth)

[1] Zero-shot and fine-tuning performance on COCO val2017.

[2] Zero-shot performance on LVIS minival (APr) with the last pre-trained checkpoint.

[3] On LVIS, the model could overfit slightly during the pre-training course. Thus we reported two numbers on LVIS: the performance of the last checkpoint (LVIS[2]) and the performance of the best checkpoint during the pre-training course (LVIS[3]).

[4] Zero-shot performance on the 13 ODinW datasets.

[5] GLIP-T released in this repo is pre-trained on Conceptual Captions 3M and SBU captions. It is referred in paper in Table 1 and in Appendix C.3. It differs slightly from the GLIP-T in the main paper in terms of downstream performance. We will release the pre-training support for using CC3M and SBU captions data in the next update.

[6] This config is only intended for zero-shot evaluation and fine-tuning. Pre-training config with support for using CC3M and SBU captions data will be updated.

[7] GLIP-L released in this repo is pre-trained on Conceptual Captions 3M+12M and SBU captions. It slightly outperforms the GLIP-L in the main paper because the model used to annotate the caption data are improved compared to the main paper. We will release the pre-training support for using CC3M+12M and SBU captions data in the next update.

[8] Multi-scale testing used.

[9] This config is only intended for zero-shot evaluation and fine-tuning. Pre-training config with support for using CC3M+12M and SBU captions data to be updated.


## Pre-Training


***Required Data.***  Prepare ``Objects365``, ``Flickr30K``, and ``MixedGrounding`` data as in [DATA.md](DATA.md). Support for training using caption data (Conceptual Captions and SBU captions) will be released soon.

***Command.***

Perform pre-training with the following command (please change the config-file accordingly; checkout model zoo for the corresponding config; change the ``{output_dir}`` to your desired output directory):

```
python -m torch.distributed.launch --nnodes 2 --nproc_per_node=16 tools/train_net.py \
    --config-file configs/pretrain/glip_Swin_T_O365_GoldG.yaml \
    --skip-test --use-tensorboard --override_output_dir {output_dir}
```

For training GLIP-T models, we used `nnodes = 2`, `nproc_per_node=16` on 32GB V100 machines. For training GLIP-L models, we used `nnodes = 4`, `nproc_per_node=16` on 32GB V100 machines. Please setup the environment accordingly based on your local machine.


## (Zero-Shot) Evaluation

### COCO Evaluation

Prepare ``COCO/val2017`` data as in [DATA.md](DATA.md). Set ``{config_file}``, ``{model_checkpoint}`` according to the ``Model Zoo``; set ``{output_dir}`` to a folder where the evaluation results will be stored.

```
python tools/test_grounding_net.py --config-file {config_file} --weight {model_checkpoint} \
        TEST.IMS_PER_BATCH 1 \
        MODEL.DYHEAD.SCORE_AGG "MEAN" \
        TEST.EVAL_TASK detection \
        MODEL.DYHEAD.FUSE_CONFIG.MLM_LOSS False \
        OUTPUT_DIR {output_dir}
```

### LVIS Evaluation

We follow MDETR to evaluate with the [FixedAP](https://arxiv.org/pdf/2102.01066.pdf) criterion. Set ``{config_file}``, ``{model_checkpoint}`` according to the ``Model Zoo``. Prepare ``COCO/val2017`` data as in [DATA.md](DATA.md).

```
python -m torch.distributed.launch --nproc_per_node=4 \
        tools/test_grounding_net.py \
        --config-file {config_file} \
        --task_config configs/lvis/minival.yaml \
        --weight {model_checkpoint} \
        TEST.EVAL_TASK detection OUTPUT_DIR {output_dir} 
        TEST.CHUNKED_EVALUATION 40  TEST.IMS_PER_BATCH 4 SOLVER.IMS_PER_BATCH 4 TEST.MDETR_STYLE_AGGREGATE_CLASS_NUM 3000 MODEL.RETINANET.DETECTIONS_PER_IMG 300 MODEL.FCOS.DETECTIONS_PER_IMG 300 MODEL.ATSS.DETECTIONS_PER_IMG 300 MODEL.ROI_HEADS.DETECTIONS_PER_IMG 300
```
If you wish to evaluate on Val 1.0, set ``--task_config`` to ``configs/lvis/val.yaml``.


### ODinW / Custom Dataset Evaluation

GLIP supports easy evaluation on a custom dataset. Currently, the code supports evaluation on [COCO-formatted](https://cocodataset.org/#format-data) dataset.

We will use the [Aquarium](https://public.roboflow.com/object-detection/aquarium) dataset from ODinW as an example to show how to evaluate on a custom COCO-formatted dataset.

1. Download the raw dataset from RoboFlow in the COCO format into ``DATASET/odinw/Aquarium``. Each train/val/test split has a corresponding ``annotation`` file and a ``image`` folder. 

2. Remove the background class from the annotation file. This can be as simple as open "_annotations.coco.json" and remove the entry with "id:0" from "categories". For convenience, we provide the modified annotation files for  Aquarium:
    ```
    wget https://penzhanwu2bbs.blob.core.windows.net/data/GLIPv1_Open/odinw/Aquarium/Aquarium%20Combined.v2-raw-1024.coco/test/annotations_without_background.json -O DATASET/odinw/Aquarium/Aquarium\ Combined.v2-raw-1024.coco/test/annotations_without_background.json
    wget https://penzhanwu2bbs.blob.core.windows.net/data/GLIPv1_Open/odinw/Aquarium/Aquarium%20Combined.v2-raw-1024.coco/train/annotations_without_background.json -O DATASET/odinw/Aquarium/Aquarium\ Combined.v2-raw-1024.coco/train/annotations_without_background.json
    wget https://penzhanwu2bbs.blob.core.windows.net/data/GLIPv1_Open/odinw/Aquarium/Aquarium%20Combined.v2-raw-1024.coco/valid/annotations_without_background.json -O DATASET/odinw/Aquarium/Aquarium\ Combined.v2-raw-1024.coco/valid/annotations_without_background.json
    ```
    
4. Then create a yaml file as in ``configs/odinw/Aquarium_Aquarium_Combined.v2-raw-1024.coco.yaml``. A few fields to be noted in the yamls:

    DATASET.CAPTION_PROMPT allows manually changing the prompt (the default prompt is simply concatnating all the categories);

    MODELS.\*.NUM_CLASSES need to be set to the number of categories in the dataset (including the background class). E.g., Aquarium has 7 non-background categories thus MODELS.\*.NUM_CLASSES is set to 8;

4. Run the following command to evaluate on the dataset. Set ``{config_file}``, ``{model_checkpoint}`` according to the ``Model Zoo``. Set {odinw_configs} to the path of the task yaml file we just prepared.

```
python tools/test_grounding_net.py --config-file {config_file} --weight {model_checkpoint} \
      --task_config {odinw_configs} \
      TEST.IMS_PER_BATCH 1 SOLVER.IMS_PER_BATCH 1 \
      TEST.EVAL_TASK detection \
      DATASETS.TRAIN_DATASETNAME_SUFFIX _grounding \
      DATALOADER.DISTRIBUTE_CHUNK_AMONG_NODE False \
      DATASETS.USE_OVERRIDE_CATEGORY True \
      DATASETS.USE_CAPTION_PROMPT True
```

### Flickr30K Evaluation
Prepare ``Flickr30K`` data as in [DATA.md](DATA.md). Set ``{config_file}``, ``{model_checkpoint}`` according to the ``Model Zoo``.

```
python tools/test_grounding_net.py \
        --config-file {config_file} \
        --task_config configs/flickr/test.yaml,configs/flickr/val.yaml \
        --weight {model_checkpoint} \
        OUTPUT_DIR {output_dir} TEST.IMS_PER_BATCH 1 SOLVER.IMS_PER_BATCH 1 TEST.MDETR_STYLE_AGGREGATE_CLASS_NUM 100 TEST.EVAL_TASK grounding MODEL.DYHEAD.FUSE_CONFIG.MLM_LOSS False
```



## Fine-Tuning

### COCO Fine-Tuning
Prepare the ``COCO`` data as in [DATA.md](DATA.md). Set ``{config_file}``, ``{model_checkpoint}`` according to the ``Model Zoo``.

Below is the fine-tuning script for tuning the Tiny models:
```
python -m torch.distributed.launch --nproc_per_node=16 tools/train_net.py \
       --config-file {config_file} \
       --skip-test \
       MODEL.WEIGHT {model_checkpoint} \
       DATASETS.TRAIN '("coco_grounding_train", )' \
       MODEL.BACKBONE.FREEZE_CONV_BODY_AT -1 SOLVER.IMS_PER_BATCH 32 SOLVER.USE_AMP True SOLVER.MAX_EPOCH 24 TEST.DURING_TRAINING False TEST.IMS_PER_BATCH 16 SOLVER.FIND_UNUSED_PARAMETERS False SOLVER.BASE_LR 0.00001 SOLVER.LANG_LR 0.00001 SOLVER.STEPS \(0.67,0.89\) DATASETS.DISABLE_SHUFFLE True MODEL.DYHEAD.SCORE_AGG "MEAN" TEST.EVAL_TASK detection
```

For evaluation, please follow the instructions in ``COCO Evaluation``. Scripts for tuning the Large model will be released soon.

### ODinW / Custom Dataset Fine-Tuning
Prepare the dataset as in ``ODinW / Custom Dataset Evaluation``.

#### Full Model Fine-Tuning

For tuning with 1/3/5/10-shot, set {custom_shot_and_epoch_and_general_copy} to "1_200_8", "3_200_4", "5_200_2", "10_200_1", respectively.

For tuning with all the data, set {custom_shot_and_epoch_and_general_copy} to "0_200_1"; set SOLVER.STEP_PATIENCE to 2; set SOLVER.AUTO_TERMINATE_PATIENCE to 4.

```
python -m torch.distributed.launch --nproc_per_node=4 tools/finetune.py \
      --config-file {config_file}  --ft-tasks {configs} --skip-test \
      --custom_shot_and_epoch_and_general_copy {custom_shot_and_epoch_and_general_copy} \
      --evaluate_only_best_on_test --push_both_val_and_test \
      MODEL.WEIGHT {model_checkpoint} \
      SOLVER.USE_AMP True TEST.DURING_TRAINING True TEST.IMS_PER_BATCH 4 SOLVER.IMS_PER_BATCH 4 SOLVER.WEIGHT_DECAY 0.05 TEST.EVAL_TASK detection DATASETS.TRAIN_DATASETNAME_SUFFIX _grounding MODEL.BACKBONE.FREEZE_CONV_BODY_AT 2 MODEL.DYHEAD.USE_CHECKPOINT True SOLVER.FIND_UNUSED_PARAMETERS False SOLVER.TEST_WITH_INFERENCE True SOLVER.USE_AUTOSTEP True DATASETS.USE_OVERRIDE_CATEGORY True SOLVER.SEED 10 DATASETS.SHUFFLE_SEED 3 DATASETS.USE_CAPTION_PROMPT True DATASETS.DISABLE_SHUFFLE True \
      SOLVER.STEP_PATIENCE 3 SOLVER.CHECKPOINT_PER_EPOCH 1.0 SOLVER.AUTO_TERMINATE_PATIENCE 8 SOLVER.MODEL_EMA 0.0 SOLVER.TUNING_HIGHLEVEL_OVERRIDE full
```

#### Prompt Tuning
Follow the command as in ``Full Model Fine-Tuning``. But set the following hyper-parameters:
```
SOLVER.WEIGHT_DECAY 0.25 \
SOLVER.BASE_LR 0.05 \
SOLVER.TUNING_HIGHLEVEL_OVERRIDE language_prompt_v2
```


## Citations
Please consider citing this paper if you use the code:
```
@inproceedings{li2021grounded,
      title={Grounded Language-Image Pre-training},
      author={Liunian Harold Li* and Pengchuan Zhang* and Haotian Zhang* and Jianwei Yang and Chunyuan Li and Yiwu Zhong and Lijuan Wang and Lu Yuan and Lei Zhang and Jenq-Neng Hwang and Kai-Wei Chang and Jianfeng Gao},
      year={2022},
      booktitle={CVPR},
}
```


================================================
FILE: GLIP/SECURITY.md
================================================
<!-- BEGIN MICROSOFT SECURITY.MD V0.0.5 BLOCK -->

## Security

Microsoft takes the security of our software products and services seriously, which includes all source code repositories managed through our GitHub organizations, which include [Microsoft](https://github.com/Microsoft), [Azure](https://github.com/Azure), [DotNet](https://github.com/dotnet), [AspNet](https://github.com/aspnet), [Xamarin](https://github.com/xamarin), and [our GitHub organizations](https://opensource.microsoft.com/).

If you believe you have found a security vulnerability in any Microsoft-owned repository that meets [Microsoft's definition of a security vulnerability](https://docs.microsoft.com/en-us/previous-versions/tn-archive/cc751383(v=technet.10)), please report it to us as described below.

## Reporting Security Issues

**Please do not report security vulnerabilities through public GitHub issues.**

Instead, please report them to the Microsoft Security Response Center (MSRC) at [https://msrc.microsoft.com/create-report](https://msrc.microsoft.com/create-report).

If you prefer to submit without logging in, send email to [secure@microsoft.com](mailto:secure@microsoft.com).  If possible, encrypt your message with our PGP key; please download it from the [Microsoft Security Response Center PGP Key page](https://www.microsoft.com/en-us/msrc/pgp-key-msrc).

You should receive a response within 24 hours. If for some reason you do not, please follow up via email to ensure we received your original message. Additional information can be found at [microsoft.com/msrc](https://www.microsoft.com/msrc). 

Please include the requested information listed below (as much as you can provide) to help us better understand the nature and scope of the possible issue:

  * Type of issue (e.g. buffer overflow, SQL injection, cross-site scripting, etc.)
  * Full paths of source file(s) related to the manifestation of the issue
  * The location of the affected source code (tag/branch/commit or direct URL)
  * Any special configuration required to reproduce the issue
  * Step-by-step instructions to reproduce the issue
  * Proof-of-concept or exploit code (if possible)
  * Impact of the issue, including how an attacker might exploit the issue

This information will help us triage your report more quickly.

If you are reporting for a bug bounty, more complete reports can contribute to a higher bounty award. Please visit our [Microsoft Bug Bounty Program](https://microsoft.com/msrc/bounty) page for more details about our active programs.

## Preferred Languages

We prefer all communications to be in English.

## Policy

Microsoft follows the principle of [Coordinated Vulnerability Disclosure](https://www.microsoft.com/en-us/msrc/cvd).

<!-- END MICROSOFT SECURITY.MD BLOCK -->

================================================
FILE: GLIP/SUPPORT.md
================================================
# TODO: The maintainer of this repo has not yet edited this file

**REPO OWNER**: Do you want Customer Service & Support (CSS) support for this product/project?

- **No CSS support:** Fill out this template with information about how to file issues and get help.
- **Yes CSS support:** Fill out an intake form at [aka.ms/spot](https://aka.ms/spot). CSS will work with/help you to determine next steps. More details also available at [aka.ms/onboardsupport](https://aka.ms/onboardsupport).
- **Not sure?** Fill out a SPOT intake as though the answer were "Yes". CSS will help you decide.

*Then remove this first heading from this SUPPORT.MD file before publishing your repo.*

# Support

## How to file issues and get help  

This project uses GitHub Issues to track bugs and feature requests. Please search the existing 
issues before filing new issues to avoid duplicates.  For new issues, file your bug or 
feature request as a new Issue.

For help and questions about using this project, please **REPO MAINTAINER: INSERT INSTRUCTIONS HERE 
FOR HOW TO ENGAGE REPO OWNERS OR COMMUNITY FOR HELP. COULD BE A STACK OVERFLOW TAG OR OTHER
CHANNEL. WHERE WILL YOU HELP PEOPLE?**.

## Microsoft Support Policy  

Support for this **PROJECT or PRODUCT** is limited to the resources listed above.


================================================
FILE: GLIP/configs/flickr/test.yaml
================================================
MODEL:
  ATSS:
    NUM_CLASSES: 8 # Placeholder
  FCOS:
    NUM_CLASSES: 8 # Placeholder
  ROI_BOX_HEAD:
    NUM_CLASSES: 8 # Placeholder
  DYHEAD:
    NUM_CLASSES: 8 # Placeholder
DATASETS:
  TRAIN: ("flickr30k_test", )
  TEST: ("flickr30k_test", )
  FLICKR_GT_TYPE: "separate"

INPUT:
  MIN_SIZE_TRAIN: 800
  MAX_SIZE_TRAIN: 1333
  MIN_SIZE_TEST: 800
  MAX_SIZE_TEST: 1333
DATALOADER:
  SIZE_DIVISIBILITY: 32
  ASPECT_RATIO_GROUPING: False

================================================
FILE: GLIP/configs/flickr/val.yaml
================================================
MODEL:
  ATSS:
    NUM_CLASSES: 8 # Placeholder
  FCOS:
    NUM_CLASSES: 8 # Placeholder
  ROI_BOX_HEAD:
    NUM_CLASSES: 8 # Placeholder
  DYHEAD:
    NUM_CLASSES: 8 # Placeholder
DATASETS:
  TRAIN: ("flickr30k_val", )
  TEST: ("flickr30k_val", )
  FLICKR_GT_TYPE: "separate"

INPUT:
  MIN_SIZE_TRAIN: 800
  MAX_SIZE_TRAIN: 1333
  MIN_SIZE_TEST: 800
  MAX_SIZE_TEST: 1333
DATALOADER:
  SIZE_DIVISIBILITY: 32
  ASPECT_RATIO_GROUPING: False
SOLVER:
  WARMUP_ITERS: 0
  MAX_EPOCH: 12
  CHECKPOINT_PERIOD: 100
TEST:
  IMS_PER_BATCH: 8

================================================
FILE: GLIP/configs/lvis/minival.yaml
================================================
MODEL:
  ATSS:
    NUM_CLASSES: 8 # these fields are not used; just a placeholder
  FCOS:
    NUM_CLASSES: 8
  ROI_BOX_HEAD:
    NUM_CLASSES: 8
  DYHEAD:
    NUM_CLASSES: 8
DATASETS:
  REGISTER:
    lvis_evaluation_mini_val:
      img_dir: "coco"
      ann_file: "coco/annotations/lvis_v1_minival_inserted_image_name.json"
    lvis_evaluation_val:
      img_dir: "coco"
      ann_file: "coco/annotations/lvis_od_val.json"
  TRAIN: ("lvis_evaluation_mini_val",) 
  TEST: ("lvis_evaluation_mini_val",)

INPUT:
  MIN_SIZE_TRAIN: 800
  MAX_SIZE_TRAIN: 1333
  MIN_SIZE_TEST: 800
  MAX_SIZE_TEST: 1333
DATALOADER:
  SIZE_DIVISIBILITY: 32
  ASPECT_RATIO_GROUPING: False
TEST:
  IMS_PER_BATCH: 8


================================================
FILE: GLIP/configs/lvis/val.yaml
================================================
MODEL:
  ATSS:
    NUM_CLASSES: 8 # these fields are not used; just a placeholder
  FCOS:
    NUM_CLASSES: 8
  ROI_BOX_HEAD:
    NUM_CLASSES: 8
  DYHEAD:
    NUM_CLASSES: 8
DATASETS:
  REGISTER:
    lvis_evaluation_mini_val:
      img_dir: "coco"
      ann_file: "coco/annotations/lvis_v1_minival_inserted_image_name.json"
    lvis_evaluation_val:
      img_dir: "coco"
      ann_file: "coco/annotations/lvis_od_val.json"
  TRAIN: ("lvis_evaluation_val",) 
  TEST: ("lvis_evaluation_val",)

INPUT:
  MIN_SIZE_TRAIN: 800
  MAX_SIZE_TRAIN: 1333
  MIN_SIZE_TEST: 800
  MAX_SIZE_TEST: 1333
DATALOADER:
  SIZE_DIVISIBILITY: 32
  ASPECT_RATIO_GROUPING: False
TEST:
  IMS_PER_BATCH: 8


================================================
FILE: GLIP/configs/odinw/Aquarium_Aquarium_Combined.v2-raw-1024.coco.yaml
================================================
DATALOADER:
  ASPECT_RATIO_GROUPING: false
  SIZE_DIVISIBILITY: 32

DATASETS:
  GENERAL_COPY: 16
  CAPTION_PROMPT: '[{"prefix": " ", "name": "fish", "suffix": ""}, {"prefix": "", "name": "jellyfish", "suffix": ""}, {"prefix": "", "name": "penguin", "suffix": " , which is black and white"}, {"prefix": "", "name": "puffin", "suffix": " with orange beaks "}, {"prefix": "", "name": "shark", "suffix": ""}, {"prefix": "", "name": "starfish", "suffix": ""}, {"prefix": "", "name": "stingray", "suffix": " which is flat and round"}, ]'
  REGISTER:
    test:
      ann_file: odinw/Aquarium/Aquarium Combined.v2-raw-1024.coco/test/annotations_without_background.json
      img_dir: odinw/Aquarium/Aquarium Combined.v2-raw-1024.coco/test
    train:
      ann_file: odinw/Aquarium/Aquarium Combined.v2-raw-1024.coco/train/annotations_without_background.json
      img_dir: odinw/Aquarium/Aquarium Combined.v2-raw-1024.coco/train
    val:
      ann_file: odinw/Aquarium/Aquarium Combined.v2-raw-1024.coco/valid/annotations_without_background.json
      img_dir: odinw/Aquarium/Aquarium Combined.v2-raw-1024.coco/valid
  TEST: ("val",)
  TRAIN: ("train",)
INPUT:
  MAX_SIZE_TEST: 1333
  MAX_SIZE_TRAIN: 1333
  MIN_SIZE_TEST: 800
  MIN_SIZE_TRAIN: 800
MODEL:
  ATSS:
    NUM_CLASSES: 8
  DYHEAD:
    NUM_CLASSES: 8
  FCOS:
    NUM_CLASSES: 8
  ROI_BOX_HEAD:
    NUM_CLASSES: 8
SOLVER:
  CHECKPOINT_PERIOD: 100
  MAX_EPOCH: 12
  WARMUP_ITERS: 0
TEST:
  IMS_PER_BATCH: 8


================================================
FILE: GLIP/configs/pretrain/glip_Swin_L.yaml
================================================
MODEL:
  META_ARCHITECTURE: "GeneralizedVLRCNN"
  WEIGHT: "swin_large_patch4_window12_384_22k.pth"
  RPN_ONLY: True
  RPN_ARCHITECTURE: "VLDYHEAD"

  BACKBONE:
    CONV_BODY: "SWINT-FPN-RETINANET"
    OUT_CHANNELS: 256

  SWINT:
    EMBED_DIM: 192
    DEPTHS: (2, 2, 18, 2)
    NUM_HEADS: (6, 12, 24, 48)
    WINDOW_SIZE: 12
    OUT_CHANNELS: (192, 384, 768, 1536)
    DROP_PATH_RATE: 0.4

  LANGUAGE_BACKBONE:
    FREEZE: False
    MODEL_TYPE: "bert-base-uncased" # "roberta-base", "clip"
    MASK_SPECIAL: False

  RPN:
    USE_FPN: True
    ANCHOR_SIZES: (64, 128, 256, 512, 1024)
    ANCHOR_STRIDE: (8, 16, 32, 64, 128)
    ASPECT_RATIOS: (1.0,)
    SCALES_PER_OCTAVE: 1

  DYHEAD:
    CHANNELS: 256
    NUM_CONVS: 8
    USE_GN: True
    USE_DYRELU: True
    USE_DFCONV: True
    USE_DYFUSE: True
    TOPK: 9 # topk for selecting candidate positive samples from each level
    SCORE_AGG: "MEAN"
    LOG_SCALE: 0.0

    USE_CHECKPOINT: True
    FUSE_CONFIG:
      USE_FUSED_FEATURES_DOT_PRODUCT: True
      EARLY_FUSE_ON: True
      TYPE: "MHA-B"
      USE_CLASSIFICATION_LOSS: False
      USE_TOKEN_LOSS: False
      USE_CONTRASTIVE_ALIGN_LOSS: False
      CONTRASTIVE_HIDDEN_DIM: 64
      USE_DOT_PRODUCT_TOKEN_LOSS: True
      USE_LAYER_SCALE: True
      CLAMP_MIN_FOR_UNDERFLOW: True
      CLAMP_MAX_FOR_OVERFLOW: True
      CLAMP_BERTATTN_MIN_FOR_UNDERFLOW: True
      CLAMP_BERTATTN_MAX_FOR_OVERFLOW: True
      CLAMP_DOT_PRODUCT: True

DATASETS:

  TRAIN: ("mixed_train_no_coco",) # Place holder dataset for now. To be updated in the next version
  TEST: ("coco_2017_val", )

  ONE_HOT: False
  FLICKR_COPY: 8 # 0.15 * 8 = ~1.2M
  MIXED_COPY: 4 # 0.6 * 4 = ~2.4M
  OBJECT365_COPY: 2 # 1.4 * 2 = ~2.8M
  VG_COPY: 3 # 0.4 * 3 = ~1.2M
  IN_COPY: 2 # 0.67 * 2 = ~1.33M
  OI_COPY: 1 # 2M * 1 = 2M

  DISABLE_SHUFFLE: False
  ADD_DET_PROMPT: False
  RANDOM_SAMPLE_NEG: 85
  CONTROL_PROB: (0.0, 0.0, 0.5, 0.0)
  FURTHER_SCREEN: True
  CAPTION_CONF: 0.5
  CAPTION_NMS: -1.0
  CAPTION_MIN_BOX: 1

  SEPARATION_TOKENS: ". "

  PACK_RANDOM_CAPTION_NUMBER: 20
  NO_RANDOM_PACK_PROBABILITY: 0.4
  RANDOM_PACK_PROB: 0.5
  CAPTION_FORMAT_VERSION: "v2"

INPUT:
  PIXEL_MEAN: [ 103.530, 116.280, 123.675 ]
  PIXEL_STD: [ 57.375, 57.120, 58.395 ]
  MIN_SIZE_TRAIN: 800
  MAX_SIZE_TRAIN: 1333
  MIN_SIZE_TEST: 800
  MAX_SIZE_TEST: 1333

AUGMENT:
  MULT_MIN_SIZE_TRAIN: (480,560,640,720,800)

DATALOADER:
  SIZE_DIVISIBILITY: 32

SOLVER:
  OPTIMIZER: ADAMW
  BASE_LR: 0.0001
  LANG_LR: 0.00001
  WEIGHT_DECAY: 0.01
  WEIGHT_DECAY_SCHEDULE: True
  STEPS: (0.67, 0.89)
  MAX_ITER: 1000000
  IMS_PER_BATCH: 64
  WARMUP_ITERS: 2000
  WARMUP_FACTOR: 0.001

  FIND_UNUSED_PARAMETERS: False

  CLIP_GRADIENTS:
    ENABLED: True
    CLIP_TYPE: "full_model"
    CLIP_VALUE: 1.0
    NORM_TYPE: 2.0


================================================
FILE: GLIP/configs/pretrain/glip_Swin_T_O365_GoldG.yaml
================================================
MODEL:
  META_ARCHITECTURE: "GeneralizedVLRCNN"
  WEIGHT: "swin_tiny_patch4_window7_224.pth"
  RPN_ONLY: True
  RPN_ARCHITECTURE: "VLDYHEAD"

  BACKBONE:
    CONV_BODY: "SWINT-FPN-RETINANET"
    OUT_CHANNELS: 256
    FREEZE_CONV_BODY_AT: -1

  LANGUAGE_BACKBONE:
    FREEZE: False
    MODEL_TYPE: "bert-base-uncased" # "roberta-base", "clip"
    MASK_SPECIAL: False

  RPN:
    USE_FPN: True
    ANCHOR_SIZES: (64, 128, 256, 512, 1024)
    ANCHOR_STRIDE: (8, 16, 32, 64, 128)
    ASPECT_RATIOS: (1.0,)
    SCALES_PER_OCTAVE: 1

  DYHEAD:
    CHANNELS: 256
    NUM_CONVS: 6
    USE_GN: True
    USE_DYRELU: True
    USE_DFCONV: True
    USE_DYFUSE: True
    TOPK: 9 # topk for selecting candidate positive samples from each level
    SCORE_AGG: "MEAN"
    LOG_SCALE: 0.0

    FUSE_CONFIG:
      EARLY_FUSE_ON: True
      TYPE: "MHA-B"
      USE_CLASSIFICATION_LOSS: False
      USE_TOKEN_LOSS: False
      USE_CONTRASTIVE_ALIGN_LOSS: False
      CONTRASTIVE_HIDDEN_DIM: 64
      USE_DOT_PRODUCT_TOKEN_LOSS: True
      USE_FUSED_FEATURES_DOT_PRODUCT: True
      USE_LAYER_SCALE: True
      CLAMP_MIN_FOR_UNDERFLOW: True
      CLAMP_MAX_FOR_OVERFLOW: True
      CLAMP_BERTATTN_MIN_FOR_UNDERFLOW: True
      CLAMP_BERTATTN_MAX_FOR_OVERFLOW: True
      CLAMP_DOT_PRODUCT: True
           
    USE_CHECKPOINT: True

TEST:
  DURING_TRAINING: False
  IMS_PER_BATCH: 64

# use for grounding model
DATASETS:
  TRAIN: ("object365_dt_train", "mixed_train_no_coco", "flickr30k_train", )
  TEST: ("coco_2017_val", )
  DISABLE_SHUFFLE: False
  ADD_DET_PROMPT: False
  RANDOM_SAMPLE_NEG: 85
  CONTROL_PROB: (0.0, 0.0, 0.5, 0.0)

  SEPARATION_TOKENS: ". "

INPUT:
  PIXEL_MEAN: [ 103.530, 116.280, 123.675 ]
  PIXEL_STD: [ 57.375, 57.120, 58.395 ]
  MIN_SIZE_TRAIN: 800
  MAX_SIZE_TRAIN: 1333
  MIN_SIZE_TEST: 800
  MAX_SIZE_TEST: 1333

AUGMENT:
  MULT_MIN_SIZE_TRAIN: (480,560,640,720,800)

DATALOADER:
  SIZE_DIVISIBILITY: 32

SOLVER:
  OPTIMIZER: ADAMW
  BASE_LR: 0.0001
  LANG_LR: 0.00001
  WEIGHT_DECAY: 0.0001
  STEPS: (0.67, 0.89)
  MAX_EPOCH: 30
  IMS_PER_BATCH: 64
  WARMUP_ITERS: 2000
  WARMUP_FACTOR: 0.001
  USE_AMP: True
  MODEL_EMA: 0.999
  FIND_UNUSED_PARAMETERS: False

  CLIP_GRADIENTS:
    ENABLED: True
    CLIP_TYPE: "full_model"
    CLIP_VALUE: 1.0
    NORM_TYPE: 2.0

================================================
FILE: GLIP/maskrcnn_benchmark/__init__.py
================================================
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.


================================================
FILE: GLIP/maskrcnn_benchmark/config/__init__.py
================================================
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
from .defaults import _C as cfg
from .paths_catalog import try_to_find

================================================
FILE: GLIP/maskrcnn_benchmark/config/defaults.py
================================================
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
import os

from yacs.config import CfgNode as CN

# -----------------------------------------------------------------------------
# Convention about Training / Test specific parameters
# -----------------------------------------------------------------------------
# Whenever an argument can be either used for training or for testing, the
# corresponding name will be post-fixed by a _TRAIN for a training parameter,
# or _TEST for a test-specific parameter.
# For example, the number of images during training will be
# IMAGES_PER_BATCH_TRAIN, while the number of images for testing will be
# IMAGES_PER_BATCH_TEST

# -----------------------------------------------------------------------------
# Config definition
# -----------------------------------------------------------------------------

_C = CN()

_C.MODEL = CN()
_C.MODEL.RPN_ONLY = False
_C.MODEL.BOX_ON = True
_C.MODEL.MASK_ON = False
_C.MODEL.KEYPOINT_ON = False
_C.MODEL.DEVICE = "cuda"

_C.MODEL.META_ARCHITECTURE = "GeneralizedRCNN"

_C.MODEL.RPN_ARCHITECTURE = "RPN"
_C.MODEL.DEBUG = False  # add debug flag
_C.MODEL.ONNX = False  # add onnx flag

# If the WEIGHT starts with a catalog://, like :R-50, the code will look for
# the path in paths_catalog. Else, it will use it as the specified absolute
# path
_C.MODEL.WEIGHT = ""
_C.MODEL.PRETRAIN_NAME = ""

# If LINEAR_PROB = True, only the last linear layers in rpn and roi_head are trainable
_C.MODEL.LINEAR_PROB = False

# -----------------------------------------------------------------------------
# Multitask Training / Test specific parameters
# -----------------------------------------------------------------------------
_C.MODEL.MULTITASK = CN(new_allowed=True)

# -----------------------------------------------------------------------------
# INPUT
# -----------------------------------------------------------------------------
_C.INPUT = CN()
# Size of the smallest side of the image during training
_C.INPUT.MIN_SIZE_TRAIN = 800  # (800,)
# Maximum size of the side of the image during training
_C.INPUT.MAX_SIZE_TRAIN = 1333
# Size of the smallest side of the image during testing
_C.INPUT.MIN_SIZE_TEST = 800
# Maximum size of the side of the image during testing
_C.INPUT.MAX_SIZE_TEST = 1333
# Values to be used for image normalization
_C.INPUT.PIXEL_MEAN = [102.9801, 115.9465, 122.7717]
# Values to be used for image normalization
_C.INPUT.PIXEL_STD = [1., 1., 1.]
# Convert image to BGR format (for Caffe2 models), in range 0-255
_C.INPUT.TO_BGR255 = True
_C.INPUT.FORMAT = ''
_C.INPUT.FIX_RES = False

# -----------------------------------------------------------------------------
# Augmentation
# -----------------------------------------------------------------------------
_C.AUGMENT = CN()
_C.AUGMENT.USE_RA = 0
_C.AUGMENT.FLIP_PROB_TRAIN = 0.5
_C.AUGMENT.VERTICAL_FLIP_PROB_TRAIN = 0.0
_C.AUGMENT.MULT_MIN_SIZE_TRAIN = ()

_C.AUGMENT.BRIGHTNESS = 0.0
_C.AUGMENT.CONTRAST = 0.0
_C.AUGMENT.SATURATION = 0.0
_C.AUGMENT.HUE = 0.0

_C.AUGMENT.CROP_PROB = 0.5
_C.AUGMENT.CROP_MIN_IOUS = (0.1, 0.3, 0.5, 0.7, 0.9)
_C.AUGMENT.CROP_MIN_SIZE = 0.3

# -----------------------------------------------------------------------------
# Dataset
# -----------------------------------------------------------------------------
_C.DATASETS = CN()
# List of the dataset names for training, as present in paths_catalog.py
_C.DATASETS.TRAIN = ()
# List of the dataset names for testing, as present in paths_catalog.py
_C.DATASETS.TEST = ()
# Use is_crowd label
_C.DATASETS.USE_CROWD = False
_C.DATASETS.CLASS_AGNOSTIC = False
_C.DATASETS.CLASS_CONCAT = False
_C.DATASETS.MAX_BOX = -1
_C.DATASETS.SAMPLE_RATIO = 0.0
_C.DATASETS.FEW_SHOT = 0
# SHUFFLE_SEED != 0 means shuffle the dataset in the few shot setting
_C.DATASETS.SHUFFLE_SEED = 0
_C.DATASETS.PREDEFINED_TEXT = ''
_C.DATASETS.ALTERNATIVE_TRAINING = False
_C.DATASETS.MULTISTAGE_TRAINING = False
_C.DATASETS.REGISTER = CN(new_allowed=True)
_C.DATASETS.BOX_THRESHOLD = 0.1
# Duplicate Dataset
_C.DATASETS.COCO_COPY = 1
_C.DATASETS.LVIS_COPY = 1
_C.DATASETS.FLICKR_COPY = 1
_C.DATASETS.MIXED_COPY = 1
_C.DATASETS.OBJECT365_COPY = 1
_C.DATASETS.VG_COPY = 1
_C.DATASETS.OI_COPY = 1
_C.DATASETS.IN_COPY = 1

# Duplicate Dataset
_C.DATASETS.COCO_COPY = 1
_C.DATASETS.FLICKR_COPY = 1
_C.DATASETS.MIXED_COPY = 1
_C.DATASETS.OBJECT365_COPY = 1
_C.DATASETS.VG_COPY = 1
_C.DATASETS.OI_COPY = 1
_C.DATASETS.IN_COPY = 1
_C.DATASETS.GENERAL_COPY = -1
_C.DATASETS.GENERAL_COPY_TEST = -1

# OD to Grounding
_C.DATASETS.RANDOM_SAMPLE_NEG = -1
_C.DATASETS.ADD_DET_PROMPT = False
_C.DATASETS.ADD_DET_PROMPT_ADVANCED = False
_C.DATASETS.USE_OD_AUG = False
_C.DATASETS.USE_COCO_FORMAT = False
_C.DATASETS.CONTROL_PROB = ()
_C.DATASETS.DISABLE_SHUFFLE = False
_C.DATASETS.PROMPT_VERSION = ""
_C.DATASETS.PROMPT_LIMIT_NEG = -1
_C.DATASETS.POS_QUESTION_PROB = 0.6
_C.DATASETS.NEG_QUESTION_PROB = 0.8
_C.DATASETS.FULL_QUESTION_PROB = 0.5
_C.DATASETS.ONE_HOT = False
_C.DATASETS.NO_MINUS_ONE_FOR_ONE_HOT = False

_C.DATASETS.DISABLE_CLIP_TO_IMAGE = False
_C.DATASETS.SEPARATION_TOKENS = " "

# LVIS
_C.DATASETS.LVIS_USE_NORMAL_AP = False
_C.DATASETS.SPECIAL_SAFEGUARD_FOR_COCO_GROUNDING = False

# Caption
_C.DATASETS.BING_INDEX_LIST = []
_C.DATASETS.CAPTION_MIN_BOX = 1
_C.DATASETS.REPLACE_CLEAN_LABEL = False
_C.DATASETS.FURTHER_SCREEN = False
_C.DATASETS.CAPTION_CONF = 0.9
_C.DATASETS.CAPTION_NMS = 0.9
_C.DATASETS.PACK_RANDOM_CAPTION_NUMBER = 0
_C.DATASETS.INFERENCE_CAPTION = False
_C.DATASETS.SAMPLE_NEGATIVE_FOR_GROUNDING_DATA = -1.0
_C.DATASETS.RANDOM_PACK_PROB = -1.0
_C.DATASETS.NO_RANDOM_PACK_PROBABILITY = 0.0
_C.DATASETS.SAFEGUARD_POSITIVE_CAPTION = True
_C.DATASETS.CAPTION_FORMAT_VERSION = "v1"
_C.DATASETS.LOCAL_DEBUG = False


# Od in the wild
_C.DATASETS.PREDEFINED_TEXT = None
_C.DATASETS.TRAIN_DATASETNAME_SUFFIX = ""
_C.DATASETS.TEST_DATASETNAME_SUFFIX = ""
_C.DATASETS.OVERRIDE_CATEGORY = None
_C.DATASETS.USE_OVERRIDE_CATEGORY = False
_C.DATASETS.SUPRESS_QUERY = None
_C.DATASETS.USE_SUPRESS_QUERY = False
_C.DATASETS.USE_CAPTION_PROMPT = False
_C.DATASETS.CAPTION_PROMPT = None

_C.DATASETS.FLICKR_GT_TYPE = "separate"

# VQA
_C.DATASETS.DIVER_BOX_FOR_VQA = False
# -----------------------------------------------------------------------------
# DataLoader
# -----------------------------------------------------------------------------
_C.DATALOADER = CN()
# Number of data loading threads
_C.DATALOADER.NUM_WORKERS = 4
# If > 0, this enforces that each collated batch should have a size divisible
# by SIZE_DIVISIBILITY
_C.DATALOADER.SIZE_DIVISIBILITY = 0
# If True, each batch should contain only images for which the aspect ratio
# is compatible. This groups portrait images together, and landscape images
# are not batched with portrait images.
_C.DATALOADER.ASPECT_RATIO_GROUPING = True
# Define min number of keypoints required from GT, for example 10 out of 17
_C.DATALOADER.MIN_KPS_PER_IMS = 0
# Use random sampler during training
_C.DATALOADER.USE_RANDOM_SEED = False

_C.DATALOADER.DISTRIBUTE_CHUNK_AMONG_NODE = False
# ---------------------------------------------------------------------------- #
# Backbone options
# ---------------------------------------------------------------------------- #
_C.MODEL.BACKBONE = CN()

# The backbone conv body to use
# The string must match a function that is imported in modeling.model_builder
# (e.g., 'FPN.add_fpn_ResNet101_conv5_body' to specify a ResNet-101-FPN
# backbone)
_C.MODEL.BACKBONE.CONV_BODY = "R-50-C4"

# Add StopGrad at a specified stage so the bottom layers are frozen
_C.MODEL.BACKBONE.FREEZE_CONV_BODY_AT = 2
_C.MODEL.BACKBONE.FREEZE = False
_C.MODEL.BACKBONE.GROUP = 1
_C.MODEL.BACKBONE.OUT_CHANNELS = 256 * 4
# Option to reset bn running statics
_C.MODEL.BACKBONE.RESET_BN = False
# Backbone Normalization Level
_C.MODEL.BACKBONE.NORM_LEVEL = 3
# BN for backbone
_C.MODEL.BACKBONE.USE_BN = False
# Sync BN for backbone
_C.MODEL.BACKBONE.USE_SYNCBN = False
_C.MODEL.BACKBONE.USE_NSYNCBN = False
# GN for backbone
_C.MODEL.BACKBONE.USE_GN = False
# Evo Norm for backbone
_C.MODEL.BACKBONE.USE_EN = False
# Layers for backbone
_C.MODEL.BACKBONE.USE_DFCONV = False
_C.MODEL.BACKBONE.USE_DYRELU = False
_C.MODEL.BACKBONE.USE_SE = False
_C.MODEL.BACKBONE.LAYER_SETUP = (3, 4, 6, 3)
_C.MODEL.BACKBONE.LAYER_SEARCH = CN(new_allowed=True)
_C.MODEL.BACKBONE.OUT_FEATURES = ("stage2", "stage3", "stage4", "stage5")
_C.MODEL.BACKBONE.FPN_LAYER = ()
_C.MODEL.BACKBONE.USE_CHECKPOINT = False
# Add JF efficient det cfgs
_C.MODEL.BACKBONE.EFFICIENT_DET_START_FROM = 3
_C.MODEL.BACKBONE.EFFICIENT_DET_COMPOUND = 0
_C.MODEL.BACKBONE.EFFICIENT_DET_BIFPN_VERSION = 0

_C.MODEL.LANGUAGE_BACKBONE = CN()
_C.MODEL.LANGUAGE_BACKBONE.WEIGHT = ""
_C.MODEL.LANGUAGE_BACKBONE.FREEZE = False
_C.MODEL.LANGUAGE_BACKBONE.USE_CHECKPOINT = False
_C.MODEL.LANGUAGE_BACKBONE.TOKENIZER_TYPE = "bert-base-uncased"
_C.MODEL.LANGUAGE_BACKBONE.MODEL_TYPE = "bert-base-uncased"
_C.MODEL.LANGUAGE_BACKBONE.LANG_DIM = 768
_C.MODEL.LANGUAGE_BACKBONE.MAX_QUERY_LEN = 256
_C.MODEL.LANGUAGE_BACKBONE.N_LAYERS = 1
_C.MODEL.LANGUAGE_BACKBONE.UNUSED_TOKEN = 106
_C.MODEL.LANGUAGE_BACKBONE.MASK_SPECIAL = False

_C.MODEL.LANGUAGE_BACKBONE.RNN_TYPE = "lstm"
_C.MODEL.LANGUAGE_BACKBONE.VARIABLE_LENGTH = True
_C.MODEL.LANGUAGE_BACKBONE.WORD_EMBEDDING_SIZE = 512
_C.MODEL.LANGUAGE_BACKBONE.WORD_VEC_SIZE = 512
_C.MODEL.LANGUAGE_BACKBONE.HIDDEN_SIZE = 512
_C.MODEL.LANGUAGE_BACKBONE.BIDIRECTIONAL = True
_C.MODEL.LANGUAGE_BACKBONE.INPUT_DROPOUT_P = 0.5
_C.MODEL.LANGUAGE_BACKBONE.DROPOUT_P = 0.2
_C.MODEL.LANGUAGE_BACKBONE.CORPUS_PATH = ""
_C.MODEL.LANGUAGE_BACKBONE.VOCAB_SIZE = 0

_C.MODEL.LANGUAGE_BACKBONE.PAD_MAX = True
# ---------------------------------------------------------------------------- #
# FPN options
# ---------------------------------------------------------------------------- #
_C.MODEL.FPN = CN()
_C.MODEL.FPN.FREEZE = False
_C.MODEL.FPN.USE_GN = False
_C.MODEL.FPN.USE_RELU = False
_C.MODEL.FPN.USE_DYRELU = False
_C.MODEL.FPN.DROP_BLOCK = True
_C.MODEL.FPN.DROP_PROB = 0.3
_C.MODEL.FPN.DROP_SIZE = 3
_C.MODEL.FPN.USE_SPP = False
_C.MODEL.FPN.USE_PAN = False
_C.MODEL.FPN.USE_DYHEAD = False
_C.MODEL.FPN.RETURN_SWINT_FEATURE_BEFORE_FUSION = False
# ---------------------------------------------------------------------------- #
# BIFPN options
# ---------------------------------------------------------------------------- #
_C.MODEL.BIFPN = CN()
_C.MODEL.BIFPN.NUM_REPEATS = 1
_C.MODEL.BIFPN.USE_ATTENTION = True

# ---------------------------------------------------------------------------- #
# Group Norm options
# ---------------------------------------------------------------------------- #
_C.MODEL.GROUP_NORM = CN()
# Number of dimensions per group in GroupNorm (-1 if using NUM_GROUPS)
_C.MODEL.GROUP_NORM.DIM_PER_GP = -1
# Number of groups in GroupNorm (-1 if using DIM_PER_GP)
_C.MODEL.GROUP_NORM.NUM_GROUPS = 16
# GroupNorm's small constant in the denominator
_C.MODEL.GROUP_NORM.EPSILON = 1e-5

# ---------------------------------------------------------------------------- #
# Evo Norm options
# ---------------------------------------------------------------------------- #
_C.MODEL.EVO_NORM = CN()
# Number of groups in EvoNorm (-1 if using DIM_PER_GP)
_C.MODEL.EVO_NORM.NUM_GROUPS = 8
# EvoNorm's small constant in the denominator
_C.MODEL.EVO_NORM.EPSILON = 1e-5

# ---------------------------------------------------------------------------- #
# RetinaNet Options (Follow the Detectron version)
# ---------------------------------------------------------------------------- #
_C.MODEL.RETINANET = CN()
# This is the number of foreground classes and background.
_C.MODEL.RETINANET.NUM_CLASSES = 81
# Convolutions to use in the cls and bbox tower
# NOTE: this doesn't include the last conv for logits
_C.MODEL.RETINANET.NUM_CONVS = 4
# During inference, #locs to select based on cls score before NMS is performed
# per FPN level
_C.MODEL.RETINANET.PRE_NMS_TOP_N = 1000
# Prior prob for the positives at the beginning of training. This is used to set
# the bias init for the logits layer
_C.MODEL.RETINANET.PRIOR_PROB = 0.01
# Inference cls score threshold, anchors with score > INFERENCE_TH are
# considered for inference
_C.MODEL.RETINANET.INFERENCE_TH = 0.05
# NMS threshold used in RetinaNet
_C.MODEL.RETINANET.NMS_TH = 0.4
_C.MODEL.RETINANET.DETECTIONS_PER_IMG = 100

# ---------------------------------------------------------------------------- #
# Focal Loss Options (Follow the Detectron version)
# ---------------------------------------------------------------------------- #
_C.MODEL.FOCAL = CN()
# Weight for bbox_regression loss
_C.MODEL.FOCAL.BBOX_REG_WEIGHT = 4.0
# Smooth L1 loss beta for bbox regression
_C.MODEL.FOCAL.BBOX_REG_BETA = 0.11
# IoU overlap ratio for labeling an anchor as positive
# Anchors with >= iou overlap are labeled positive
_C.MODEL.FOCAL.FG_IOU_THRESHOLD = 0.5
# IoU overlap ratio for labeling an anchor as negative
# Anchors with < iou overlap are labeled negative
_C.MODEL.FOCAL.BG_IOU_THRESHOLD = 0.4
# Focal loss parameter: alpha
_C.MODEL.FOCAL.LOSS_ALPHA = 0.25
# Focal loss parameter: gamma
_C.MODEL.FOCAL.LOSS_GAMMA = 2.0

# ---------------------------------------------------------------------------- #
# FCOS Options
# ---------------------------------------------------------------------------- #
_C.MODEL.FCOS = CN()
_C.MODEL.FCOS.NUM_CLASSES = 81  # the number of classes including background
_C.MODEL.FCOS.FPN_STRIDES = [8, 16, 32, 64, 128]
_C.MODEL.FCOS.PRIOR_PROB = 0.01
_C.MODEL.FCOS.INFERENCE_TH = 0.05
_C.MODEL.FCOS.NMS_TH = 0.6
_C.MODEL.FCOS.PRE_NMS_TOP_N = 1000

# the number of convolutions used in the cls and bbox tower
_C.MODEL.FCOS.NUM_CONVS = 4
# if use deformable conv to align features
_C.MODEL.FCOS.USE_DFCONV = False

# if CENTER_SAMPLING_RADIUS <= 0, it will disable center sampling
_C.MODEL.FCOS.CENTER_SAMPLING_RADIUS = 0.0
# IOU_LOSS_TYPE can be "iou", "linear_iou" or "giou"
_C.MODEL.FCOS.IOU_LOSS_TYPE = "iou"

_C.MODEL.FCOS.NORM_REG_TARGETS = False
_C.MODEL.FCOS.CENTERNESS_ON_REG = False
_C.MODEL.FCOS.USE_GT_CENTER = False

_C.MODEL.FCOS.DETECTIONS_PER_IMG = 100
_C.MODEL.FCOS.USE_GN = False
_C.MODEL.FCOS.USE_BN = False

_C.MODEL.FCOS.INFERENCE_TH_TRAIN = 0.0
_C.MODEL.FCOS.PRE_NMS_TOP_N_TRAIN = 3000
_C.MODEL.FCOS.POST_NMS_TOP_N_TRAIN = 1000

# ---------------------------------------------------------------------------- #
# ATSS Options
# ---------------------------------------------------------------------------- #
_C.MODEL.ATSS = CN()
_C.MODEL.ATSS.NUM_CLASSES = 81  # the number of classes including background
_C.MODEL.ATSS.PRIOR_PROB = 0.01
_C.MODEL.ATSS.INFERENCE_TH = 0.05
_C.MODEL.ATSS.NMS_TH = 0.6
_C.MODEL.ATSS.PRE_NMS_TOP_N = 1000

# the number of convolutions used in the cls and bbox tower
_C.MODEL.ATSS.NUM_CONVS = 4
# the channels of convolutions used in the cls and bbox tower
_C.MODEL.ATSS.CHANNELS = 128
# if use deformable conv to align features
_C.MODEL.ATSS.USE_DFCONV = False

# topk for selecting candidate positive samples from each level
_C.MODEL.ATSS.TOPK = 9

# Weight for bbox_regression loss
_C.MODEL.ATSS.REG_LOSS_WEIGHT = 2.0

_C.MODEL.ATSS.DETECTIONS_PER_IMG = 100
_C.MODEL.ATSS.USE_GN = False
_C.MODEL.ATSS.USE_BN = False

_C.MODEL.ATSS.USE_DYRELU = False
_C.MODEL.ATSS.USE_SE = False

_C.MODEL.ATSS.INFERENCE_TH_TRAIN = 0.0
_C.MODEL.ATSS.PRE_NMS_TOP_N_TRAIN = 3000
_C.MODEL.ATSS.POST_NMS_TOP_N_TRAIN = 1000
# ---------------------------------------------------------------------------- #
# DYHEAD Options
# ---------------------------------------------------------------------------- #
_C.MODEL.DYHEAD = CN()
_C.MODEL.DYHEAD.NUM_CLASSES = 81  # the number of classes including background
_C.MODEL.DYHEAD.PRIOR_PROB = 0.01

# the number of convolutions used in the cls and bbox tower
_C.MODEL.DYHEAD.NUM_CONVS = 4
# the channels of convolutions used in the cls and bbox tower
_C.MODEL.DYHEAD.CHANNELS = 128
_C.MODEL.DYHEAD.GROUPS = 1
# if use deformable conv to align features
_C.MODEL.DYHEAD.USE_DFCONV = False

# topk for selecting candidate positive samples from each level
_C.MODEL.DYHEAD.TOPK = 9

_C.MODEL.DYHEAD.SCORE_AGG = "MEAN"  # MEAN or MAX, for binary focal loss score aggregation

_C.MODEL.DYHEAD.LOG_SCALE = 0.0  # temperature (dot product)
_C.MODEL.DYHEAD.SHALLOW_LOG_SCALE = 0.0  # # temperature (shallow contrastive)

_C.MODEL.DYHEAD.USE_GN = False
_C.MODEL.DYHEAD.USE_NSYNCBN = False
_C.MODEL.DYHEAD.USE_SYNCBN = False

_C.MODEL.DYHEAD.USE_DYFUSE = False
_C.MODEL.DYHEAD.USE_DYRELU = False

_C.MODEL.DYHEAD.CONV_FUNC = ''

# CosineSimOutputLayers: https://github.com/ucbdrive/few-shot-object-detection/blob/master/fsdet/modeling/roi_heads/fast_rcnn.py#L448-L464
_C.MODEL.DYHEAD.COSINE_SCALE = -1.0

_C.MODEL.DYHEAD.FUSE_CONFIG = CN()
_C.MODEL.DYHEAD.FUSE_CONFIG.EARLY_FUSE_ON = False
_C.MODEL.DYHEAD.FUSE_CONFIG.TYPE = ""
_C.MODEL.DYHEAD.FUSE_CONFIG.JOINT_EMB_SIZE = 256
_C.MODEL.DYHEAD.FUSE_CONFIG.JOINT_OUT_SIZE = 256
_C.MODEL.DYHEAD.FUSE_CONFIG.JOINT_EMB_DROPOUT = 0.1
_C.MODEL.DYHEAD.FUSE_CONFIG.JOINT_MLP_LAYERS = 2

_C.MODEL.DYHEAD.FUSE_CONFIG.USE_CLASSIFICATION_LOSS = False

_C.MODEL.DYHEAD.FUSE_CONFIG.USE_TOKEN_LOSS = False
_C.MODEL.DYHEAD.FUSE_CONFIG.TOKEN_LOSS_WEIGHT = 1.0
_C.MODEL.DYHEAD.FUSE_CONFIG.TOKEN_GAMMA = 2.0
_C.MODEL.DYHEAD.FUSE_CONFIG.TOKEN_ALPHA = 0.25

_C.MODEL.DYHEAD.FUSE_CONFIG.USE_DOT_PRODUCT_TOKEN_LOSS = False
_C.MODEL.DYHEAD.FUSE_CONFIG.USE_CONTRASTIVE_ALIGN_LOSS = False
_C.MODEL.DYHEAD.FUSE_CONFIG.CONTRASTIVE_HIDDEN_DIM = 64
_C.MODEL.DYHEAD.FUSE_CONFIG.CONTRASTIVE_ALIGN_LOSS_WEIGHT = 1.0
_C.MODEL.DYHEAD.FUSE_CONFIG.DOT_PRODUCT_TOKEN_LOSS_WEIGHT = 1.0
_C.MODEL.DYHEAD.FUSE_CONFIG.USE_LAYER_SCALE = True
_C.MODEL.DYHEAD.FUSE_CONFIG.SEPARATE_BIDIRECTIONAL = False
_C.MODEL.DYHEAD.FUSE_CONFIG.STABLE_SOFTMAX_2D = False

_C.MODEL.DYHEAD.FUSE_CONFIG.DO_LANG_PROJ_OUTSIDE_CHECKPOINT = False

_C.MODEL.DYHEAD.FUSE_CONFIG.USE_FUSED_FEATURES_DOT_PRODUCT = False

# Controls for 
_C.MODEL.DYHEAD.FUSE_CONFIG.CLAMP_MIN_FOR_UNDERFLOW = False
_C.MODEL.DYHEAD.FUSE_CONFIG.CLAMP_MAX_FOR_OVERFLOW = False
_C.MODEL.DYHEAD.FUSE_CONFIG.CLAMP_BERTATTN_MIN_FOR_UNDERFLOW = False
_C.MODEL.DYHEAD.FUSE_CONFIG.CLAMP_BERTATTN_MAX_FOR_OVERFLOW = False
_C.MODEL.DYHEAD.FUSE_CONFIG.CLAMP_DOT_PRODUCT = False

# MLM Loss
_C.MODEL.DYHEAD.FUSE_CONFIG.MLM_LOSS = False
_C.MODEL.DYHEAD.FUSE_CONFIG.MLM_LOSS_FOR_ONLY_POSITIVES = True
_C.MODEL.DYHEAD.FUSE_CONFIG.NO_MASK_FOR_OD = False
_C.MODEL.DYHEAD.FUSE_CONFIG.NO_MASK_FOR_GOLD = False
_C.MODEL.DYHEAD.FUSE_CONFIG.MLM_LOSS_COEF = 1.0
_C.MODEL.DYHEAD.FUSE_CONFIG.MLM_OBJ_FOR_ONLY_POSITIVE  = False

# Shallow Contrastive Loss (FPN)
_C.MODEL.DYHEAD.FUSE_CONFIG.USE_SHALLOW_CONTRASTIVE_LOSS = False
_C.MODEL.DYHEAD.FUSE_CONFIG.SHALLOW_MAX_POSITIVE_ANCHORS = 100
_C.MODEL.DYHEAD.FUSE_CONFIG.USE_SHALLOW_ZERO_PADS = False
_C.MODEL.DYHEAD.FUSE_CONFIG.SHALLOW_CONTRASTIVE_HIDDEN_DIM = 64
_C.MODEL.DYHEAD.FUSE_CONFIG.SHALLOW_CONTRASTIVE_LOSS_WEIGHT = 1.0

# Shallow Contrastive Loss (BACKBONE)
_C.MODEL.DYHEAD.FUSE_CONFIG.USE_BACKBONE_SHALLOW_CONTRASTIVE_LOSS = False

_C.MODEL.DYHEAD.FUSE_CONFIG.ADD_LINEAR_LAYER = False

# use checkpoint to save memory
_C.MODEL.DYHEAD.USE_CHECKPOINT = False

# ---------------------------------------------------------------------------- #
# RPN options
# ---------------------------------------------------------------------------- #
_C.MODEL.RPN = CN()
_C.MODEL.RPN.USE_FPN = False
# Base RPN anchor sizes given in absolute pixels w.r.t. the scaled network input
_C.MODEL.RPN.ANCHOR_SIZES = (32, 64, 128, 256, 512)
# Stride of the feature map that RPN is attached.
# For FPN, number of strides should match number of scales
_C.MODEL.RPN.ANCHOR_STRIDE = (16,)
# RPN anchor aspect ratios
_C.MODEL.RPN.ASPECT_RATIOS = (0.5, 1.0, 2.0)
# Anchor shift away ration from the center for r,t,l,d
_C.MODEL.RPN.ANCHOR_SHIFT = (0.0, 0.0, 0.0, 0.0)
# Use center to decide anchor size
_C.MODEL.RPN.USE_RELATIVE_SIZE = False
# Remove RPN anchors that go outside the image by RPN_STRADDLE_THRESH pixels
# Set to -1 or a large value, e.g. 100000, to disable pruning anchors
_C.MODEL.RPN.STRADDLE_THRESH = 0
# Anchor scales per octave for complex anchors
_C.MODEL.RPN.OCTAVE = 2.0
_C.MODEL.RPN.SCALES_PER_OCTAVE = 3
# Minimum overlap required between an anchor and ground-truth box for the
# (anchor, gt box) pair to be a positive example (IoU >= FG_IOU_THRESHOLD
# ==> positive RPN example)
_C.MODEL.RPN.FG_IOU_THRESHOLD = 0.7
# Maximum overlap allowed between an anchor and ground-truth box for the
# (anchor, gt box) pair to be a negative examples (IoU < BG_IOU_THRESHOLD
# ==> negative RPN example)
_C.MODEL.RPN.BG_IOU_THRESHOLD = 0.3
# Total number of RPN examples per image
_C.MODEL.RPN.BATCH_SIZE_PER_IMAGE = 256
# Target fraction of foreground (positive) examples per RPN minibatch
_C.MODEL.RPN.POSITIVE_FRACTION = 0.5
# Number of top scoring RPN proposals to keep before applying NMS
# When FPN is used, this is *per FPN level* (not total)
_C.MODEL.RPN.PRE_NMS_TOP_N_TRAIN = 12000
_C.MODEL.RPN.PRE_NMS_TOP_N_TEST = 6000
# Number of top scoring RPN proposals to keep after applying NMS
_C.MODEL.RPN.POST_NMS_TOP_N_TRAIN = 2000
_C.MODEL.RPN.POST_NMS_TOP_N_TEST = 1000
# NMS threshold used on RPN proposals
_C.MODEL.RPN.NMS_THRESH = 0.7
# Proposal height and width both need to be greater than RPN_MIN_SIZE
# (a the scale used during training or inference)
_C.MODEL.RPN.MIN_SIZE = 0
# Number of top scoring RPN proposals to keep after combining proposals from
# all FPN levels
_C.MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN = 2000
_C.MODEL.RPN.FPN_POST_NMS_TOP_N_TEST = 2000
# Custom rpn head, empty to use default conv or separable conv
_C.MODEL.RPN.RPN_HEAD = "SingleConvRPNHead"
_C.MODEL.RPN.FREEZE = False
_C.MODEL.RPN.FORCE_BOXES = False
_C.MODEL.RPN.RETURN_FUSED_FEATURES = False

# ---------------------------------------------------------------------------- #
# ROI HEADS options
# ---------------------------------------------------------------------------- #
_C.MODEL.ROI_HEADS = CN()
_C.MODEL.ROI_HEADS.USE_FPN = False
# Overlap threshold for an RoI to be considered foreground (if >= FG_IOU_THRESHOLD)
_C.MODEL.ROI_HEADS.FG_IOU_THRESHOLD = 0.5
# Overlap threshold for an RoI to be considered background
# (class = 0 if overlap in [0, BG_IOU_THRESHOLD))
_C.MODEL.ROI_HEADS.BG_IOU_THRESHOLD = 0.5
# Default weights on (dx, dy, dw, dh) for normalizing bbox regression targets
# These are empirically chosen to approximately lead to unit variance targets
_C.MODEL.ROI_HEADS.BBOX_REG_WEIGHTS = (10., 10., 5., 5.)
# RoI minibatch size *per image* (number of regions of interest [ROIs])
# Total number of RoIs per training minibatch =
#   TRAIN.BATCH_SIZE_PER_IM * TRAIN.IMS_PER_BATCH * NUM_GPUS
# E.g., a common configuration is: 512 * 2 * 8 = 8192
_C.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 512
# Target fraction of RoI minibatch that is labeled foreground (i.e. class > 0)
_C.MODEL.ROI_HEADS.POSITIVE_FRACTION = 0.25

# Only used on test mode

# Minimum score threshold (assuming scores in a [0, 1] range); a value chosen to
# balance obtaining high recall with not having too many low precision
# detections that will slow down inference post processing steps (like NMS)
_C.MODEL.ROI_HEADS.SCORE_THRESH = 0.05
# Overlap threshold used for non-maximum suppression (suppress boxes with
# IoU >= this threshold)
_C.MODEL.ROI_HEADS.NMS = 0.5
# Maximum number of detections to return per image (100 is based on the limit
# established for the COCO dataset)
_C.MODEL.ROI_HEADS.DETECTIONS_PER_IMG = 100

_C.MODEL.ROI_BOX_HEAD = CN()
_C.MODEL.ROI_BOX_HEAD.FEATURE_EXTRACTOR = "ResNet50Conv5ROIFeatureExtractor"
_C.MODEL.ROI_BOX_HEAD.PREDICTOR = "FastRCNNPredictor"
_C.MODEL.ROI_BOX_HEAD.POOLER_RESOLUTION = 14
_C.MODEL.ROI_BOX_HEAD.POOLER_SAMPLING_RATIO = 0
_C.MODEL.ROI_BOX_HEAD.POOLER_SCALES = (1.0 / 16,)
_C.MODEL.ROI_BOX_HEAD.NUM_CLASSES = 81
# Hidden layer dimension when using an MLP for the RoI box head
_C.MODEL.ROI_BOX_HEAD.MLP_HEAD_DIM = 1024
# GN
_C.MODEL.ROI_BOX_HEAD.USE_GN = False
# Dilation
_C.MODEL.ROI_BOX_HEAD.DILATION = 1
_C.MODEL.ROI_BOX_HEAD.CONV_HEAD_DIM = 256
_C.MODEL.ROI_BOX_HEAD.NUM_STACKED_CONVS = 4
# Use D2 style ROIAlignV2
_C.MODEL.ROI_BOX_HEAD.POOLER_ALIGNED = False

_C.MODEL.ROI_MASK_HEAD = CN()
_C.MODEL.ROI_MASK_HEAD.FEATURE_EXTRACTOR = "ResNet50Conv5ROIFeatureExtractor"
_C.MODEL.ROI_MASK_HEAD.PREDICTOR = "MaskRCNNC4Predictor"
_C.MODEL.ROI_MASK_HEAD.POOLER_RESOLUTION = 14
_C.MODEL.ROI_MASK_HEAD.POOLER_SAMPLING_RATIO = 0
_C.MODEL.ROI_MASK_HEAD.POOLER_SCALES = (1.0 / 16,)
_C.MODEL.ROI_MASK_HEAD.MLP_HEAD_DIM = 1024
_C.MODEL.ROI_MASK_HEAD.CONV_LAYERS = (256, 256, 256, 256)
_C.MODEL.ROI_MASK_HEAD.RESOLUTION = 14
_C.MODEL.ROI_MASK_HEAD.SHARE_BOX_FEATURE_EXTRACTOR = True
# Whether or not resize and translate masks to the input image.
_C.MODEL.ROI_MASK_HEAD.POSTPROCESS_MASKS = False
_C.MODEL.ROI_MASK_HEAD.POSTPROCESS_MASKS_THRESHOLD = 0.5
# Dilation
_C.MODEL.ROI_MASK_HEAD.DILATION = 1
# GN
_C.MODEL.ROI_MASK_HEAD.USE_GN = False
# HG
_C.MODEL.ROI_MASK_HEAD.HG_SCALE = 1

_C.MODEL.ROI_KEYPOINT_HEAD = CN()
_C.MODEL.ROI_KEYPOINT_HEAD.FEATURE_EXTRACTOR = "KeypointRCNNFeatureExtractor"
_C.MODEL.ROI_KEYPOINT_HEAD.PREDICTOR = "KeypointRCNNPredictor"
_C.MODEL.ROI_KEYPOINT_HEAD.POOLER_RESOLUTION = 14
_C.MODEL.ROI_KEYPOINT_HEAD.POOLER_SAMPLING_RATIO = 0
_C.MODEL.ROI_KEYPOINT_HEAD.POOLER_SCALES = (1.0 / 16,)
_C.MODEL.ROI_KEYPOINT_HEAD.MLP_HEAD_DIM = 1024
_C.MODEL.ROI_KEYPOINT_HEAD.CONV_LAYERS = tuple(512 for _ in range(8))
_C.MODEL.ROI_KEYPOINT_HEAD.RESOLUTION = 14
_C.MODEL.ROI_KEYPOINT_HEAD.NUM_CLASSES = 17
_C.MODEL.ROI_KEYPOINT_HEAD.KEYPOINT_NAME = ()  # If left empty, use default names
_C.MODEL.ROI_KEYPOINT_HEAD.SHARE_BOX_FEATURE_EXTRACTOR = True

# ---------------------------------------------------------------------------- #
# ResNe[X]t options (ResNets = {ResNet, ResNeXt}
# Note that parts of a resnet may be used for both the backbone and the head
# These options apply to both
# ---------------------------------------------------------------------------- #
_C.MODEL.RESNETS = CN()

_C.MODEL.RESNETS.USE_STEM3X3 = False
_C.MODEL.RESNETS.WITH_SE = False
_C.MODEL.RESNETS.USE_AVG_DOWN = False

# Number of groups to use; 1 ==> ResNet; > 1 ==> ResNeXt
_C.MODEL.RESNETS.NUM_GROUPS = 1

# Baseline width of each group
_C.MODEL.RESNETS.WIDTH_PER_GROUP = 64

# Place the stride 2 conv on the 1x1 filter
# Use True only for the original MSRA ResNet; use False for C2 and Torch models
_C.MODEL.RESNETS.STRIDE_IN_1X1 = True

# Residual transformation function
_C.MODEL.RESNETS.TRANS_FUNC = "BottleneckWithFixedBatchNorm"
# ResNet's stem function (conv1 and pool1)
_C.MODEL.RESNETS.STEM_FUNC = "StemWithFixedBatchNorm"

# Apply dilation in stage "res5"
_C.MODEL.RESNETS.RES5_DILATION = 1

_C.MODEL.RESNETS.BACKBONE_OUT_CHANNELS = 256 * 4
_C.MODEL.RESNETS.RES2_OUT_CHANNELS = 256
_C.MODEL.RESNETS.STEM_OUT_CHANNELS = 64

_C.MODEL.RESNETS.REVISION = "resnet_light"
# Deformable convolutions
_C.MODEL.RESNETS.STAGE_WITH_DCN = (False, False, False, False)
_C.MODEL.RESNETS.WITH_MODULATED_DCN = False
_C.MODEL.RESNETS.DEFORMABLE_GROUPS = 1

# ---------------------------------------------------------------------------- #
# Swin Transformer
# ---------------------------------------------------------------------------- #
_C.MODEL.SWINT = CN()
_C.MODEL.SWINT.EMBED_DIM = 96
_C.MODEL.SWINT.OUT_CHANNELS = (96, 192, 384, 768)
_C.MODEL.SWINT.DEPTHS = (2, 2, 6, 2)
_C.MODEL.SWINT.NUM_HEADS = (3, 6, 12, 24)
_C.MODEL.SWINT.WINDOW_SIZE = 7
_C.MODEL.SWINT.MLP_RATIO = 4
_C.MODEL.SWINT.DROP_PATH_RATE = 0.2
_C.MODEL.SWINT.APE = False
_C.MODEL.SWINT.VERSION = "v1"
_C.MODEL.SWINT.OUT_NORM = True
_C.MODEL.SWINT.LAYER_SCALE = 0

# ---------------------------------------------------------------------------- #
# CVT SPEC
# ---------------------------------------------------------------------------- #
_C.MODEL.SPEC = CN(new_allowed=True)

# ---------------------------------------------------------------------------- #
# CLIP SPEC
# ---------------------------------------------------------------------------- #
_C.MODEL.CLIP = CN()
_C.MODEL.CLIP.CONTEXT_LENGTH = 256  # default 77
_C.MODEL.CLIP.WIDTH = 512
_C.MODEL.CLIP.LAYERS = 12
_C.MODEL.CLIP.HEADS = 8
_C.MODEL.CLIP.DROP_PATH = 0.0
_C.MODEL.CLIP.TOKENIZER = "clip"
_C.MODEL.CLIP.VOCAB_SIZE = 49408

# ---------------------------------------------------------------------------- #
# SEARCH
# ---------------------------------------------------------------------------- #

_C.SEARCH = CN()
_C.SEARCH.MAX_EPOCH = 20
_C.SEARCH.SELECT_NUM = 20
_C.SEARCH.POPULATION_NUM = 64
_C.SEARCH.MUTATION_NUM = 24
_C.SEARCH.CROSSOVER_NUM = 24
_C.SEARCH.MUTATION_PROB = 0.1

# ---------------------------------------------------------------------------- #
# Solver
# ---------------------------------------------------------------------------- #
_C.SOLVER = CN()
_C.SOLVER.USE_AMP = False

_C.SOLVER.MAX_ITER = 40000
_C.SOLVER.MULTI_MAX_ITER = ()  # set different max epoch for different stage
_C.SOLVER.MAX_EPOCH = 0  # any epoch number>0 will overwrite max_iter
_C.SOLVER.MULTI_MAX_EPOCH = ()  # set different max epoch for different stage

_C.SOLVER.OPTIMIZER = "SGD"  # "ADAMW"

_C.SOLVER.BASE_LR = 0.001

_C.SOLVER.LANG_LR = 0.00001
_C.SOLVER.BACKBONE_BODY_LR_FACTOR = 1.0

_C.SOLVER.BIAS_LR_FACTOR = 2
_C.SOLVER.GRAD_CLIP = 0.0
# D2 gradient clip
_C.SOLVER.CLIP_GRADIENTS = CN()
_C.SOLVER.CLIP_GRADIENTS.ENABLED = False
_C.SOLVER.CLIP_GRADIENTS.CLIP_VALUE = 0.0
_C.SOLVER.CLIP_GRADIENTS.CLIP_TYPE = "full_model"
_C.SOLVER.CLIP_GRADIENTS.NORM_TYPE = 2.0
_C.SOLVER.MODEL_EMA = 0.0

_C.SOLVER.MOMENTUM = 0.9

_C.SOLVER.WEIGHT_DECAY = 0.0005
_C.SOLVER.WEIGHT_DECAY_BIAS = 0.0
_C.SOLVER.WEIGHT_DECAY_NORM_FACTOR = 1.0

# use cosine lr to replace default multistage
_C.SOLVER.USE_COSINE = False
_C.SOLVER.MIN_LR = 0.000001

_C.SOLVER.GAMMA = 0.1
_C.SOLVER.STEPS = (30000,)

_C.SOLVER.USE_AUTOSTEP = False
_C.SOLVER.STEP_PATIENCE = 5

_C.SOLVER.WARMUP_FACTOR = 1.0 / 3
_C.SOLVER.WARMUP_ITERS = 500
_C.SOLVER.WARMUP_METHOD = "linear"

_C.SOLVER.CHECKPOINT_PERIOD = 2500
_C.SOLVER.CHECKPOINT_PER_EPOCH = -1.0
_C.SOLVER.TEST_WITH_INFERENCE = False
_C.SOLVER.AUTO_TERMINATE_PATIENCE = -1
# Number of images per batch
# This is global, so if we have 8 GPUs and IMS_PER_BATCH = 16, each GPU will
# see 2 images per batch
_C.SOLVER.IMS_PER_BATCH = 16
# This is the max negative ratio allowed per batch
_C.SOLVER.MAX_NEG_PER_BATCH = 0.1

_C.SOLVER.SEED = 0
_C.SOLVER.DISABLE_OUTPUT_DISTRIBUTED = False


_C.SOLVER.PROMPT_PROBING_LEVEL = -1.0 
# -1 means tuning the whole model; 
# 1 means tuning the whole language model; 1.5 means tuning the box head as well

_C.SOLVER.FIND_UNUSED_PARAMETERS = True
_C.SOLVER.DATASET_LENGTH = -1 # Just for logging purpose
_C.SOLVER.TUNING_HIGHLEVEL_OVERRIDE = None
_C.SOLVER.USE_EMA_FOR_MONITOR = False

_C.SOLVER.WEIGHT_DECAY_SCHEDULE = False
_C.SOLVER.WEIGHT_DECAY_SCHEDULE_RATIO = 0.667

# ---------------------------------------------------------------------------- #
# Specific test options
# ---------------------------------------------------------------------------- #
_C.TEST = CN()
_C.TEST.EXPECTED_RESULTS = []
_C.TEST.EXPECTED_RESULTS_SIGMA_TOL = 4
_C.TEST.DURING_TRAINING = False
# Number of images per batch
# This is global, so if we have 8 GPUs and IMS_PER_BATCH = 16, each GPU will
# see 2 images per batch
_C.TEST.IMS_PER_BATCH = 16
# Special Test Configuration
_C.TEST.USE_MULTISCALE = False
# _C.TEST.SCALES = (400, 600, 800, 1000, 1200, 1400)
# _C.TEST.RANGES = ((96, 10000), (64, 10000), (0, 10000), (0, 10000), (0, 256), (0, 192))
_C.TEST.SCALES = (400, 500, 600, 640, 700, 900, 1000, 1100, 1200, 1300, 1400, 1800)
_C.TEST.RANGES = ((96, 10000), (96, 10000), (64, 10000), (64, 10000), (64, 10000), (0, 10000), (0, 10000), (0, 256), (0, 256), (0, 192), (0, 192), (0, 96))
_C.TEST.MAX_SIZE = 2500
_C.TEST.FLIP = True
_C.TEST.SPECIAL_NMS = 'none'  # ('none', 'soft-nms', 'vote', 'soft-vote')
_C.TEST.TH = 0.6  # threshold for nms or vote
_C.TEST.PRE_NMS_TOP_N = 1000
_C.TEST.NUM_CLASSES = 81
_C.TEST.SELECT_CLASSES = ()

_C.TEST.EVAL_TASK = ""
_C.TEST.SUBSET = -1
_C.TEST.CHUNKED_EVALUATION = -1
_C.TEST.MDETR_STYLE_AGGREGATE_CLASS_NUM = -1
# ---------------------------------------------------------------------------- #
# Misc options
# ---------------------------------------------------------------------------- #
_C.OUTPUT_DIR = "OUTPUT"

_C.PATHS_CATALOG = os.path.join(os.path.dirname(__file__), "paths_catalog.py")

# TensorBoard experiment location
_C.TENSORBOARD_EXP = "OUTPUT"


================================================
FILE: GLIP/maskrcnn_benchmark/config/paths_catalog.py
================================================
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
"""Centralized catalog of paths."""

import os


def try_to_find(file, return_dir=False, search_path=['./DATASET', './OUTPUT', './data', './MODEL']):
    if not file:
        return file

    if file.startswith('catalog://'):
        return file

    DATASET_PATH = ['./']
    if 'DATASET' in os.environ:
        DATASET_PATH.append(os.environ['DATASET'])
    DATASET_PATH += search_path

    for path in DATASET_PATH:
        if os.path.exists(os.path.join(path, file)):
            if return_dir:
                return path
            else:
                return os.path.join(path, file)

    print('Cannot find {} in {}'.format(file, DATASET_PATH))
    exit(1)


class DatasetCatalog(object):
    DATASETS = {
        # pretrained grounding dataset
        # mixed vg and coco
        "mixed_train": {
            "coco_img_dir": "coco/train2014",
            "vg_img_dir": "gqa/images",
            "ann_file": "mdetr_annotations/final_mixed_train.json",
        },
        "mixed_train_no_coco": {
            "coco_img_dir": "coco/train2014",
            "vg_img_dir": "gqa/images",
            "ann_file": "mdetr_annotations/final_mixed_train_no_coco.json",
        },

        # flickr30k
        "flickr30k_train": {
            "img_folder": "flickr30k/flickr30k_images/train",
            "ann_file": "mdetr_annotations/final_flickr_separateGT_train.json",
            "is_train": True
        },
        "flickr30k_val": {
            "img_folder": "flickr30k/flickr30k_images/val",
            "ann_file": "mdetr_annotations/final_flickr_separateGT_val.json",
            "is_train": False
        },
        "flickr30k_test": {
            "img_folder": "flickr30k/flickr30k_images/test",
            "ann_file": "mdetr_annotations/final_flickr_separateGT_test.json",
            "is_train": False
        },

        # refcoco
        "refexp_all_val": {
            "img_dir": "refcoco/train2014",
            "ann_file": "mdetr_annotations/final_refexp_val.json",
            "is_train": False
        },

        # gqa
        "gqa_val": {
            "img_dir": "gqa/images",
            "ann_file": "mdetr_annotations/final_gqa_val.json",
            "is_train": False
        },

        # phrasecut
        "phrasecut_train": {
            "img_dir": "gqa/images",
            "ann_file": "mdetr_annotations/finetune_phrasecut_train.json",
            "is_train": True
        },


        # od to grounding
        # coco tsv
        "coco_dt_train": {
            "dataset_file": "coco_dt",
            "yaml_path": "coco_tsv/coco_obj.yaml",
            "is_train": True,
        },
        "COCO_odinw_train_8copy_dt_train": {
            "dataset_file": "coco_odinw_dt",
            "yaml_path": "coco_tsv/COCO_odinw_train_8copy.yaml",
            "is_train": True,
        },
        "COCO_odinw_val_dt_train": {
            "dataset_file": "coco_odinw_dt",
            "yaml_path": "coco_tsv/COCO_odinw_val.yaml",
            "is_train": False,
        },
        # lvis tsv
        "lvisv1_dt_train": {
            "dataset_file": "lvisv1_dt",
            "yaml_path": "coco_tsv/LVIS_v1_train.yaml",
            "is_train": True,
        },
        "LVIS_odinw_train_8copy_dt_train": {
            "dataset_file": "coco_odinw_dt",
            "yaml_path": "coco_tsv/LVIS_odinw_train_8copy.yaml",
            "is_train": True,
        },
        # object365 tsv
        "object365_dt_train": {
            "dataset_file": "object365_dt",
            "yaml_path": "Objects365/objects365_train_vgoiv6.cas2000.yaml",
            "is_train": True,
        },
        "object365_odinw_2copy_dt_train": {
            "dataset_file": "object365_odinw_dt",
            "yaml_path": "Objects365/objects365_train_odinw.cas2000_2copy.yaml",
            "is_train": True,
        },
        "objects365_odtsv_train": {
            "dataset_file": "objects365_odtsv",
            "yaml_path": "Objects365/train.cas2000.yaml",
            "is_train": True,
        },
        "objects365_odtsv_val": {
            "dataset_file": "objects365_odtsv",
            "yaml_path": "Objects365/val.yaml",
            "is_train": False,
        },

        # ImagetNet OD
        "imagenetod_train_odinw_2copy_dt": {
            "dataset_file": "imagenetod_odinw_dt",
            "yaml_path": "imagenet_od/imagenetod_train_odinw_2copy.yaml",
            "is_train": True,
        },

        # OpenImage OD
        "oi_train_odinw_dt": {
            "dataset_file": "oi_odinw_dt",
            "yaml_path": "openimages_v5c/oi_train_odinw.cas.2000.yaml",
            "is_train": True,
        },

        # vg tsv
        "vg_dt_train": {
            "dataset_file": "vg_dt",
            "yaml_path": "visualgenome/train_vgoi6_clipped.yaml",
            "is_train": True,
        },

        "vg_odinw_clipped_8copy_dt_train": {
            "dataset_file": "vg_odinw_clipped_8copy_dt",
            "yaml_path": "visualgenome/train_odinw_clipped_8copy.yaml",
            "is_train": True,
        },
        "vg_vgoi6_clipped_8copy_dt_train": {
            "dataset_file": "vg_vgoi6_clipped_8copy_dt",
            "yaml_path": "visualgenome/train_vgoi6_clipped_8copy.yaml",
            "is_train": True,
        },

        # coco json
        "coco_grounding_train": {
            "img_dir": "coco/train2017",
            "ann_file": "coco/annotations/instances_train2017.json",
            "is_train": True,
        },

        "lvis_grounding_train": {
            "img_dir": "coco",
            "ann_file": "coco/annotations/lvis_od_train.json"
        },


        "lvis_val": {
            "img_dir": "coco",
            "ann_file": "coco/annotations/lvis_od_val.json"
        },
        "coco_2017_train": {
            "img_dir": "coco/train2017",
            "ann_file": "coco/annotations/instances_train2017.json"
        },
        "coco_2017_val": {
            "img_dir": "coco/val2017",
            "ann_file": "coco/annotations/instances_val2017.json"
        },
        "coco_2017_test": {
            "img_dir": "coco/test2017",
            "ann_file": "coco/annotations/image_info_test-dev2017.json"
        },
        "coco_2014_train": {
            "img_dir": "coco/train2014",
            "ann_file": "coco/annotations/instances_train2014.json"
        },
        "coco_2014_val": {
            "img_dir": "coco/val2014",
            "ann_file": "coco/annotations/instances_val2014.json"
        },
        "coco_2014_minival": {
            "img_dir": "coco/val2014",
            "ann_file": "coco/annotations/instances_minival2014.json"
        },
    }

    @staticmethod
    def set(name, info):
        DatasetCatalog.DATASETS.update({name: info})

    @staticmethod
    def get(name):

        if name.endswith('_bg'):
            attrs = DatasetCatalog.DATASETS[name]
            data_dir = try_to_find(attrs["ann_file"], return_dir=True)
            args = dict(
                root=os.path.join(data_dir, attrs["img_dir"]),
                ann_file=os.path.join(data_dir, attrs["ann_file"]),
            )
            return dict(
                factory="Background",
                args=args,
            )
        else:
            if "bing" in name.split("_"):
                attrs = DatasetCatalog.DATASETS["bing_caption_train"]
            else:
                attrs = DatasetCatalog.DATASETS[name]

            if "voc" in name and 'split' in attrs:
                data_dir = try_to_find(attrs["data_dir"], return_dir=True)
                args = dict(
                    data_dir=os.path.join(data_dir, attrs["data_dir"]),
                    split=attrs["split"],
                )
                return dict(
                    factory="PascalVOCDataset",
                    args=args,
                )
            elif "mixed" in name:
                vg_img_dir = try_to_find(attrs["vg_img_dir"], return_dir=True)
                coco_img_dir = try_to_find(attrs["coco_img_dir"], return_dir=True)
                ann_file = try_to_find(attrs["ann_file"], return_dir=True)
                args = dict(
                    img_folder_coco=os.path.join(coco_img_dir, attrs["coco_img_dir"]),
                    img_folder_vg=os.path.join(vg_img_dir, attrs["vg_img_dir"]),
                    ann_file=os.path.join(ann_file, attrs["ann_file"])
                )
                return dict(
                    factory="MixedDataset",
                    args=args,
                )
            elif "flickr" in name:
                img_dir = try_to_find(attrs["img_folder"], return_dir=True)
                ann_dir = try_to_find(attrs["ann_file"], return_dir=True)
                args = dict(
                    img_folder=os.path.join(img_dir, attrs["img_folder"]),
                    ann_file=os.path.join(ann_dir, attrs["ann_file"]),
                    is_train=attrs["is_train"]
                )
                return dict(
                    factory="FlickrDataset",
                    args=args,
                )
            elif "refexp" in name:
                img_dir = try_to_find(attrs["img_dir"], return_dir=True)
                ann_dir = try_to_find(attrs["ann_file"], return_dir=True)
                args = dict(
                    img_folder=os.path.join(img_dir, attrs["img_dir"]),
                    ann_file=os.path.join(ann_dir, attrs["ann_file"]),
                )
                return dict(
                    factory="RefExpDataset",
                    args=args,
                )
            elif "gqa" in name:
                img_dir = try_to_find(attrs["img_dir"], return_dir=True)
                ann_dir = try_to_find(attrs["ann_file"], return_dir=True)
                args = dict(
                    img_folder=os.path.join(img_dir, attrs["img_dir"]),
                    ann_file=os.path.join(ann_dir, attrs["ann_file"]),
                )
                return dict(
                    factory="GQADataset",
                    args=args,
                )
            elif "phrasecut" in name:
                img_dir = try_to_find(attrs["img_dir"], return_dir=True)
                ann_dir = try_to_find(attrs["ann_file"], return_dir=True)
                args = dict(
                    img_folder=os.path.join(img_dir, attrs["img_dir"]),
                    ann_file=os.path.join(ann_dir, attrs["ann_file"]),
                )
                return dict(
                    factory="PhrasecutDetection",
                    args=args,
                )
            elif "_caption" in name:
                yaml_path = try_to_find(attrs["yaml_path"], return_dir=True)
                if "no_coco" in name:
                    yaml_name = attrs["yaml_name_no_coco"]
                else:
                    yaml_name = attrs["yaml_name"]
                yaml_file_name = "{}.{}.yaml".format(yaml_name, name.split("_")[2])
                args = dict(
                    yaml_file=os.path.join(yaml_path, attrs["yaml_path"], yaml_file_name)
                )
                return dict(
                    factory="CaptionTSV",
                    args=args,
                )
            elif "inferencecap" in name:
                yaml_file_name = try_to_find(attrs["yaml_path"])
                args = dict(
                    yaml_file=yaml_file_name)
                return dict(
                    factory="CaptionTSV",
                    args=args,
                )
            elif "pseudo_data" in name:
                args = dict(
                    yaml_file=try_to_find(attrs["yaml_path"])
                )
                return dict(
                    factory="PseudoData",
                    args=args,
                )
            elif "_dt" in name:
                dataset_file = attrs["dataset_file"]
                yaml_path = try_to_find(attrs["yaml_path"], return_dir=True)
                args = dict(
                    name=dataset_file,
                    yaml_file=os.path.join(yaml_path, attrs["yaml_path"]),
                )
                return dict(
                    factory="CocoDetectionTSV",
                    args=args,
                )
            elif "_odtsv" in name:
                dataset_file = attrs["dataset_file"]
                yaml_path = try_to_find(attrs["yaml_path"], return_dir=True)
                args = dict(
                    name=dataset_file,
                    yaml_file=os.path.join(yaml_path, attrs["yaml_path"]),
                )
                return dict(
                    factory="ODTSVDataset",
                    args=args,
                )
            elif "_grounding" in name:
                img_dir = try_to_find(attrs["img_dir"], return_dir=True)
                ann_dir = try_to_find(attrs["ann_file"], return_dir=True)
                args = dict(
                    img_folder=os.path.join(img_dir, attrs["img_dir"]),
                    ann_file=os.path.join(ann_dir, attrs["ann_file"]),
                )
                return dict(
                    factory="CocoGrounding",
                    args=args,
                )
            elif "lvis_evaluation" in name:
                img_dir = try_to_find(attrs["img_dir"], return_dir=True)
                ann_dir = try_to_find(attrs["ann_file"], return_dir=True)
                args = dict(
                    img_folder=os.path.join(img_dir, attrs["img_dir"]),
                    ann_file=os.path.join(ann_dir, attrs["ann_file"]),
                )
                return dict(
                    factory="LvisDetection",
                    args=args,
                )
            else:
                ann_dir = try_to_find(attrs["ann_file"], return_dir=True)
                img_dir = try_to_find(attrs["img_dir"], return_dir=True)
                args = dict(
                    root=os.path.join(img_dir, attrs["img_dir"]),
                    ann_file=os.path.join(ann_dir, attrs["ann_file"]),
                )
                for k, v in attrs.items():
                    args.update({k: os.path.join(ann_dir, v)})
                return dict(
                    factory="COCODataset",
                    args=args,
                )

        raise RuntimeError("Dataset not available: {}".format(name))


class ModelCatalog(object):
    S3_C2_DETECTRON_URL = "https://dl.fbaipublicfiles.com/detectron"
    C2_IMAGENET_MODELS = {
        "MSRA/R-50": "ImageNetPretrained/MSRA/R-50.pkl",
        "MSRA/R-50-GN": "ImageNetPretrained/47261647/R-50-GN.pkl",
        "MSRA/R-101": "ImageNetPretrained/MSRA/R-101.pkl",
        "MSRA/R-101-GN": "ImageNetPretrained/47592356/R-101-GN.pkl",
        "FAIR/20171220/X-101-32x8d": "ImageNetPretrained/20171220/X-101-32x8d.pkl",
        "FAIR/20171220/X-101-64x4d": "ImageNetPretrained/FBResNeXt/X-101-64x4d.pkl",
    }

    C2_DETECTRON_SUFFIX = "output/train/coco_2014_train%3Acoco_2014_valminusminival/generalized_rcnn/model_final.pkl"
    C2_DETECTRON_MODELS = {
        "35857197/e2e_faster_rcnn_R-50-C4_1x": "01_33_49.iAX0mXvW",
        "35857345/e2e_faster_rcnn_R-50-FPN_1x": "01_36_30.cUF7QR7I",
        "35857890/e2e_faster_rcnn_R-101-FPN_1x": "01_38_50.sNxI7sX7",
        "36761737/e2e_faster_rcnn_X-101-32x8d-FPN_1x": "06_31_39.5MIHi1fZ",
        "35858791/e2e_mask_rcnn_R-50-C4_1x": "01_45_57.ZgkA7hPB",
        "35858933/e2e_mask_rcnn_R-50-FPN_1x": "01_48_14.DzEQe4wC",
        "35861795/e2e_mask_rcnn_R-101-FPN_1x": "02_31_37.KqyEK4tT",
        "36761843/e2e_mask_rcnn_X-101-32x8d-FPN_1x": "06_35_59.RZotkLKI",
    }

    @staticmethod
    def get(name):
        if name.startswith("Caffe2Detectron/COCO"):
            return ModelCatalog.get_c2_detectron_12_2017_baselines(name)
        if name.startswith("ImageNetPretrained"):
            return ModelCatalog.get_c2_imagenet_pretrained(name)
        raise RuntimeError("model not present in the catalog {}".format(name))

    @staticmethod
    def get_c2_imagenet_pretrained(name):
        prefix = ModelCatalog.S3_C2_DETECTRON_URL
        name = name[len("ImageNetPretrained/"):]
        name = ModelCatalog.C2_IMAGENET_MODELS[name]
        url = "/".join([prefix, name])
        return url

    @staticmethod
    def get_c2_detectron_12_2017_baselines(name):
        # Detectron C2 models are stored following the structure
        # prefix/<model_id>/2012_2017_baselines/<model_name>.yaml.<signature>/suffix
        # we use as identifiers in the catalog Caffe2Detectron/COCO/<model_id>/<model_name>
        prefix = ModelCatalog.S3_C2_DETECTRON_URL
        suffix = ModelCatalog.C2_DETECTRON_SUFFIX
        # remove identification prefix
        name = name[len("Caffe2Detectron/COCO/"):]
        # split in <model_id> and <model_name>
        model_id, model_name = name.split("/")
        # parsing to make it match the url address from the Caffe2 models
        model_name = "{}.yaml".format(model_name)
        signature = ModelCatalog.C2_DETECTRON_MODELS[name]
        unique_name = ".".join([model_name, signature])
        url = "/".join([prefix, model_id, "12_2017_baselines", unique_name, suffix])
        return url


================================================
FILE: GLIP/maskrcnn_benchmark/csrc/ROIAlign.h
================================================
// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
#pragma once

#include "cpu/vision.h"

#ifdef WITH_CUDA
#include "cuda/vision.h"
#endif

// Interface for Python
at::Tensor ROIAlign_forward(const at::Tensor& input,
                            const at::Tensor& rois,
                            const float spatial_scale,
                            const int pooled_height,
                            const int pooled_width,
                            const int sampling_ratio) {
  if (input.device().is_cuda()) {
#ifdef WITH_CUDA
    return ROIAlign_forward_cuda(input, rois, spatial_scale, pooled_height, pooled_width, sampling_ratio);
#else
    AT_ERROR("Not compiled with GPU support");
#endif
  }
  return ROIAlign_forward_cpu(input, rois, spatial_scale, pooled_height, pooled_width, sampling_ratio);
}

at::Tensor ROIAlign_backward(const at::Tensor& grad,
                             const at::Tensor& rois,
                             const float spatial_scale,
                             const int pooled_height,
                             const int pooled_width,
                             const int batch_size,
                             const int channels,
                             const int height,
                             const int width,
                             const int sampling_ratio) {
  if (grad.device().is_cuda()) {
#ifdef WITH_CUDA
    return ROIAlign_backward_cuda(grad, rois, spatial_scale, pooled_height, pooled_width, batch_size, channels, height, width, sampling_ratio);
#else
    AT_ERROR("Not compiled with GPU support");
#endif
  }
  AT_ERROR("Not implemented on the CPU");
}



================================================
FILE: GLIP/maskrcnn_benchmark/csrc/ROIPool.h
================================================
// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
#pragma once

#include "cpu/vision.h"

#ifdef WITH_CUDA
#include "cuda/vision.h"
#endif


std::tuple<at::Tensor, at::Tensor> ROIPool_forward(const at::Tensor& input,
                                const at::Tensor& rois,
                                const float spatial_scale,
                                const int pooled_height,
                                const int pooled_width) {
  if (input.device().is_cuda()) {
#ifdef WITH_CUDA
    return ROIPool_forward_cuda(input, rois, spatial_scale, pooled_height, pooled_width);
#else
    AT_ERROR("Not compiled with GPU support");
#endif
  }
  AT_ERROR("Not implemented on the CPU");
}

at::Tensor ROIPool_backward(const at::Tensor& grad,
                                 const at::Tensor& input,
                                 const at::Tensor& rois,
                                 const at::Tensor& argmax,
                                 const float spatial_scale,
                                 const int pooled_height,
                                 const int pooled_width,
                                 const int batch_size,
                                 const int channels,
                                 const int height,
                                 const int width) {
  if (grad.device().is_cuda()) {
#ifdef WITH_CUDA
    return ROIPool_backward_cuda(grad, input, rois, argmax, spatial_scale, pooled_height, pooled_width, batch_size, channels, height, width);
#else
    AT_ERROR("Not compiled with GPU support");
#endif
  }
  AT_ERROR("Not implemented on the CPU");
}





================================================
FILE: GLIP/maskrcnn_benchmark/csrc/SigmoidFocalLoss.h
================================================
#pragma once

#include "cpu/vision.h"

#ifdef WITH_CUDA
#include "cuda/vision.h"
#endif

// Interface for Python
at::Tensor SigmoidFocalLoss_forward(
		const at::Tensor& logits,
                const at::Tensor& targets,
		const int num_classes, 
		const float gamma, 
		const float alpha) {
  if (logits.device().is_cuda()) {
#ifdef WITH_CUDA
    return SigmoidFocalLoss_forward_cuda(logits, targets, num_classes, gamma, alpha);
#else
    AT_ERROR("Not compiled with GPU support");
#endif
  }
  AT_ERROR("Not implemented on the CPU");
}

at::Tensor SigmoidFocalLoss_backward(
			     const at::Tensor& logits,
                             const at::Tensor& targets,
			     const at::Tensor& d_losses,
			     const int num_classes,
			     const float gamma,
			     const float alpha) {
  if (logits.device().is_cuda()) {
#ifdef WITH_CUDA
    return SigmoidFocalLoss_backward_cuda(logits, targets, d_losses, num_classes, gamma, alpha);
#else
    AT_ERROR("Not compiled with GPU support");
#endif
  }
  AT_ERROR("Not implemented on the CPU");
}


================================================
FILE: GLIP/maskrcnn_benchmark/csrc/cpu/ROIAlign_cpu.cpp
================================================
// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
#include "cpu/vision.h"

// implementation taken from Caffe2
template <typename T>
struct PreCalc {
  int pos1;
  int pos2;
  int pos3;
  int pos4;
  T w1;
  T w2;
  T w3;
  T w4;
};

template <typename T>
void pre_calc_for_bilinear_interpolate(
    const int height,
    const int width,
    const int pooled_height,
    const int pooled_width,
    const int iy_upper,
    const int ix_upper,
    T roi_start_h,
    T roi_start_w,
    T bin_size_h,
    T bin_size_w,
    int roi_bin_grid_h,
    int roi_bin_grid_w,
    std::vector<PreCalc<T>>& pre_calc) {
  int pre_calc_index = 0;
  for (int ph = 0; ph < pooled_height; ph++) {
    for (int pw = 0; pw < pooled_width; pw++) {
      for (int iy = 0; iy < iy_upper; iy++) {
        const T yy = roi_start_h + ph * bin_size_h +
            static_cast<T>(iy + .5f) * bin_size_h /
                static_cast<T>(roi_bin_grid_h); // e.g., 0.5, 1.5
        for (int ix = 0; ix < ix_upper; ix++) {
          const T xx = roi_start_w + pw * bin_size_w +
              static_cast<T>(ix + .5f) * bin_size_w /
                  static_cast<T>(roi_bin_grid_w);

          T x = xx;
          T y = yy;
          // deal with: inverse elements are out of feature map boundary
          if (y < -1.0 || y > height || x < -1.0 || x > width) {
            // empty
            PreCalc<T> pc;
            pc.pos1 = 0;
            pc.pos2 = 0;
            pc.pos3 = 0;
            pc.pos4 = 0;
            pc.w1 = 0;
            pc.w2 = 0;
            pc.w3 = 0;
            pc.w4 = 0;
            pre_calc[pre_calc_index] = pc;
            pre_calc_index += 1;
            continue;
          }

          if (y <= 0) {
            y = 0;
          }
          if (x <= 0) {
            x = 0;
          }

          int y_low = (int)y;
          int x_low = (int)x;
          int y_high;
          int x_high;

          if (y_low >= height - 1) {
            y_high = y_low = height - 1;
            y = (T)y_low;
          } else {
            y_high = y_low + 1;
          }

          if (x_low >= width - 1) {
            x_high = x_low = width - 1;
            x = (T)x_low;
          } else {
            x_high = x_low + 1;
          }

          T ly = y - y_low;
          T lx = x - x_low;
          T hy = 1. - ly, hx = 1. - lx;
          T w1 = hy * hx, w2 = hy * lx, w3 = ly * hx, w4 = ly * lx;

          // save weights and indeces
          PreCalc<T> pc;
          pc.pos1 = y_low * width + x_low;
          pc.pos2 = y_low * width + x_high;
          pc.pos3 = y_high * width + x_low;
          pc.pos4 = y_high * width + x_high;
          pc.w1 = w1;
          pc.w2 = w2;
          pc.w3 = w3;
          pc.w4 = w4;
          pre_calc[pre_calc_index] = pc;

          pre_calc_index += 1;
        }
      }
    }
  }
}

template <typename T>
void ROIAlignForward_cpu_kernel(
    const int nthreads,
    const T* bottom_data,
    const T& spatial_scale,
    const int channels,
    const int height,
    const int width,
    const int pooled_height,
    const int pooled_width,
    const int sampling_ratio,
    const T* bottom_rois,
    //int roi_cols,
    T* top_data) {
  //AT_ASSERT(roi_cols == 4 || roi_cols == 5);
  int roi_cols = 5;

  int n_rois = nthreads / channels / pooled_width / pooled_height;
  // (n, c, ph, pw) is an element in the pooled output
  // can be parallelized using omp
  // #pragma omp parallel for num_threads(32)
  for (int n = 0; n < n_rois; n++) {
    int index_n = n * channels * pooled_width * pooled_height;

    // roi could have 4 or 5 columns
    const T* offset_bottom_rois = bottom_rois + n * roi_cols;
    int roi_batch_ind = 0;
    if (roi_cols == 5) {
      roi_batch_ind = offset_bottom_rois[0];
      offset_bottom_rois++;
    }

    // Do not using rounding; this implementation detail is critical
    T roi_start_w = offset_bottom_rois[0] * spatial_scale;
    T roi_start_h = offset_bottom_rois[1] * spatial_scale;
    T roi_end_w = offset_bottom_rois[2] * spatial_scale;
    T roi_end_h = offset_bottom_rois[3] * spatial_scale;
    // T roi_start_w = round(offset_bottom_rois[0] * spatial_scale);
    // T roi_start_h = round(offset_bottom_rois[1] * spatial_scale);
    // T roi_end_w = round(offset_bottom_rois[2] * spatial_scale);
    // T roi_end_h = round(offset_bottom_rois[3] * spatial_scale);

    // Force malformed ROIs to be 1x1
    T roi_width = std::max(roi_end_w - roi_start_w, (T)1.);
    T roi_height = std::max(roi_end_h - roi_start_h, (T)1.);
    T bin_size_h = static_cast<T>(roi_height) / static_cast<T>(pooled_height);
    T bin_size_w = static_cast<T>(roi_width) / static_cast<T>(pooled_width);

    // We use roi_bin_grid to sample the grid and mimic integral
    int roi_bin_grid_h = (sampling_ratio > 0)
        ? sampling_ratio
        : ceil(roi_height / pooled_height); // e.g., = 2
    int roi_bin_grid_w =
        (sampling_ratio > 0) ? sampling_ratio : ceil(roi_width / pooled_width);

    // We do average (integral) pooling inside a bin
    const T count = roi_bin_grid_h * roi_bin_grid_w; // e.g. = 4

    // we want to precalculate indeces and weights shared by all chanels,
    // this is the key point of optimiation
    std::vector<PreCalc<T>> pre_calc(
        roi_bin_grid_h * roi_bin_grid_w * pooled_width * pooled_height);
    pre_calc_for_bilinear_interpolate(
        height,
        width,
        pooled_height,
        pooled_width,
        roi_bin_grid_h,
        roi_bin_grid_w,
        roi_start_h,
        roi_start_w,
        bin_size_h,
        bin_size_w,
        roi_bin_grid_h,
        roi_bin_grid_w,
        pre_calc);

      for (int c = 0; c < channels; c++) {
      int index_n_c = index_n + c * pooled_width * pooled_height;
      const T* offset_bottom_data =
          bottom_data + (roi_batch_ind * channels + c) * height * width;
      int pre_calc_index = 0;

      for (int ph = 0; ph < pooled_height; ph++) {
        for (int pw = 0; pw < pooled_width; pw++) {
          int index = index_n_c + ph * pooled_width + pw;

          T output_val = 0.;
          for (int iy = 0; iy < roi_bin_grid_h; iy++) {
            for (int ix = 0; ix < roi_bin_grid_w; ix++) {
              PreCalc<T> pc = pre_calc[pre_calc_index];
              output_val += pc.w1 * offset_bottom_data[pc.pos1] +
                  pc.w2 * offset_bottom_data[pc.pos2] +
                  pc.w3 * offset_bottom_data[pc.pos3] +
                  pc.w4 * offset_bottom_data[pc.pos4];

              pre_calc_index += 1;
            }
          }
          output_val /= count;

          top_data[index] = output_val;
        } // for pw
      } // for ph
    } // for c
  } // for n
}

at::Tensor ROIAlign_forward_cpu(const at::Tensor& input,
                                const at::Tensor& rois,
                                const float spatial_scale,
                                const int pooled_height,
                                const int pooled_width,
                                const int sampling_ratio) {
  AT_ASSERTM(!input.device().is_cuda(), "input must be a CPU tensor");
  AT_ASSERTM(!rois.device().is_cuda(), "rois must be a CPU tensor");

  auto num_rois = rois.size(0);
  auto channels = input.size(1);
  auto height = input.size(2);
  auto width = input.size(3);

  auto output = at::empty({num_rois, channels, pooled_height, pooled_width}, input.options());
  auto output_size = num_rois * pooled_height * pooled_width * channels;

  if (output.numel() == 0) {
    return output;
  }

  AT_DISPATCH_FLOATING_TYPES(input.scalar_type(), "ROIAlign_forward", [&] {
    ROIAlignForward_cpu_kernel<scalar_t>(
         output_size,
         input.data_ptr<scalar_t>(),
         spatial_scale,
         channels,
         height,
         width,
         pooled_height,
         pooled_width,
         sampling_ratio,
         rois.data_ptr<scalar_t>(),
         output.data_ptr<scalar_t>());
  });
  return output;
}


================================================
FILE: GLIP/maskrcnn_benchmark/csrc/cpu/nms_cpu.cpp
================================================
// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
#include "cpu/vision.h"


template <typename scalar_t>
at::Tensor nms_cpu_kernel(const at::Tensor& dets,
                          const at::Tensor& scores,
                          const float threshold) {
  AT_ASSERTM(!dets.device().is_cuda(), "dets must be a CPU tensor");
  AT_ASSERTM(!scores.device().is_cuda(), "scores must be a CPU tensor");
  AT_ASSERTM(dets.type() == scores.type(), "dets should have the same type as scores");

  if (dets.numel() == 0) {
    return at::empty({0}, dets.options().dtype(at::kLong).device(at::kCPU));
  }

  auto x1_t = dets.select(1, 0).contiguous();
  auto y1_t = dets.select(1, 1).contiguous();
  auto x2_t = dets.select(1, 2).contiguous();
  auto y2_t = dets.select(1, 3).contiguous();

  at::Tensor areas_t = (x2_t - x1_t + 1) * (y2_t - y1_t + 1);

  auto order_t = std::get<1>(scores.sort(0, /* descending=*/true));

  auto ndets = dets.size(0);
  at::Tensor suppressed_t = at::zeros({ndets}, dets.options().dtype(at::kByte).device(at::kCPU));

  auto suppressed = suppressed_t.data_ptr<uint8_t>();
  auto order = order_t.data_ptr<int64_t>();
  auto x1 = x1_t.data_ptr<scalar_t>();
  auto y1 = y1_t.data_ptr<scalar_t>();
  auto x2 = x2_t.data_ptr<scalar_t>();
  auto y2 = y2_t.data_ptr<scalar_t>();
  auto areas = areas_t.data_ptr<scalar_t>();

  for (int64_t _i = 0; _i < ndets; _i++) {
    auto i = order[_i];
    if (suppressed[i] == 1)
      continue;
    auto ix1 = x1[i];
    auto iy1 = y1[i];
    auto ix2 = x2[i];
    auto iy2 = y2[i];
    auto iarea = areas[i];

    for (int64_t _j = _i + 1; _j < ndets; _j++) {
      auto j = order[_j];
      if (suppressed[j] == 1)
        continue;
      auto xx1 = std::max(ix1, x1[j]);
      auto yy1 = std::max(iy1, y1[j]);
      auto xx2 = std::min(ix2, x2[j]);
      auto yy2 = std::min(iy2, y2[j]);

      auto w = std::max(static_cast<scalar_t>(0), xx2 - xx1 + 1);
      auto h = std::max(static_cast<scalar_t>(0), yy2 - yy1 + 1);
      auto inter = w * h;
      auto ovr = inter / (iarea + areas[j] - inter);
      if (ovr >= threshold)
        suppressed[j] = 1;
   }
  }
  return at::nonzero(suppressed_t == 0).squeeze(1);
}

at::Tensor nms_cpu(const at::Tensor& dets,
               const at::Tensor& scores,
               const float threshold) {
  at::Tensor result;
  AT_DISPATCH_FLOATING_TYPES(dets.scalar_type(), "nms", [&] {
    result = nms_cpu_kernel<scalar_t>(dets, scores, threshold);
  });
  return result;
}


================================================
FILE: GLIP/maskrcnn_benchmark/csrc/cpu/soft_nms.cpp
================================================
// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
#include "cpu/vision.h"


template <typename scalar_t>
std::pair<at::Tensor, at::Tensor> soft_nms_cpu_kernel(const at::Tensor& dets,
                                                      const at::Tensor& scores,
                                                      const float threshold,
                                                      const float sigma) {
  AT_ASSERTM(!dets.device().is_cuda(), "dets must be a CPU tensor");
  AT_ASSERTM(!scores.device().is_cuda(), "scores must be a CPU tensor");
  AT_ASSERTM(dets.type() == scores.type(), "dets should have the same type as scores");

  if (dets.numel() == 0) {
    return std::make_pair(at::empty({0}, dets.options().dtype(at::kLong).device(at::kCPU)),
                          at::empty({0}, scores.options().dtype(at::kFloat).device(at::kCPU)));
  }

  auto x1_t = dets.select(1, 0).contiguous();
  auto y1_t = dets.select(1, 1).contiguous();
  auto x2_t = dets.select(1, 2).contiguous();
  auto y2_t = dets.select(1, 3).contiguous();

  auto scores_t = scores.clone();

  at::Tensor areas_t = (x2_t - x1_t + 1) * (y2_t - y1_t + 1);
  auto ndets = dets.size(0);
  auto inds_t = at::arange(ndets, dets.options().dtype(at::kLong).device(at::kCPU));

  auto x1 = x1_t.data_ptr<scalar_t>();
  auto y1 = y1_t.data_ptr<scalar_t>();
  auto x2 = x2_t.data_ptr<scalar_t>();
  auto y2 = y2_t.data_ptr<scalar_t>();
  auto s = scores_t.data_ptr<scalar_t>();
  auto inds = inds_t.data_ptr<int64_t>();
  auto areas = areas_t.data_ptr<scalar_t>();

  for (int64_t i = 0; i < ndets; i++) {

    auto ix1 = x1[i];
    auto iy1 = y1[i];
    auto ix2 = x2[i];
    auto iy2 = y2[i];
    auto is = s[i];
    auto ii = inds[i];
    auto iarea = areas[i];

    auto maxpos = scores_t.slice(0, i, ndets).argmax().item<int64_t>() + i;

    // add max box as a detection
    x1[i] = x1[maxpos];
    y1[i] = y1[maxpos];
    x2[i] = x2[maxpos];
    y2[i] = y2[maxpos];
    s[i] = s[maxpos];
    inds[i] = inds[maxpos];
    areas[i] = areas[maxpos];

    // swap ith box with position of max box
    x1[maxpos] = ix1;
    y1[maxpos] = iy1;
    x2[maxpos] = ix2;
    y2[maxpos] = iy2;
    s[maxpos] = is;
    inds[maxpos] = ii;
    areas[maxpos] = iarea;

    ix1 = x1[i];
    iy1 = y1[i];
    ix2 = x2[i];
    iy2 = y2[i];
    iarea = areas[i];

    // NMS iterations, note that ndets changes if detection boxes
    // fall below threshold
    for (int64_t j = i + 1; j < ndets; j++) {
      auto xx1 = std::max(ix1, x1[j]);
      auto yy1 = std::max(iy1, y1[j]);
      auto xx2 = std::min(ix2, x2[j]);
      auto yy2 = std::min(iy2, y2[j]);

      auto w = std::max(static_cast<scalar_t>(0), xx2 - xx1 + 1);
      auto h = std::max(static_cast<scalar_t>(0), yy2 - yy1 + 1);

      auto inter = w * h;
      auto ovr = inter / (iarea + areas[j] - inter);

      s[j] = s[j] * std::exp(- std::pow(ovr, 2.0) / sigma);

      // if box score falls below threshold, discard the box by
      // swapping with last box update ndets
      if (s[j] < threshold) {
        x1[j] = x1[ndets - 1];
        y1[j] = y1[ndets - 1];
        x2[j] = x2[ndets - 1];
        y2[j] = y2[ndets - 1];
        s[j] = s[ndets - 1];
        inds[j] = inds[ndets - 1];
        areas[j] = areas[ndets - 1];
        j--;
        ndets--;
      }
    }
  }
  return std::make_pair(inds_t.slice(0, 0, ndets), scores_t.slice(0, 0, ndets));
}

std::pair<at::Tensor, at::Tensor> soft_nms_cpu(const at::Tensor& dets,
                                               const at::Tensor& scores,
                                               const float threshold,
                                               const float sigma) {
  std::pair<at::Tensor, at::Tensor> result;
  AT_DISPATCH_FLOATING_TYPES(dets.scalar_type(), "soft_nms", [&] {
    result = soft_nms_cpu_kernel<scalar_t>(dets, scores, threshold, sigma);
  });
  return result;
}

================================================
FILE: GLIP/maskrcnn_benchmark/csrc/cpu/vision.h
================================================
// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
#pragma once
#include <torch/extension.h>


at::Tensor ROIAlign_forward_cpu(const at::Tensor& input,
                                const at::Tensor& rois,
                                const float spatial_scale,
                                const int pooled_height,
                                const int pooled_width,
                                const int sampling_ratio);


at::Tensor nms_cpu(const at::Tensor& dets,
                   const at::Tensor& scores,
                   const float threshold);


std::pair<at::Tensor, at::Tensor> soft_nms_cpu(const at::Tensor& dets,
                                               const at::Tensor& scores,
                                               const float threshold,
                                               const float sigma);

================================================
FILE: GLIP/maskrcnn_benchmark/csrc/cuda/ROIAlign_cuda.cu
================================================
// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
#include <ATen/ATen.h>
#include <ATen/cuda/CUDAContext.h>

#include <THC/THC.h>
#include <THC/THCAtomics.cuh>
#include <THC/THCDeviceUtils.cuh>

// TODO make it in a common file
#define CUDA_1D_KERNEL_LOOP(i, n)                            \
  for (int i = blockIdx.x * blockDim.x + threadIdx.x; i < n; \
       i += blockDim.x * gridDim.x)


template <typename T>
__device__ T bilinear_interpolate(const T* bottom_data,
    const int height, const int width,
    T y, T x,
    const int index /* index for debug only*/) {

  // deal with cases that inverse elements are out of feature map boundary
  if (y < -1.0 || y > height || x < -1.0 || x > width) {
    //empty
    return 0;
  }

  if (y <= 0) y = 0;
  if (x <= 0) x = 0;

  int y_low = (int) y;
  int x_low = (int) x;
  int y_high;
  int x_high;

  if (y_low >= height - 1) {
    y_high = y_low = height - 1;
    y = (T) y_low;
  } else {
    y_high = y_low + 1;
  }

  if (x_low >= width - 1) {
    x_high = x_low = width - 1;
    x = (T) x_low;
  } else {
    x_high = x_low + 1;
  }

  T ly = y - y_low;
  T lx = x - x_low;
  T hy = 1. - ly, hx = 1. - lx;
  // do bilinear interpolation
  T v1 = bottom_data[y_low * width + x_low];
  T v2 = bottom_data[y_low * width + x_high];
  T v3 = bottom_data[y_high * width + x_low];
  T v4 = bottom_data[y_high * width + x_high];
  T w1 = hy * hx, w2 = hy * lx, w3 = ly * hx, w4 = ly * lx;

  T val = (w1 * v1 + w2 * v2 + w3 * v3 + w4 * v4);

  return val;
}

template <typename T>
__global__ void RoIAlignForward(const int nthreads, const T* bottom_data,
    const T spatial_scale, const int channels,
    const int height, const int width,
    const int pooled_height, const int pooled_width,
    const int sampling_ratio,
    const T* bottom_rois, T* top_data) {
  CUDA_1D_KERNEL_LOOP(index, nthreads) {
    // (n, c, ph, pw) is an element in the pooled output
    int pw = index % pooled_width;
    int ph = (index / pooled_width) % pooled_height;
    int c = (index / pooled_width / pooled_height) % channels;
    int n = index / pooled_width / pooled_height / channels;

    const T* offset_bottom_rois = bottom_rois + n * 5;
    int roi_batch_ind = offset_bottom_rois[0];

    // Do not using rounding; this implementation detail is critical
    T roi_start_w = offset_bottom_rois[1] * spatial_scale;
    T roi_start_h = offset_bottom_rois[2] * spatial_scale;
    T roi_end_w = offset_bottom_rois[3] * spatial_scale;
    T roi_end_h = offset_bottom_rois[4] * spatial_scale;
    // T roi_start_w = round(offset_bottom_rois[1] * spatial_scale);
    // T roi_start_h = round(offset_bottom_rois[2] * spatial_scale);
    // T roi_end_w = round(offset_bottom_rois[3] * spatial_scale);
    // T roi_end_h = round(offset_bottom_rois[4] * spatial_scale);

    // Force malformed ROIs to be 1x1
    T roi_width = max(roi_end_w - roi_start_w, (T)1.);
    T roi_height = max(roi_end_h - roi_start_h, (T)1.);
    T bin_size_h = static_cast<T>(roi_height) / static_cast<T>(pooled_height);
    T bin_size_w = static_cast<T>(roi_width) / static_cast<T>(pooled_width);

    const T* offset_bottom_data = bottom_data + (roi_batch_ind * channels + c) * height * width;

    // We use roi_bin_grid to sample the grid and mimic integral
    int roi_bin_grid_h = (sampling_ratio > 0) ? sampling_ratio : ceil(roi_height / pooled_height); // e.g., = 2
    int roi_bin_grid_w = (sampling_ratio > 0) ? sampling_ratio : ceil(roi_width / pooled_width);

    // We do average (integral) pooling inside a bin
    const T count = roi_bin_grid_h * roi_bin_grid_w; // e.g. = 4

    T output_val = 0.;
    for (int iy = 0; iy < roi_bin_grid_h; iy ++) // e.g., iy = 0, 1
    {
      const T y = roi_start_h + ph * bin_size_h + static_cast<T>(iy + .5f) * bin_size_h / static_cast<T>(roi_bin_grid_h); // e.g., 0.5, 1.5
      for (int ix = 0; ix < roi_bin_grid_w; ix ++)
      {
        const T x = roi_start_w + pw * bin_size_w + static_cast<T>(ix + .5f) * bin_size_w / static_cast<T>(roi_bin_grid_w);

        T val = bilinear_interpolate(offset_bottom_data, height, width, y, x, index);
        output_val += val;
      }
    }
    output_val /= count;

    top_data[index] = output_val;
  }
}


template <typename T>
__device__ void bilinear_interpolate_gradient(
    const int height, const int width,
    T y, T x,
    T & w1, T & w2, T & w3, T & w4,
    int & x_low, int & x_high, int & y_low, int & y_high,
    const int index /* index for debug only*/) {

  // deal with cases that inverse elements are out of feature map boundary
  if (y < -1.0 || y > height || x < -1.0 || x > width) {
    //empty
    w1 = w2 = w3 = w4 = 0.;
    x_low = x_high = y_low = y_high = -1;
    return;
  }

  if (y <= 0) y = 0;
  if (x <= 0) x = 0;

  y_low = (int) y;
  x_low = (int) x;

  if (y_low >= height - 1) {
    y_high = y_low = height - 1;
    y = (T) y_low;
  } else {
    y_high = y_low + 1;
  }

  if (x_low >= width - 1) {
    x_high = x_low = width - 1;
    x = (T) x_low;
  } else {
    x_high = x_low + 1;
  }

  T ly = y - y_low;
  T lx = x - x_low;
  T hy = 1. - ly, hx = 1. - lx;

  // reference in forward
  // T v1 = bottom_data[y_low * width + x_low];
  // T v2 = bottom_data[y_low * width + x_high];
  // T v3 = bottom_data[y_high * width + x_low];
  // T v4 = bottom_data[y_high * width + x_high];
  // T val = (w1 * v1 + w2 * v2 + w3 * v3 + w4 * v4);

  w1 = hy * hx, w2 = hy * lx, w3 = ly * hx, w4 = ly * lx;

  return;
}

template <typename T>
__global__ void RoIAlignBackwardFeature(const int nthreads, const T* top_diff,
    const int num_rois, const T spatial_scale,
    const int channels, const int height, const int width,
    const int pooled_height, const int pooled_width,
    const int sampling_ratio,
    T* bottom_diff,
    const T* bottom_rois) {
  CUDA_1D_KERNEL_LOOP(index, nthreads) {
    // (n, c, ph, pw) is an element in the pooled output
    int pw = index % pooled_width;
    int ph = (index / pooled_width) % pooled_height;
    int c = (index / pooled_width / pooled_height) % channels;
    int n = index / pooled_width / pooled_height / channels;

    const T* offset_bottom_rois = bottom_rois + n * 5;
    int roi_batch_ind = offset_bottom_rois[0];

    // Do not using rounding; this implementation detail is critical
    T roi_start_w = offset_bottom_rois[1] * spatial_scale;
    T roi_start_h = offset_bottom_rois[2] * spatial_scale;
    T roi_end_w = offset_bottom_rois[3] * spatial_scale;
    T roi_end_h = offset_bottom_rois[4] * spatial_scale;
    // T roi_start_w = round(offset_bottom_rois[1] * spatial_scale);
    // T roi_start_h = round(offset_bottom_rois[2] * spatial_scale);
    // T roi_end_w = round(offset_bottom_rois[3] * spatial_scale);
    // T roi_end_h = round(offset_bottom_rois[4] * spatial_scale);

    // Force malformed ROIs to be 1x1
    T roi_width = max(roi_end_w - roi_start_w, (T)1.);
    T roi_height = max(roi_end_h - roi_start_h, (T)1.);
    T bin_size_h = static_cast<T>(roi_height) / static_cast<T>(pooled_height);
    T bin_size_w = static_cast<T>(roi_width) / static_cast<T>(pooled_width);

    T* offset_bottom_diff = bottom_diff + (roi_batch_ind * channels + c) * height * width;

    int top_offset    = (n * channels + c) * pooled_height * pooled_width;
    const T* offset_top_diff = top_diff + top_offset;
    const T top_diff_this_bin = offset_top_diff[ph * pooled_width + pw];

    // We use roi_bin_grid to sample the grid and mimic integral
    int roi_bin_grid_h = (sampling_ratio > 0) ? sampling_ratio : ceil(roi_height / pooled_height); // e.g., = 2
    int roi_bin_grid_w = (sampling_ratio > 0) ? sampling_ratio : ceil(roi_width / pooled_width);

    // We do average (integral) pooling inside a bin
    const T count = roi_bin_grid_h * roi_bin_grid_w; // e.g. = 4

    for (int iy = 0; iy < roi_bin_grid_h; iy ++) // e.g., iy = 0, 1
    {
      const T y = roi_start_h + ph * bin_size_h + static_cast<T>(iy + .5f) * bin_size_h / static_cast<T>(roi_bin_grid_h); // e.g., 0.5, 1.5
      for (int ix = 0; ix < roi_bin_grid_w; ix ++)
      {
        const T x = roi_start_w + pw * bin_size_w + static_cast<T>(ix + .5f) * bin_size_w / static_cast<T>(roi_bin_grid_w);

        T w1, w2, w3, w4;
        int x_low, x_high, y_low, y_high;

        bilinear_interpolate_gradient(height, width, y, x,
            w1, w2, w3, w4,
            x_low, x_high, y_low, y_high,
            index);

        T g1 = top_diff_this_bin * w1 / count;
        T g2 = top_diff_this_bin * w2 / count;
        T g3 = top_diff_this_bin * w3 / count;
        T g4 = top_diff_this_bin * w4 / count;

        if (x_low >= 0 && x_high >= 0 && y_low >= 0 && y_high >= 0)
        {
          atomicAdd(offset_bottom_diff + y_low * width + x_low, static_cast<T>(g1));
          atomicAdd(offset_bottom_diff + y_low * width + x_high, static_cast<T>(g2));
          atomicAdd(offset_bottom_diff + y_high * width + x_low, static_cast<T>(g3));
          atomicAdd(offset_bottom_diff + y_high * width + x_high, static_cast<T>(g4));
        } // if
      } // ix
    } // iy
  } // CUDA_1D_KERNEL_LOOP
} // RoIAlignBackward


at::Tensor ROIAlign_forward_cuda(const at::Tensor& input,
                                 const at::Tensor& rois,
                                 const float spatial_scale,
                                 const int pooled_height,
                                 const int pooled_width,
                                 const int sampling_ratio) {
  AT_ASSERTM(input.device().is_cuda(), "input must be a CUDA tensor");
  AT_ASSERTM(rois.device().is_cuda(), "rois must be a CUDA tensor");

  auto num_rois = rois.size(0);
  auto channels = input.size(1);
  auto height = input.size(2);
  auto width = input.size(3);

  auto output = at::empty({num_rois, channels, pooled_height, pooled_width}, input.options());
  auto output_size = num_rois * pooled_height * pooled_width * channels;
  cudaStream_t stream = at::cuda::getCurrentCUDAStream();

  dim3 grid(std::min(THCCeilDiv(output_size, 512L), 4096L));
  dim3 block(512);

  if (output.numel() == 0) {
    THCudaCheck(cudaGetLastError());
    return output;
  }

  AT_DISPATCH_FLOATING_TYPES(input.scalar_type(), "ROIAlign_forward", [&] {
    RoIAlignForward<scalar_t><<<grid, block, 0, stream>>>(
         output_size,
         input.contiguous().data_ptr<scalar_t>(),
         spatial_scale,
         channels,
         height,
         width,
         pooled_height,
         pooled_width,
         sampling_ratio,
         rois.contiguous().data_ptr<scalar_t>(),
         output.data_ptr<scalar_t>());
  });
  THCudaCheck(cudaGetLastError());
  return output;
}

// TODO remove the dependency on input and use instead its sizes -> save memory
at::Tensor ROIAlign_backward_cuda(const at::Tensor& grad,
                                  const at::Tensor& rois,
                                  const float spatial_scale,
                                  const int pooled_height,
                                  const int pooled_width,
                                  const int batch_size,
                                  const int channels,
                                  const int height,
                                  const int width,
                                  const int sampling_ratio) {
  AT_ASSERTM(grad.device().is_cuda(), "grad must be a CUDA tensor");
  AT_ASSERTM(rois.device().is_cuda(), "rois must be a CUDA tensor");

  auto num_rois = rois.size(0);
  auto grad_input = at::zeros({batch_size, channels, height, width}, grad.options());

  cudaStream_t stream = at::cuda::getCurrentCUDAStream();

  dim3 grid(std::min(THCCeilDiv(grad.numel(), 512L), 4096L));
  dim3 block(512);

  // handle possibly empty gradients
  if (grad.numel() == 0) {
    THCudaCheck(cudaGetLastError());
    return grad_input;
  }

  AT_DISPATCH_FLOATING_TYPES(grad.scalar_type(), "ROIAlign_backward", [&] {
    RoIAlignBackwardFeature<scalar_t><<<grid, block, 0, stream>>>(
         grad.numel(),
         grad.contiguous().data_ptr<scalar_t>(),
         num_rois,
         spatial_scale,
         channels,
         height,
         width,
         pooled_height,
         pooled_width,
         sampling_ratio,
         grad_input.data_ptr<scalar_t>(),
         rois.contiguous().data_ptr<scalar_t>());
  });
  THCudaCheck(cudaGetLastError());
  return grad_input;
}


================================================
FILE: GLIP/maskrcnn_benchmark/csrc/cuda/ROIPool_cuda.cu
================================================
// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
#include <ATen/ATen.h>
#include <ATen/cuda/CUDAContext.h>

#include <THC/THC.h>
#include <THC/THCAtomics.cuh>
#include <THC/THCDeviceUtils.cuh>


// TODO make it in a common file
#define CUDA_1D_KERNEL_LOOP(i, n)                            \
  for (int i = blockIdx.x * blockDim.x + threadIdx.x; i < n; \
       i += blockDim.x * gridDim.x)


template <typename T>
__global__ void RoIPoolFForward(const int nthreads, const T* bottom_data,
    const T spatial_scale, const int channels, const int height,
    const int width, const int pooled_height, const int pooled_width,
    const T* bottom_rois, T* top_data, int* argmax_data) {
  CUDA_1D_KERNEL_LOOP(index, nthreads) {
    // (n, c, ph, pw) is an element in the pooled output
    int pw = index % pooled_width;
    int ph = (index / pooled_width) % pooled_height;
    int c = (index / pooled_width / pooled_height) % channels;
    int n = index / pooled_width / pooled_height / channels;

    const T* offset_bottom_rois = bottom_rois + n * 5;
    int roi_batch_ind = offset_bottom_rois[0];
    int roi_start_w = round(offset_bottom_rois[1] * spatial_scale);
    int roi_start_h = round(offset_bottom_rois[2] * spatial_scale);
    int roi_end_w = round(offset_bottom_rois[3] * spatial_scale);
    int roi_end_h = round(offset_bottom_rois[4] * spatial_scale);

    // Force malformed ROIs to be 1x1
    int roi_width = max(roi_end_w - roi_start_w + 1, 1);
    int roi_height = max(roi_end_h - roi_start_h + 1, 1);
    T bin_size_h = static_cast<T>(roi_height)
                       / static_cast<T>(pooled_height);
    T bin_size_w = static_cast<T>(roi_width)
                       / static_cast<T>(pooled_width);

    int hstart = static_cast<int>(floor(static_cast<T>(ph)
                                        * bin_size_h));
    int wstart = static_cast<int>(floor(static_cast<T>(pw)
                                        * bin_size_w));
    int hend = static_cast<int>(ceil(static_cast<T>(ph + 1)
                                     * bin_size_h));
    int wend = static_cast<int>(ceil(static_cast<T>(pw + 1)
                                     * bin_size_w));

    // Add roi offsets and clip to input boundaries
    hstart = min(max(hstart + roi_start_h, 0), height);
    hend = min(max(hend + roi_start_h, 0), height);
    wstart = min(max(wstart + roi_start_w, 0), width);
    wend = min(max(wend + roi_start_w, 0), width);
    bool is_empty = (hend <= hstart) || (wend <= wstart);

    // Define an empty pooling region to be zero
    T maxval = is_empty ? 0 : -FLT_MAX;
    // If nothing is pooled, argmax = -1 causes nothing to be backprop'd
    int maxidx = -1;
    const T* offset_bottom_data =
        bottom_data + (roi_batch_ind * channels + c) * height * width;
    for (int h = hstart; h < hend; ++h) {
      for (int w = wstart; w < wend; ++w) {
        int bottom_index = h * width + w;
        if (offset_bottom_data[bottom_index] > maxval) {
          maxval = offset_bottom_data[bottom_index];
          maxidx = bottom_index;
        }
      }
    }
    top_data[index] = maxval;
    argmax_data[index] = maxidx;
  }
}

template <typename T>
__global__ void RoIPoolFBackward(const int nthreads, const T* top_diff,
    const int* argmax_data, const int num_rois, const T spatial_scale,
    const int channels, const int height, const int width,
    const int pooled_height, const int pooled_width, T* bottom_diff,
    const T* bottom_rois) {
  CUDA_1D_KERNEL_LOOP(index, nthreads) {
    // (n, c, ph, pw) is an element in the pooled output
    int pw = index % pooled_width;
    int ph = (index / pooled_width) % pooled_height;
    int c = (index / pooled_width / pooled_height) % channels;
    int n = index / pooled_width / pooled_height / channels;

    const T* offset_bottom_rois = bottom_rois + n * 5;
    int roi_batch_ind = offset_bottom_rois[0];
    int bottom_offset = (roi_batch_ind * channels + c) * height * width;
    int top_offset    = (n * channels + c) * pooled_height * pooled_width;
    const T* offset_top_diff = top_diff + top_offset;
    T* offset_bottom_diff = bottom_diff + bottom_offset;
    const int* offset_argmax_data = argmax_data + top_offset;

    int argmax = offset_argmax_data[ph * pooled_width + pw];
    if (argmax != -1) {
      atomicAdd(
          offset_bottom_diff + argmax,
          static_cast<T>(offset_top_diff[ph * pooled_width + pw]));

    }
  }
}

std::tuple<at::Tensor, at::Tensor> ROIPool_forward_cuda(const at::Tensor& input,
                                const at::Tensor& rois,
                                const float spatial_scale,
                                const int pooled_height,
                                const int pooled_width) {
  AT_ASSERTM(input.device().is_cuda(), "input must be a CUDA tensor");
  AT_ASSERTM(rois.device().is_cuda(), "rois must be a CUDA tensor");

  auto num_rois = rois.size(0);
  auto channels = input.size(1);
  auto height = input.size(2);
  auto width = input.size(3);

  auto output = at::empty({num_rois, channels, pooled_height, pooled_width}, input.options());
  auto output_size = num_rois * pooled_height * pooled_width * channels;
  auto argmax = at::zeros({num_rois, channels, pooled_height, pooled_width}, input.options().dtype(at::kInt));

  cudaStream_t stream = at::cuda::getCurrentCUDAStream();

  dim3 grid(std::min(THCCeilDiv(output_size, 512L), 4096L));
  dim3 block(512);

  if (output.numel() == 0) {
    THCudaCheck(cudaGetLastError());
    return std::make_tuple(output, argmax);
  }

  AT_DISPATCH_FLOATING_TYPES(input.scalar_type(), "ROIPool_forward", [&] {
    RoIPoolFForward<scalar_t><<<grid, block, 0, stream>>>(
         output_size,
         input.contiguous().data_ptr<scalar_t>(),
         spatial_scale,
         channels,
         height,
         width,
         pooled_height,
         pooled_width,
         rois.contiguous().data_ptr<scalar_t>(),
         output.data_ptr<scalar_t>(),
         argmax.data_ptr<int>());
  });
  THCudaCheck(cudaGetLastError());
  return std::make_tuple(output, argmax);
}

// TODO remove the dependency on input and use instead its sizes -> save memory
at::Tensor ROIPool_backward_cuda(const at::Tensor& grad,
                                 const at::Tensor& input,
                                 const at::Tensor& rois,
                                 const at::Tensor& argmax,
                                 const float spatial_scale,
                                 const int pooled_height,
                                 const int pooled_width,
                                 const int batch_size,
                                 const int channels,
                                 const int height,
                                 const int width) {
  AT_ASSERTM(grad.device().is_cuda(), "grad must be a CUDA tensor");
  AT_ASSERTM(rois.device().is_cuda(), "rois must be a CUDA tensor");
  // TODO add more checks

  auto num_rois = rois.size(0);
  auto grad_input = at::zeros({batch_size, channels, height, width}, grad.options());

  cudaStream_t stream = at::cuda::getCurrentCUDAStream();

  dim3 grid(std::min(THCCeilDiv(grad.numel(), 512L), 4096L));
  dim3 block(512);

  // handle possibly empty gradients
  if (grad.numel() == 0) {
    THCudaCheck(cudaGetLastError());
    return grad_input;
  }

  AT_DISPATCH_FLOATING_TYPES(grad.scalar_type(), "ROIPool_backward", [&] {
    RoIPoolFBackward<scalar_t><<<grid, block, 0, stream>>>(
         grad.numel(),
         grad.contiguous().data_ptr<scalar_t>(),
         argmax.data_ptr<int>(),
         num_rois,
         spatial_scale,
         channels,
         height,
         width,
         pooled_height,
         pooled_width,
         grad_input.data_ptr<scalar_t>(),
         rois.contiguous().data_ptr<scalar_t>());
  });
  THCudaCheck(cudaGetLastError());
  return grad_input;
}


================================================
FILE: GLIP/maskrcnn_benchmark/csrc/cuda/SigmoidFocalLoss_cuda.cu
================================================
// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
// This file is modified from  https://github.com/pytorch/pytorch/blob/master/modules/detectron/sigmoid_focal_loss_op.cu
// Cheng-Yang Fu
// cyfu@cs.unc.edu
#include <ATen/ATen.h>
#include <ATen/cuda/CUDAContext.h>

#include <THC/THC.h>
#include <THC/THCAtomics.cuh>
#include <THC/THCDeviceUtils.cuh>

#include <cfloat>

// TODO make it in a common file
#define CUDA_1D_KERNEL_LOOP(i, n)                            \
  for (int i = blockIdx.x * blockDim.x + threadIdx.x; i < n; \
       i += blockDim.x * gridDim.x)


template <typename T>
__global__ void SigmoidFocalLossForward(const int nthreads, 
    const T* logits,
    const int* targets,
    const int num_classes,
    const float gamma, 
    const float alpha,
    const int num, 
    T* losses) {
  CUDA_1D_KERNEL_LOOP(i, nthreads) {

    int n = i / num_classes;
    int d = i % num_classes; // current class[0~79]; 
    int t = targets[n]; // target class [1~80];

    // Decide it is positive or negative case. 
    T c1 = (t == (d+1)); 
    T c2 = (t>=0 & t != (d+1));

    T zn = (1.0 - alpha);
    T zp = (alpha);

    // p = 1. / 1. + expf(-x); p = sigmoid(x)
    T  p = 1. / (1. + expf(-logits[i]));

    // (1-p)**gamma * log(p) where
    T term1 = powf((1. - p), gamma) * logf(max(p, FLT_MIN));

    // p**gamma * log(1-p)
    T term2 = powf(p, gamma) *
            (-1. * logits[i] * (logits[i] >= 0) -   
             logf(1. + expf(logits[i] - 2. * logits[i] * (logits[i] >= 0))));

    losses[i] = 0.0;
    losses[i] += -c1 * term1 * zp;
    losses[i] += -c2 * term2 * zn;

  } // CUDA_1D_KERNEL_LOOP
} // SigmoidFocalLossForward


template <typename T>
__global__ void SigmoidFocalLossBackward(const int nthreads,
                const T* logits,
                const int* targets,
                const T* d_losses,
                const int num_classes,
                const float gamma,
                const float alpha,
                const int num,
                T* d_logits) {
  CUDA_1D_KERNEL_LOOP(i, nthreads) {

    int n = i / num_classes;
    int d = i % num_classes; // current class[0~79]; 
    int t = targets[n]; // target class [1~80], 0 is background;

    // Decide it is positive or negative case. 
    T c1 = (t == (d+1));
    T c2 = (t>=0 & t != (d+1));

    T zn = (1.0 - alpha);
    T zp = (alpha);
    // p = 1. / 1. + expf(-x); p = sigmoid(x)
    T  p = 1. / (1. + expf(-logits[i]));

    // (1-p)**g * (1 - p - g*p*log(p)
    T term1 = powf((1. - p), gamma) *
                      (1. - p - (p * gamma * logf(max(p, FLT_MIN))));

    // (p**g) * (g*(1-p)*log(1-p) - p)
    T term2 = powf(p, gamma) *
                  ((-1. * logits[i] * (logits[i] >= 0) -
                      logf(1. + expf(logits[i] - 2. * logits[i] * (logits[i] >= 0)))) *
                      (1. - p) * gamma - p);
    d_logits[i] = 0.0;
    d_logits[i] += -c1 * term1 * zp;
    d_logits[i] += -c2 * term2 * zn;
    d_logits[i] = d_logits[i] * d_losses[i];

  } // CUDA_1D_KERNEL_LOOP
} // SigmoidFocalLossBackward


at::Tensor SigmoidFocalLoss_forward_cuda(
		const at::Tensor& logits,
                const at::Tensor& targets,
		const int num_classes, 
		const float gamma, 
		const float alpha) {
  AT_ASSERTM(logits.device().is_cuda(), "logits must be a CUDA tensor");
  AT_ASSERTM(targets.device().is_cuda(), "targets must be a CUDA tensor");
  AT_ASSERTM(logits.dim() == 2, "logits should be NxClass");

  const int num_samples = logits.size(0);
	
  auto losses = at::empty({num_samples, logits.size(1)}, logits.options());
  auto losses_size = num_samples * logits.size(1);
  cudaStream_t stream = at::cuda::getCurrentCUDAStream();

  dim3 grid(std::min(THCCeilDiv(losses_size, 512L), 4096L));
  dim3 block(512);

  if (losses.numel() == 0) {
    THCudaCheck(cudaGetLastError());
    return losses;
  }

  AT_DISPATCH_FLOATING_TYPES(logits.scalar_type(), "SigmoidFocalLoss_forward", [&] {
    SigmoidFocalLossForward<scalar_t><<<grid, block, 0, stream>>>(
         losses_size,
         logits.contiguous().data_ptr<scalar_t>(),
	 targets.contiguous().data_ptr<int>(),
         num_classes,
	 gamma,
	 alpha,
	 num_samples,
         losses.data_ptr<scalar_t>());
  });
  THCudaCheck(cudaGetLastError());
  return losses;   
}	


at::Tensor SigmoidFocalLoss_backward_cuda(
		const at::Tensor& logits,
                const at::Tensor& targets,
		const at::Tensor& d_losses,
		const int num_classes, 
		const float gamma, 
		const float alpha) {
  AT_ASSERTM(logits.device().is_cuda(), "logits must be a CUDA tensor");
  AT_ASSERTM(targets.device().is_cuda(), "targets must be a CUDA tensor");
  AT_ASSERTM(d_losses.device().is_cuda(), "d_losses must be a CUDA tensor");

  AT_ASSERTM(logits.dim() == 2, "logits should be NxClass");

  const int num_samples = logits.size(0);
  AT_ASSERTM(logits.size(1) == num_classes, "logits.size(1) should be num_classes");
	
  auto d_logits = at::zeros({num_samples, num_classes}, logits.options());
  auto d_logits_size = num_samples * logits.size(1);
  cudaStream_t stream = at::cuda::getCurrentCUDAStream();

  dim3 grid(std::min(THCCeilDiv(d_logits_size, 512L), 4096L));
  dim3 block(512);

  if (d_logits.numel() == 0) {
    THCudaCheck(cudaGetLastError());
    return d_logits;
  }

  AT_DISPATCH_FLOATING_TYPES(logits.scalar_type(), "SigmoidFocalLoss_backward", [&] {
    SigmoidFocalLossBackward<scalar_t><<<grid, block, 0, stream>>>(
         d_logits_size,
         logits.contiguous().data_ptr<scalar_t>(),
	 targets.contiguous().data_ptr<int>(),
	 d_losses.contiguous().data_ptr<scalar_t>(),
         num_classes,
	 gamma,
	 alpha,
	 num_samples,
         d_logits.data_ptr<scalar_t>());
  });

  THCudaCheck(cudaGetLastError());
  return d_logits;   
}	



================================================
FILE: GLIP/maskrcnn_benchmark/csrc/cuda/deform_conv_cuda.cu
================================================
// modify from
// https://github.com/chengdazhi/Deformable-Convolution-V2-PyTorch/blob/mmdetection/mmdet/ops/dcn/src/deform_conv_cuda.c

#include <ATen/ATen.h>
#include <ATen/cuda/CUDAContext.h>

#include <THC/THC.h>
#include <THC/THCDeviceUtils.cuh>

#include <vector>
#include <iostream>
#include <cmath>


void deformable_im2col(const at::Tensor data_im, const at::Tensor data_offset,
                       const int channels, const int height, const int width,
                       const int ksize_h, const int ksize_w, const int pad_h,
                       const int pad_w, const int stride_h, const int stride_w,
                       const int dilation_h, const int dilation_w,
                       const int parallel_imgs, const int deformable_group,
                       at::Tensor data_col);

void deformable_col2im(const at::Tensor data_col, const at::Tensor data_offset,
                       const int channels, const int height, const int width,
                       const int ksize_h, const int ksize_w, const int pad_h,
                       const int pad_w, const int stride_h, const int stride_w,
                       const int dilation_h, const int dilation_w,
                       const int parallel_imgs, const int deformable_group,
                       at::Tensor grad_im);

void deformable_col2im_coord(
    const at::Tensor data_col, const at::Tensor data_im,
    const at::Tensor data_offset, const int channels, const int height,
    const int width, const int ksize_h, const int ksize_w, const int pad_h,
    const int pad_w, const int stride_h, const int stride_w,
    const int dilation_h, const int dilation_w, const int parallel_imgs,
    const int deformable_group, at::Tensor grad_offset);

void modulated_deformable_im2col_cuda(
    const at::Tensor data_im, const at::Tensor data_offset,
    const at::Tensor data_mask, const int batch_size, const int channels,
    const int height_im, const int width_im, const int height_col,
    const int width_col, const int kernel_h, const int kenerl_w,
    const int pad_h, const int pad_w, const int stride_h, const int stride_w,
    const int dilation_h, const int dilation_w, const int deformable_group,
    at::Tensor data_col);

void modulated_deformable_col2im_cuda(
    const at::Tensor data_col, const at::Tensor data_offset,
    const at::Tensor data_mask, const int batch_size, const int channels,
    const int height_im, const int width_im, const int height_col,
    const int width_col, const int kernel_h, const int kenerl_w,
    const int pad_h, const int pad_w, const int stride_h, const int stride_w,
    const int dilation_h, const int dilation_w, const int deformable_group,
    at::Tensor grad_im);

void modulated_deformable_col2im_coord_cuda(
    const at::Tensor data_col, const at::Tensor data_im,
    const at::Tensor data_offset, const at::Tensor data_mask,
    const int batch_size, const int channels, const int height_im,
    const int width_im, const int height_col, const int width_col,
    const int kernel_h, const int kenerl_w, const int pad_h, const int pad_w,
    const int stride_h, const int stride_w, const int dilation_h,
    const int dilation_w, const int deformable_group, at::Tensor grad_offset,
    at::Tensor grad_mask);

void shape_check(at::Tensor input, at::Tensor offset, at::Tensor *gradOutput,
                 at::Tensor weight, int kH, int kW, int dH, int dW, int padH,
                 int padW, int dilationH, int dilationW, int group,
                 int deformable_group) 
{
  TORCH_CHECK(weight.ndimension() == 4,
           "4D weight tensor (nOutputPlane,nInputPlane,kH,kW) expected, "
           "but got: %s",
           weight.ndimension());

  TORCH_CHECK(weight.is_contiguous(), "weight tensor has to be contiguous");

  TORCH_CHECK(kW > 0 && kH > 0,
           "kernel size should be greater than zero, but got kH: %d kW: %d", kH,
           kW);

  TORCH_CHECK((weight.size(2) == kH && weight.size(3) == kW),
           "kernel size should be consistent with weight, ",
           "but got kH: %d kW: %d weight.size(2): %d, weight.size(3): %d", kH,
           kW, weight.size(2), weight.size(3));

  TORCH_CHECK(dW > 0 && dH > 0,
           "stride should be greater than zero, but got dH: %d dW: %d", dH, dW);

  TORCH_CHECK(
      dilationW > 0 && dilationH > 0,
      "dilation should be greater than 0, but got dilationH: %d dilationW: %d",
      dilationH, dilationW);

  int ndim = input.ndimension();
  int dimf = 0;
  int dimh = 1;
  int dimw = 2;

  if (ndim == 4) {
    dimf++;
    dimh++;
    dimw++;
  }

  TORCH_CHECK(ndim == 3 || ndim == 4, "3D or 4D input tensor expected but got: %s",
           ndim);

  long nInputPlane = weight.size(1) * group;
  long inputHeight = input.size(dimh);
  long inputWidth = input.size(dimw);
  long nOutputPlane = weight.size(0);
  long outputHeight =
      (inputHeight + 2 * padH - (dilationH * (kH - 1) + 1)) / dH + 1;
  long outputWidth =
      (inputWidth + 2 * padW - (dilationW * (kW - 1) + 1)) / dW + 1;

  TORCH_CHECK(nInputPlane % deformable_group == 0,
           "input channels must divide deformable group size");

  if (outputWidth < 1 || outputHeight < 1)
    AT_ERROR(
        "Given input size: (%ld x %ld x %ld). "
        "Calculated output size: (%ld x %ld x %ld). Output size is too small",
        nInputPlane, inputHeight, inputWidth, nOutputPlane, outputHeight,
        outputWidth);

  TORCH_CHECK(input.size(1) == nInputPlane,
           "invalid number of input planes, expected: %d, but got: %d",
           nInputPlane, input.size(1));

  TORCH_CHECK((inputHeight >= kH && inputWidth >= kW),
           "input image is smaller than kernel");

  TORCH_CHECK((offset.size(2) == outputHeight && offset.size(3) == outputWidth),
           "invalid spatial size of offset, expected height: %d width: %d, but "
           "got height: %d width: %d",
           outputHeight, outputWidth, offset.size(2), offset.size(3));

  TORCH_CHECK((offset.size(1) == deformable_group * 2 * kH * kW),
           "invalid number of channels of offset");

  if (gradOutput != NULL) {
    TORCH_CHECK(gradOutput->size(dimf) == nOutputPlane,
             "invalid number of gradOutput planes, expected: %d, but got: %d",
             nOutputPlane, gradOutput->size(dimf));

    TORCH_CHECK((gradOutput->size(dimh) == outputHeight &&
              gradOutput->size(dimw) == outputWidth),
             "invalid size of gradOutput, expected height: %d width: %d , but "
             "got height: %d width: %d",
             outputHeight, outputWidth, gradOutput->size(dimh),
             gradOutput->size(dimw));
  }
}

int deform_conv_forward_cuda(at::Tensor input, at::Tensor weight,
                             at::Tensor offset, at::Tensor output,
                             at::Tensor columns, at::Tensor ones, int kW,
                             int kH, int dW, int dH, int padW, int padH,
                             int dilationW, int dilationH, int group,
                             int deformable_group, int im2col_step) 
{
  // todo: resize columns to include im2col: done
  // todo: add im2col_step as input
  // todo: add new output buffer and transpose it to output (or directly
  // transpose output) todo: possibly change data indexing because of
  // parallel_imgs

  shape_check(input, offset, NULL, weight, kH, kW, dH, dW, padH, padW,
              dilationH, dilationW, group, deformable_group);

  input = input.contiguous();
  offset = offset.contiguous();
  weight = weight.contiguous();

  int batch = 1;
  if (input.ndimension() == 3) {
    // Force batch
    batch = 0;
    input.unsqueeze_(0);
    offset.unsqueeze_(0);
  }

  // todo: assert batchsize dividable by im2col_step

  long batchSize = input.size(0);
  long nInputPlane = input.size(1);
  long inputHeight = input.size(2);
  long inputWidth = input.size(3);

  long nOutputPlane = weight.size(0);

  long outputWidth =
      (inputWidth + 2 * padW - (dilationW * (kW - 1) + 1)) / dW + 1;
  long outputHeight =
      (inputHeight + 2 * padH - (dilationH * (kH - 1) + 1)) / dH + 1;

  TORCH_CHECK((offset.size(0) == batchSize), "invalid batch size of offset");

  output = output.view({batchSize / im2col_step, im2col_step, nOutputPlane,
                        outputHeight, outputWidth});
  columns = at::zeros(
      {nInputPlane * kW * kH, im2col_step * outputHeight * outputWidth},
      input.options());

  if (ones.ndimension() != 2 ||
      ones.size(0) * ones.size(1) < outputHeight * outputWidth) {
    ones = at::ones({outputHeight, outputWidth}, input.options());
  }

  input = input.view({batchSize / im2col_step, im2col_step, nInputPlane,
                      inputHeight, inputWidth});
  offset =
      offset.view({batchSize / im2col_step, im2col_step,
                   deformable_group * 2 * kH * kW, outputHeight, outputWidth});

  at::Tensor output_buffer =
      at::zeros({batchSize / im2col_step, nOutputPlane,
                 im2col_step * outputHeight, outputWidth},
                output.options());

  output_buffer = output_buffer.view(
      {output_buffer.size(0), group, output_buffer.size(1) / group,
       output_buffer.size(2), output_buffer.size(3)});

  for (int elt = 0; elt < batchSize / im2col_step; elt++) {
    deformable_im2col(input[elt], offset[elt], nInputPlane, inputHeight,
                      inputWidth, kH, kW, padH, padW, dH, dW, dilationH,
                      dilationW, im2col_step, deformable_group, columns);

    columns = columns.view({group, columns.size(0) / group, columns.size(1)});
    weight = weight.view({group, weight.size(0) / group, weight.size(1),
                          weight.size(2), weight.size(3)});

    for (int g = 0; g < group; g++) {
      output_buffer[elt][g] = output_buffer[elt][g]
                                  .flatten(1)
                                  .addmm_(weight[g].flatten(1), columns[g])
                                  .view_as(output_buffer[elt][g]);
    }
  }

  output_buffer = output_buffer.view(
      {output_buffer.size(0), output_buffer.size(1) * output_buffer.size(2),
       output_buffer.size(3), output_buffer.size(4)});

  output_buffer = output_buffer.view({batchSize / im2col_step, nOutputPlane,
                                      im2col_step, outputHeight, outputWidth});
  output_buffer.transpose_(1, 2);
  output.copy_(output_buffer);
  output = output.view({batchSize, nOutputPlane, outputHeight, outputWidth});

  input = input.view({batchSize, nInputPlane, inputHeight, inputWidth});
  offset = offset.view(
      {batchSize, deformable_group * 2 * kH * kW, outputHeight, outputWidth});

  if (batch == 0) {
    output = output.view({nOutputPlane, outputHeight, outputWidth});
    input = input.view({nInputPlane, inputHeight, inputWidth});
    offset = offset.view({offset.size(1), offset.size(2), offset.size(3)});
  }

  return 1;
}

int deform_conv_backward_input_cuda(at::Tensor input, at::Tensor offset,
                                    at::Tensor gradOutput, at::Tensor gradInput,
                                    at::Tensor gradOffset, at::Tensor weight,
                                    at::Tensor columns, int kW, int kH, int dW,
                                    int dH, int padW, int padH, int dilationW,
                                    int dilationH, int group,
                                    int deformable_group, int im2col_step) 
{
  shape_check(input, offset, &gradOutput, weight, kH, kW, dH, dW, padH, padW,
              dilationH, dilationW, group, deformable_group);

  input = input.contiguous();
  offset = offset.contiguous();
  gradOutput = gradOutput.contiguous();
  weight = weight.contiguous();

  int batch = 1;

  if (input.ndimension() == 3) {
    // Force batch
    batch = 0;
    input = input.view({1, input.size(0), input.size(1), input.size(2)});
    offset = offset.view({1, offset.size(0), offset.size(1), offset.size(2)});
    gradOutput = gradOutput.view(
        {1, gradOutput.size(0), gradOutput.size(1), gradOutput.size(2)});
  }

  long batchSize = input.size(0);
  long nInputPlane = input.size(1);
  long inputHeight = input.size(2);
  long inputWidth = input.size(3);

  long nOutputPlane = weight.size(0);

  long outputWidth =
      (inputWidth + 2 * padW - (dilationW * (kW - 1) + 1)) / dW + 1;
  long outputHeight =
      (inputHeight + 2 * padH - (dilationH * (kH - 1) + 1)) / dH + 1;

  TORCH_CHECK((offset.size(0) == batchSize), 3, "invalid batch size of offset");
  gradInput = gradInput.view({batchSize, nInputPlane, inputHeight, inputWidth});
  columns = at::zeros(
      {nInputPlane * kW * kH, im2col_step * outputHeight * outputWidth},
      input.options());

  // change order of grad output
  gradOutput = gradOutput.view({batchSize / im2col_step, im2col_step,
                                nOutputPlane, outputHeight, outputWidth});
  gradOutput.transpose_(1, 2);

  gradInput = gradInput.view({batchSize / im2col_step, im2col_step, nInputPlane,
                              inputHeight, inputWidth});
  input = input.view({batchSize / im2col_step, im2col_step, nInputPlane,
                      inputHeight, inputWidth});
  gradOffset = gradOffset.view({batchSize / im2col_step, im2col_step,
                                deformable_group * 2 * kH * kW, outputHeight,
                                outputWidth});
  offset =
      offset.view({batchSize / im2col_step, im2col_step,
                   deformable_group * 2 * kH * kW, outputHeight, outputWidth});

  for (int elt = 0; elt < batchSize / im2col_step; elt++) {
    // divide into groups
    columns = columns.view({group, columns.size(0) / group, columns.size(1)});
    weight = weight.view({group, weight.size(0) / group, weight.size(1),
                          weight.size(2), weight.size(3)});
    gradOutput = gradOutput.view(
        {gradOutput.size(0), group, gradOutput.size(1) / group,
         gradOutput.size(2), gradOutput.size(3), gradOutput.size(4)});

    for (int g = 0; g < group; g++) {
      columns[g] = columns[g].addmm_(weight[g].flatten(1).transpose(0, 1),
                                     gradOutput[elt][g].flatten(1), 0.0f, 1.0f);
    }

    columns =
        columns.view({columns.size(0) * columns.size(1), columns.size(2)});
    gradOutput = gradOutput.view(
        {gradOutput.size(0), gradOutput.size(1) * gradOutput.size(2),
         gradOutput.size(3), gradOutput.size(4), gradOutput.size(5)});

    deformable_col2im_coord(columns, input[elt], offset[elt], nInputPlane,
                            inputHeight, inputWidth, kH, kW, padH, padW, dH, dW,
                            dilationH, dilationW, im2col_step, deformable_group,
                            gradOffset[elt]);

    deformable_col2im(columns, offset[elt], nInputPlane, inputHeight,
                      inputWidth, kH, kW, padH, padW, dH, dW, dilationH,
                      dilationW, im2col_step, deformable_group, gradInput[elt]);
  }

  gradOutput.transpose_(1, 2);
  gradOutput =
      gradOutput.view({batchSize, nOutputPlane, outputHeight, outputWidth});

  gradInput = gradInput.view({batchSize, nInputPlane, inputHeight, inputWidth});
  input = input.view({batchSize, nInputPlane, inputHeight, inputWidth});
  gradOffset = gradOffset.view(
      {batchSize, deformable_group * 2 * kH * kW, outputHeight, outputWidth});
  offset = offset.view(
      {batchSize, deformable_group * 2 * kH * kW, outputHeight, outputWidth});

  if (batch == 0) {
    gradOutput = gradOutput.view({nOutputPlane, outputHeight, outputWidth});
    input = input.view({nInputPlane, inputHeight, inputWidth});
    gradInput = gradInput.view({nInputPlane, inputHeight, inputWidth});
    offset = offset.view({offset.size(1), offset.size(2), offset.size(3)});
    gradOffset =
        gradOffset.view({offset.size(1), offset.size(2), offset.size(3)});
  }

  return 1;
}

int deform_conv_backward_parameters_cuda(
    at::Tensor input, at::Tensor offset, at::Tensor gradOutput,
    at::Tensor gradWeight,  // at::Tensor gradBias,
    at::Tensor columns, at::Tensor ones, int kW, int kH, int dW, int dH,
    int padW, int padH, int dilationW, int dilationH, int group,
    int deformable_group, float scale, int im2col_step) 
{
  // todo: transpose and reshape outGrad
  // todo: reshape columns
  // todo: add im2col_step as input

  shape_check(input, offset, &gradOutput, gradWeight, kH, kW, dH, dW, padH,
              padW, dilationH, dilationW, group, deformable_group);

  input = input.contiguous();
  offset = offset.contiguous();
  gradOutput = gradOutput.contiguous();

  int batch = 1;

  if (input.ndimension() == 3) {
    // Force batch
    batch = 0;
    input = input.view(
        at::IntList({1, input.size(0), input.size(1), input.size(2)}));
    gradOutput = gradOutput.view(
        {1, gradOutput.size(0), gradOutput.size(1), gradOutput.size(2)});
  }

  long batchSize = input.size(0);
  long nInputPlane = input.size(1);
  long inputHeight = input.size(2);
  long inputWidth = input.size(3);

  long nOutputPlane = gradWeight.size(0);

  long outputWidth =
      (inputWidth + 2 * padW - (dilationW * (kW - 1) + 1)) / dW + 1;
  long outputHeight =
      (inputHeight + 2 * padH - (dilationH * (kH - 1) + 1)) / dH + 1;

  TORCH_CHECK((offset.size(0) == batchSize), "invalid batch size of offset");

  columns = at::zeros(
      {nInputPlane * kW * kH, im2col_step * outputHeight * outputWidth},
      input.options());

  gradOutput = gradOutput.view({batchSize / im2col_step, im2col_step,
                                nOutputPlane, outputHeight, outputWidth});
  gradOutput.transpose_(1, 2);

  at::Tensor gradOutputBuffer = at::zeros_like(gradOutput);
  gradOutputBuffer =
      gradOutputBuffer.view({batchSize / im2col_step, nOutputPlane, im2col_step,
                             outputHeight, outputWidth});
  gradOutputBuffer.copy_(gradOutput);
  gradOutputBuffer =
      gradOutputBuffer.view({batchSize / im2col_step, nOutputPlane,
                             im2col_step * outputHeight, outputWidth});

  gradOutput.transpose_(1, 2);
  gradOutput =
      gradOutput.view({batchSize, nOutputPlane, outputHeight, outputWidth});

  input = input.view({batchSize / im2col_step, im2col_step, nInputPlane,
                      inputHeight, inputWidth});
  offset =
      offset.view({batchSize / im2col_step, im2col_step,
                   deformable_group * 2 * kH * kW, outputHeight, outputWidth});

  for (int elt = 0; elt < batchSize / im2col_step; elt++) {
    deformable_im2col(input[elt], offset[elt], nInputPlane, inputHeight,
                      inputWidth, kH, kW, padH, padW, dH, dW, dilationH,
                      dilationW, im2col_step, deformable_group, columns);

    // divide into group
    gradOutputBuffer = gradOutputBuffer.view(
        {gradOutputBuffer.size(0), group, gradOutputBuffer.size(1) / group,
         gradOutputBuffer.size(2), gradOutputBuffer.size(3)});
    columns = columns.view({group, columns.size(0) / group, columns.size(1)});
    gradWeight =
        gradWeight.view({group, gradWeight.size(0) / group, gradWeight.size(1),
                         gradWeight.size(2), gradWeight.size(3)});

    for (int g = 0; g < group; g++) {
      gradWeight[g] = gradWeight[g]
                          .flatten(1)
                          .addmm_(gradOutputBuffer[elt][g].flatten(1),
                                  columns[g].transpose(1, 0), 1.0, scale)
                          .view_as(gradWeight[g]);
    }
    gradOutputBuffer = gradOutputBuffer.view(
        {gradOutputBuffer.size(0),
         gradOutputBuffer.size(1) * gradOutputBuffer.size(2),
         gradOutputBuffer.size(3), gradOutputBuffer.size(4)});
    columns =
        columns.view({columns.size(0) * columns.size(1), columns.size(2)});
    gradWeight = gradWeight.view({gradWeight.size(0) * gradWeight.size(1),
                                  gradWeight.size(2), gradWeight.size(3),
                                  gradWeight.size(4)});
  }

  input = input.view({batchSize, nInputPlane, inputHeight, inputWidth});
  offset = offset.view(
      {batchSize, deformable_group * 2 * kH * kW, outputHeight, outputWidth});

  if (batch == 0) {
    gradOutput = gradOutput.view({nOutputPlane, outputHeight, outputWidth});
    input = input.view({nInputPlane, inputHeight, inputWidth});
  }

  return 1;
}

void modulated_deform_conv_cuda_forward(
    at::Tensor input, at::Tensor weight, at::Tensor bias, at::Tensor ones,
    at::Tensor offset, at::Tensor mask, at::Tensor output, at::Tensor columns,
    int kernel_h, int kernel_w, const int stride_h, const int stride_w,
    const int pad_h, const int pad_w, const int dilation_h,
    const int dilation_w, const int group, const int deformable_group,
    const bool with_bias) 
{
  TORCH_CHECK(input.is_contiguous(), "input tensor has to be contiguous");
  TORCH_CHECK(weight.is_contiguous(), "weight tensor has to be contiguous");

  const int batch = input.size(0);
  const int channels = input.size(1);
  const int height = input.size(2);
  const int width = input.size(3);

  const int channels_out = weight.size(0);
  const int channels_kernel = weight.size(1);
  const int kernel_h_ = weight.size(2);
  const int kernel_w_ = weight.size(3);

  if (kernel_h_ != kernel_h || kernel_w_ != kernel_w)
    AT_ERROR("Input shape and kernel shape wont match: (%d x %d vs %d x %d).",
             kernel_h_, kernel_w, kernel_h_, kernel_w_);
  if (channels != channels_kernel * group)
    AT_ERROR("Input shape and kernel channels wont match: (%d vs %d).",
             channels, channels_kernel * group);

  const int height_out =
      (height + 2 * pad_h - (dilation_h * (kernel_h - 1) + 1)) / stride_h + 1;
  const int width_out =
      (width + 2 * pad_w - (dilation_w * (kernel_w - 1) + 1)) / stride_w + 1;

  if (ones.ndimension() != 2 ||
      ones.size(0) * ones.size(1) < height_out * width_out) {
    // Resize plane and fill with ones...
    ones = at::ones({height_out, width_out}, input.options());
  }

  // resize output
  output = output.view({batch, channels_out, height_out, width_out}).zero_();
  // resize temporary columns
  columns =
      at::zeros({channels * kernel_h * kernel_w, 1 * height_out * width_out},
                input.options());

  output = output.view({output.size(0), group, output.size(1) / group,
                        output.size(2), output.size(3)});

  for (int b = 0; b < batch; b++) {
    modulated_deformable_im2col_cuda(
        input[b], offset[b], mask[b], 1, channels, height, width, height_out,
        width_out, kernel_h, kernel_w, pad_h, pad_w, stride_h, stride_w,
        dilation_h, dilation_w, deformable_group, columns);

    // divide into group
    weight = weight.view({group, weight.size(0) / group, weight.size(1),
                          weight.size(2), weight.size(3)});
    columns = columns.view({group, columns.size(0) / group, columns.size(1)});

    for (int g = 0; g < group; g++) {
      output[b][g] = output[b][g]
                         .flatten(1)
                         .addmm_(weight[g].flatten(1), columns[g])
                         .view_as(output[b][g]);
    }

    weight = weight.view({weight.size(0) * weight.size(1), weight.size(2),
                          weight.size(3), weight.size(4)});
    columns =
        columns.view({columns.size(0) * columns.size(1), columns.size(2)});
  }

  output = output.view({output.size(0), output.size(1) * output.size(2),
                        output.size(3), output.size(4)});

  if (with_bias) {
    output += bias.view({1, bias.size(0), 1, 1});
  }
}

void modulated_deform_conv_cuda_backward(
    at::Tensor input, at::Tensor weight, at::Tensor bias, at::Tensor ones,
    at::Tensor offset, at::Tensor mask, at::Tensor columns,
    at::Tensor grad_input, at::Tensor grad_weight, at::Tensor grad_bias,
    at::Tensor grad_offset, at::Tensor grad_mask, at::Tensor grad_output,
    int kernel_h, int kernel_w, int stride_h, int stride_w, int pad_h,
    int pad_w, int dilation_h, int dilation_w, int group, int deformable_group,
    const bool with_bias) 
{
  TORCH_CHECK(input.is_contiguous(), "input tensor has to be contiguous");
  TORCH_CHECK(weight.is_contiguous(), "weight tensor has to be contiguous");

  const int batch = input.size(0);
  const int channels = input.size(1);
  const int height = input.size(2);
  const int width = input.size(3);

  const int channels_kernel = weight.size(1);
  const int kernel_h_ = weight.size(2);
  const int kernel_w_ = weight.size(3);
  if (kernel_h_ != kernel_h || kernel_w_ != kernel_w)
    AT_ERROR("Input shape and kernel shape wont match: (%d x %d vs %d x %d).",
             kernel_h_, kernel_w, kernel_h_, kernel_w_);
  if (channels != channels_kernel * group)
    AT_ERROR("Input shape and kernel channels wont match: (%d vs %d).",
             channels, channels_kernel * group);

  const int height_out =
      (height + 2 * pad_h - (dilation_h * (kernel_h - 1) + 1)) / stride_h + 1;
  const int width_out =
      (width + 2 * pad_w - (dilation_w * (kernel_w - 1) + 1)) / stride_w + 1;

  if (ones.ndimension() != 2 ||
      ones.size(0) * ones.size(1) < height_out * width_out) {
    // Resize plane and fill with ones...
    ones = at::ones({height_out, width_out}, input.options());
  }

  grad_input = grad_input.view({batch, channels, height, width});
  columns = at::zeros({channels * kernel_h * kernel_w, height_out * width_out},
                      input.options());

  grad_output =
      grad_output.view({grad_output.size(0), group, grad_output.size(1) / group,
                        grad_output.size(2), grad_output.size(3)});

  for (int b = 0; b < batch; b++) {
    // divide int group
    columns = columns.view({group, columns.size(0) / group, columns.size(1)});
    weight = weight.view({group, weight.size(0) / group, weight.size(1),
                          weight.size(2), weight.size(3)});

    for (int g = 0; g < group; g++) {
      columns[g].addmm_(weight[g].flatten(1).transpose(0, 1),
                        grad_output[b][g].flatten(1), 0.0f, 1.0f);
    }

    columns =
        columns.view({columns.size(0) * columns.size(1), columns.size(2)});
    weight = weight.view({weight.size(0) * weight.size(1), weight.size(2),
                          weight.size(3), weight.size(4)});

    // gradient w.r.t. input coordinate data
    modulated_deformable_col2im_coord_cuda(
        columns, input[b], offset[b], mask[b], 1, channels, height, width,
        height_out, width_out, kernel_h, kernel_w, pad_h, pad_w, stride_h,
        stride_w, dilation_h, dilation_w, deformable_group, grad_offset[b],
        grad_mask[b]);
    // gradient w.r.t. input data
    modulated_deformable_col2im_cuda(
        columns, offset[b], mask[b], 1, channels, height, width, height_out,
        width_out, kernel_h, kernel_w, pad_h, pad_w, stride_h, stride_w,
        dilation_h, dilation_w, deformable_group, grad_input[b]);

    // gradient w.r.t. weight, dWeight should accumulate across the batch and
    // group
    modulated_deformable_im2col_cuda(
        input[b], offset[b], mask[b], 1, channels, height, width, height_out,
        width_out, kernel_h, kernel_w, pad_h, pad_w, stride_h, stride_w,
        dilation_h, dilation_w, deformable_group, columns);

    columns = columns.view({group, columns.size(0) / group, columns.size(1)});
    grad_weight = grad_weight.view({group, grad_weight.size(0) / group,
                                    grad_weight.size(1), grad_weight.size(2),
                                    grad_weight.size(3)});
    if (with_bias)
      grad_bias = grad_bias.view({group, grad_bias.size(0) / group});

    for (int g = 0; g < group; g++) {
      grad_weight[g] =
          grad_weight[g]
              .flatten(1)
              .addmm_(grad_output[b][g].flatten(1), columns[g].transpose(0, 1))
              .view_as(grad_weight[g]);
      if (with_bias) {
        grad_bias[g] =
            grad_bias[g]
                .view({-1, 1})
                .addmm_(grad_output[b][g].flatten(1), ones.view({-1, 1}))
                .view(-1);
      }
    }

    columns =
        columns.view({columns.size(0) * columns.size(1), columns.size(2)});
    grad_weight = grad_weight.view({grad_weight.size(0) * grad_weight.size(1),
                                    grad_weight.size(2), grad_weight.size(3),
                                    grad_weight.size(4)});
    if (with_bias)
      grad_bias = grad_bias.view({grad_bias.size(0) * grad_bias.size(1)});
  }
  grad_output = grad_output.view({grad_output.size(0) * grad_output.size(1),
                                  grad_output.size(2), grad_output.size(3),
                                  grad_output.size(4)});
}


================================================
FILE: GLIP/maskrcnn_benchmark/csrc/cuda/deform_conv_kernel_cuda.cu
================================================
/*!
 ******************* BEGIN Caffe Copyright Notice and Disclaimer ****************
 *
 * COPYRIGHT
 *
 * All contributions by the University of California:
 * Copyright (c) 2014-2017 The Regents of the University of California (Regents)
 * All rights reserved.
 *
 * All other contributions:
 * Copyright (c) 2014-2017, the respective contributors
 * All rights reserved.
 *
 * Caffe uses a shared copyright model: each contributor holds copyright over
 * their contributions to Caffe. The project versioning records all such
 * contribution and copyright details. If a contributor wants to further mark
 * their specific copyright on a particular contribution, they should indicate
 * their copyright solely in the commit message of the change when it is
 * committed.
 *
 * LICENSE
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions are met:
 *
 * 1. Redistributions of source code must retain the above copyright notice, this
 * list of conditions and the following disclaimer.
 * 2. Redistributions in binary form must reproduce the above copyright notice,
 * this list of conditions and the following disclaimer in the documentation
 * and/or other materials provided with the distribution.
 *
 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
 * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
 * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
 * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
 * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
 * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
 * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
 * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
 * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 *
 * CONTRIBUTION AGREEMENT
 *
 * By contributing to the BVLC/caffe repository through pull-request, comment,
 * or otherwise, the contributor releases their content to the
 * license and copyright terms herein.
 *
 ***************** END Caffe Copyright Notice and Disclaimer ********************
 *
 * Copyright (c) 2018 Microsoft
 * Licensed under The MIT License [see LICENSE for details]
 * \file modulated_deformable_im2col.cuh
 * \brief Function definitions of converting an image to
 * column matrix based on kernel, padding, dilation, and offset.
 * These functions are mainly used in deformable convolution operators.
 * \ref: https://arxiv.org/abs/1703.06211
 * \author Yuwen Xiong, Haozhi Qi, Jifeng Dai, Xizhou Zhu, Han Hu, Dazhi Cheng
 */

// modify from https://github.com/chengdazhi/Deformable-Convolution-V2-PyTorch/blob/mmdetection/mmdet/ops/dcn/src/deform_conv_cuda_kernel.cu


#include <ATen/ATen.h>
#include <THC/THCAtomics.cuh>
#include <stdio.h>
#include <math.h>
#include <float.h>

using namespace at;

#define CUDA_KERNEL_LOOP(i, n)                                 \
  for (int i = blockIdx.x * blockDim.x + threadIdx.x; i < (n); \
       i += blockDim.x * gridDim.x)

const int CUDA_NUM_THREADS = 1024;
const int kMaxGridNum = 65535;
inline int GET_BLOCKS(const int N)
{
  return std::min(kMaxGridNum, (N + CUDA_NUM_THREADS - 1) / CUDA_NUM_THREADS);
}

/*
const int CUDA_NUM_THREADS = 1024;

inline int GET_BLOCKS(const int N)
{
  return (N + CUDA_NUM_THREADS - 1) / CUDA_NUM_THREADS;
}*/

template <typename scalar_t>
__device__ scalar_t deformable_im2col_bilinear(const scalar_t *bottom_data, const int data_width,
                                               const int height, const int width, scalar_t h, scalar_t w)
{

  int h_low = floor(h);
  int w_low = floor(w);
  int h_high = h_low + 1;
  int w_high = w_low + 1;

  scalar_t lh = h - h_low;
  scalar_t lw = w - w_low;
  scalar_t hh = 1 - lh, hw = 1 - lw;

  scalar_t v1 = 0;
  if (h_low >= 0 && w_low >= 0)
    v1 = bottom_data[h_low * data_width + w_low];
  scalar_t v2 = 0;
  if (h_low >= 0 && w_high <= width - 1)
    v2 = bottom_data[h_low * data_width + w_high];
  scalar_t v3 = 0;
  if (h_high <= height - 1 && w_low >= 0)
    v3 = bottom_data[h_high * data_width + w_low];
  scalar_t v4 = 0;
  if (h_high <= height - 1 && w_high <= width - 1)
    v4 = bottom_data[h_high * data_width + w_high];

  scalar_t w1 = hh * hw, w2 = hh * lw, w3 = lh * hw, w4 = lh * lw;

  scalar_t val = (w1 * v1 + w2 * v2 + w3 * v3 + w4 * v4);
  return val;
}

template <typename scalar_t>
__device__ scalar_t get_gradient_weight(scalar_t argmax_h, scalar_t argmax_w,
                                        const int h, const int w, const int height, const int width)
{

  if (argmax_h <= -1 || argmax_h >= height || argmax_w <= -1 || argmax_w >= width)
  {
    //empty
    return 0;
  }

  int argmax_h_low = floor(argmax_h);
  int argmax_w_low = floor(argmax_w);
  int argmax_h_high = argmax_h_low + 1;
  int argmax_w_high = argmax_w_low + 1;

  scalar_t weight = 0;
  if (h == argmax_h_low && w == argmax_w_low)
    weight = (h + 1 - argmax_h) * (w + 1 - argmax_w);
  if (h == argmax_h_low && w == argmax_w_high)
    weight = (h + 1 - argmax_h) * (argmax_w + 1 - w);
  if (h == argmax_h_high && w == argmax_w_low)
    weight = (argmax_h + 1 - h) * (w + 1 - argmax_w);
  if (h == argmax_h_high && w == argmax_w_high)
    weight = (argmax_h + 1 - h) * (argmax_w + 1 - w);
  return weight;
}

template <typename scalar_t>
__device__ scalar_t get_coordinate_weight(scalar_t argmax_h, scalar_t argmax_w,
                                          const int height, const int width, const scalar_t *im_data,
                                          const int data_width, const int bp_dir)
{

  if (argmax_h <= -1 || argmax_h >= height || argmax_w <= -1 || argmax_w >= width)
  {
    //empty
    return 0;
  }

  int argmax_h_low = floor(argmax_h);
  int argmax_w_low = floor(argmax_w);
  int argmax_h_high = argmax_h_low + 1;
  int argmax_w_high = argmax_w_low + 1;

  scalar_t weight = 0;

  if (bp_dir == 0)
  {
    if (argmax_h_low >= 0 && argmax_w_low >= 0)
      weight += -1 * (argmax_w_low + 1 - argmax_w) * im_data[argmax_h_low * data_width + argmax_w_low];
    if (argmax_h_low >= 0 && argmax_w_high <= width - 1)
      weight += -1 * (argmax_w - argmax_w_low) * im_data[argmax_h_low * data_width + argmax_w_high];
    if (argmax_h_high <= height - 1 && argmax_w_low >= 0)
      weight += (argmax_w_low + 1 - argmax_w) * im_data[argmax_h_high * data_width + argmax_w_low];
    if (argmax_h_high <= height - 1 && argmax_w_high <= width - 1)
      weight += (argmax_w - argmax_w_low) * im_data[argmax_h_high * data_width + argmax_w_high];
  }
  else if (bp_dir == 1)
  {
    if (argmax_h_low >= 0 && argmax_w_low >= 0)
      weight += -1 * (argmax_h_low + 1 - argmax_h) * im_data[argmax_h_low * data_width + argmax_w_low];
    if (argmax_h_low >= 0 && argmax_w_high <= width - 1)
      weight += (argmax_h_low + 1 - argmax_h) * im_data[argmax_h_low * data_width + argmax_w_high];
    if (argmax_h_high <= height - 1 && argmax_w_low >= 0)
      weight += -1 * (argmax_h - argmax_h_low) * im_data[argmax_h_high * data_width + argmax_w_low];
    if (argmax_h_high <= height - 1 && argmax_w_high <= width - 1)
      weight += (argmax_h - argmax_h_low) * im_data[argmax_h_high * data_width + argmax_w_high];
  }

  return weight;
}

template <typename scalar_t>
__global__ void deformable_im2col_gpu_kernel(const int n, const scalar_t *data_im, const scalar_t *data_offset,
                                             const int height, const int width, const int kernel_h, const int kernel_w,
                                             const int pad_h, const int pad_w, const int stride_h, const int stride_w,
                                             const int dilation_h, const int dilation_w, const int channel_per_deformable_group,
                                             const int batch_size, const int num_channels, const int deformable_group,
                                             const int height_col, const int width_col,
                                             scalar_t *data_col)
{
  CUDA_KERNEL_LOOP(index, n)
  {
    // index index of output matrix
    const int w_col = index % width_col;
    const int h_col = (index / width_col) % height_col;
    const int b_col = (index / width_col / height_col) % batch_size;
    const int c_im = (index / width_col / height_col) / batch_size;
    const int c_col = c_im * kernel_h * kernel_w;

    // compute deformable group index
    const int deformable_group_index = c_im / channel_per_deformable_group;

    const int h_in = h_col * stride_h - pad_h;
    const int w_in = w_col * stride_w - pad_w;
    scalar_t *data_col_ptr = data_col + ((c_col * batch_size + b_col) * height_col + h_col) * width_col + w_col;
    //const scalar_t* data_im_ptr = data_im + ((b_col * num_channels + c_im) * height + h_in) * width + w_in;
    const scalar_t *data_im_ptr = data_im + (b_col * num_channels + c_im) * height * width;
    const scalar_t *data_offset_ptr = data_offset + (b_col * deformable_group + deformable_group_index) * 2 * kernel_h * kernel_w * height_col * width_col;

    for (int i = 0; i < kernel_h; ++i)
    {
      for (int j = 0; j < kernel_w; ++j)
      {
        const int data_offset_h_ptr = ((2 * (i * kernel_w + j)) * height_col + h_col) * width_col + w_col;
        const int data_offset_w_ptr = ((2 * (i * kernel_w + j) + 1) * height_col + h_col) * width_col + w_col;
        const scalar_t offset_h = data_offset_ptr[data_offset_h_ptr];
        const scalar_t offset_w = data_offset_ptr[data_offset_w_ptr];
        scalar_t val = static_cast<scalar_t>(0);
        const scalar_t h_im = h_in + i * dilation_h + offset_h;
        const scalar_t w_im = w_in + j * dilation_w + offset_w;
        if (h_im > -1 && w_im > -1 && h_im < height && w_im < width)
        {
          //const scalar_t map_h = i * dilation_h + offset_h;
          //const scalar_t map_w = j * dilation_w + offset_w;
          //const int cur_height = height - h_in;
          //const int cur_width = width - w_in;
          //val = deformable_im2col_bilinear(data_im_ptr, width, cur_height, cur_width, map_h, map_w);
          val = deformable_im2col_bilinear(data_im_ptr, width, height, width, h_im, w_im);
        }
        *data_col_ptr = val;
        data_col_ptr += batch_size * height_col * width_col;
      }
    }
  }
}

void deformable_im2col(
    const at::Tensor data_im, const at::Tensor data_offset, const int channels,
    const int height, const int width, const int ksize_h, const int ksize_w,
    const int pad_h, const int pad_w, const int stride_h, const int stride_w,
    const int dilation_h, const int dilation_w, const int parallel_imgs,
    const int deformable_group, at::Tensor data_col)
{
  // num_axes should be smaller than block size
  // todo: check parallel_imgs is correctly passed in
  int height_col = (height + 2 * pad_h - (dilation_h * (ksize_h - 1) + 1)) / stride_h + 1;
  int width_col = (width + 2 * pad_w - (dilation_w * (ksize_w - 1) + 1)) / stride_w + 1;
  int num_kernels = channels * height_col * width_col * parallel_imgs;
  int channel_per_deformable_group = channels / deformable_group;

  AT_DISPATCH_FLOATING_TYPES_AND_HALF(
      data_im.scalar_type(), "deformable_im2col_gpu", ([&] {
        const scalar_t *data_im_ = data_im.data_ptr<scalar_t>();
        const scalar_t *data_offset_ = data_offset.data_ptr<scalar_t>();
        scalar_t *data_col_ = data_col.data_ptr<scalar_t>();

        deformable_im2col_gpu_kernel<<<GET_BLOCKS(num_kernels), CUDA_NUM_THREADS>>>(
            num_kernels, data_im_, data_offset_, height, width, ksize_h, ksize_w,
            pad_h, pad_w, stride_h, stride_w, dilation_h, dilation_w,
            channel_per_deformable_group, parallel_imgs, channels, deformable_group,
            height_col, width_col, data_col_);
      }));

  cudaError_t err = cudaGetLastError();
  if (err != cudaSuccess)
  {
    printf("error in deformable_im2col: %s\n", cudaGetErrorString(err));
  }
}

template <typename scalar_t>
__global__ void deformable_col2im_gpu_kernel(
    const int n, const scalar_t *data_col, const scalar_t *data_offset,
    const int channels, const int height, const int width,
    const int kernel_h, const int kernel_w,
    const int pad_h, const int pad_w,
    const int stride_h, const int stride_w,
    const int dilation_h, const int dilation_w,
    const int channel_per_deformable_group,
    const int batch_size, const int deformable_group,
    const int height_col, const int width_col,
    scalar_t *grad_im)
{
  CUDA_KERNEL_LOOP(index, n)
  {
    const int j = (index / width_col / height_col / batch_size) % kernel_w;
    const int i = (index / width_col / height_col / batch_size / kernel_w) % kernel_h;
    const int c = index / width_col / height_col / batch_size / kernel_w / kernel_h;
    // compute the start and end of the output

    const int deformable_group_index = c / channel_per_deformable_group;

    int w_out = index % width_col;
    int h_out = (index / width_col) % height_col;
    int b = (index / width_col / height_col) % batch_size;
    int w_in = w_out * stride_w - pad_w;
    int h_in = h_out * stride_h - pad_h;

    const scalar_t *data_offset_ptr = data_offset + (b * deformable_group + deformable_group_index) *
                                                        2 * kernel_h * kernel_w * height_col * width_col;
    const int data_offset_h_ptr = ((2 * (i * kernel_w + j)) * height_col + h_out) * width_col + w_out;
    const int data_offset_w_ptr = ((2 * (i * kernel_w + j) + 1) * height_col + h_out) * width_col + w_out;
    const scalar_t offset_h = data_offset_ptr[data_offset_h_ptr];
    const scalar_t offset_w = data_offset_ptr[data_offset_w_ptr];
    const scalar_t cur_inv_h_data = h_in + i * dilation_h + offset_h;
    const scalar_t cur_inv_w_data = w_in + j * dilation_w + offset_w;

    const scalar_t cur_top_grad = data_col[index];
    const int cur_h = (int)cur_inv_h_data;
    const int cur_w = (int)cur_inv_w_data;
    for (int dy = -2; dy <= 2; dy++)
    {
      for (int dx = -2; dx <= 2; dx++)
      {
        if (cur_h + dy >= 0 && cur_h + dy < height &&
            cur_w + dx >= 0 && cur_w + dx < width &&
            abs(cur_inv_h_data - (cur_h + dy)) < 1 &&
            abs(cur_inv_w_data - (cur_w + dx)) < 1)
        {
          int cur_bottom_grad_pos = ((b * channels + c) * height + cur_h + dy) * width + cur_w + dx;
          scalar_t weight = get_gradient_weight(cur_inv_h_data, cur_inv_w_data, cur_h + dy, cur_w + dx, height, width);
          atomicAdd(grad_im + cur_bottom_grad_pos, weight * cur_top_grad);
        }
      }
    }
  }
}

void deformable_col2im(
    const at::Tensor data_col, const at::Tensor data_offset, const int channels,
    const int height, const int width, const int ksize_h,
    const int ksize_w, const int pad_h, const int pad_w,
    const int stride_h, const int stride_w,
    const int dilation_h, const int dilation_w,
    const int parallel_imgs, const int deformable_group,
    at::Tensor grad_im)
{

  // todo: make sure parallel_imgs is passed in correctly
  int height_col = (height + 2 * pad_h - (dilation_h * (ksize_h - 1) + 1)) / stride_h + 1;
  int width_col = (width + 2 * pad_w - (dilation_w * (ksize_w - 1) + 1)) / stride_w + 1;
  int num_kernels = channels * ksize_h * ksize_w * height_col * width_col * parallel_imgs;
  int channel_per_deformable_group = channels / deformable_group;

  AT_DISPATCH_FLOATING_TYPES_AND_HALF(
      data_col.scalar_type(), "deformable_col2im_gpu", ([&] {
        const scalar_t *data_col_ = data_col.data_ptr<scalar_t>();
        const scalar_t *data_offset_ = data_offset.data_ptr<scalar_t>();
        scalar_t *grad_im_ = grad_im.data_ptr<scalar_t>();

        deformable_col2im_gpu_kernel<<<GET_BLOCKS(num_kernels), CUDA_NUM_THREADS>>>(
            num_kernels, data_col_, data_offset_, channels, height, width, ksize_h,
            ksize_w, pad_h, pad_w, stride_h, stride_w,
            dilation_h, dilation_w, channel_per_deformable_group,
            parallel_imgs, deformable_group, height_col, width_col, grad_im_);
      }));

  cudaError_t err = cudaGetLastError();
  if (err != cudaSuccess)
  {
    printf("error in deformable_col2im: %s\n", cudaGetErrorString(err));
  }
}

template <typename scalar_t>
__global__ void deformable_col2im_coord_gpu_kernel(const int n, const scalar_t *data_col,
                                                   const scalar_t *data_im, const scalar_t *data_offset,
                                                   const int channels, const int height, const int width,
                                                   const int kernel_h, const int kernel_w,
                                                   const int pad_h, const int pad_w,
                                                   const int stride_h, const int stride_w,
                                                   const int dilation_h, const int dilation_w,
                                                   const int channel_per_deformable_group,
                                                   const int batch_size, const int offset_channels, const int deformable_group,
                                                   const int height_col, const int width_col, scalar_t *grad_offset)
{
  CUDA_KERNEL_LOOP(index, n)
  {
    scalar_t val = 0;
    int w = index % width_col;
    int h = (index / width_col) % height_col;
    int c = (index / width_col / height_col) % offset_channels;
    int b = (index / width_col / height_col) / offset_channels;
    // compute the start and end of the output

    const int deformable_group_index = c / (2 * kernel_h * kernel_w);
    const int col_step = kernel_h * kernel_w;
    int cnt = 0;
    const scalar_t *data_col_ptr = data_col + deformable_group_index * channel_per_deformable_group *
                                                  batch_size * width_col * height_col;
    const scalar_t *data_im_ptr = data_im + (b * deformable_group + deformable_group_index) *
                                                channel_per_deformable_group / kernel_h / kernel_w * height * width;
    const scalar_t *data_offset_ptr = data_offset + (b * deformable_group + deformable_group_index) * 2 *
                                                        kernel_h * kernel_w * height_col * width_col;

    const int offset_c = c - deformable_group_index * 2 * kernel_h * kernel_w;

    for (int col_c = (offset_c / 2); col_c < channel_per_deformable_group; col_c += col_step)
    {
      const int col_pos = (((col_c * batch_size + b) * height_col) + h) * width_col + w;
      const int bp_dir = offset_c % 2;

      int j = (col_pos / width_col / height_col / batch_size) % kernel_w;
      int i = (col_pos / width_col / height_col / batch_size / kernel_w) % kernel_h;
      int w_out = col_pos % width_col;
      int h_out = (col_pos / width_col) % height_col;
      int w_in = w_out * stride_w - pad_w;
      int h_in = h_out * stride_h - pad_h;
      const int data_offset_h_ptr = (((2 * (i * kernel_w + j)) * height_col + h_out) * width_col + w_out);
      const int data_offset_w_ptr = (((2 * (i * kernel_w + j) + 1) * height_col + h_out) * width_col + w_out);
      const scalar_t offset_h = data_offset_ptr[data_offset_h_ptr];
      const scalar_t offset_w = data_offset_ptr[data_offset_w_ptr];
      scalar_t inv_h = h_in + i * dilation_h + offset_h;
      scalar_t inv_w = w_in + j * dilation_w + offset_w;
      if (inv_h <= -1 || inv_w <= -1 || inv_h >= height || inv_w >= width)
      {
        inv_h = inv_w = -2;
      }
      const scalar_t weight = get_coordinate_weight(
          inv_h, inv_w,
          height, width, data_im_ptr + cnt * height * width, width, bp_dir);
      val += weight * data_col_ptr[col_pos];
      cnt += 1;
    }

    grad_offset[index] = val;
  }
}

void deformable_col2im_coord(
    const at::Tensor data_col, const at::Tensor data_im, const at::Tensor data_offset,
    const int channels, const int height, const int width, const int ksize_h,
    const int ksize_w, const int pad_h, const int pad_w, const int stride_h,
    const int stride_w, const int dilation_h, const int dilation_w,
    const int parallel_imgs, const int deformable_group, at::Tensor grad_offset)
{

  int height_col = (height + 2 * pad_h - (dilation_h * (ksize_h - 1) + 1)) / stride_h + 1;
  int width_col = (width + 2 * pad_w - (dilation_w * (ksize_w - 1) + 1)) / stride_w + 1;
  int num_kernels = height_col * width_col * 2 * ksize_h * ksize_w * deformable_group * parallel_imgs;
  int channel_per_deformable_group = channels * ksize_h * ksize_w / deformable_group;

  AT_DISPATCH_FLOATING_TYPES_AND_HALF(
      data_col.scalar_type(), "deformable_col2im_coord_gpu", ([&] {
        const scalar_t *data_col_ = data_col.data_ptr<scalar_t>();
        const scalar_t *data_im_ = data_im.data_ptr<scalar_t>();
        const scalar_t *data_offset_ = data_offset.data_ptr<scalar_t>();
        scalar_t *grad_offset_ = grad_offset.data_ptr<scalar_t>();

        deformable_col2im_coord_gpu_kernel<<<GET_BLOCKS(num_kernels), CUDA_NUM_THREADS>>>(
            num_kernels, data_col_, data_im_, data_offset_, channels, height, width,
            ksize_h, ksize_w, pad_h, pad_w, stride_h, stride_w,
            dilation_h, dilation_w, channel_per_deformable_group,
            parallel_imgs, 2 * ksize_h * ksize_w * deformable_group, deformable_group,
            height_col, width_col, grad_offset_);
      }));
}

template <typename scalar_t>
__device__ scalar_t dmcn_im2col_bilinear(const scalar_t *bottom_data, const int data_width,
                                         const int height, const int width, scalar_t h, scalar_t w)
{
  int h_low = floor(h);
  int w_low = floor(w);
  int h_high = h_low + 1;
  int w_high = w_low + 1;

  scalar_t lh = h - h_low;
  scalar_t lw = w - w_low;
  scalar_t hh = 1 - lh, hw = 1 - lw;

  scalar_t v1 = 0;
  if (h_low >= 0 && w_low >= 0)
    v1 = bottom_data[h_low * data_width + w_low];
  scalar_t v2 = 0;
  if (h_low >= 0 && w_high <= width - 1)
    v2 = bottom_data[h_low * data_width + w_high];
  scalar_t v3 = 0;
  if (h_high <= height - 1 && w_low >= 0)
    v3 = bottom_data[h_high * data_width + w_low];
  scalar_t v4 = 0;
  if (h_high <= height - 1 && w_high <= width - 1)
    v4 = bottom_data[h_high * data_width + w_high];

  scalar_t w1 = hh * hw, w2 = hh * lw, w3 = lh * hw, w4 = lh * lw;

  scalar_t val = (w1 * v1 + w2 * v2 + w3 * v3 + w4 * v4);
  return val;
}

template <typename scalar_t>
__device__ scalar_t dmcn_get_gradient_weight(scalar_t argmax_h, scalar_t argmax_w,
                                             const int h, const int w, const int height, const int width)
{
  if (argmax_h <= -1 || argmax_h >= height || argmax_w <= -1 || argmax_w >= width)
  {
    //empty
    return 0;
  }

  int argmax_h_low = floor(argmax_h);
  int argmax_w_low = floor(argmax_w);
  int argmax_h_high = argmax_h_low + 1;
  int argmax_w_high = argmax_w_low + 1;

  scalar_t weight = 0;
  if (h == argmax_h_low && w == argmax_w_low)
    weight = (h + 1 - argmax_h) * (w + 1 - argmax_w);
  if (h == argmax_h_low && w == argmax_w_high)
    weight = (h + 1 - argmax_h) * (argmax_w + 1 - w);
  if (h == argmax_h_high && w == argmax_w_low)
    weight = (argmax_h + 1 - h) * (w + 1 - argmax_w);
  if (h == argmax_h_high && w == argmax_w_high)
    weight = (argmax_h + 1 - h) * (argmax_w + 1 - w);
  return weight;
}

template <typename scalar_t>
__device__ scalar_t dmcn_get_coordinate_weight(scalar_t argmax_h, scalar_t argmax_w,
                                               const int height, const int width, const scalar_t *im_data,
                                               const int data_width, const int bp_dir)
{
  if (argmax_h <= -1 || argmax_h >= height || argmax_w <= -1 || argmax_w >= width)
  {
    //empty
    return 0;
  }

  int argmax_h_low = floor(argmax_h);
  int argmax_w_low = floor(argmax_w);
  int argmax_h_high = argmax_h_low + 1;
  int argmax_w_high = argmax_w_low + 1;

  scalar_t weight = 0;

  if (bp_dir == 0)
  {
    if (argmax_h_low >= 0 && argmax_w_low >= 0)
      weight += -1 * (argmax_w_low + 1 - argmax_w) * im_data[argmax_h_low * data_width + argmax_w_low];
    if (argmax_h_low >= 0 && argmax_w_high <= width - 1)
      weight += -1 * (argmax_w - argmax_w_low) * im_data[argmax_h_low * data_width + argmax_w_high];
    if (argmax_h_high <= height - 1 && argmax_w_low >= 0)
      weight += (argmax_w_low + 1 - argmax_w) * im_data[argmax_h_high * data_width + argmax_w_low];
    if (argmax_h_high <= height - 1 && argmax_w_high <= width - 1)
      weight += (argmax_w - argmax_w_low) * im_data[argmax_h_high * data_width + argmax_w_high];
  }
  else if (bp_dir == 1)
  {
    if (argmax_h_low >= 0 && argmax_w_low >= 0)
      weight += -1 * (argmax_h_low + 1 - argmax_h) * im_data[argmax_h_low * data_width + argmax_w_low];
    if (argmax_h_low >= 0 && argmax_w_high <= width - 1)
      weight += (argmax_h_low + 1 - argmax_h) * im_data[argmax_h_low * data_width + argmax_w_high];
    if (argmax_h_high <= height - 1 && argmax_w_low >= 0)
      weight += -1 * (argmax_h - argmax_h_low) * im_data[argmax_h_high * data_width + argmax_w_low];
    if (argmax_h_high <= height - 1 && argmax_w_high <= width - 1)
      weight += (argmax_h - argmax_h_low) * im_data[argmax_h_high * data_width + argmax_w_high];
  }

  return weight;
}

template <typename scalar_t>
__global__ void modulated_deformable_im2col_gpu_kernel(const int n,
                                                       const scalar_t *data_im, const scalar_t *data_offset, const scalar_t *data_mask,
                                                       const int height, const int width, const int kernel_h, const int kernel_w,
                                                       const int pad_h, const int pad_w,
                                                       const int stride_h, const int stride_w,
                                                       const int dilation_h, const int dilation_w,
                                                       const int channel_per_deformable_group,
                                                       const int batch_size, const int num_channels, const int deformable_group,
                                                       const int height_col, const int width_col,
                                                       scalar_t *data_col)
{
  CUDA_KERNEL_LOOP(index, n)
  {
    // index index of output matrix
    const int w_col = index % width_col;
    const int h_col = (index / width_col) % height_col;
    const int b_col = (index / width_col / height_col) % batch_size;
    const int c_im = (index / width_col / height_col) / batch_size;
    const int c_col = c_im * kernel_h * kernel_w;

    // compute deformable group index
    const int deformable_group_index = c_im / channel_per_deformable_group;

    const int h_in = h_col * stride_h - pad_h;
    const int w_in = w_col * stride_w - pad_w;

    scalar_t *data_col_ptr = data_col + ((c_col * batch_size + b_col) * height_col + h_col) * width_col + w_col;
    //const float* data_im_ptr = data_im + ((b_col * num_channels + c_im) * height + h_in) * width + w_in;
    const scalar_t *data_im_ptr = data_im + (b_col * num_channels + c_im) * height * width;
    const scalar_t *data_offset_ptr = data_offset + (b_col * deformable_group + deformable_group_index) * 2 * kernel_h * kernel_w * height_col * width_col;

    const scalar_t *data_mask_ptr = data_mask + (b_col * deformable_group + deformable_group_index) * kernel_h * kernel_w * height_col * width_col;

    for (int i = 0; i < kernel_h; ++i)
    {
      for (int j = 0; j < kernel_w; ++j)
      {
        const int data_offset_h_ptr = ((2 * (i * kernel_w + j)) * height_col + h_col) * width_col + w_col;
        const int data_offset_w_ptr = ((2 * (i * kernel_w + j) + 1) * height_col + h_col) * width_col + w_col;
        const int data_mask_hw_ptr = ((i * kernel_w + j) * height_col + h_col) * width_col + w_col;
        const scalar_t offset_h = data_offset_ptr[data_offset_h_ptr];
        const scalar_t offset_w = data_offset_ptr[data_offset_w_ptr];
        const scalar_t mask = data_mask_ptr[data_mask_hw_ptr];
        scalar_t val = static_cast<scalar_t>(0);
        const scalar_t h_im = h_in + i * dilation_h + offset_h;
        const scalar_t w_im = w_in + j * dilation_w + offset_w;
        //if (h_im >= 0 && w_im >= 0 && h_im < height && w_im < width) {
        if (h_im > -1 && w_im > -1 && h_im < height && w_im < width)
        {
          //const float map_h = i * dilation_h + offset_h;
          //const float map_w = j * dilation_w + offset_w;
          //const int cur_height = height - h_in;
          //const int cur_width = width - w_in;
          //val = dmcn_im2col_bilinear(data_im_ptr, width, cur_height, cur_width, map_h, map_w);
          val = dmcn_im2col_bilinear(data_im_ptr, width, height, width, h_im, w_im);
        }
        *data_col_ptr = val * mask;
        data_col_ptr += batch_size * height_col * width_col;
        //data_col_ptr += height_col * width_col;
      }
    }
  }
}

template <typename scalar_t>
__global__ void modulated_deformable_col2im_gpu_kernel(const int n,
                                                       const scalar_t *data_col, const scalar_t *data_offset, const scalar_t *data_mask,
                                                       const int channels, const int height, const int width,
                                                       const int kernel_h, const int kernel_w,
                                                       const int pad_h, const int pad_w,
                                                       const int stride_h, const int stride_w,
                                                       const int dilation_h, const int dilation_w,
                                                       const int channel_per_deformable_group,
                                                       const int batch_size, const int deformable_group,
                                                       const int height_col, const int width_col,
                                                       scalar_t *grad_im)
{
  CUDA_KERNEL_LOOP(index, n)
  {
    const int j = (index / width_col / height_col / batch_size) % kernel_w;
    const int i = (index / width_col / height_col / batch_size / kernel_w) % kernel_h;
    const int c = index / width_col / height_col / batch_size / kernel_w / kernel_h;
    // compute the start and end of the output

    const int deformable_group_index = c / channel_per_deformable_group;

    int w_out = index % width_col;
    int h_out = (index / width_col) % height_col;
    int b = (index / width_col / height_col) % batch_size;
    int w_in = w_out * stride_w - pad_w;
    int h_in = h_out * stride_h - pad_h;

    const scalar_t *data_offset_ptr = data_offset + (b * deformable_group + deformable_group_index) * 2 * kernel_h * kernel_w * height_col * width_col;
    const scalar_t *data_mask_ptr = data_mask + (b * deformable_group + deformable_group_index) * kernel_h * kernel_w * height_col * width_col;
    const int data_offset_h_ptr = ((2 * (i * kernel_w + j)) * height_col + h_out) * width_col + w_out;
    const int data_offset_w_ptr = ((2 * (i * kernel_w + j) + 1) * height_col + h_out) * width_col + w_out;
    const int data_mask_hw_ptr = ((i * kernel_w + j) * height_col + h_out) * width_col + w_out;
    const scalar_t offset_h = data_offset_ptr[data_offset_h_ptr];
    const scalar_t offset_w = data_offset_ptr[data_offset_w_ptr];
    const scalar_t mask = data_mask_ptr[data_mask_hw_ptr];
    const scalar_t cur_inv_h_data = h_in + i * dilation_h + offset_h;
    const scalar_t cur_inv_w_data = w_in + j * dilation_w + offset_w;

    const scalar_t cur_top_grad = data_col[index] * mask;
    const int cur_h = (int)cur_inv_h_data;
    const int cur_w = (int)cur_inv_w_data;
    for (int dy = -2; dy <= 2; dy++)
    {
      for (int dx = -2; dx <= 2; dx++)
      {
        if (cur_h + dy >= 0 && cur_h + dy < height &&
            cur_w + dx >= 0 && cur_w + dx < width &&
            abs(cur_inv_h_data - (cur_h + dy)) < 1 &&
            abs(cur_inv_w_data - (cur_w + dx)) < 1)
        {
          int cur_bottom_grad_pos = ((b * channels + c) * height + cur_h + dy) * width + cur_w + dx;
          scalar_t weight = dmcn_get_gradient_weight(cur_inv_h_data, cur_inv_w_data, cur_h + dy, cur_w + dx, height, width);
          atomicAdd(grad_im + cur_bottom_grad_pos, weight * cur_top_grad);
        }
      }
    }
  }
}

template <typename scalar_t>
__global__ void modulated_deformable_col2im_coord_gpu_kernel(const int n,
                                                             const scalar_t *data_col, const scalar_t *data_im,
                                                             const scalar_t *data_offset, const scalar_t *data_mask,
                                                             const int channels, const int height, const int width,
                                                             const int kernel_h, const int kernel_w,
                                                             const int pad_h, const int pad_w,
                                                             const int stride_h, const int stride_w,
                                                             const int dilation_h, const int dilation_w,
                                                             const int channel_per_deformable_group,
                                                             const int batch_size, const int offset_channels, const int deformable_group,
                                                             const int height_col, const int width_col,
                                                             scalar_t *grad_offset, scalar_t *grad_mask)
{
  CUDA_KERNEL_LOOP(index, n)
  {
    scalar_t val = 0, mval = 0;
    int w = index % width_col;
 
Download .txt
gitextract_t68xqa9y/

├── .gitignore
├── GLIP/
│   ├── CODE_OF_CONDUCT.md
│   ├── DATA.md
│   ├── LICENSE
│   ├── README.md
│   ├── SECURITY.md
│   ├── SUPPORT.md
│   ├── configs/
│   │   ├── flickr/
│   │   │   ├── test.yaml
│   │   │   └── val.yaml
│   │   ├── lvis/
│   │   │   ├── minival.yaml
│   │   │   └── val.yaml
│   │   ├── odinw/
│   │   │   └── Aquarium_Aquarium_Combined.v2-raw-1024.coco.yaml
│   │   └── pretrain/
│   │       ├── glip_Swin_L.yaml
│   │       └── glip_Swin_T_O365_GoldG.yaml
│   ├── maskrcnn_benchmark/
│   │   ├── __init__.py
│   │   ├── config/
│   │   │   ├── __init__.py
│   │   │   ├── defaults.py
│   │   │   └── paths_catalog.py
│   │   ├── csrc/
│   │   │   ├── ROIAlign.h
│   │   │   ├── ROIPool.h
│   │   │   ├── SigmoidFocalLoss.h
│   │   │   ├── cpu/
│   │   │   │   ├── ROIAlign_cpu.cpp
│   │   │   │   ├── nms_cpu.cpp
│   │   │   │   ├── soft_nms.cpp
│   │   │   │   └── vision.h
│   │   │   ├── cuda/
│   │   │   │   ├── ROIAlign_cuda.cu
│   │   │   │   ├── ROIPool_cuda.cu
│   │   │   │   ├── SigmoidFocalLoss_cuda.cu
│   │   │   │   ├── deform_conv_cuda.cu
│   │   │   │   ├── deform_conv_kernel_cuda.cu
│   │   │   │   ├── deform_pool_cuda.cu
│   │   │   │   ├── deform_pool_kernel_cuda.cu
│   │   │   │   ├── ml_nms.cu
│   │   │   │   ├── nms.cu
│   │   │   │   └── vision.h
│   │   │   ├── deform_conv.h
│   │   │   ├── deform_pool.h
│   │   │   ├── ml_nms.h
│   │   │   ├── nms.h
│   │   │   └── vision.cpp
│   │   ├── data/
│   │   │   ├── __init__.py
│   │   │   ├── build.py
│   │   │   ├── collate_batch.py
│   │   │   ├── datasets/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── background.py
│   │   │   │   ├── box_label_loader.py
│   │   │   │   ├── caption.py
│   │   │   │   ├── coco.py
│   │   │   │   ├── coco_dt.py
│   │   │   │   ├── concat_dataset.py
│   │   │   │   ├── custom_distributed_sampler.py
│   │   │   │   ├── duplicate_dataset.py
│   │   │   │   ├── evaluation/
│   │   │   │   │   ├── __init__.py
│   │   │   │   │   ├── box_aug.py
│   │   │   │   │   ├── coco/
│   │   │   │   │   │   ├── __init__.py
│   │   │   │   │   │   └── coco_eval.py
│   │   │   │   │   ├── flickr/
│   │   │   │   │   │   ├── __init__.py
│   │   │   │   │   │   └── flickr_eval.py
│   │   │   │   │   ├── lvis/
│   │   │   │   │   │   ├── _change_lvis_annotation.py
│   │   │   │   │   │   ├── lvis.py
│   │   │   │   │   │   └── lvis_eval.py
│   │   │   │   │   ├── od_eval.py
│   │   │   │   │   ├── od_to_grounding/
│   │   │   │   │   │   ├── __init__.py
│   │   │   │   │   │   └── od_eval.py
│   │   │   │   │   ├── vg/
│   │   │   │   │   │   ├── __init__.py
│   │   │   │   │   │   └── vg_eval.py
│   │   │   │   │   └── voc/
│   │   │   │   │       ├── __init__.py
│   │   │   │   │       └── voc_eval.py
│   │   │   │   ├── flickr.py
│   │   │   │   ├── gqa.py
│   │   │   │   ├── imagenet.py
│   │   │   │   ├── list_dataset.py
│   │   │   │   ├── lvis.py
│   │   │   │   ├── mixed.py
│   │   │   │   ├── mixup.py
│   │   │   │   ├── modulated_coco.py
│   │   │   │   ├── object365.py
│   │   │   │   ├── od_to_grounding.py
│   │   │   │   ├── phrasecut.py
│   │   │   │   ├── pseudo_data.py
│   │   │   │   ├── refexp.py
│   │   │   │   ├── tsv.py
│   │   │   │   ├── vg.py
│   │   │   │   └── voc.py
│   │   │   ├── samplers/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── distributed.py
│   │   │   │   ├── grouped_batch_sampler.py
│   │   │   │   └── iteration_based_batch_sampler.py
│   │   │   └── transforms/
│   │   │       ├── __init__.py
│   │   │       ├── build.py
│   │   │       └── transforms.py
│   │   ├── engine/
│   │   │   ├── __init__.py
│   │   │   ├── alter_trainer.py
│   │   │   ├── evolution.py
│   │   │   ├── inference.py
│   │   │   ├── predictor.py
│   │   │   ├── predictor_glip.py
│   │   │   ├── singlepath_trainer.py
│   │   │   ├── stage_trainer.py
│   │   │   └── trainer.py
│   │   ├── layers/
│   │   │   ├── __init__.py
│   │   │   ├── batch_norm.py
│   │   │   ├── deform_conv.py
│   │   │   ├── deform_pool.py
│   │   │   ├── dropblock.py
│   │   │   ├── dyhead.py
│   │   │   ├── dyrelu.py
│   │   │   ├── evonorm.py
│   │   │   ├── iou_loss.py
│   │   │   ├── misc.py
│   │   │   ├── nms.py
│   │   │   ├── roi_align.py
│   │   │   ├── roi_pool.py
│   │   │   ├── se.py
│   │   │   ├── set_loss.py
│   │   │   ├── sigmoid_focal_loss.py
│   │   │   └── smooth_l1_loss.py
│   │   ├── modeling/
│   │   │   ├── __init__.py
│   │   │   ├── backbone/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── bifpn.py
│   │   │   │   ├── blocks.py
│   │   │   │   ├── efficientdet.py
│   │   │   │   ├── efficientnet.py
│   │   │   │   ├── fbnet.py
│   │   │   │   ├── fpn.py
│   │   │   │   ├── mixer.py
│   │   │   │   ├── ops.py
│   │   │   │   ├── resnet.py
│   │   │   │   ├── swint.py
│   │   │   │   ├── swint_v2.py
│   │   │   │   ├── swint_v2_vl.py
│   │   │   │   └── swint_vl.py
│   │   │   ├── balanced_positive_negative_sampler.py
│   │   │   ├── box_coder.py
│   │   │   ├── detector/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── generalized_rcnn.py
│   │   │   │   └── generalized_vl_rcnn.py
│   │   │   ├── language_backbone/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── backbone.py
│   │   │   │   ├── bert_model.py
│   │   │   │   ├── build.py
│   │   │   │   ├── clip_model.py
│   │   │   │   ├── hfpt_tokenizer.py
│   │   │   │   ├── rnn_model.py
│   │   │   │   ├── simple_tokenizer.py
│   │   │   │   ├── test_clip_tokenizer.py
│   │   │   │   └── word_utils.py
│   │   │   ├── make_layers.py
│   │   │   ├── matcher.py
│   │   │   ├── poolers.py
│   │   │   ├── registry.py
│   │   │   ├── roi_heads/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── box_head/
│   │   │   │   │   ├── __init__.py
│   │   │   │   │   ├── box_head.py
│   │   │   │   │   ├── inference.py
│   │   │   │   │   ├── loss.py
│   │   │   │   │   ├── roi_box_feature_extractors.py
│   │   │   │   │   └── roi_box_predictors.py
│   │   │   │   ├── keypoint_head/
│   │   │   │   │   ├── inference.py
│   │   │   │   │   ├── keypoint_head.py
│   │   │   │   │   ├── loss.py
│   │   │   │   │   ├── roi_keypoint_feature_extractors.py
│   │   │   │   │   └── roi_keypoint_predictors.py
│   │   │   │   └── mask_head/
│   │   │   │       ├── __init__.py
│   │   │   │       ├── hourglass.py
│   │   │   │       ├── inference.py
│   │   │   │       ├── loss.py
│   │   │   │       ├── mask_head.py
│   │   │   │       ├── roi_mask_feature_extractors.py
│   │   │   │       └── roi_mask_predictors.py
│   │   │   ├── rpn/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── anchor_generator.py
│   │   │   │   ├── atss.py
│   │   │   │   ├── dyhead.py
│   │   │   │   ├── fcos.py
│   │   │   │   ├── inference.py
│   │   │   │   ├── loss.py
│   │   │   │   ├── modeling_bert.py
│   │   │   │   ├── retina.py
│   │   │   │   ├── rpn.py
│   │   │   │   ├── transformer.py
│   │   │   │   └── vldyhead.py
│   │   │   └── utils.py
│   │   ├── solver/
│   │   │   ├── __init__.py
│   │   │   ├── build.py
│   │   │   └── lr_scheduler.py
│   │   ├── structures/
│   │   │   ├── __init__.py
│   │   │   ├── bounding_box.py
│   │   │   ├── boxlist_ops.py
│   │   │   ├── image_list.py
│   │   │   ├── keypoint.py
│   │   │   └── segmentation_mask.py
│   │   └── utils/
│   │       ├── README.md
│   │       ├── __init__.py
│   │       ├── amp.py
│   │       ├── big_model_loading.py
│   │       ├── c2_model_loading.py
│   │       ├── checkpoint.py
│   │       ├── collect_env.py
│   │       ├── comm.py
│   │       ├── cv2_util.py
│   │       ├── dist.py
│   │       ├── ema.py
│   │       ├── env.py
│   │       ├── flops.py
│   │       ├── fuse_helper.py
│   │       ├── imports.py
│   │       ├── logger.py
│   │       ├── mdetr_dist.py
│   │       ├── metric_logger.py
│   │       ├── miscellaneous.py
│   │       ├── model_serialization.py
│   │       ├── model_zoo.py
│   │       ├── pretrain_model_loading.py
│   │       ├── registry.py
│   │       ├── shallow_contrastive_loss_helper.py
│   │       └── stats.py
│   ├── setup.py
│   └── tools/
│       ├── cityscapes/
│       │   ├── convert_cityscapes_to_coco.py
│       │   └── instances2dict_with_polygons.py
│       ├── eval_all.py
│       ├── finetune.py
│       ├── test_grounding_net.py
│       ├── test_net.py
│       └── train_net.py
├── GroundingDINO/
│   ├── LICENSE
│   ├── README.md
│   ├── demo/
│   │   ├── gradio_app.py
│   │   └── inference_on_a_image.py
│   ├── groundingdino/
│   │   ├── __init__.py
│   │   ├── config/
│   │   │   ├── GroundingDINO_SwinB.py
│   │   │   └── GroundingDINO_SwinT_OGC.py
│   │   ├── datasets/
│   │   │   ├── __init__.py
│   │   │   └── transforms.py
│   │   ├── models/
│   │   │   ├── GroundingDINO/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── backbone/
│   │   │   │   │   ├── __init__.py
│   │   │   │   │   ├── backbone.py
│   │   │   │   │   ├── position_encoding.py
│   │   │   │   │   └── swin_transformer.py
│   │   │   │   ├── bertwarper.py
│   │   │   │   ├── csrc/
│   │   │   │   │   ├── MsDeformAttn/
│   │   │   │   │   │   ├── ms_deform_attn.h
│   │   │   │   │   │   ├── ms_deform_attn_cpu.cpp
│   │   │   │   │   │   ├── ms_deform_attn_cpu.h
│   │   │   │   │   │   ├── ms_deform_attn_cuda.cu
│   │   │   │   │   │   ├── ms_deform_attn_cuda.h
│   │   │   │   │   │   └── ms_deform_im2col_cuda.cuh
│   │   │   │   │   ├── cuda_version.cu
│   │   │   │   │   └── vision.cpp
│   │   │   │   ├── fuse_modules.py
│   │   │   │   ├── groundingdino.py
│   │   │   │   ├── ms_deform_attn.py
│   │   │   │   ├── transformer.py
│   │   │   │   ├── transformer_vanilla.py
│   │   │   │   └── utils.py
│   │   │   ├── __init__.py
│   │   │   └── registry.py
│   │   ├── util/
│   │   │   ├── __init__.py
│   │   │   ├── box_ops.py
│   │   │   ├── get_tokenlizer.py
│   │   │   ├── inference.py
│   │   │   ├── logger.py
│   │   │   ├── misc.py
│   │   │   ├── slconfig.py
│   │   │   ├── slio.py
│   │   │   ├── time_counter.py
│   │   │   ├── utils.py
│   │   │   ├── visualizer.py
│   │   │   └── vl_utils.py
│   │   └── version.py
│   ├── pyproject.toml
│   ├── requirements.txt
│   └── setup.py
├── LICENSE
├── README.md
├── SG_Nav.py
├── configs/
│   ├── challenge_objectnav2021.local.rgbd.yaml
│   ├── challenge_objectnav2022.local.rgbd.yaml
│   ├── challenge_pointnav2021.local.rgbd.yaml
│   ├── challenge_pointnav2021.local.rgbd_test_scene.yaml
│   ├── ddppo_objectnav.yaml
│   ├── ddppo_pointnav.yaml
│   └── ddppo_pointnav_.yaml
├── habitat-lab/
│   ├── .circleci/
│   │   └── config.yml
│   ├── .editorconfig
│   ├── .github/
│   │   ├── ISSUE_TEMPLATE/
│   │   │   ├── bug-report.md
│   │   │   ├── feature-request.md
│   │   │   └── questions-help-support.md
│   │   └── PULL_REQUEST_TEMPLATE.md
│   ├── .gitignore
│   ├── .pre-commit-config.yaml
│   ├── CODE_OF_CONDUCT.md
│   ├── CONTRIBUTING.md
│   ├── Dockerfile
│   ├── LICENSE
│   ├── MANIFEST.in
│   ├── README.md
│   ├── configs/
│   │   ├── baselines/
│   │   │   └── ppo.yaml
│   │   ├── datasets/
│   │   │   ├── eqa/
│   │   │   │   └── mp3d.yaml
│   │   │   ├── imagenav/
│   │   │   │   ├── gibson.yaml
│   │   │   │   └── mp3d.yaml
│   │   │   ├── objectnav/
│   │   │   │   ├── hm3d.yaml
│   │   │   │   └── mp3d.yaml
│   │   │   ├── pointnav/
│   │   │   │   ├── gibson.yaml
│   │   │   │   ├── gibson_0_plus.yaml
│   │   │   │   ├── gibson_v2.yaml
│   │   │   │   ├── habitat_test.yaml
│   │   │   │   ├── hm3d.yaml
│   │   │   │   └── mp3d.yaml
│   │   │   ├── rearrangepick/
│   │   │   │   └── replica_cad.yaml
│   │   │   ├── single_episode.yaml
│   │   │   └── vln/
│   │   │       └── mp3d_r2r.yaml
│   │   ├── tasks/
│   │   │   ├── eqa_mp3d.yaml
│   │   │   ├── imagenav.yaml
│   │   │   ├── imagenav_gibson.yaml
│   │   │   ├── objectnav_hm3d.yaml
│   │   │   ├── objectnav_mp3d.yaml
│   │   │   ├── pointnav.yaml
│   │   │   ├── pointnav_gibson.yaml
│   │   │   ├── pointnav_hm3d.yaml
│   │   │   ├── pointnav_mp3d.yaml
│   │   │   ├── pointnav_rgbd.yaml
│   │   │   ├── rearrange/
│   │   │   │   ├── pick.yaml
│   │   │   │   ├── pick_spa.yaml
│   │   │   │   ├── pick_state.yaml
│   │   │   │   └── play.yaml
│   │   │   └── vln_r2r.yaml
│   │   └── test/
│   │       ├── habitat_all_sensors_test.yaml
│   │       ├── habitat_mp3d_eqa_test.yaml
│   │       ├── habitat_mp3d_object_nav_test.yaml
│   │       ├── habitat_r2r_vln_test.yaml
│   │       └── new_keys_test.yaml
│   ├── docs/
│   │   ├── .gitignore
│   │   ├── build-public.sh
│   │   ├── build.sh
│   │   ├── conf-public.py
│   │   ├── conf.py
│   │   ├── docs.rst
│   │   └── pages/
│   │       ├── habitat-lab-demo.rst
│   │       ├── habitat-sim-demo.rst
│   │       ├── index.rst
│   │       ├── quickstart.rst
│   │       └── view-transform-warp.rst
│   ├── examples/
│   │   ├── __init__.py
│   │   ├── benchmark.py
│   │   ├── example.py
│   │   ├── example_pointnav.py
│   │   ├── interactive_play.py
│   │   ├── new_actions.py
│   │   ├── register_new_sensors_and_measures.py
│   │   ├── shortest_path_follower_example.py
│   │   ├── tutorials/
│   │   │   ├── colabs/
│   │   │   │   └── Habitat_Lab.ipynb
│   │   │   └── nb_python/
│   │   │       └── Habitat_Lab.py
│   │   ├── visualization_examples.py
│   │   ├── vln_benchmark.py
│   │   └── vln_reference_path_follower_example.py
│   ├── habitat/
│   │   ├── __init__.py
│   │   ├── config/
│   │   │   ├── __init__.py
│   │   │   └── default.py
│   │   ├── core/
│   │   │   ├── __init__.py
│   │   │   ├── agent.py
│   │   │   ├── benchmark.py
│   │   │   ├── challenge.py
│   │   │   ├── dataset.py
│   │   │   ├── embodied_task.py
│   │   │   ├── env.py
│   │   │   ├── environments.py
│   │   │   ├── logging.py
│   │   │   ├── registry.py
│   │   │   ├── simulator.py
│   │   │   ├── spaces.py
│   │   │   ├── utils.py
│   │   │   └── vector_env.py
│   │   ├── datasets/
│   │   │   ├── __init__.py
│   │   │   ├── eqa/
│   │   │   │   ├── __init__.py
│   │   │   │   └── mp3d_eqa_dataset.py
│   │   │   ├── object_nav/
│   │   │   │   ├── __init__.py
│   │   │   │   └── object_nav_dataset.py
│   │   │   ├── pointnav/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── pointnav_dataset.py
│   │   │   │   └── pointnav_generator.py
│   │   │   ├── rearrange/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── configs/
│   │   │   │   │   ├── pick/
│   │   │   │   │   │   └── counter.yaml
│   │   │   │   │   └── test_config.yaml
│   │   │   │   ├── generate_episode_inits.py
│   │   │   │   ├── rearrange_dataset.py
│   │   │   │   ├── rearrange_generator.py
│   │   │   │   ├── receptacle.py
│   │   │   │   └── samplers.py
│   │   │   ├── registration.py
│   │   │   ├── utils.py
│   │   │   └── vln/
│   │   │       ├── __init__.py
│   │   │       └── r2r_vln_dataset.py
│   │   ├── py.typed
│   │   ├── sims/
│   │   │   ├── __init__.py
│   │   │   ├── habitat_simulator/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── actions.py
│   │   │   │   ├── debug_visualizer.py
│   │   │   │   ├── habitat_simulator.py
│   │   │   │   └── sim_utilities.py
│   │   │   ├── pyrobot/
│   │   │   │   ├── __init__.py
│   │   │   │   └── pyrobot.py
│   │   │   └── registration.py
│   │   ├── tasks/
│   │   │   ├── __init__.py
│   │   │   ├── eqa/
│   │   │   │   ├── __init__.py
│   │   │   │   └── eqa.py
│   │   │   ├── nav/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── nav.py
│   │   │   │   ├── object_nav_task.py
│   │   │   │   └── shortest_path_follower.py
│   │   │   ├── rearrange/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── actions.py
│   │   │   │   ├── grip_actions.py
│   │   │   │   ├── marker_info.py
│   │   │   │   ├── policy_modules.py
│   │   │   │   ├── rearrange_grasp_manager.py
│   │   │   │   ├── rearrange_sensors.py
│   │   │   │   ├── rearrange_sim.py
│   │   │   │   ├── rearrange_task.py
│   │   │   │   ├── sub_tasks/
│   │   │   │   │   ├── pick_sensors.py
│   │   │   │   │   └── pick_task.py
│   │   │   │   └── utils.py
│   │   │   ├── registration.py
│   │   │   ├── utils.py
│   │   │   └── vln/
│   │   │       ├── __init__.py
│   │   │       └── vln.py
│   │   ├── utils/
│   │   │   ├── __init__.py
│   │   │   ├── common.py
│   │   │   ├── geometry_utils.py
│   │   │   ├── pickle5_multiprocessing.py
│   │   │   ├── profiling_wrapper.py
│   │   │   ├── test_utils.py
│   │   │   └── visualizations/
│   │   │       ├── __init__.py
│   │   │       ├── fog_of_war.py
│   │   │       ├── maps.py
│   │   │       └── utils.py
│   │   └── version.py
│   ├── habitat_baselines/
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── agents/
│   │   │   ├── __init__.py
│   │   │   ├── benchmark_gym.py
│   │   │   ├── mp_agents.py
│   │   │   ├── ppo_agents.py
│   │   │   ├── simple_agents.py
│   │   │   └── slam_agents.py
│   │   ├── common/
│   │   │   ├── __init__.py
│   │   │   ├── base_il_trainer.py
│   │   │   ├── base_trainer.py
│   │   │   ├── baseline_registry.py
│   │   │   ├── environments.py
│   │   │   ├── obs_transformers.py
│   │   │   ├── rollout_storage.py
│   │   │   ├── tensor_dict.py
│   │   │   └── tensorboard_utils.py
│   │   ├── config/
│   │   │   ├── __init__.py
│   │   │   ├── default.py
│   │   │   ├── eqa/
│   │   │   │   ├── il_eqa_cnn_pretrain.yaml
│   │   │   │   ├── il_pacman_nav.yaml
│   │   │   │   └── il_vqa.yaml
│   │   │   ├── imagenav/
│   │   │   │   ├── ddppo_imagenav_example.yaml
│   │   │   │   ├── ddppo_imagenav_gibson.yaml
│   │   │   │   └── ppo_imagenav_example.yaml
│   │   │   ├── objectnav/
│   │   │   │   ├── ddppo_objectnav.yaml
│   │   │   │   └── ddppo_objectnav_hm3d.yaml
│   │   │   ├── pointnav/
│   │   │   │   ├── ddppo_pointnav.yaml
│   │   │   │   ├── ppo_pointnav.yaml
│   │   │   │   ├── ppo_pointnav_example.yaml
│   │   │   │   └── ppo_pointnav_habitat_iccv19.yaml
│   │   │   ├── rearrange/
│   │   │   │   ├── rl_pick.yaml
│   │   │   │   ├── rl_pick_state.yaml
│   │   │   │   └── spap_pick.yaml
│   │   │   └── test/
│   │   │       ├── ddppo_imagenav_test.yaml
│   │   │       ├── ddppo_pointnav_test.yaml
│   │   │       ├── ppo_imagenav_test.yaml
│   │   │       └── ppo_pointnav_test.yaml
│   │   ├── il/
│   │   │   ├── README.md
│   │   │   ├── __init__.py
│   │   │   ├── metrics.py
│   │   │   ├── models/
│   │   │   │   ├── __init__.py
│   │   │   │   └── models.py
│   │   │   ├── requirements.txt
│   │   │   └── trainers/
│   │   │       ├── __init__.py
│   │   │       ├── eqa_cnn_pretrain_trainer.py
│   │   │       ├── pacman_trainer.py
│   │   │       └── vqa_trainer.py
│   │   ├── motion_planning/
│   │   │   ├── __init__.py
│   │   │   ├── grasp_generator.py
│   │   │   ├── motion_plan.py
│   │   │   ├── mp_sim.py
│   │   │   ├── mp_spaces.py
│   │   │   └── robot_target.py
│   │   ├── py.typed
│   │   ├── rl/
│   │   │   ├── __init__.py
│   │   │   ├── ddppo/
│   │   │   │   ├── README.md
│   │   │   │   ├── __init__.py
│   │   │   │   ├── algo/
│   │   │   │   │   ├── __init__.py
│   │   │   │   │   └── ddppo.py
│   │   │   │   ├── data_generation/
│   │   │   │   │   ├── create_gibson_large_dataset.py
│   │   │   │   │   └── gibson_dset_with_qual.json
│   │   │   │   ├── ddp_utils.py
│   │   │   │   ├── multi_node_slurm.sh
│   │   │   │   ├── policy/
│   │   │   │   │   ├── __init__.py
│   │   │   │   │   ├── resnet.py
│   │   │   │   │   ├── resnet_policy.py
│   │   │   │   │   └── running_mean_and_var.py
│   │   │   │   ├── requirements.txt
│   │   │   │   └── single_node.sh
│   │   │   ├── models/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── rnn_state_encoder.py
│   │   │   │   └── simple_cnn.py
│   │   │   ├── ppo/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── policy.py
│   │   │   │   ├── ppo.py
│   │   │   │   └── ppo_trainer.py
│   │   │   └── requirements.txt
│   │   ├── run.py
│   │   ├── slambased/
│   │   │   ├── README.md
│   │   │   ├── __init__.py
│   │   │   ├── install_deps.sh
│   │   │   ├── mappers.py
│   │   │   ├── monodepth.py
│   │   │   ├── path_planners.py
│   │   │   ├── reprojection.py
│   │   │   ├── requirements.txt
│   │   │   └── utils.py
│   │   └── utils/
│   │       ├── __init__.py
│   │       ├── common.py
│   │       ├── env_utils.py
│   │       ├── gym_adapter.py
│   │       ├── gym_definitions.py
│   │       ├── render_wrapper.py
│   │       └── visualizations/
│   │           ├── __init__.py
│   │           └── utils.py
│   ├── mypy.ini
│   ├── pyproject.toml
│   ├── requirements.txt
│   ├── scripts/
│   │   └── generate_profile_shell_scripts.py
│   ├── setup.cfg
│   ├── setup.py
│   └── test/
│       ├── test_baseline_agents.py
│       ├── test_baseline_trainers.py
│       ├── test_config.py
│       ├── test_dataset.py
│       ├── test_ddppo_reduce.py
│       ├── test_demo_notebook.py
│       ├── test_examples.py
│       ├── test_gym_wrapper.py
│       ├── test_habitat_env.py
│       ├── test_habitat_example.py
│       ├── test_habitat_sim.py
│       ├── test_habitat_task.py
│       ├── test_install.py
│       ├── test_motion_plan.py
│       ├── test_mp3d_eqa.py
│       ├── test_object_nav_task.py
│       ├── test_pointnav_dataset.py
│       ├── test_pyrobot.py
│       ├── test_r2r_vln.py
│       ├── test_rearrange_task.py
│       ├── test_rnn_state_encoder.py
│       ├── test_sensors.py
│       ├── test_spaces.py
│       ├── test_tensor_dict.py
│       └── test_visual_utils.py
├── requirements.txt
├── scenegraph.py
├── segment_anything/
│   ├── .flake8
│   ├── CODE_OF_CONDUCT.md
│   ├── CONTRIBUTING.md
│   ├── LICENSE
│   ├── README.md
│   ├── linter.sh
│   ├── notebooks/
│   │   ├── automatic_mask_generator_example.ipynb
│   │   ├── onnx_model_example.ipynb
│   │   └── predictor_example.ipynb
│   ├── scripts/
│   │   ├── amg.py
│   │   └── export_onnx_model.py
│   ├── segment_anything/
│   │   ├── __init__.py
│   │   ├── automatic_mask_generator.py
│   │   ├── build_sam.py
│   │   ├── build_sam_hq.py
│   │   ├── modeling/
│   │   │   ├── __init__.py
│   │   │   ├── common.py
│   │   │   ├── image_encoder.py
│   │   │   ├── mask_decoder.py
│   │   │   ├── mask_decoder_hq.py
│   │   │   ├── prompt_encoder.py
│   │   │   ├── sam.py
│   │   │   └── transformer.py
│   │   ├── predictor.py
│   │   └── utils/
│   │       ├── __init__.py
│   │       ├── amg.py
│   │       ├── onnx.py
│   │       └── transforms.py
│   ├── setup.cfg
│   └── setup.py
├── tools/
│   ├── agent.py
│   ├── download_mp.py
│   ├── matterport_category_mappings.tsv
│   ├── obj.npy
│   ├── replica.yaml
│   └── room.npy
└── utils/
    ├── __init__.py
    ├── image_process.py
    ├── utils_fmm/
    │   ├── __init__.py
    │   ├── control_helper.py
    │   ├── depth_utils.py
    │   ├── fmm_planner.py
    │   ├── mapping.py
    │   ├── model.py
    │   ├── pose_utils.py
    │   └── rotation_utils.py
    ├── utils_glip.py
    └── utils_scenegraph/
        ├── __init__.py
        ├── grounded_sam_demo.py
        ├── iou.py
        ├── mapping.py
        ├── slam_classes.py
        └── utils.py
Download .txt
Showing preview only (363K chars total). Download the full file or copy to clipboard to get everything.
SYMBOL INDEX (4564 symbols across 391 files)

FILE: GLIP/maskrcnn_benchmark/config/paths_catalog.py
  function try_to_find (line 7) | def try_to_find(file, return_dir=False, search_path=['./DATASET', './OUT...
  class DatasetCatalog (line 30) | class DatasetCatalog(object):
    method set (line 210) | def set(name, info):
    method get (line 214) | def get(name):
  class ModelCatalog (line 392) | class ModelCatalog(object):
    method get (line 416) | def get(name):
    method get_c2_imagenet_pretrained (line 424) | def get_c2_imagenet_pretrained(name):
    method get_c2_detectron_12_2017_baselines (line 432) | def get_c2_detectron_12_2017_baselines(name):

FILE: GLIP/maskrcnn_benchmark/csrc/cpu/ROIAlign_cpu.cpp
  type PreCalc (line 6) | struct PreCalc {
  function pre_calc_for_bilinear_interpolate (line 18) | void pre_calc_for_bilinear_interpolate(
  function ROIAlignForward_cpu_kernel (line 114) | void ROIAlignForward_cpu_kernel(
  function ROIAlign_forward_cpu (line 221) | at::Tensor ROIAlign_forward_cpu(const at::Tensor& input,

FILE: GLIP/maskrcnn_benchmark/csrc/cpu/nms_cpu.cpp
  function nms_cpu_kernel (line 6) | at::Tensor nms_cpu_kernel(const at::Tensor& dets,
  function nms_cpu (line 67) | at::Tensor nms_cpu(const at::Tensor& dets,

FILE: GLIP/maskrcnn_benchmark/csrc/cpu/soft_nms.cpp
  function soft_nms_cpu_kernel (line 6) | std::pair<at::Tensor, at::Tensor> soft_nms_cpu_kernel(const at::Tensor& ...
  function soft_nms_cpu (line 108) | std::pair<at::Tensor, at::Tensor> soft_nms_cpu(const at::Tensor& dets,

FILE: GLIP/maskrcnn_benchmark/csrc/deform_conv.h
  function deform_conv_forward (line 11) | int deform_conv_forward(
  function deform_conv_backward_input (line 45) | int deform_conv_backward_input(
  function deform_conv_backward_parameters (line 80) | int deform_conv_backward_parameters(
  function modulated_deform_conv_forward (line 115) | void modulated_deform_conv_forward(
  function modulated_deform_conv_backward (line 152) | void modulated_deform_conv_backward(

FILE: GLIP/maskrcnn_benchmark/csrc/deform_pool.h
  function deform_psroi_pooling_forward (line 11) | void deform_psroi_pooling_forward(
  function deform_psroi_pooling_backward (line 41) | void deform_psroi_pooling_backward(

FILE: GLIP/maskrcnn_benchmark/csrc/ml_nms.h
  function threshold (line 13) | float threshold) {

FILE: GLIP/maskrcnn_benchmark/csrc/vision.cpp
  function PYBIND11_MODULE (line 10) | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {

FILE: GLIP/maskrcnn_benchmark/data/build.py
  function build_dataset (line 21) | def build_dataset(cfg, dataset_list, transforms, dataset_catalog, is_tra...
  function build_dataset_by_group (line 124) | def build_dataset_by_group(dataset_list, transforms, dataset_catalog, is...
  function make_data_sampler (line 196) | def make_data_sampler(dataset, shuffle, distributed, num_replicas=None, ...
  function _quantize (line 207) | def _quantize(x, bins):
  function _compute_aspect_ratios (line 214) | def _compute_aspect_ratios(dataset):
  function make_batch_data_sampler (line 223) | def make_batch_data_sampler(
  function make_data_loader (line 244) | def make_data_loader(cfg, is_train=True, is_distributed=False, num_repli...

FILE: GLIP/maskrcnn_benchmark/data/collate_batch.py
  class BatchCollator (line 6) | class BatchCollator(object):
    method __init__ (line 13) | def __init__(self, size_divisible=0):
    method __call__ (line 16) | def __call__(self, batch):
  class BBoxAugCollator (line 70) | class BBoxAugCollator(object):
    method __call__ (line 77) | def __call__(self, batch):

FILE: GLIP/maskrcnn_benchmark/data/datasets/background.py
  class Background (line 11) | class Background(data.Dataset):
    method __init__ (line 21) | def __init__(self, ann_file, root, remove_images_without_annotations=N...
    method __getitem__ (line 28) | def __getitem__(self, index):
    method __len__ (line 48) | def __len__(self):
    method get_img_info (line 51) | def get_img_info(self, index):

FILE: GLIP/maskrcnn_benchmark/data/datasets/box_label_loader.py
  class LabelLoader (line 12) | class LabelLoader(object):
    method __init__ (line 13) | def __init__(self, labelmap, extra_fields=(), filter_duplicate_relatio...
    method __call__ (line 24) | def __call__(self, annotations, img_size, remove_empty=False, load_fie...
    method get_box_mask (line 60) | def get_box_mask(self, rect, img_size):
    method add_masks (line 72) | def add_masks(self, annotations, img_size):
    method add_classes (line 86) | def add_classes(self, annotations):
    method add_confidences (line 93) | def add_confidences(self, annotations):
    method add_attributes (line 102) | def add_attributes(self, annotations):
    method add_features (line 110) | def add_features(self, annotations):
    method add_scores_all (line 116) | def add_scores_all(self, annotations):
    method add_boxes_all (line 122) | def add_boxes_all(self, annotations):
    method relation_loader (line 128) | def relation_loader(self, relation_annos, target):
  class BoxLabelLoader (line 154) | class BoxLabelLoader(object):
    method __init__ (line 155) | def __init__(self, labelmap, extra_fields=(), ignore_attrs=(),
    method __call__ (line 165) | def __call__(self, annotations, img_size, remove_empty=True):
    method add_classes_with_ignore (line 197) | def add_classes_with_ignore(self, annotations):
    method add_masks (line 209) | def add_masks(self, annotations, img_size):
    method get_box_mask (line 223) | def get_box_mask(self, rect, img_size):
    method add_confidences (line 235) | def add_confidences(self, annotations):
    method add_attributes (line 246) | def add_attributes(self, annotations):

FILE: GLIP/maskrcnn_benchmark/data/datasets/caption.py
  class CaptionTSV (line 14) | class CaptionTSV(TSVYamlDataset):
    method __init__ (line 15) | def __init__(self,
    method __len__ (line 66) | def __len__(self):
    method pack_caption (line 69) | def pack_caption(self, positive_caption, negative_captions, original_t...
    method __get_negative_captions__ (line 108) | def __get_negative_captions__(self, idx, negative_size=7):
    method __getitem__ (line 117) | def __getitem__(self, idx):
    method convert_anno_from_v2_to_v1 (line 252) | def convert_anno_from_v2_to_v1(self, anno):
    method get_raw_image (line 270) | def get_raw_image(self, idx):
    method get_img_id (line 274) | def get_img_id(self, idx):

FILE: GLIP/maskrcnn_benchmark/data/datasets/coco.py
  function _count_visible_keypoints (line 20) | def _count_visible_keypoints(anno):
  function _has_only_empty_bbox (line 24) | def _has_only_empty_bbox(anno):
  function has_valid_annotation (line 28) | def has_valid_annotation(anno):
  function pil_loader (line 46) | def pil_loader(path, retry=5):
  function rgb2id (line 58) | def rgb2id(color):
  class CocoDetection (line 66) | class CocoDetection(data.Dataset):
    method __init__ (line 78) | def __init__(self, root, annFile, transform=None, target_transform=None):
    method __getitem__ (line 86) | def __getitem__(self, index, return_meta=False):
    method __len__ (line 116) | def __len__(self):
    method __repr__ (line 119) | def __repr__(self):
  class COCODataset (line 130) | class COCODataset(CocoDetection):
    method __init__ (line 131) | def __init__(self, ann_file, root, remove_images_without_annotations, ...
    method categories (line 190) | def categories(self, no_background=True):
    method __getitem__ (line 198) | def __getitem__(self, idx):
    method get_img_info (line 265) | def get_img_info(self, index):

FILE: GLIP/maskrcnn_benchmark/data/datasets/coco_dt.py
  class CocoDetectionTSV (line 19) | class CocoDetectionTSV(ODTSVDataset):
    method __init__ (line 20) | def __init__(self,
    method __len__ (line 75) | def __len__(self):
    method categories (line 78) | def categories(self, no_background=True):
    method __getitem__ (line 87) | def __getitem__(self, idx):
    method get_raw_image (line 142) | def get_raw_image(self, idx):
    method get_img_id (line 146) | def get_img_id(self, idx):

FILE: GLIP/maskrcnn_benchmark/data/datasets/concat_dataset.py
  class ConcatDataset (line 7) | class ConcatDataset(_ConcatDataset):
    method get_idxs (line 13) | def get_idxs(self, idx):
    method get_img_info (line 21) | def get_img_info(self, idx):

FILE: GLIP/maskrcnn_benchmark/data/datasets/custom_distributed_sampler.py
  class DistributedSamplerChunkByNode (line 12) | class DistributedSamplerChunkByNode(torch.utils.data.Sampler):
    method __init__ (line 14) | def __init__(self,
    method __iter__ (line 98) | def __iter__(self):
    method generate_indices_within_range_with_rank (line 130) | def generate_indices_within_range_with_rank(self, seed, epoch, process...
    method __len__ (line 173) | def __len__(self) -> int:
    method set_epoch (line 176) | def set_epoch(self, epoch: int) -> None:

FILE: GLIP/maskrcnn_benchmark/data/datasets/duplicate_dataset.py
  function create_duplicate_dataset (line 11) | def create_duplicate_dataset(DatasetBaseClass):

FILE: GLIP/maskrcnn_benchmark/data/datasets/evaluation/__init__.py
  function evaluate (line 10) | def evaluate(dataset, predictions, output_folder, **kwargs):
  function evaluate_mdetr (line 39) | def evaluate_mdetr(dataset, predictions, output_folder, cfg):

FILE: GLIP/maskrcnn_benchmark/data/datasets/evaluation/box_aug.py
  function im_detect_bbox_aug (line 12) | def im_detect_bbox_aug(model, images, device, captions=None, positive_ma...
  function im_detect_bbox (line 65) | def im_detect_bbox(model, images, target_scale, target_max_size, device,
  function im_detect_bbox_hflip (line 94) | def im_detect_bbox_hflip(model, images, target_scale, target_max_size, d...
  function im_detect_bbox_scale (line 129) | def im_detect_bbox_scale(model, images, target_scale, target_max_size, d...
  function remove_boxes (line 150) | def remove_boxes(boxlist_ts, min_scale, max_scale):
  function merge_result_from_multi_scales (line 166) | def merge_result_from_multi_scales(boxlists):
  function boxlist_nms (line 209) | def boxlist_nms(boxlist, thresh, max_proposals=-1, score_field="scores",...
  function bbox_vote (line 241) | def bbox_vote(boxes, scores, vote_thresh):
  function soft_bbox_vote (line 290) | def soft_bbox_vote(boxes, scores, vote_thresh):

FILE: GLIP/maskrcnn_benchmark/data/datasets/evaluation/coco/__init__.py
  function coco_evaluation (line 4) | def coco_evaluation(

FILE: GLIP/maskrcnn_benchmark/data/datasets/evaluation/coco/coco_eval.py
  function do_coco_evaluation (line 16) | def do_coco_evaluation(
  function prepare_for_tsv_detection (line 86) | def prepare_for_tsv_detection(predictions, dataset):
  function prepare_for_coco_detection (line 129) | def prepare_for_coco_detection(predictions, dataset):
  function prepare_for_coco_segmentation (line 160) | def prepare_for_coco_segmentation(predictions, dataset):
  function prepare_for_coco_keypoint (line 214) | def prepare_for_coco_keypoint(predictions, dataset):
  function evaluate_box_proposals (line 246) | def evaluate_box_proposals(
  function evaluate_predictions_on_coco (line 375) | def evaluate_predictions_on_coco(
  function summarize_per_category (line 400) | def summarize_per_category(coco_eval, csv_output=None):
  function filter_valid_keypoints (line 459) | def filter_valid_keypoints(coco_gt, coco_dt):
  class COCOResults (line 467) | class COCOResults(object):
    method __init__ (line 484) | def __init__(self, *iou_types):
    method update (line 494) | def update(self, coco_eval):
    method __repr__ (line 507) | def __repr__(self):
  function check_expected_results (line 512) | def check_expected_results(results, expected_results, sigma_tol):

FILE: GLIP/maskrcnn_benchmark/data/datasets/evaluation/flickr/flickr_eval.py
  function get_sentence_data (line 21) | def get_sentence_data(filename) -> List[Dict[str, Any]]:
  function get_annotations (line 89) | def get_annotations(filename) -> Dict[str, Union[int, List[str], Dict[st...
  function box_area (line 152) | def box_area(boxes: np.array) -> np.array:
  function _box_inter_union (line 171) | def _box_inter_union(boxes1: np.array, boxes2: np.array) -> Tuple[np.arr...
  function box_iou (line 186) | def box_iou(boxes1: np.array, boxes2: np.array) -> np.array:
  function _merge_boxes (line 207) | def _merge_boxes(boxes: List[List[int]]) -> List[List[int]]:
  class RecallTracker (line 220) | class RecallTracker:
    method __init__ (line 223) | def __init__(self, topk: Sequence[int]):
    method add_positive (line 232) | def add_positive(self, k: int, category: str):
    method add_negative (line 239) | def add_negative(self, k: int, category: str):
    method report (line 245) | def report(self) -> Dict[int, Dict[str, float]]:
  class Flickr30kEntitiesRecallEvaluator (line 258) | class Flickr30kEntitiesRecallEvaluator:
    method __init__ (line 259) | def __init__(
    method evaluate (line 323) | def evaluate(self, predictions: List[Dict]):
  class FlickrEvaluator (line 393) | class FlickrEvaluator(object):
    method __init__ (line 394) | def __init__(
    method accumulate (line 410) | def accumulate(self):
    method update (line 413) | def update(self, predictions):
    method synchronize_between_processes (line 416) | def synchronize_between_processes(self):
    method summarize (line 420) | def summarize(self):

FILE: GLIP/maskrcnn_benchmark/data/datasets/evaluation/lvis/lvis.py
  function _isArrayLike (line 14) | def _isArrayLike(obj):
  class LVIS (line 18) | class LVIS:
    method __init__ (line 19) | def __init__(self, annotation_path=None):
    method _load_json (line 41) | def _load_json(self, path):
    method _create_index (line 45) | def _create_index(self):
    method get_ann_ids (line 70) | def get_ann_ids(self, img_ids=None, cat_ids=None, area_rng=None):
    method get_cat_ids (line 106) | def get_cat_ids(self):
    method get_img_ids (line 113) | def get_img_ids(self):
    method _load_helper (line 120) | def _load_helper(self, _dict, ids):
    method load_anns (line 128) | def load_anns(self, ids=None):
    method load_cats (line 137) | def load_cats(self, ids):
    method load_imgs (line 147) | def load_imgs(self, ids):
    method download (line 156) | def download(self, save_dir, img_ids=None):
    method ann_to_rle (line 174) | def ann_to_rle(self, ann):
    method ann_to_mask (line 197) | def ann_to_mask(self, ann):

FILE: GLIP/maskrcnn_benchmark/data/datasets/evaluation/lvis/lvis_eval.py
  function merge (line 21) | def merge(img_ids, eval_imgs):
  class Params (line 51) | class Params:
    method __init__ (line 52) | def __init__(self, iou_type):
  class LVISResults (line 78) | class LVISResults(LVIS):
    method __init__ (line 79) | def __init__(self, lvis_gt, results, max_dets=300):
    method limit_dets_per_image (line 136) | def limit_dets_per_image(self, anns, max_dets):
    method get_top_results (line 149) | def get_top_results(self, img_id, score_thrs):
  class LVISEval (line 155) | class LVISEval:
    method __init__ (line 156) | def __init__(self, lvis_gt, lvis_dt=None, iou_type="segm"):
    method _to_mask (line 195) | def _to_mask(self, anns, lvis):
    method _prepare (line 200) | def _prepare(self):
    method _prepare_freq_group (line 244) | def _prepare_freq_group(self):
    method evaluate (line 252) | def evaluate(self):
    method _get_gt_dt (line 279) | def _get_gt_dt(self, img_id, cat_id):
    method compute_iou (line 292) | def compute_iou(self, img_id, cat_id):
    method evaluate_img (line 318) | def evaluate_img(self, img_id, cat_id, area_rng):
    method accumulate (line 412) | def accumulate(self):
    method _summarize (line 525) | def _summarize(self, summary_type, iou_thr=None, area_rng="all", freq_...
    method summarize (line 550) | def summarize(self):
    method run (line 587) | def run(self):
    method print_results (line 593) | def print_results(self):
    method get_results (line 625) | def get_results(self):
  class LvisEvaluator (line 636) | class LvisEvaluator(object):
    method __init__ (line 637) | def __init__(self, lvis_gt, iou_types):
    method update (line 650) | def update(self, predictions):
    method synchronize_between_processes (line 669) | def synchronize_between_processes(self):
    method accumulate (line 674) | def accumulate(self):
    method summarize (line 678) | def summarize(self):
    method prepare (line 683) | def prepare(self, predictions, iou_type):
    method prepare_for_lvis_detection (line 693) | def prepare_for_lvis_detection(self, predictions):
    method prepare_for_lvis_segmentation (line 717) | def prepare_for_lvis_segmentation(self, predictions):
  function _merge_lists (line 752) | def _merge_lists(listA, listB, maxN, key):
  class LvisEvaluatorFixedAP (line 766) | class LvisEvaluatorFixedAP(object):
    method __init__ (line 767) | def __init__(self, gt: LVIS, topk=10000, fixed_ap=True):
    method update (line 775) | def update(self, predictions):
    method synchronize_between_processes (line 796) | def synchronize_between_processes(self):
    method prepare (line 806) | def prepare(self, predictions):
    method summarize (line 830) | def summarize(self):
    method _summarize_standard (line 839) | def _summarize_standard(self):
    method _summarize_fixed (line 845) | def _summarize_fixed(self):
  class LvisDumper (line 874) | class LvisDumper(object):
    method __init__ (line 875) | def __init__(self, topk=10000, fixed_ap=True, out_path="lvis_eval"):
    method update (line 886) | def update(self, predictions):
    method synchronize_between_processes (line 907) | def synchronize_between_processes(self):
    method prepare (line 917) | def prepare(self, predictions):
    method summarize (line 941) | def summarize(self):
    method _summarize_standard (line 950) | def _summarize_standard(self):
    method _summarize_fixed (line 958) | def _summarize_fixed(self):
  function convert_to_xywh (line 984) | def convert_to_xywh(boxes):
  function create_common_lvis_eval (line 989) | def create_common_lvis_eval(lvis_eval, img_ids, eval_imgs):
  function lvis_evaluation (line 997) | def lvis_evaluation():

FILE: GLIP/maskrcnn_benchmark/data/datasets/evaluation/od_to_grounding/__init__.py
  function od_to_grounding_evaluation (line 4) | def od_to_grounding_evaluation(

FILE: GLIP/maskrcnn_benchmark/data/datasets/evaluation/od_to_grounding/od_eval.py
  function do_od_evaluation (line 16) | def do_od_evaluation(
  function prepare_for_tsv_detection (line 86) | def prepare_for_tsv_detection(predictions, dataset):
  function prepare_for_coco_detection (line 129) | def prepare_for_coco_detection(predictions, dataset):
  function prepare_for_coco_segmentation (line 160) | def prepare_for_coco_segmentation(predictions, dataset):
  function prepare_for_coco_keypoint (line 214) | def prepare_for_coco_keypoint(predictions, dataset):
  function evaluate_box_proposals (line 246) | def evaluate_box_proposals(
  function evaluate_predictions_on_coco (line 375) | def evaluate_predictions_on_coco(
  function summarize_per_category (line 400) | def summarize_per_category(coco_eval, csv_output=None):
  function filter_valid_keypoints (line 459) | def filter_valid_keypoints(coco_gt, coco_dt):
  class COCOResults (line 467) | class COCOResults(object):
    method __init__ (line 484) | def __init__(self, *iou_types):
    method update (line 494) | def update(self, coco_eval):
    method __repr__ (line 507) | def __repr__(self):
  function check_expected_results (line 512) | def check_expected_results(results, expected_results, sigma_tol):

FILE: GLIP/maskrcnn_benchmark/data/datasets/evaluation/vg/__init__.py
  function vg_evaluation (line 6) | def vg_evaluation(dataset, predictions, output_folder, box_only, eval_at...

FILE: GLIP/maskrcnn_benchmark/data/datasets/evaluation/vg/vg_eval.py
  function evaluate_box_proposals (line 14) | def evaluate_box_proposals(
  class VGResults (line 130) | class VGResults(object):
    method __init__ (line 137) | def __init__(self, iou_type, value):
  function do_vg_evaluation (line 145) | def do_vg_evaluation(dataset, predictions, output_folder, box_only, eval...
  function eval_detection_voc (line 253) | def eval_detection_voc(pred_boxlists, gt_boxlists, classes, iou_thresh=0...
  function calc_detection_voc_prec_rec (line 326) | def calc_detection_voc_prec_rec(pred_boxlists, gt_boxlists, classindex, ...
  function voc_ap (line 450) | def voc_ap(rec, prec, use_07_metric=False):
  function calc_detection_voc_ap (line 484) | def calc_detection_voc_ap(prec, rec, use_07_metric=False):
  function evaluate_box_proposals_for_relation (line 544) | def evaluate_box_proposals_for_relation(

FILE: GLIP/maskrcnn_benchmark/data/datasets/evaluation/voc/__init__.py
  function voc_evaluation (line 6) | def voc_evaluation(dataset, predictions, output_folder, box_only, **_):

FILE: GLIP/maskrcnn_benchmark/data/datasets/evaluation/voc/voc_eval.py
  function do_voc_evaluation (line 12) | def do_voc_evaluation(dataset, predictions, output_folder, logger):
  function eval_detection_voc (line 48) | def eval_detection_voc(pred_boxlists, gt_boxlists, iou_thresh=0.5, use_0...
  function calc_detection_voc_prec_rec (line 68) | def calc_detection_voc_prec_rec(gt_boxlists, pred_boxlists, iou_thresh=0...
  function calc_detection_voc_ap (line 160) | def calc_detection_voc_ap(prec, rec, use_07_metric=False):

FILE: GLIP/maskrcnn_benchmark/data/datasets/flickr.py
  class FlickrDataset (line 7) | class FlickrDataset(ModulatedDataset):

FILE: GLIP/maskrcnn_benchmark/data/datasets/gqa.py
  class GQADataset (line 10) | class GQADataset(ModulatedDataset):
  class GQAQuestionAnswering (line 14) | class GQAQuestionAnswering(torchvision.datasets.CocoDetection):
    method __init__ (line 15) | def __init__(self, img_folder, ann_file, transforms, return_masks, ret...
    method __getitem__ (line 25) | def __getitem__(self, idx):

FILE: GLIP/maskrcnn_benchmark/data/datasets/imagenet.py
  function pil_loader (line 8) | def pil_loader(path):
  class ImageNet (line 14) | class ImageNet(data.Dataset):
    method __init__ (line 24) | def __init__(self, ann_file, root, remove_images_without_annotations=N...
    method select_class (line 42) | def select_class(self, cls):
    method __getitem__ (line 47) | def __getitem__(self, index):
    method __len__ (line 62) | def __len__(self):

FILE: GLIP/maskrcnn_benchmark/data/datasets/list_dataset.py
  class ListDataset (line 11) | class ListDataset(object):
    method __init__ (line 12) | def __init__(self, image_lists, transforms=None):
    method __getitem__ (line 16) | def __getitem__(self, item):
    method __len__ (line 28) | def __len__(self):
    method get_img_info (line 31) | def get_img_info(self, item):

FILE: GLIP/maskrcnn_benchmark/data/datasets/lvis.py
  function _isArrayLike (line 16) | def _isArrayLike(obj):
  class LVIS (line 20) | class LVIS:
    method __init__ (line 21) | def __init__(self, annotation_path=None):
    method _load_json (line 43) | def _load_json(self, path):
    method _create_index (line 47) | def _create_index(self):
    method get_ann_ids (line 72) | def get_ann_ids(self, img_ids=None, cat_ids=None, area_rng=None):
    method get_cat_ids (line 108) | def get_cat_ids(self):
    method get_img_ids (line 115) | def get_img_ids(self):
    method _load_helper (line 122) | def _load_helper(self, _dict, ids):
    method load_anns (line 130) | def load_anns(self, ids=None):
    method load_cats (line 139) | def load_cats(self, ids):
    method load_imgs (line 149) | def load_imgs(self, ids):
    method download (line 158) | def download(self, save_dir, img_ids=None):
    method ann_to_rle (line 176) | def ann_to_rle(self, ann):
    method ann_to_mask (line 199) | def ann_to_mask(self, ann):
  class LvisDetectionBase (line 211) | class LvisDetectionBase(torchvision.datasets.VisionDataset):
    method __init__ (line 212) | def __init__(self, root, annFile, transform=None, target_transform=Non...
    method __getitem__ (line 217) | def __getitem__(self, index):
    method __len__ (line 238) | def __len__(self):
  class LvisDetection (line 242) | class LvisDetection(LvisDetectionBase):
    method __init__ (line 243) | def __init__(self, img_folder, ann_file, transforms, return_masks=Fals...
    method __getitem__ (line 249) | def __getitem__(self, idx):
    method get_raw_image (line 258) | def get_raw_image(self, idx):
    method categories (line 262) | def categories(self):

FILE: GLIP/maskrcnn_benchmark/data/datasets/mixed.py
  class CustomCocoDetection (line 15) | class CustomCocoDetection(VisionDataset):
    method __init__ (line 31) | def __init__(
    method __getitem__ (line 60) | def __getitem__(self, index):
    method __len__ (line 84) | def __len__(self):
  class MixedDataset (line 88) | class MixedDataset(CustomCocoDetection):
    method __init__ (line 91) | def __init__(self,
    method __getitem__ (line 111) | def __getitem__(self, idx):
    method get_img_info (line 142) | def get_img_info(self, index):

FILE: GLIP/maskrcnn_benchmark/data/datasets/mixup.py
  class MixupDetection (line 8) | class MixupDetection(data.Dataset):
    method __init__ (line 21) | def __init__(self, dataset, mixup=None, preproc=None, *args):
    method set_mixup (line 28) | def set_mixup(self, mixup=None, *args):
    method __len__ (line 41) | def __len__(self):
    method __getitem__ (line 45) | def __getitem__(self, idx):
    method pull_item (line 86) | def pull_item(self, idx):

FILE: GLIP/maskrcnn_benchmark/data/datasets/modulated_coco.py
  class CocoGrounding (line 22) | class CocoGrounding(torchvision.datasets.CocoDetection):
    method __init__ (line 23) | def __init__(self,
    method categories (line 116) | def categories(self, no_background=True):
    method get_box_mask (line 125) | def get_box_mask(self, rect, img_size, mode="poly"):
    method __getitem__ (line 130) | def __getitem__(self, idx):
    method get_img_info (line 232) | def get_img_info(self, index):
  class ModulatedDataset (line 238) | class ModulatedDataset(torchvision.datasets.CocoDetection):
    method __init__ (line 239) | def __init__(self,
    method __getitem__ (line 273) | def __getitem__(self, idx):
    method get_img_info (line 327) | def get_img_info(self, index):
  class CocoDetection (line 333) | class CocoDetection(data.Dataset):
    method __init__ (line 345) | def __init__(self, root, annFile, transform=None, target_transform=None):
    method __getitem__ (line 353) | def __getitem__(self, index, return_meta=False):
    method __len__ (line 383) | def __len__(self):
    method __repr__ (line 386) | def __repr__(self):
  class ConvertCocoPolysToMask (line 397) | class ConvertCocoPolysToMask(object):
    method __init__ (line 398) | def __init__(self, return_masks=False, return_tokens=False, tokenizer=...
    method get_box_mask (line 404) | def get_box_mask(self, rect, img_size, mode="poly"):
    method __call__ (line 409) | def __call__(self, image, target, ignore_box_screen=False, box_format=...
  function create_greenlight_map (line 523) | def create_greenlight_map(tok_list, tokenized):
  function create_positive_map_for_od_labels (line 561) | def create_positive_map_for_od_labels(tokenized, label_to_positions):
  function convert_coco_poly_to_mask (line 598) | def convert_coco_poly_to_mask(segmentations, height, width):
  function create_positive_map (line 615) | def create_positive_map(tokenized, tokens_positive):
  function pil_loader (line 645) | def pil_loader(path, retry=5):

FILE: GLIP/maskrcnn_benchmark/data/datasets/object365.py
  class Object365DetectionTSV (line 7) | class Object365DetectionTSV(CocoDetectionTSV):

FILE: GLIP/maskrcnn_benchmark/data/datasets/od_to_grounding.py
  function clean_name (line 9) | def clean_name(name):
  function sanity_check_target_after_processing (line 16) | def sanity_check_target_after_processing(target):
  function convert_od_to_grounding_simple (line 20) | def convert_od_to_grounding_simple(
  function check_for_positive_overflow (line 104) | def check_for_positive_overflow(target, ind_to_class, tokenizer, max_seq...
  function convert_object_detection_to_grounding_optimized_for_od (line 149) | def convert_object_detection_to_grounding_optimized_for_od(
  function generate_control_options_given_probabilities (line 336) | def generate_control_options_given_probabilities(

FILE: GLIP/maskrcnn_benchmark/data/datasets/phrasecut.py
  class PhrasecutDetection (line 7) | class PhrasecutDetection(ModulatedDataset):

FILE: GLIP/maskrcnn_benchmark/data/datasets/pseudo_data.py
  class PseudoData (line 15) | class PseudoData(TSVYamlDataset):
    method __init__ (line 16) | def __init__(self,
    method __len__ (line 72) | def __len__(self):
    method check_for_overlap (line 76) | def check_for_overlap(range1, range2):
    method divert_boxes (line 81) | def divert_boxes(self, anno):
    method __getitem__ (line 107) | def __getitem__(self, idx):
    method convert_anno_from_yiling_to_ours (line 202) | def convert_anno_from_yiling_to_ours(self, anno):
    method get_raw_image (line 219) | def get_raw_image(self, idx):
    method get_img_id (line 223) | def get_img_id(self, idx):

FILE: GLIP/maskrcnn_benchmark/data/datasets/refexp.py
  class RefExpDataset (line 14) | class RefExpDataset(ModulatedDataset):
  class RefExpEvaluator (line 18) | class RefExpEvaluator(object):
    method __init__ (line 19) | def __init__(self, refexp_gt, iou_types, k=(1, 5, 10), thresh_iou=0.5):
    method accumulate (line 29) | def accumulate(self):
    method update (line 32) | def update(self, predictions):
    method synchronize_between_processes (line 35) | def synchronize_between_processes(self):
    method summarize (line 42) | def summarize(self):

FILE: GLIP/maskrcnn_benchmark/data/datasets/tsv.py
  function load_linelist_file (line 16) | def load_linelist_file(linelist_file):
  function img_from_base64 (line 25) | def img_from_base64(imagestring):
  function load_from_yaml_file (line 33) | def load_from_yaml_file(yaml_file):
  function find_file_path_in_yaml (line 38) | def find_file_path_in_yaml(fname, root):
  function create_lineidx (line 50) | def create_lineidx(filein, idxout):
  function read_to_character (line 62) | def read_to_character(fp, c):
  class TSVFile (line 75) | class TSVFile(object):
    method __init__ (line 76) | def __init__(self, tsv_file, generate_lineidx=False):
    method __del__ (line 88) | def __del__(self):
    method __str__ (line 92) | def __str__(self):
    method __repr__ (line 95) | def __repr__(self):
    method num_rows (line 98) | def num_rows(self):
    method seek (line 102) | def seek(self, idx):
    method seek_first_column (line 113) | def seek_first_column(self, idx):
    method get_key (line 120) | def get_key(self, idx):
    method __getitem__ (line 123) | def __getitem__(self, index):
    method __len__ (line 126) | def __len__(self):
    method _ensure_lineidx_loaded (line 129) | def _ensure_lineidx_loaded(self):
    method _ensure_tsv_opened (line 135) | def _ensure_tsv_opened(self):
  class CompositeTSVFile (line 146) | class CompositeTSVFile():
    method __init__ (line 147) | def __init__(self, file_list, seq_file, root='.'):
    method get_key (line 159) | def get_key(self, index):
    method num_rows (line 164) | def num_rows(self):
    method __getitem__ (line 167) | def __getitem__(self, index):
    method __len__ (line 171) | def __len__(self):
    method initialize (line 174) | def initialize(self):
  function load_list_file (line 190) | def load_list_file(fname):
  class TSVDataset (line 199) | class TSVDataset(object):
    method __init__ (line 200) | def __init__(self, img_file, label_file=None, hw_file=None,
    method __len__ (line 226) | def __len__(self):
    method __getitem__ (line 235) | def __getitem__(self, idx):
    method get_line_no (line 250) | def get_line_no(self, idx):
    method get_image (line 253) | def get_image(self, idx):
    method get_annotations (line 266) | def get_annotations(self, idx):
    method get_target_from_annotations (line 275) | def get_target_from_annotations(self, annotations, img_size, idx):
    method apply_transforms (line 280) | def apply_transforms(self, image, target=None):
    method get_img_info (line 285) | def get_img_info(self, idx):
    method get_img_key (line 309) | def get_img_key(self, idx):
  class TSVYamlDataset (line 326) | class TSVYamlDataset(TSVDataset):
    method __init__ (line 330) | def __init__(self, yaml_file, root=None, replace_clean_label=False):
  class ODTSVDataset (line 354) | class ODTSVDataset(TSVYamlDataset):
    method __init__ (line 359) | def __init__(self, yaml_file, extra_fields=(), transforms=None,
    method get_target_from_annotations (line 411) | def get_target_from_annotations(self, annotations, img_size, idx):
    method apply_transforms (line 417) | def apply_transforms(self, img, target=None):

FILE: GLIP/maskrcnn_benchmark/data/datasets/vg.py
  class VGDetectionTSV (line 14) | class VGDetectionTSV(CocoDetectionTSV):
  function sort_key_by_val (line 18) | def sort_key_by_val(dic):
  function bbox_overlaps (line 23) | def bbox_overlaps(anchors, gt_boxes):
  function _box_filter (line 59) | def _box_filter(boxes, must_overlap=False):
  class VGTSVDataset (line 78) | class VGTSVDataset(TSVYamlDataset):
    method __init__ (line 83) | def __init__(self, yaml_file, extra_fields=None, transforms=None,
    method _get_freq_prior (line 167) | def _get_freq_prior(self, must_overlap=False):
    method relation_loader (line 204) | def relation_loader(self, relation_triplets, target):
    method get_target_from_annotations (line 228) | def get_target_from_annotations(self, annotations, img_size, idx):
    method get_groundtruth (line 241) | def get_groundtruth(self, idx, call=False):
    method apply_transforms (line 255) | def apply_transforms(self, img, target=None):
    method map_class_id_to_class_name (line 260) | def map_class_id_to_class_name(self, class_id):
    method map_attribute_id_to_attribute_name (line 263) | def map_attribute_id_to_attribute_name(self, attribute_id):
    method map_relation_id_to_relation_name (line 266) | def map_relation_id_to_relation_name(self, relation_id):

FILE: GLIP/maskrcnn_benchmark/data/datasets/voc.py
  class PascalVOCDataset (line 17) | class PascalVOCDataset(torch.utils.data.Dataset):
    method __init__ (line 43) | def __init__(self, data_dir, split, use_difficult=False, transforms=No...
    method __getitem__ (line 61) | def __getitem__(self, index):
    method __len__ (line 73) | def __len__(self):
    method get_groundtruth (line 76) | def get_groundtruth(self, index):
    method _preprocess_annotation (line 87) | def _preprocess_annotation(self, target):
    method get_img_info (line 126) | def get_img_info(self, index):
    method map_class_id_to_class_name (line 133) | def map_class_id_to_class_name(self, class_id):

FILE: GLIP/maskrcnn_benchmark/data/samplers/distributed.py
  class DistributedSampler (line 12) | class DistributedSampler(Sampler):
    method __init__ (line 27) | def __init__(self, dataset, num_replicas=None, rank=None, shuffle=True...
    method __iter__ (line 45) | def __iter__(self):
    method __len__ (line 68) | def __len__(self):
    method set_epoch (line 71) | def set_epoch(self, epoch):

FILE: GLIP/maskrcnn_benchmark/data/samplers/grouped_batch_sampler.py
  class GroupedBatchSampler (line 9) | class GroupedBatchSampler(BatchSampler):
    method __init__ (line 24) | def __init__(self, sampler, group_ids, batch_size, drop_uneven=False):
    method _prepare_batches (line 40) | def _prepare_batches(self):
    method __iter__ (line 102) | def __iter__(self):
    method __len__ (line 111) | def __len__(self):

FILE: GLIP/maskrcnn_benchmark/data/samplers/iteration_based_batch_sampler.py
  class IterationBasedBatchSampler (line 5) | class IterationBasedBatchSampler(BatchSampler):
    method __init__ (line 11) | def __init__(self, batch_sampler, num_iterations, start_iter=0):
    method __iter__ (line 16) | def __iter__(self):
    method __len__ (line 30) | def __len__(self):

FILE: GLIP/maskrcnn_benchmark/data/transforms/build.py
  function build_transforms (line 5) | def build_transforms(cfg, is_train=True):

FILE: GLIP/maskrcnn_benchmark/data/transforms/transforms.py
  function matrix_iou (line 12) | def matrix_iou(a, b, relative=False):
  class RACompose (line 29) | class RACompose(object):
    method __init__ (line 30) | def __init__(self, pre_transforms, rand_transforms, post_transforms, c...
    method __call__ (line 36) | def __call__(self, image, target):
    method __repr__ (line 47) | def __repr__(self):
  class Compose (line 64) | class Compose(object):
    method __init__ (line 65) | def __init__(self, transforms):
    method __call__ (line 68) | def __call__(self, image, target=None):
    method __repr__ (line 75) | def __repr__(self):
  class Resize (line 84) | class Resize(object):
    method __init__ (line 85) | def __init__(self, min_size, max_size, restrict=False):
    method get_size (line 93) | def get_size(self, image_size):
    method __call__ (line 117) | def __call__(self, image, target):
  class RandomHorizontalFlip (line 130) | class RandomHorizontalFlip(object):
    method __init__ (line 131) | def __init__(self, prob=0.5):
    method __call__ (line 134) | def __call__(self, image, target):
  class RandomVerticalFlip (line 145) | class RandomVerticalFlip(object):
    method __init__ (line 146) | def __init__(self, prob=0.5):
    method __call__ (line 149) | def __call__(self, image, target):
  class ToTensor (line 158) | class ToTensor(object):
    method __call__ (line 159) | def __call__(self, image, target):
  class Normalize (line 163) | class Normalize(object):
    method __init__ (line 164) | def __init__(self, mean, std, format='rgb'):
    method __call__ (line 169) | def __call__(self, image, target):
  class ColorJitter (line 178) | class ColorJitter(object):
    method __init__ (line 179) | def __init__(self,
    method __call__ (line 191) | def __call__(self, image, target):
  class RandomCrop (line 196) | class RandomCrop(object):
    method __init__ (line 197) | def __init__(self, prob=0.5, min_ious=(0.1, 0.3, 0.5, 0.7, 0.9), min_c...
    method __call__ (line 203) | def __call__(self, img, target):
  class RandomAffine (line 254) | class RandomAffine(object):
    method __init__ (line 255) | def __init__(self, prob=0.5, degrees=(-10, 10), translate=(.1, .1), sc...
    method __call__ (line 264) | def __call__(self, img, targets=None):
  class RandomErasing (line 332) | class RandomErasing:
    method __init__ (line 333) | def __init__(self, prob=0.5, era_l=0.02, era_h=1/3, min_aspect=0.3,
    method _get_pixels (line 346) | def _get_pixels(self, patch_size):
    method __call__ (line 354) | def __call__(self, image, target):

FILE: GLIP/maskrcnn_benchmark/engine/alter_trainer.py
  function reduce_loss_dict (line 13) | def reduce_loss_dict(all_loss_dict):
  function do_train (line 44) | def do_train(

FILE: GLIP/maskrcnn_benchmark/engine/evolution.py
  function gather_candidates (line 25) | def gather_candidates(all_candidates):
  function gather_stats (line 31) | def gather_stats(all_candidates):
  function compute_on_dataset (line 39) | def compute_on_dataset(model, rngs, data_loader, device=cfg.MODEL.DEVICE):
  function bn_statistic (line 54) | def bn_statistic(model, rngs, data_loader, device=cfg.MODEL.DEVICE, max_...
  function inference (line 73) | def inference(
  function fitness (line 109) | def fitness(cfg, model, rngs, val_loaders):
  class EvolutionTrainer (line 129) | class EvolutionTrainer(object):
    method __init__ (line 130) | def __init__(self, cfg, model, flops_limit=None, is_distributed=True):
    method save_checkpoint (line 155) | def save_checkpoint(self):
    method load_checkpoint (line 168) | def load_checkpoint(self):
    method legal (line 179) | def legal(self, cand):
    method update_top_k (line 195) | def update_top_k(self, candidates, *, k, key, reverse=False):
    method eval_candidates (line 203) | def eval_candidates(self, train_loader, val_loader):
    method stack_random_cand (line 221) | def stack_random_cand(self, random_func, *, batchsize=10):
    method random_can (line 227) | def random_can(self, num):
    method get_mutation (line 242) | def get_mutation(self, k, mutation_num, m_prob):
    method get_crossover (line 268) | def get_crossover(self, k, crossover_num):
    method train (line 292) | def train(self, train_loader, val_loader):
    method save_candidates (line 319) | def save_candidates(self, cand, template):

FILE: GLIP/maskrcnn_benchmark/engine/inference.py
  function inference_default (line 20) | def inference_default(
  function clean_name (line 85) | def clean_name(name):
  function create_one_hot_dict (line 92) | def create_one_hot_dict(labels, no_minus_one_for_one_hot = False):
  function create_positive_dict (line 111) | def create_positive_dict(tokenized, tokens_positive, labels):
  function chunks (line 146) | def chunks(lst, n):
  function create_queries_and_maps_from_dataset (line 159) | def create_queries_and_maps_from_dataset(dataset, cfg):
  function create_queries_and_maps (line 192) | def create_queries_and_maps(labels, label_list, additional_labels = None...
  function create_positive_map_label_to_token_from_positive_map (line 262) | def create_positive_map_label_to_token_from_positive_map(positive_map, p...
  function _accumulate_predictions_from_multiple_gpus (line 270) | def _accumulate_predictions_from_multiple_gpus(predictions_per_gpu):
  function resize_box (line 291) | def resize_box(output, targets):
  function flickr_post_process (line 299) | def flickr_post_process(output, targets, positive_map_label_to_token, pl...
  function build_flickr_evaluator (line 317) | def build_flickr_evaluator(cfg):
  function build_lvis_evaluator (line 324) | def build_lvis_evaluator(ann_file, fixed_ap=True):
  function write_lvis_results (line 331) | def write_lvis_results(results, output_file_name):
  function write_flickr_results (line 345) | def write_flickr_results(results, output_file_name):
  function inference (line 360) | def inference(

FILE: GLIP/maskrcnn_benchmark/engine/predictor.py
  class COCODemo (line 19) | class COCODemo(object):
    method __init__ (line 105) | def __init__(
    method build_transform (line 139) | def build_transform(self):
    method inference (line 169) | def inference(self, image, debug=False):
    method run_on_opencv_image (line 187) | def run_on_opencv_image(self, image):
    method compute_prediction (line 212) | def compute_prediction(self, original_image):
    method select_top_predictions (line 258) | def select_top_predictions(self, predictions):
    method compute_colors_for_labels (line 297) | def compute_colors_for_labels(self, labels):
    method overlay_boxes (line 305) | def overlay_boxes(self, image, predictions):
    method overlay_scores (line 327) | def overlay_scores(self, image, predictions):
    method overlay_cboxes (line 348) | def overlay_cboxes(self, image, predictions):
    method overlay_centers (line 370) | def overlay_centers(self, image, predictions):
    method overlay_count (line 388) | def overlay_count(self, image, predictions):
    method overlay_mask (line 405) | def overlay_mask(self, image, predictions):
    method overlay_keypoints (line 431) | def overlay_keypoints(self, image, predictions):
    method create_mask_montage (line 441) | def create_mask_montage(self, image, predictions):
    method overlay_class_names (line 477) | def overlay_class_names(self, image, predictions, names=None):
  function vis_keypoints (line 505) | def vis_keypoints(img, kps, kp_thresh=0, alpha=0.7, names=None, connecti...

FILE: GLIP/maskrcnn_benchmark/engine/predictor_glip.py
  class GLIPDemo (line 29) | class GLIPDemo(object):
    method __init__ (line 30) | def __init__(self,
    method build_transform (line 61) | def build_transform(self):
    method build_tokenizer (line 91) | def build_tokenizer(self):
    method run_ner (line 106) | def run_ner(self, caption):
    method inference (line 130) | def inference(self, original_image, original_caption, thresh=None):
    method inference_batch (line 138) | def inference_batch(self, original_image_list, original_caption, thres...
    method run_on_web_image (line 148) | def run_on_web_image(self, original_image, original_caption, thresh=0.5):
    method compute_prediction (line 162) | def compute_prediction(self, original_image, original_caption):
    method compute_prediction_batch (line 208) | def compute_prediction_batch(self, original_image_list, original_capti...
    method _post_process_fixed_thresh (line 255) | def _post_process_fixed_thresh(self, predictions):
    method _post_process (line 273) | def _post_process(self, predictions, threshold=0.5):
    method compute_colors_for_labels (line 291) | def compute_colors_for_labels(self, labels):
    method overlay_boxes (line 299) | def overlay_boxes(self, image, predictions):
    method overlay_scores (line 313) | def overlay_scores(self, image, predictions):
    method overlay_entity_names (line 326) | def overlay_entity_names(self, image, predictions, names=None):
    method overlay_mask (line 352) | def overlay_mask(self, image, predictions):
    method create_mask_montage (line 373) | def create_mask_montage(self, image, predictions):
  function create_positive_map_label_to_token_from_positive_map (line 402) | def create_positive_map_label_to_token_from_positive_map(positive_map, p...
  function create_positive_map (line 409) | def create_positive_map(tokenized, tokens_positive):
  function find_noun_phrases (line 445) | def find_noun_phrases(caption: str) -> List[str]:
  function remove_punctuation (line 462) | def remove_punctuation(text: str) -> str:

FILE: GLIP/maskrcnn_benchmark/engine/singlepath_trainer.py
  function reduce_loss_dict (line 13) | def reduce_loss_dict(loss_dict):
  function do_train (line 38) | def do_train(

FILE: GLIP/maskrcnn_benchmark/engine/stage_trainer.py
  function reduce_loss_dict (line 13) | def reduce_loss_dict(all_loss_dict):
  function do_train (line 44) | def do_train(

FILE: GLIP/maskrcnn_benchmark/engine/trainer.py
  function reduce_loss_dict (line 20) | def reduce_loss_dict(loss_dict):
  function do_train (line 45) | def do_train(

FILE: GLIP/maskrcnn_benchmark/layers/batch_norm.py
  class FrozenBatchNorm2d (line 9) | class FrozenBatchNorm2d(nn.Module):
    method __init__ (line 15) | def __init__(self, n):
    method forward (line 22) | def forward(self, x):
  class AllReduce (line 30) | class AllReduce(Function):
    method forward (line 32) | def forward(ctx, input):
    method backward (line 40) | def backward(ctx, grad_output):
  class NaiveSyncBatchNorm2d (line 45) | class NaiveSyncBatchNorm2d(nn.BatchNorm2d):
    method __init__ (line 73) | def __init__(self, *args, stats_mode="", **kwargs):
    method forward (line 78) | def forward(self, input):

FILE: GLIP/maskrcnn_benchmark/layers/deform_conv.py
  class DeformConvFunction (line 12) | class DeformConvFunction(Function):
    method forward (line 15) | def forward(
    method backward (line 75) | def backward(ctx, grad_output):
    method _output_size (line 137) | def _output_size(input, weight, padding, dilation, stride):
  class ModulatedDeformConvFunction (line 152) | class ModulatedDeformConvFunction(Function):
    method forward (line 155) | def forward(
    method backward (line 209) | def backward(ctx, grad_output):
    method _infer_shape (line 251) | def _infer_shape(ctx, input, weight):
  class DeformConv (line 267) | class DeformConv(nn.Module):
    method __init__ (line 269) | def __init__(
    method reset_parameters (line 306) | def reset_parameters(self):
    method forward (line 314) | def forward(self, input, offset):
    method __repr__ (line 319) | def __repr__(self):
  class ModulatedDeformConv (line 333) | class ModulatedDeformConv(nn.Module):
    method __init__ (line 335) | def __init__(
    method reset_parameters (line 369) | def reset_parameters(self):
    method forward (line 379) | def forward(self, input, offset, mask):
    method __repr__ (line 384) | def __repr__(self):
  class ModulatedDeformConvPack (line 398) | class ModulatedDeformConvPack(ModulatedDeformConv):
    method __init__ (line 400) | def __init__(self,
    method init_offset (line 424) | def init_offset(self):
    method forward (line 429) | def forward(self, input):

FILE: GLIP/maskrcnn_benchmark/layers/deform_pool.py
  function add_conv (line 7) | def add_conv(in_ch, out_ch, ksize, stride, leaky=True):
  class upsample (line 31) | class upsample(nn.Module):
    method __init__ (line 34) | def __init__(self, size=None, scale_factor=None, mode='nearest', align...
    method forward (line 42) | def forward(self, input):
    method extra_repr (line 45) | def extra_repr(self):
  class SPPLayer (line 53) | class SPPLayer(nn.Module):
    method __init__ (line 54) | def __init__(self):
    method forward (line 57) | def forward(self, x):
  class DropBlock (line 65) | class DropBlock(nn.Module):
    method __init__ (line 66) | def __init__(self, block_size=7, keep_prob=0.9):
    method reset (line 75) | def reset(self, block_size, keep_prob):
    method calculate_gamma (line 83) | def calculate_gamma(self, x):
    method forward (line 87) | def forward(self, x):
  class resblock (line 109) | class resblock(nn.Module):
    method __init__ (line 118) | def __init__(self, ch, nblocks=1, shortcut=True):
    method forward (line 129) | def forward(self, x):
  class RFBblock (line 138) | class RFBblock(nn.Module):
    method __init__ (line 139) | def __init__(self,in_ch,residual=False):
    method forward (line 161) | def forward(self,x):
  class FeatureAdaption (line 172) | class FeatureAdaption(nn.Module):
    method __init__ (line 173) | def __init__(self, in_ch, out_ch, n_anchors, rfb=False, sep=False):
    method forward (line 187) | def forward(self, input, wh_pred):
  class ASFFmobile (line 200) | class ASFFmobile(nn.Module):
    method __init__ (line 201) | def __init__(self, level, rfb=False, vis=False):
    method forward (line 229) | def forward(self, x_level_0, x_level_1, x_level_2):
  class ASFF (line 268) | class ASFF(nn.Module):
    method __init__ (line 269) | def __init__(self, level, rfb=False, vis=False):
    method forward (line 296) | def forward(self, x_level_0, x_level_1, x_level_2):
  function make_divisible (line 333) | def make_divisible(v, divisor, min_value=None):
  class ConvBNReLU (line 353) | class ConvBNReLU(nn.Sequential):
    method __init__ (line 354) | def __init__(self, in_planes, out_planes, kernel_size=3, stride=1, gro...
  function add_sepconv (line 362) | def add_sepconv(in_ch, out_ch, ksize, stride):
  class InvertedResidual (line 376) | class InvertedResidual(nn.Module):
    method __init__ (line 377) | def __init__(self, inp, oup, stride, expand_ratio):
    method forward (line 398) | def forward(self, x):
  class ressepblock (line 404) | class ressepblock(nn.Module):
    method __init__ (line 405) | def __init__(self, ch, out_ch, in_ch=None, shortcut=True):
    method forward (line 416) | def forward(self, x):

FILE: GLIP/maskrcnn_benchmark/layers/dropblock.py
  class DropBlock2D (line 6) | class DropBlock2D(nn.Module):
    method __init__ (line 27) | def __init__(self, drop_prob, block_size):
    method forward (line 33) | def forward(self, x):
    method _compute_block_mask (line 62) | def _compute_block_mask(self, mask):
    method _compute_gamma (line 75) | def _compute_gamma(self, x):
  class DropBlock3D (line 79) | class DropBlock3D(DropBlock2D):
    method __init__ (line 100) | def __init__(self, drop_prob, block_size):
    method forward (line 103) | def forward(self, x):
    method _compute_block_mask (line 132) | def _compute_block_mask(self, mask):
    method _compute_gamma (line 145) | def _compute_gamma(self, x):

FILE: GLIP/maskrcnn_benchmark/layers/dyhead.py
  class Conv3x3Norm (line 9) | class Conv3x3Norm(torch.nn.Module):
    method __init__ (line 10) | def __init__(self,
    method forward (line 28) | def forward(self, input, **kwargs):
  class DyConv (line 35) | class DyConv(nn.Module):
    method __init__ (line 36) | def __init__(self,
    method init_weights (line 72) | def init_weights(self):
    method forward (line 85) | def forward(self, x):
  class DyHead (line 122) | class DyHead(nn.Module):
    method __init__ (line 123) | def __init__(self, cfg, in_channels):
    method forward (line 149) | def forward(self, x):

FILE: GLIP/maskrcnn_benchmark/layers/dyrelu.py
  function _make_divisible (line 5) | def _make_divisible(v, divisor, min_value=None):
  class swish (line 15) | class swish(nn.Module):
    method forward (line 16) | def forward(self, x):
  class h_swish (line 20) | class h_swish(nn.Module):
    method __init__ (line 21) | def __init__(self, inplace=False):
    method forward (line 25) | def forward(self, x):
  class h_sigmoid (line 29) | class h_sigmoid(nn.Module):
    method __init__ (line 30) | def __init__(self, inplace=True, h_max=1):
    method forward (line 35) | def forward(self, x):
  class DYReLU (line 39) | class DYReLU(nn.Module):
    method __init__ (line 40) | def __init__(self, inp, oup, reduction=4, lambda_a=1.0, K2=True, use_b...
    method forward (line 78) | def forward(self, x):

FILE: GLIP/maskrcnn_benchmark/layers/evonorm.py
  class EvoNorm2d (line 5) | class EvoNorm2d(nn.Module):
    method __init__ (line 8) | def __init__(self, num_features, eps=1e-5, nonlinearity=True, group=32):
    method reset_parameters (line 23) | def reset_parameters(self):
    method group_std (line 29) | def group_std(self, x, groups=32):
    method forward (line 35) | def forward(self, x):

FILE: GLIP/maskrcnn_benchmark/layers/iou_loss.py
  class IOULoss (line 5) | class IOULoss(nn.Module):
    method __init__ (line 6) | def __init__(self, loss_type="iou"):
    method forward (line 10) | def forward(self, pred, target, weight=None):
  class IOUWHLoss (line 52) | class IOUWHLoss(nn.Module):  # used for anchor guiding
    method __init__ (line 53) | def __init__(self, reduction='none'):
    method forward (line 57) | def forward(self, pred, target):

FILE: GLIP/maskrcnn_benchmark/layers/misc.py
  class _NewEmptyTensorOp (line 17) | class _NewEmptyTensorOp(torch.autograd.Function):
    method forward (line 19) | def forward(ctx, x, new_shape):
    method backward (line 24) | def backward(ctx, grad):
  class Conv2d (line 29) | class Conv2d(torch.nn.Conv2d):
    method forward (line 30) | def forward(self, x):
  class ConvTranspose2d (line 45) | class ConvTranspose2d(torch.nn.ConvTranspose2d):
    method forward (line 46) | def forward(self, x):
  class BatchNorm2d (line 66) | class BatchNorm2d(torch.nn.BatchNorm2d):
    method forward (line 67) | def forward(self, x):
  function interpolate (line 75) | def interpolate(
  class Scale (line 113) | class Scale(torch.nn.Module):
    method __init__ (line 114) | def __init__(self, init_value=1.0):
    method forward (line 118) | def forward(self, input):
  class DFConv2d (line 122) | class DFConv2d(torch.nn.Module):
    method __init__ (line 124) | def __init__(
    method forward (line 181) | def forward(self, x):

FILE: GLIP/maskrcnn_benchmark/layers/roi_align.py
  class _ROIAlign (line 10) | class _ROIAlign(Function):
    method forward (line 12) | def forward(ctx, input, roi, output_size, spatial_scale, sampling_ratio):
    method backward (line 25) | def backward(ctx, grad_output):
  class ROIAlign (line 51) | class ROIAlign(nn.Module):
    method __init__ (line 52) | def __init__(self, output_size, spatial_scale, sampling_ratio):
    method forward (line 58) | def forward(self, input, rois):
    method __repr__ (line 63) | def __repr__(self):
  class ROIAlignV2 (line 71) | class ROIAlignV2(nn.Module):
    method __init__ (line 72) | def __init__(self, output_size, spatial_scale, sampling_ratio):
    method forward (line 78) | def forward(self, input, rois):
    method __repr__ (line 83) | def __repr__(self):

FILE: GLIP/maskrcnn_benchmark/layers/roi_pool.py
  class _ROIPool (line 11) | class _ROIPool(Function):
    method forward (line 13) | def forward(ctx, input, roi, output_size, spatial_scale):
    method backward (line 25) | def backward(ctx, grad_output):
  class ROIPool (line 49) | class ROIPool(nn.Module):
    method __init__ (line 50) | def __init__(self, output_size, spatial_scale):
    method forward (line 55) | def forward(self, input, rois):
    method __repr__ (line 58) | def __repr__(self):

FILE: GLIP/maskrcnn_benchmark/layers/se.py
  class SELayer (line 4) | class SELayer(nn.Module):
    method __init__ (line 5) | def __init__(self, channel, reduction=16):
    method forward (line 15) | def forward(self, x):
  class SEBlock (line 22) | class SEBlock(nn.Module):
    method __init__ (line 23) | def __init__(self, channels, reduction=16,
    method forward (line 41) | def forward(self, x):

FILE: GLIP/maskrcnn_benchmark/layers/set_loss.py
  function box_area (line 10) | def box_area(boxes):
  function box_iou (line 15) | def box_iou(boxes1, boxes2):
  function generalized_box_iou (line 31) | def generalized_box_iou(boxes1, boxes2):
  function dice_loss (line 55) | def dice_loss(inputs, targets, num_boxes):
  function sigmoid_focal_loss (line 73) | def sigmoid_focal_loss(inputs: torch.Tensor, targets: torch.Tensor, alph...
  class HungarianMatcher (line 115) | class HungarianMatcher(nn.Module):
    method __init__ (line 123) | def __init__(self, cost_class: float = 1, cost_bbox: float = 1, cost_g...
    method forward (line 145) | def forward(self, outputs, targets):
  class SetCriterion (line 217) | class SetCriterion(nn.Module):
    method __init__ (line 223) | def __init__(self, num_classes, matcher, weight_dict, eos_coef, losses,
    method loss_labels (line 248) | def loss_labels(self, outputs, targets, indices, num_boxes, log=False):
    method loss_boxes (line 283) | def loss_boxes(self, outputs, targets, indices, num_boxes):
    method _get_src_permutation_idx (line 306) | def _get_src_permutation_idx(self, indices):
    method _get_tgt_permutation_idx (line 312) | def _get_tgt_permutation_idx(self, indices):
    method get_loss (line 318) | def get_loss(self, loss, outputs, targets, indices, num_boxes, **kwargs):
    method forward (line 327) | def forward(self, outputs, targets, *argrs, **kwargs):

FILE: GLIP/maskrcnn_benchmark/layers/sigmoid_focal_loss.py
  class _SigmoidFocalLoss (line 11) | class _SigmoidFocalLoss(Function):
    method forward (line 13) | def forward(ctx, logits, targets, gamma, alpha):
    method backward (line 27) | def backward(ctx, d_loss):
  function sigmoid_focal_loss_cpu (line 42) | def sigmoid_focal_loss_cpu(logits, targets, gamma, alpha):
  class SigmoidFocalLoss (line 55) | class SigmoidFocalLoss(nn.Module):
    method __init__ (line 56) | def __init__(self, gamma, alpha):
    method forward (line 61) | def forward(self, logits, targets):
    method __repr__ (line 70) | def __repr__(self):
  function token_sigmoid_softmax_focal_loss (line 78) | def token_sigmoid_softmax_focal_loss(pred_logits, targets, alpha, gamma,...
  function token_sigmoid_binary_focal_loss_v2 (line 110) | def token_sigmoid_binary_focal_loss_v2(pred_logits, targets, alpha, gamm...
  function token_sigmoid_binary_focal_loss (line 130) | def token_sigmoid_binary_focal_loss(pred_logits, targets, alpha, gamma, ...
  class TokenSigmoidFocalLoss (line 174) | class TokenSigmoidFocalLoss(nn.Module):
    method __init__ (line 175) | def __init__(self, alpha, gamma):
    method forward (line 180) | def forward(self, logits, targets, text_masks=None, version="binary", ...
    method __repr__ (line 192) | def __repr__(self):

FILE: GLIP/maskrcnn_benchmark/layers/smooth_l1_loss.py
  function smooth_l1_loss (line 6) | def smooth_l1_loss(input, target, beta=1. / 9, size_average=True):

FILE: GLIP/maskrcnn_benchmark/modeling/backbone/__init__.py
  function build_resnet_backbone (line 23) | def build_resnet_backbone(cfg):
  function build_resnet_c5_backbone (line 31) | def build_resnet_c5_backbone(cfg):
  function build_retinanet_swint_fpn_backbone (line 38) | def build_retinanet_swint_fpn_backbone(cfg):
  function build_swint_fpn_backbone (line 84) | def build_swint_fpn_backbone(cfg):
  function build_retinanet_cvt_fpn_backbone (line 128) | def build_retinanet_cvt_fpn_backbone(cfg):
  function build_eff_fpn_p6p7_backbone (line 170) | def build_eff_fpn_p6p7_backbone(cfg):
  function build_eff_fpn_p6p7_backbone (line 199) | def build_eff_fpn_p6p7_backbone(cfg):
  function build_efficientdet_backbone (line 220) | def build_efficientdet_backbone(cfg):
  function build_backbone (line 234) | def build_backbone(cfg):

FILE: GLIP/maskrcnn_benchmark/modeling/backbone/bifpn.py
  class BiFPN (line 7) | class BiFPN(nn.Module):
    method __init__ (line 8) | def __init__(self, in_channels_list, out_channels, first_time=False, e...
    method forward (line 120) | def forward(self, inputs):
    method _forward_fast_attention (line 151) | def _forward_fast_attention(self, inputs):
    method _forward (line 225) | def _forward(self, inputs):

FILE: GLIP/maskrcnn_benchmark/modeling/backbone/blocks.py
  class stem (line 5) | class stem(nn.Module):
    method __init__ (line 8) | def __init__(self, conv, inplanes, planes, stride=1, norm_layer=nn.Bat...
    method forward (line 15) | def forward(self, x):
  class basic (line 23) | class basic(nn.Module):
    method __init__ (line 27) | def __init__(self, conv, inplanes, planes, stride=1, midplanes=None, n...
    method forward (line 43) | def forward(self, x):
  class bottleneck (line 62) | class bottleneck(nn.Module):
    method __init__ (line 66) | def __init__(self, conv, inplanes, planes, stride=1, midplanes=None, n...
    method forward (line 84) | def forward(self, x):
  class invert (line 107) | class invert(nn.Module):
    method __init__ (line 108) | def __init__(self, conv, inp, oup, stride=1, expand_ratio=1, norm_laye...
    method forward (line 141) | def forward(self, x):
  function channel_shuffle (line 154) | def channel_shuffle(x, groups):
  class shuffle (line 165) | class shuffle(nn.Module):
    method __init__ (line 169) | def __init__(self, conv, inplanes, outplanes, stride=1, midplanes=None...
    method forward (line 201) | def forward(self, x):
  class shufflex (line 212) | class shufflex(nn.Module):
    method __init__ (line 216) | def __init__(self, conv, inplanes, outplanes, stride=1, midplanes=None...
    method forward (line 258) | def forward(self, x):

FILE: GLIP/maskrcnn_benchmark/modeling/backbone/efficientdet.py
  class MaxPool2dStaticSamePadding (line 32) | class MaxPool2dStaticSamePadding(nn.Module):
    method __init__ (line 38) | def __init__(self, kernel_size, stride):
    method forward (line 59) | def forward(self, x):
  class Conv2dStaticSamePadding (line 84) | class Conv2dStaticSamePadding(nn.Module):
    method __init__ (line 90) | def __init__(self, in_channels, out_channels, kernel_size, stride=1, b...
    method forward (line 124) | def forward(self, x):
  class SeparableConvBlock (line 149) | class SeparableConvBlock(nn.Module):
    method __init__ (line 154) | def __init__(self, in_channels, out_channels=None, norm=True, activati...
    method forward (line 177) | def forward(self, x):
  class BiFPN (line 190) | class BiFPN(nn.Module):
    method __init__ (line 195) | def __init__(self, num_channels, conv_channels, first_time=False,
    method forward (line 297) | def forward(self, inputs):
    method _forward_fast_attention (line 327) | def _forward_fast_attention(self, inputs):
    method _forward (line 433) | def _forward(self, inputs):
  class Regressor (line 484) | class Regressor(nn.Module):
    method __init__ (line 489) | def __init__(self, in_channels, num_anchors, num_layers, onnx_export=F...
    method forward (line 502) | def forward(self, inputs):
  class SwishImplementation (line 519) | class SwishImplementation(torch.autograd.Function):
    method forward (line 521) | def forward(ctx, i):
    method backward (line 527) | def backward(ctx, grad_output):
  class MemoryEfficientSwish (line 532) | class MemoryEfficientSwish(nn.Module):
    method forward (line 533) | def forward(self, x):
  class Swish (line 538) | class Swish(nn.Module):
    method forward (line 539) | def forward(self, x):
  class Classifier (line 542) | class Classifier(nn.Module):
    method __init__ (line 547) | def __init__(self, in_channels, num_anchors, num_classes, num_layers,
    method forward (line 567) | def forward(self, inputs):
  class Conv2dDynamicSamePadding (line 588) | class Conv2dDynamicSamePadding(nn.Conv2d):
    method __init__ (line 591) | def __init__(self, in_channels, out_channels, kernel_size, stride=1, d...
    method forward (line 596) | def forward(self, x):
  function get_same_padding_conv2d (line 609) | def get_same_padding_conv2d(image_size=None):
  function round_filters (line 619) | def round_filters(filters, global_params):
  function round_repeats (line 633) | def round_repeats(repeats, global_params):
  function drop_connect (line 640) | def drop_connect(inputs, p, training):
  class MBConvBlock (line 651) | class MBConvBlock(nn.Module):
    method __init__ (line 663) | def __init__(self, block_args, global_params):
    method forward (line 703) | def forward(self, inputs, drop_connect_rate=None):
    method set_swish (line 740) | def set_swish(self, memory_efficient=True):
  class BlockDecoder (line 744) | class BlockDecoder(object):
    method _decode_block_string (line 748) | def _decode_block_string(block_string):
    method _encode_block_string (line 775) | def _encode_block_string(block):
    method decode (line 792) | def decode(string_list):
    method encode (line 806) | def encode(blocks_args):
  function efficientnet (line 818) | def efficientnet(width_coefficient=None, depth_coefficient=None, dropout...
  function efficientnet_params (line 847) | def efficientnet_params(model_name):
  function get_model_params (line 865) | def get_model_params(model_name, override_params):
  function load_pretrained_weights (line 902) | def load_pretrained_weights(model, model_name, load_fc=True, advprop=Fal...
  class EfficientNet (line 919) | class EfficientNet(nn.Module):
    method __init__ (line 932) | def __init__(self, blocks_args=None, global_params=None):
    method set_swish (line 982) | def set_swish(self, memory_efficient=True):
    method extract_features (line 988) | def extract_features(self, inputs):
    method forward (line 1005) | def forward(self, inputs):
    method from_name (line 1019) | def from_name(cls, model_name, override_params=None):
    method from_pretrained (line 1025) | def from_pretrained(cls, model_name, load_weights=True, advprop=True, ...
    method get_image_size (line 1036) | def get_image_size(cls, model_name):
    method _check_model_name_is_valid (line 1042) | def _check_model_name_is_valid(cls, model_name):
  class EfficientNetD (line 1048) | class EfficientNetD(nn.Module):
    method __init__ (line 1053) | def __init__(self, compound_coef, load_weights=False):
    method forward (line 1063) | def forward(self, x):
  class Anchors (line 1087) | class Anchors(nn.Module):
    method __init__ (line 1092) | def __init__(self, anchor_scale=4., pyramid_levels=None, **kwargs):
    method forward (line 1108) | def forward(self, image, dtype=torch.float32, features=None):
    method get_key (line 1190) | def get_key(self, hint, image_shape):
  class EffNetFPN (line 1193) | class EffNetFPN(nn.Module):
    method __init__ (line 1194) | def __init__(self, compound_coef=0, start_from=3):
    method forward (line 1216) | def forward(self, inputs):
  class EfficientDetBackbone (line 1229) | class EfficientDetBackbone(nn.Module):
    method __init__ (line 1256) | def __init__(self, num_classes=80, compound_coef=0, load_weights=False,
    method freeze_bn (line 1293) | def freeze_bn(self):
    method forward (line 1298) | def forward(self, inputs):
    method init_backbone (line 1310) | def init_backbone(self, path):
  function init_weights (line 1318) | def init_weights(model):
  function calc_iou (line 1328) | def calc_iou(a, b):
  class BBoxTransform (line 1344) | class BBoxTransform(nn.Module):
    method forward (line 1345) | def forward(self, anchors, regression):
  class ClipBoxes (line 1377) | class ClipBoxes(nn.Module):
    method __init__ (line 1379) | def __init__(self):
    method forward (line 1382) | def forward(self, boxes, img):
  function postprocess2 (line 1393) | def postprocess2(x, anchors, regression, classification,
  function postprocess (line 1450) | def postprocess(x, anchors, regression, classification, regressBoxes, cl...
  function display (line 1490) | def display(preds, imgs, obj_list, imshow=True, imwrite=False):
  function calculate_focal_loss2 (line 1510) | def calculate_focal_loss2(classification, target_list, alpha, gamma):
  function calculate_focal_loss (line 1515) | def calculate_focal_loss(classification, targets, alpha, gamma):
  function calculate_giou (line 1534) | def calculate_giou(pred, gt):
  class FocalLoss (line 1556) | class FocalLoss(nn.Module):
    method __init__ (line 1557) | def __init__(self, alpha=0.25, gamma=2., cls_loss_type='FL', smooth_bc...
    method forward (line 1602) | def forward(self, classifications, regressions, anchor_info, annotatio...
  class ModelWithLoss (line 1761) | class ModelWithLoss(nn.Module):
    method __init__ (line 1762) | def __init__(self, model, criterion):
    method forward (line 1767) | def forward(self, *args):
  class TorchVisionNMS (line 1776) | class TorchVisionNMS(nn.Module):
    method __init__ (line 1777) | def __init__(self, iou_threshold):
    method forward (line 1781) | def forward(self, box, prob):
  class PostProcess (line 1785) | class PostProcess(nn.Module):
    method __init__ (line 1786) | def __init__(self, iou_threshold):
    method forward (line 1790) | def forward(self, x, anchors, regression,
  class InferenceModel (line 1846) | class InferenceModel(nn.Module):
    method __init__ (line 1847) | def __init__(self, model):
    method forward (line 1859) | def forward(self, sample):

FILE: GLIP/maskrcnn_benchmark/modeling/backbone/efficientnet.py
  function round_channels (line 17) | def round_channels(channels,
  function calc_tf_padding (line 40) | def calc_tf_padding(x,
  class ConvBlock (line 71) | class ConvBlock(nn.Module):
    method __init__ (line 100) | def __init__(self,
    method forward (line 136) | def forward(self, x):
  function conv1x1_block (line 147) | def conv1x1_block(in_channels,
  function conv3x3_block (line 193) | def conv3x3_block(in_channels,
  function dwconv3x3_block (line 243) | def dwconv3x3_block(in_channels,
  function dwconv5x5_block (line 287) | def dwconv5x5_block(in_channels,
  class EffiDwsConvUnit (line 331) | class EffiDwsConvUnit(nn.Module):
    method __init__ (line 351) | def __init__(self,
    method forward (line 378) | def forward(self, x):
  class EffiInvResUnit (line 391) | class EffiInvResUnit(nn.Module):
    method __init__ (line 416) | def __init__(self,
    method forward (line 458) | def forward(self, x):
  class EffiInitBlock (line 473) | class EffiInitBlock(nn.Module):
    method __init__ (line 491) | def __init__(self,
    method forward (line 508) | def forward(self, x):
  class EfficientNet (line 515) | class EfficientNet(nn.Module):
    method __init__ (line 547) | def __init__(self,
    method _freeze_backbone (line 608) | def _freeze_backbone(self, freeze_at):
    method forward (line 616) | def forward(self, x):
  function get_efficientnet (line 625) | def get_efficientnet(cfg, version, tf_mode = True, bn_eps=1e-5, **kwargs):

FILE: GLIP/maskrcnn_benchmark/modeling/backbone/fbnet.py
  function _py2_round (line 23) | def _py2_round(x):
  function _get_divisible_by (line 27) | def _get_divisible_by(num, divisible_by, min_val):
  class Identity (line 34) | class Identity(nn.Module):
    method __init__ (line 35) | def __init__(self, C_in, C_out, stride):
    method forward (line 52) | def forward(self, x):
  class CascadeConv3x3 (line 60) | class CascadeConv3x3(nn.Sequential):
    method __init__ (line 61) | def __init__(self, C_in, C_out, stride):
    method forward (line 73) | def forward(self, x):
  class Shift (line 80) | class Shift(nn.Module):
    method __init__ (line 81) | def __init__(self, C, kernel_size, stride, padding):
    method forward (line 108) | def forward(self, x):
  class ShiftBlock5x5 (line 134) | class ShiftBlock5x5(nn.Sequential):
    method __init__ (line 135) | def __init__(self, C_in, C_out, expansion, stride):
    method forward (line 154) | def forward(self, x):
  class ChannelShuffle (line 161) | class ChannelShuffle(nn.Module):
    method __init__ (line 162) | def __init__(self, groups):
    method forward (line 166) | def forward(self, x):
  class ConvBNRelu (line 181) | class ConvBNRelu(nn.Sequential):
    method __init__ (line 182) | def __init__(
  class SEModule (line 240) | class SEModule(nn.Module):
    method __init__ (line 243) | def __init__(self, C):
    method forward (line 253) | def forward(self, x):
  class Upsample (line 257) | class Upsample(nn.Module):
    method __init__ (line 258) | def __init__(self, scale_factor, mode, align_corners=None):
    method forward (line 264) | def forward(self, x):
  function _get_upsample_op (line 271) | def _get_upsample_op(stride):
  class IRFBlock (line 288) | class IRFBlock(nn.Module):
    method __init__ (line 289) | def __init__(
    method forward (line 392) | def forward(self, x):

FILE: GLIP/maskrcnn_benchmark/modeling/backbone/fpn.py
  class FPN (line 6) | class FPN(nn.Module):
    method __init__ (line 13) | def __init__(
    method forward (line 59) | def forward(self, x):
  class LastLevelMaxPool (line 132) | class LastLevelMaxPool(nn.Module):
    method forward (line 133) | def forward(self, x):
  class LastLevelP6P7 (line 137) | class LastLevelP6P7(nn.Module):
    method __init__ (line 141) | def __init__(self, in_channels, out_channels):
    method forward (line 150) | def forward(self, c5, p5):
  class SPPLayer (line 157) | class SPPLayer(nn.Module):
    method __init__ (line 158) | def __init__(self):
    method forward (line 161) | def forward(self, x):

FILE: GLIP/maskrcnn_benchmark/modeling/backbone/mixer.py
  class MixedOperationRandom (line 4) | class MixedOperationRandom(nn.Module):
    method __init__ (line 5) | def __init__(self, search_ops):
    method forward (line 10) | def forward(self, x, x_path=None):

FILE: GLIP/maskrcnn_benchmark/modeling/backbone/ops.py
  function conv7x7 (line 7) | def conv7x7(in_planes, out_planes, stride=1, groups=1, dilation=1):
  function conv5x5 (line 13) | def conv5x5(in_planes, out_planes, stride=1, groups=1, dilation=1):
  function conv3x3 (line 19) | def conv3x3(in_planes, out_planes, stride=1, groups=1, dilation=1):
  function conv1x1 (line 25) | def conv1x1(in_planes, out_planes, stride=1):
  function maxpool (line 30) | def maxpool(**kwargs):
  function avgpool (line 34) | def avgpool(**kwargs):
  function dropout (line 37) | def dropout(prob):

FILE: GLIP/maskrcnn_benchmark/modeling/backbone/resnet.py
  class ResNet (line 81) | class ResNet(nn.Module):
    method __init__ (line 82) | def __init__(self, cfg):
    method _freeze_backbone (line 159) | def _freeze_backbone(self, freeze_at):
    method forward (line 170) | def forward(self, x):
  class ResNetHead (line 180) | class ResNetHead(nn.Module):
    method __init__ (line 181) | def __init__(
    method forward (line 226) | def forward(self, x):
  function _make_stage (line 232) | def _make_stage(
  class Bottleneck (line 278) | class Bottleneck(nn.Module):
    method __init__ (line 279) | def __init__(
    method forward (line 382) | def forward(self, x):
  class BaseStem (line 408) | class BaseStem(nn.Module):
    method __init__ (line 409) | def __init__(self, cfg, norm_func):
    method forward (line 435) | def forward(self, x):
  class BottleneckWithFixedBatchNorm (line 451) | class BottleneckWithFixedBatchNorm(Bottleneck):
    method __init__ (line 452) | def __init__(
  class StemWithFixedBatchNorm (line 478) | class StemWithFixedBatchNorm(BaseStem):
    method __init__ (line 479) | def __init__(self, cfg):
  class BottleneckWithBatchNorm (line 485) | class BottleneckWithBatchNorm(Bottleneck):
    method __init__ (line 486) | def __init__(
  class StemWithBatchNorm (line 512) | class StemWithBatchNorm(BaseStem):
    method __init__ (line 513) | def __init__(self, cfg):
  class BottleneckWithNaiveSyncBatchNorm (line 519) | class BottleneckWithNaiveSyncBatchNorm(Bottleneck):
    method __init__ (line 520) | def __init__(
  class StemWithNaiveSyncBatchNorm (line 546) | class StemWithNaiveSyncBatchNorm(BaseStem):
    method __init__ (line 547) | def __init__(self, cfg):
  class BottleneckWithSyncBatchNorm (line 553) | class BottleneckWithSyncBatchNorm(Bottleneck):
    method __init__ (line 554) | def __init__(
  class StemWithSyncBatchNorm (line 580) | class StemWithSyncBatchNorm(BaseStem):
    method __init__ (line 581) | def __init__(self, cfg):
  class BottleneckWithGN (line 587) | class BottleneckWithGN(Bottleneck):
    method __init__ (line 588) | def __init__(
  class StemWithGN (line 614) | class StemWithGN(BaseStem):
    method __init__ (line 615) | def __init__(self, cfg):

FILE: GLIP/maskrcnn_benchmark/modeling/backbone/swint.py
  class Mlp (line 13) | class Mlp(nn.Module):
    method __init__ (line 16) | def __init__(self, in_features, hidden_features=None, out_features=Non...
    method forward (line 25) | def forward(self, x):
  function window_partition (line 34) | def window_partition(x, window_size):
  function window_reverse (line 48) | def window_reverse(windows, window_size, H, W):
  class WindowAttention (line 64) | class WindowAttention(nn.Module):
    method __init__ (line 77) | def __init__(self, dim, window_size, num_heads, qkv_bias=True, qk_scal...
    method forward (line 111) | def forward(self, x, mask=None):
  class SwinTransformerBlock (line 145) | class SwinTransformerBlock(nn.Module):
    method __init__ (line 162) | def __init__(self, dim, num_heads, window_size=7, shift_size=0,
    method forward (line 186) | def forward(self, x, mask_matrix):
  class PatchMerging (line 245) | class PatchMerging(nn.Module):
    method __init__ (line 252) | def __init__(self, dim, norm_layer=nn.LayerNorm):
    method forward (line 258) | def forward(self, x, H, W):
  class BasicLayer (line 287) | class BasicLayer(nn.Module):
    method __init__ (line 305) | def __init__(self,
    method forward (line 347) | def forward(self, x, H, W):
  class PatchEmbed (line 389) | class PatchEmbed(nn.Module):
    method __init__ (line 398) | def __init__(self, patch_size=4, in_chans=3, embed_dim=96, norm_layer=...
    method forward (line 412) | def forward(self, x):
  class SwinTransformer (line 431) | class SwinTransformer(nn.Module):
    method __init__ (line 459) | def __init__(self,
    method _freeze_stages (line 556) | def _freeze_stages(self):
    method init_weights (line 573) | def init_weights(self, pretrained=None):
    method forward (line 591) | def forward(self, x):
    method train (line 617) | def train(self, mode=True):
  function build_swint_backbone (line 623) | def build_swint_backbone(cfg):

FILE: GLIP/maskrcnn_benchmark/modeling/backbone/swint_v2.py
  class Mlp (line 14) | class Mlp(nn.Module):
    method __init__ (line 17) | def __init__(self, in_features, hidden_features=None, out_features=Non...
    method forward (line 26) | def forward(self, x):
  function window_partition (line 35) | def window_partition(x, window_size):
  function window_reverse (line 49) | def window_reverse(windows, window_size, H, W):
  class WindowAttention (line 65) | class WindowAttention(nn.Module):
    method __init__ (line 78) | def __init__(self, dim, window_size, num_heads, qkv_bias=True, qk_scal...
    method forward (line 112) | def forward(self, x, mask=None):
  class SwinTransformerBlock (line 146) | class SwinTransformerBlock(nn.Module):
    method __init__ (line 163) | def __init__(self, dim, num_heads, window_size=7, shift_size=0,
    method forward (line 194) | def forward(self, x, mask_matrix):
  class PatchMerging (line 253) | class PatchMerging(nn.Module):
    method __init__ (line 260) | def __init__(self, dim, norm_layer=nn.LayerNorm):
    method forward (line 266) | def forward(self, x, H, W):
  class BasicLayer (line 295) | class BasicLayer(nn.Module):
    method __init__ (line 313) | def __init__(self,
    method forward (line 358) | def forward(self, x, H, W):
  class ConvEmbed (line 442) | class ConvEmbed(nn.Module):
    method __init__ (line 446) | def __init__(
    method forward (line 467) | def forward(self, x, H=None, W=None):
  class SwinTransformer (line 499) | class SwinTransformer(nn.Module):
    method __init__ (line 527) | def __init__(self,
    method _freeze_stages (line 635) | def _freeze_stages(self):
    method init_weights (line 652) | def init_weights(self, pretrained=None):
    method forward (line 670) | def forward(self, x):
    method train (line 697) | def train(self, mode=True):
  function build_swint_backbone (line 703) | def build_swint_backbone(cfg):

FILE: GLIP/maskrcnn_benchmark/modeling/backbone/swint_v2_vl.py
  class Mlp (line 15) | class Mlp(nn.Module):
    method __init__ (line 18) | def __init__(self, in_features, hidden_features=None, out_features=Non...
    method forward (line 27) | def forward(self, x):
  function window_partition (line 36) | def window_partition(x, window_size):
  function window_reverse (line 50) | def window_reverse(windows, window_size, H, W):
  class WindowAttention (line 66) | class WindowAttention(nn.Module):
    method __init__ (line 79) | def __init__(self, dim, window_size, num_heads, qkv_bias=True, qk_scal...
    method forward (line 125) | def forward(self, x, mask=None, x_text=None, mask_text=None):
  class SwinTransformerBlock (line 215) | class SwinTransformerBlock(nn.Module):
    method __init__ (line 232) | def __init__(self, dim, num_heads, window_size=7, shift_size=0,
    method forward (line 276) | def forward(self, x, mask_matrix, x_text, mask_text):
  class PatchMerging (line 347) | class PatchMerging(nn.Module):
    method __init__ (line 354) | def __init__(self, dim, norm_layer=nn.LayerNorm):
    method forward (line 360) | def forward(self, x, H, W):
  class BasicLayer (line 389) | class BasicLayer(nn.Module):
    method __init__ (line 407) | def __init__(self,
    method forward (line 456) | def forward(self, x, H, W, x_text=None, mask_text=None):
  class ConvEmbed (line 542) | class ConvEmbed(nn.Module):
    method __init__ (line 546) | def __init__(
    method forward (line 567) | def forward(self, x, H=None, W=None):
  class SwinTransformer (line 599) | class SwinTransformer(nn.Module):
    method __init__ (line 627) | def __init__(self,
    method _freeze_stages (line 744) | def _freeze_stages(self):
    method init_weights (line 761) | def init_weights(self, pretrained=None):
    method forward (line 779) | def forward(self, inputs):
    method train (line 822) | def train(self, mode=True):
  function build_swint_backbone (line 828) | def build_swint_backbone(cfg):

FILE: GLIP/maskrcnn_benchmark/modeling/backbone/swint_vl.py
  class Mlp (line 14) | class Mlp(nn.Module):
    method __init__ (line 17) | def __init__(self, in_features, hidden_features=None, out_features=Non...
    method forward (line 26) | def forward(self, x):
  function window_partition (line 35) | def window_partition(x, window_size):
  function window_reverse (line 49) | def window_reverse(windows, window_size, H, W):
  class WindowAttention (line 65) | class WindowAttention(nn.Module):
    method __init__ (line 78) | def __init__(self, dim, window_size, num_heads, qkv_bias=True, qk_scal...
    method forward (line 124) | def forward(self, x, mask=None, x_text=None, mask_text=None):
  class SwinTransformerBlock (line 214) | class SwinTransformerBlock(nn.Module):
    method __init__ (line 231) | def __init__(self, dim, num_heads, window_size=7, shift_size=0,
    method forward (line 264) | def forward(self, x, mask_matrix, x_text, mask_text):
  class PatchMerging (line 335) | class PatchMerging(nn.Module):
    method __init__ (line 342) | def __init__(self, dim, norm_layer=nn.LayerNorm):
    method forward (line 348) | def forward(self, x, H, W):
  class BasicLayer (line 377) | class BasicLayer(nn.Module):
    method __init__ (line 395) | def __init__(self,
    method forward (line 441) | def forward(self, x, H, W, x_text=None, mask_text=None):
  class PatchEmbed (line 485) | class PatchEmbed(nn.Module):
    method __init__ (line 494) | def __init__(self, patch_size=4, in_chans=3, embed_dim=96, norm_layer=...
    method forward (line 508) | def forward(self, x):
  class SwinTransformer (line 527) | class SwinTransformer(nn.Module):
    method __init__ (line 555) | def __init__(self,
    method _freeze_stages (line 661) | def _freeze_stages(self):
    method init_weights (line 678) | def init_weights(self, pretrained=None):
    method forward (line 696) | def forward(self, inputs):
    method train (line 739) | def train(self, mode=True):
  function build_swint_backbone (line 745) | def build_swint_backbone(cfg):

FILE: GLIP/maskrcnn_benchmark/modeling/balanced_positive_negative_sampler.py
  class BalancedPositiveNegativeSampler (line 5) | class BalancedPositiveNegativeSampler(object):
    method __init__ (line 10) | def __init__(self, batch_size_per_image, positive_fraction):
    method __call__ (line 19) | def __call__(self, matched_idxs):

FILE: GLIP/maskrcnn_benchmark/modeling/box_coder.py
  class BoxCoder (line 7) | class BoxCoder(object):
    method __init__ (line 13) | def __init__(self, weights, bbox_xform_clip=math.log(1000. / 16)):
    method encode (line 22) | def encode(self, reference_boxes, proposals):
    method decode (line 52) | def decode(self, rel_codes, boxes):

FILE: GLIP/maskrcnn_benchmark/modeling/detector/__init__.py
  function build_detection_model (line 9) | def build_detection_model(cfg):

FILE: GLIP/maskrcnn_benchmark/modeling/detector/generalized_rcnn.py
  class GeneralizedRCNN (line 17) | class GeneralizedRCNN(nn.Module):
    method __init__ (line 27) | def __init__(self, cfg):
    method train (line 45) | def train(self, mode=True):
    method forward (line 70) | def forward(self, images, targets=None):

FILE: GLIP/maskrcnn_benchmark/modeling/detector/generalized_vl_rcnn.py
  function random_word (line 26) | def random_word(input_ids, mask_token_id, vocabs, padding_token_id, gree...
  class GeneralizedVLRCNN (line 63) | class GeneralizedVLRCNN(nn.Module):
    method __init__ (line 73) | def __init__(self, cfg):
    method train (line 129) | def train(self, mode=True):
    method forward (line 170) | def forward(self,

FILE: GLIP/maskrcnn_benchmark/modeling/language_backbone/backbone.py
  function build_bert_backbone (line 13) | def build_bert_backbone(cfg):
  function build_bert_backbone (line 20) | def build_bert_backbone(cfg):
  function build_rnn_backbone (line 27) | def build_rnn_backbone(cfg):
  function build_clip_backbone (line 34) | def build_clip_backbone(cfg):
  function build_backbone (line 40) | def build_backbone(cfg):

FILE: GLIP/maskrcnn_benchmark/modeling/language_backbone/bert_model.py
  class BertEncoder (line 10) | class BertEncoder(nn.Module):
    method __init__ (line 11) | def __init__(self, cfg):
    method forward (line 32) | def forward(self, x):

FILE: GLIP/maskrcnn_benchmark/modeling/language_backbone/build.py
  function build_tokenizer (line 4) | def build_tokenizer(tokenizer_name):

FILE: GLIP/maskrcnn_benchmark/modeling/language_backbone/clip_model.py
  class LayerNorm (line 16) | class LayerNorm(nn.Module):
    method __init__ (line 17) | def __init__(self, hidden_size, eps=1e-12):
    method forward (line 25) | def forward(self, x):
  class QuickGELU (line 34) | class QuickGELU(nn.Module):
    method forward (line 35) | def forward(self, x: torch.Tensor):
  class ResidualAttentionBlock (line 39) | class ResidualAttentionBlock(nn.Module):
    method __init__ (line 40) | def __init__(self,
    method attention (line 58) | def attention(self, x: torch.Tensor, key_padding_mask: torch.Tensor = ...
    method forward (line 63) | def forward(self, x: torch.Tensor, key_padding_mask: torch.Tensor = No...
  class CLIPTransformer (line 69) | class CLIPTransformer(nn.Module):
    method __init__ (line 70) | def __init__(self, cfg):
    method build_attention_mask (line 113) | def build_attention_mask(self):
    method _init_weights (line 121) | def _init_weights(self, m):
    method resize_pos_embed_1d (line 129) | def resize_pos_embed_1d(self, posemb, shape_new):
    method init_weights (line 140) | def init_weights(self, pretrained="", pretrained_layers=[], verbose=Fa...
    method no_weight_decay (line 165) | def no_weight_decay(self):
    method forward (line 171) | def forward(self, text):

FILE: GLIP/maskrcnn_benchmark/modeling/language_backbone/hfpt_tokenizer.py
  class HFPTTokenizer (line 7) | class HFPTTokenizer(object):
    method __init__ (line 8) | def __init__(self, pt_name=None):
    method get_eot_token (line 36) | def get_eot_token(self):
    method get_sot_token (line 39) | def get_sot_token(self):
    method get_eot_token_list (line 42) | def get_eot_token_list(self):
    method get_sot_token_list (line 45) | def get_sot_token_list(self):
    method get_tokenizer_obj (line 48) | def get_tokenizer_obj(self):
    method check_added_tokens (line 53) | def check_added_tokens(self):
    method tokenize (line 56) | def tokenize(self, texts: Union[str, List[str]], context_length: int =...
    method get_vocab_size (line 95) | def get_vocab_size(self):
    method __call__ (line 98) | def __call__(self, texts: Union[str, List[str]], context_length: int =...

FILE: GLIP/maskrcnn_benchmark/modeling/language_backbone/rnn_model.py
  class RNNEnoder (line 7) | class RNNEnoder(nn.Module):
    method __init__ (line 8) | def __init__(self, cfg):
    method forward (line 36) | def forward(self, input, mask=None):
    method encode (line 49) | def encode(self, input_labels):
    method sort_inputs (line 103) | def sort_inputs(self, input_labels):  # sort input labels by descending

FILE: GLIP/maskrcnn_benchmark/modeling/language_backbone/simple_tokenizer.py
  function default_bpe (line 14) | def default_bpe():
  function bytes_to_unicode (line 19) | def bytes_to_unicode():
  function get_pairs (line 41) | def get_pairs(word):
  function basic_clean (line 53) | def basic_clean(text):
  function whitespace_clean (line 59) | def whitespace_clean(text):
  class SimpleTokenizer (line 65) | class SimpleTokenizer(object):
    method __init__ (line 66) | def __init__(self, bpe_path: str = default_bpe()):
    method bpe (line 85) | def bpe(self, token):
    method encode (line 126) | def encode(self, text):
    method decode (line 134) | def decode(self, tokens):
    method get_vocab_size (line 139) | def get_vocab_size(self):
    method get_eot_token (line 142) | def get_eot_token(self):
    method get_sot_token (line 145) | def get_sot_token(self):
    method check_added_tokens (line 148) | def check_added_tokens(self):
    method get_tokenizer_obj (line 151) | def get_tokenizer_obj(self):
    method tokenize (line 154) | def tokenize(self, texts: Union[str, List[str]], context_length: int =...
    method __call__ (line 172) | def __call__(self, texts: Union[str, List[str]], context_length: int =...

FILE: GLIP/maskrcnn_benchmark/modeling/language_backbone/word_utils.py
  class Dictionary (line 15) | class Dictionary(object):
    method __init__ (line 16) | def __init__(self):
    method add_word (line 20) | def add_word(self, word):
    method __len__ (line 26) | def __len__(self):
    method __getitem__ (line 29) | def __getitem__(self, a):
    method __contains__ (line 39) | def __contains__(self, word):
  class Corpus (line 43) | class Corpus(object):
    method __init__ (line 44) | def __init__(self):
    method set_max_len (line 47) | def set_max_len(self, value):
    method load_file (line 50) | def load_file(self, filename):
    method add_to_corpus (line 58) | def add_to_corpus(self, line):
    method tokenize (line 67) | def tokenize(self, line, max_len=20):
    method __len__ (line 99) | def __len__(self):

FILE: GLIP/maskrcnn_benchmark/modeling/make_layers.py
  function get_group_gn (line 14) | def get_group_gn(dim, dim_per_gp, num_groups):
  function group_norm (line 31) | def group_norm(out_channels, affine=True, divisor=1):
  function make_conv3x3 (line 44) | def make_conv3x3(
  function make_fc (line 80) | def make_fc(dim_in, hidden_dim, use_gn=False):
  function conv_with_kaiming_uniform (line 95) | def conv_with_kaiming_uniform(use_gn=False, use_relu=False, use_dyrelu=F...

FILE: GLIP/maskrcnn_benchmark/modeling/matcher.py
  class Matcher (line 5) | class Matcher(object):
    method __init__ (line 23) | def __init__(self, high_threshold, low_threshold, allow_low_quality_ma...
    method __call__ (line 42) | def __call__(self, match_quality_matrix):
    method set_low_quality_matches_ (line 86) | def set_low_quality_matches_(self, matches, all_matches, match_quality...

FILE: GLIP/maskrcnn_benchmark/modeling/poolers.py
  class LevelMapper (line 11) | class LevelMapper(object):
    method __init__ (line 16) | def __init__(self, k_min, k_max, canonical_scale=224, canonical_level=...
    method __call__ (line 31) | def __call__(self, boxlists):
  class Pooler (line 45) | class Pooler(nn.Module):
    method __init__ (line 55) | def __init__(self, output_size, scales, sampling_ratio, use_v2=False):
    method convert_to_roi_format (line 82) | def convert_to_roi_format(self, boxes):
    method forward (line 95) | def forward(self, x, boxes):

FILE: GLIP/maskrcnn_benchmark/modeling/roi_heads/__init__.py
  class CombinedROIHeads (line 9) | class CombinedROIHeads(torch.nn.ModuleDict):
    method __init__ (line 15) | def __init__(self, cfg, heads):
    method forward (line 23) | def forward(self, features, proposals, targets=None, language_dict_fea...
  function build_roi_heads (line 64) | def build_roi_heads(cfg):

FILE: GLIP/maskrcnn_benchmark/modeling/roi_heads/box_head/box_head.py
  class ROIBoxHead (line 11) | class ROIBoxHead(torch.nn.Module):
    method __init__ (line 16) | def __init__(self, cfg):
    method forward (line 25) | def forward(self, features, proposals, targets=None):
  function build_roi_box_head (line 69) | def build_roi_box_head(cfg):

FILE: GLIP/maskrcnn_benchmark/modeling/roi_heads/box_head/inference.py
  class PostProcessor (line 12) | class PostProcessor(nn.Module):
    method __init__ (line 19) | def __init__(
    method forward (line 38) | def forward(self, x, boxes):
    method prepare_boxlist (line 87) | def prepare_boxlist(self, boxes, scores, image_shape, extra_field={}):
    method filter_results (line 108) | def filter_results(self, boxlist, num_classes):
  function make_roi_box_post_processor (line 164) | def make_roi_box_post_processor(cfg):

FILE: GLIP/maskrcnn_benchmark/modeling/roi_heads/box_head/loss.py
  class FastRCNNLossComputation (line 15) | class FastRCNNLossComputation(object):
    method __init__ (line 21) | def __init__(self, proposal_matcher, fg_bg_sampler, box_coder):
    method match_targets_to_proposals (line 32) | def match_targets_to_proposals(self, proposal, target):
    method prepare_targets (line 54) | def prepare_targets(self, proposals, targets):
    method subsample (line 86) | def subsample(self, proposals, targets):
    method __call__ (line 123) | def __call__(self, class_logits, box_regression):
  function make_roi_box_loss_evaluator (line 171) | def make_roi_box_loss_evaluator(cfg):

FILE: GLIP/maskrcnn_benchmark/modeling/roi_heads/box_head/roi_box_feature_extractors.py
  class LightheadFeatureExtractor (line 15) | class LightheadFeatureExtractor(nn.Module):
    method __init__ (line 16) | def __init__(self, cfg):
    method forward (line 46) | def forward(self, x, proposals):
  class ResNet50Conv5ROIFeatureExtractor (line 66) | class ResNet50Conv5ROIFeatureExtractor(nn.Module):
    method __init__ (line 67) | def __init__(self, config):
    method forward (line 94) | def forward(self, x, proposals):
  class FPN2MLPFeatureExtractor (line 101) | class FPN2MLPFeatureExtractor(nn.Module):
    method __init__ (line 106) | def __init__(self, cfg):
    method forward (line 124) | def forward(self, x, proposals):
  class FPNXconv1fcFeatureExtractor (line 135) | class FPNXconv1fcFeatureExtractor(nn.Module):
    method __init__ (line 140) | def __init__(self, cfg):
    method forward (line 189) | def forward(self, x, proposals):
  function make_roi_box_feature_extractor (line 197) | def make_roi_box_feature_extractor(cfg):

FILE: GLIP/maskrcnn_benchmark/modeling/roi_heads/box_head/roi_box_predictors.py
  class FastRCNNPredictor (line 5) | class FastRCNNPredictor(nn.Module):
    method __init__ (line 6) | def __init__(self, config, pretrained=None):
    method forward (line 25) | def forward(self, x):
  class FPNPredictor (line 33) | class FPNPredictor(nn.Module):
    method __init__ (line 34) | def __init__(self, cfg):
    method forward (line 47) | def forward(self, x):
  function make_roi_box_predictor (line 60) | def make_roi_box_predictor(cfg):

FILE: GLIP/maskrcnn_benchmark/modeling/roi_heads/keypoint_head/inference.py
  class KeypointPostProcessor (line 10) | class KeypointPostProcessor(nn.Module):
    method __init__ (line 11) | def __init__(self, keypointer=None):
    method forward (line 15) | def forward(self, x, boxes):
  function heatmaps_to_keypoints (line 40) | def heatmaps_to_keypoints(maps, rois):
  class Keypointer (line 97) | class Keypointer(object):
    method __init__ (line 103) | def __init__(self, padding=0):
    method __call__ (line 106) | def __call__(self, masks, boxes):
  function make_roi_keypoint_post_processor (line 118) | def make_roi_keypoint_post_processor(cfg):

FILE: GLIP/maskrcnn_benchmark/modeling/roi_heads/keypoint_head/keypoint_head.py
  class ROIKeypointHead (line 9) | class ROIKeypointHead(torch.nn.Module):
    method __init__ (line 10) | def __init__(self, cfg):
    method forward (line 18) | def forward(self, features, proposals, targets=None):
  function build_roi_keypoint_head (line 49) | def build_roi_keypoint_head(cfg):

FILE: GLIP/maskrcnn_benchmark/modeling/roi_heads/keypoint_head/loss.py
  function project_keypoints_to_heatmap (line 17) | def project_keypoints_to_heatmap(keypoints, proposals, discretization_si...
  function cat_boxlist_with_keypoints (line 24) | def cat_boxlist_with_keypoints(boxlists):
  function _within_box (line 39) | def _within_box(points, boxes):
  class KeypointRCNNLossComputation (line 54) | class KeypointRCNNLossComputation(object):
    method __init__ (line 55) | def __init__(self, proposal_matcher, fg_bg_sampler, discretization_size):
    method match_targets_to_proposals (line 66) | def match_targets_to_proposals(self, proposal, target):
    method prepare_targets (line 79) | def prepare_targets(self, proposals, targets):
    method subsample (line 111) | def subsample(self, proposals, targets):
    method __call__ (line 145) | def __call__(self, proposals, keypoint_logits):
  function make_roi_keypoint_loss_evaluator (line 172) | def make_roi_keypoint_loss_evaluator(cfg):

FILE: GLIP/maskrcnn_benchmark/modeling/roi_heads/keypoint_head/roi_keypoint_feature_extractors.py
  class KeypointRCNNFeatureExtractor (line 10) | class KeypointRCNNFeatureExtractor(nn.Module):
    method __init__ (line 11) | def __init__(self, cfg):
    method forward (line 37) | def forward(self, x, proposals):
  class KeypointRCNNFeature2XZoomExtractor (line 43) | class KeypointRCNNFeature2XZoomExtractor(nn.Module):
    method __init__ (line 44) | def __init__(self, cfg):
    method forward (line 79) | def forward(self, x, proposals):
  function make_roi_keypoint_feature_extractor (line 92) | def make_roi_keypoint_feature_extractor(cfg):

FILE: GLIP/maskrcnn_benchmark/modeling/roi_heads/keypoint_head/roi_keypoint_predictors.py
  class KeypointRCNNPredictor (line 7) | class KeypointRCNNPredictor(nn.Module):
    method __init__ (line 8) | def __init__(self, cfg):
    method forward (line 26) | def forward(self, x):
  function make_roi_keypoint_predictor (line 37) | def make_roi_keypoint_predictor(cfg):

FILE: GLIP/maskrcnn_benchmark/modeling/roi_heads/mask_head/hourglass.py
  class Residual (line 6) | class Residual(nn.Module):
    method __init__ (line 7) | def __init__(self, inp_dim, out_dim, use_gn=False):
    method forward (line 22) | def forward(self, x):
  class Hourglass (line 41) | class Hourglass(nn.Module):
    method __init__ (line 42) | def __init__(self, n, f, gn=False, increase=0):
    method forward (line 58) | def forward(self, x):

FILE: GLIP/maskrcnn_benchmark/modeling/roi_heads/mask_head/inference.py
  function convert_mask_grounding_to_od_logits (line 10) | def convert_mask_grounding_to_od_logits(logits, positive_map_label_to_to...
  class MaskPostProcessor (line 20) | class MaskPostProcessor(nn.Module):
    method __init__ (line 31) | def __init__(self, masker=None, mdetr_style_aggregate_class_num=None, ...
    method forward (line 37) | def forward(self, x, boxes, positive_map_label_to_token=None):
  class MaskPostProcessorCOCOFormat (line 81) | class MaskPostProcessorCOCOFormat(MaskPostProcessor):
    method forward (line 88) | def forward(self, x, boxes, positive_map_label_to_token=None, vl_versi...
  function expand_boxes (line 108) | def expand_boxes(boxes, scale):
  function expand_masks (line 125) | def expand_masks(mask, padding):
  function paste_mask_in_image (line 135) | def paste_mask_in_image(mask, box, im_h, im_w, thresh=0.5, padding=1):
  class Masker (line 174) | class Masker(object):
    method __init__ (line 180) | def __init__(self, threshold=0.5, padding=1):
    method forward_single_image (line 184) | def forward_single_image(self, masks, boxes):
    method __call__ (line 197) | def __call__(self, masks, boxes):
  function make_roi_mask_post_processor (line 214) | def make_roi_mask_post_processor(cfg):

FILE: GLIP/maskrcnn_benchmark/modeling/roi_heads/mask_head/loss.py
  function project_masks_on_boxes (line 11) | def project_masks_on_boxes(segmentation_masks, proposals, discretization...
  class MaskRCNNLossComputation (line 47) | class MaskRCNNLossComputation(object):
    method __init__ (line 48) | def __init__(self, proposal_matcher, discretization_size, vl_version=F...
    method match_targets_to_proposals (line 58) | def match_targets_to_proposals(self, proposal, target):
    method prepare_targets (line 74) | def prepare_targets(self, proposals, targets):
    method __call__ (line 123) | def __call__(self, proposals, mask_logits, targets):
  function make_roi_mask_loss_evaluator (line 167) | def make_roi_mask_loss_evaluator(cfg):

FILE: GLIP/maskrcnn_benchmark/modeling/roi_heads/mask_head/mask_head.py
  function keep_only_positive_boxes (line 13) | def keep_only_positive_boxes(boxes):
  class ROIMaskHead (line 36) | class ROIMaskHead(torch.nn.Module):
    method __init__ (line 37) | def __init__(self, cfg):
    method forward (line 45) | def forward(self, features, proposals, targets=None,
  function build_roi_mask_head (line 87) | def build_roi_mask_head(cfg):

FILE: GLIP/maskrcnn_benchmark/modeling/roi_heads/mask_head/roi_mask_feature_extractors.py
  class MaskRCNNFPNFeatureExtractor (line 13) | class MaskRCNNFPNFeatureExtractor(nn.Module):
    method __init__ (line 18) | def __init__(self, cfg):
    method forward (line 53) | def forward(self, x, proposals):
  class HourglassFPNFeatureExtractor (line 62) | class HourglassFPNFeatureExtractor(nn.Module):
    method __init__ (line 67) | def __init__(self, cfg):
    method forward (line 99) | def forward(self, x, proposals):
  function make_roi_mask_feature_extractor (line 115) | def make_roi_mask_feature_extractor(cfg):

FILE: GLIP/maskrcnn_benchmark/modeling/roi_heads/mask_head/roi_mask_predictors.py
  class MaskRCNNC4Predictor (line 11) | class MaskRCNNC4Predictor(nn.Module):
    method __init__ (line 12) | def __init__(self, cfg):
    method forward (line 38) | def forward(self, x):
  class VLMaskRCNNC4Predictor (line 43) | class VLMaskRCNNC4Predictor(nn.Module):
    method __init__ (line 44) | def __init__(self, cfg):
    method forward (line 75) | def forward(self, x, language_dict_features):
  function make_roi_mask_predictor (line 109) | def make_roi_mask_predictor(cfg):

FILE: GLIP/maskrcnn_benchmark/modeling/rpn/__init__.py
  function build_rpn (line 19) | def build_rpn(cfg):

FILE: GLIP/maskrcnn_benchmark/modeling/rpn/anchor_generator.py
  class BufferList (line 12) | class BufferList(nn.Module):
    method __init__ (line 17) | def __init__(self, buffers=None):
    method extend (line 22) | def extend(self, buffers):
    method __len__ (line 28) | def __len__(self):
    method __iter__ (line 31) | def __iter__(self):
  class AnchorGenerator (line 35) | class AnchorGenerator(nn.Module):
    method __init__ (line 41) | def __init__(
    method num_anchors_per_location (line 70) | def num_anchors_per_location(self):
    method grid_anchors (line 73) | def grid_anchors(self, grid_sizes):
    method add_visibility_to (line 97) | def add_visibility_to(self, boxlist):
    method forward (line 112) | def forward(self, image_list, feature_maps):
  function make_anchor_generator (line 139) | def make_anchor_generator(config):
  function make_anchor_generator_complex (line 157) | def make_anchor_generator_complex(config):
  class CenterAnchorGenerator (line 184) | class CenterAnchorGenerator(nn.Module):
    method __init__ (line 190) | def __init__(
    method add_visibility_to (line 208) | def add_visibility_to(self, boxlist):
    method forward (line 223) | def forward(self, centers, image_sizes, feature_maps):
  function make_center_anchor_generator (line 276) | def make_center_anchor_generator(config):
  function generate_anchors (line 356) | def generate_anchors(
  function _generate_anchors (line 370) | def _generate_anchors(base_size, scales, aspect_ratios):
  function _whctrs (line 382) | def _whctrs(anchor):
  function _mkanchors (line 391) | def _mkanchors(ws, hs, x_ctr, y_ctr):
  function _ratio_enum (line 408) | def _ratio_enum(anchor, ratios):
  function _scale_enum (line 419) | def _scale_enum(anchor, scales):

FILE: GLIP/maskrcnn_benchmark/modeling/rpn/atss.py
  class BoxCoder (line 14) | class BoxCoder(object):
    method __init__ (line 16) | def __init__(self, cfg):
    method encode (line 19) | def encode(self, gt_boxes, anchors):
    method decode (line 41) | def decode(self, preds, anchors):
  class ATSSHead (line 75) | class ATSSHead(torch.nn.Module):
    method __init__ (line 76) | def __init__(self, cfg):
    method forward (line 171) | def forward(self, x):
  class ATSSModule (line 188) | class ATSSModule(torch.nn.Module):
    method __init__ (line 190) | def __init__(self, cfg):
    method forward (line 200) | def forward(self, images, features, targets=None):
    method _forward_train (line 209) | def _forward_train(self, box_cls, box_regression, centerness, targets,...
    method _forward_test (line 231) | def _forward_test(self, box_cls, box_regression, centerness, anchors):

FILE: GLIP/maskrcnn_benchmark/modeling/rpn/dyhead.py
  class h_sigmoid (line 16) | class h_sigmoid(nn.Module):
    method __init__ (line 17) | def __init__(self, inplace=True, h_max=1):
    method forward (line 22) | def forward(self, x):
  class BoxCoder (line 26) | class BoxCoder(object):
    method __init__ (line 28) | def __init__(self, cfg):
    method encode (line 31) | def encode(self, gt_boxes, anchors):
    method decode (line 52) | def decode(self, preds, anchors):
  class Conv3x3Norm (line 85) | class Conv3x3Norm(torch.nn.Module):
    method __init__ (line 86) | def __init__(self,
    method forward (line 122) | def forward(self, input, **kwargs):
  class DyConv (line 129) | class DyConv(torch.nn.Module):
    method __init__ (line 130) | def __init__(self,
    method init_weights (line 166) | def init_weights(self):
    method forward (line 179) | def forward(self, x):
  class DyHead (line 217) | class DyHead(torch.nn.Module):
    method __init__ (line 218) | def __init__(self, cfg):
    method extract_feature (line 286) | def extract_feature(self, x):
    method forward (line 293) | def forward(self, x):
  class DyHeadModule (line 327) | class DyHeadModule(torch.nn.Module):
    method __init__ (line 329) | def __init__(self, cfg):
    method forward (line 339) | def forward(self, images, features, targets=None):
    method _forward_train (line 348) | def _forward_train(self, box_cls, box_regression, centerness, targets,...
    method _forward_test (line 375) | def _forward_test(self, box_cls, box_regression, centerness, anchors):

FILE: GLIP/maskrcnn_benchmark/modeling/rpn/fcos.py
  class FCOSHead (line 14) | class FCOSHead(torch.nn.Module):
    method __init__ (line 15) | def __init__(self, cfg):
    method forward (line 100) | def forward(self, x):
  class FCOSModule (line 126) | class FCOSModule(torch.nn.Module):
    method __init__ (line 132) | def __init__(self, cfg):
    method forward (line 152) | def forward(self, images, features, targets=None):
    method _forward_train (line 180) | def _forward_train(self, locations, box_cls, box_regression, centernes...
    method _forward_test (line 199) | def _forward_test(self, locations, box_cls, box_regression, centerness...
    method compute_locations (line 208) | def compute_locations(self, features):
    method compute_locations_per_level (line 219) | def compute_locations_per_level(self, h, w, stride, device):

FILE: GLIP/maskrcnn_benchmark/modeling/rpn/inference.py
  class RPNPostProcessor (line 16) | class RPNPostProcessor(torch.nn.Module):
    method __init__ (line 22) | def __init__(
    method add_gt_proposals (line 56) | def add_gt_proposals(self, proposals, targets):
    method forward_for_single_feature_map (line 79) | def forward_for_single_feature_map(self, anchors, objectness, box_regr...
    method forward (line 133) | def forward(self, anchors, objectness, box_regression, targets=None):
    method select_over_all_levels (line 162) | def select_over_all_levels(self, boxlists):
  function make_rpn_postprocessor (line 192) | def make_rpn_postprocessor(config, rpn_box_coder, is_train):
  class RetinaPostProcessor (line 217) | class RetinaPostProcessor(torch.nn.Module):
    method __init__ (line 223) | def __init__(
    method forward_for_single_feature_map (line 255) | def forward_for_single_feature_map(self, anchors, box_cls, box_regress...
    method select_over_all_levels (line 325) | def select_over_all_levels(self, boxlists):
    method forward (line 370) | def forward(self, anchors, objectness, box_regression, targets=None):
  function make_retina_postprocessor (line 394) | def make_retina_postprocessor(config, rpn_box_coder, is_train):
  class FCOSPostProcessor (line 414) | class FCOSPostProcessor(torch.nn.Module):
    method __init__ (line 420) | def __init__(
    method forward_for_single_feature_map (line 449) | def forward_for_single_feature_map(
    method forward (line 517) | def forward(self, locations, box_cls, box_regression, centerness, imag...
    method select_over_all_levels (line 547) | def select_over_all_levels(self, boxlists):
  function make_fcos_postprocessor (line 569) | def make_fcos_postprocessor(config, is_train=False):
  class ATSSPostProcessor (line 592) | class ATSSPostProcessor(torch.nn.Module):
    method __init__ (line 593) | def __init__(
    method forward_for_single_feature_map (line 620) | def forward_for_single_feature_map(self, box_regression, centerness, a...
    method forward (line 713) | def forward(self, box_regression, centerness, anchors,
    method select_over_all_levels (line 747) | def select_over_all_levels(self, boxlists):
  function convert_grounding_to_od_logits (line 771) | def convert_grounding_to_od_logits(logits, box_cls, positive_map, score_...
  function convert_grounding_to_od_logits_v2 (line 792) | def convert_grounding_to_od_logits_v2(logits, num_class, positive_map, s...
  function make_atss_postprocessor (line 825) | def make_atss_postprocessor(config, box_coder, is_train=False):

FILE: GLIP/maskrcnn_benchmark/modeling/rpn/loss.py
  class RPNLossComputation (line 28) | class RPNLossComputation(object):
    method __init__ (line 33) | def __init__(self, proposal_matcher, fg_bg_sampler, box_coder):
    method match_targets_to_anchors (line 45) | def match_targets_to_anchors(self, anchor, target):
    method prepare_targets (line 64) | def prepare_targets(self, anchors, targets):
    method __call__ (line 95) | def __call__(self, anchors, objectness, box_regression, targets):
  class FocalLossComputation (line 156) | class FocalLossComputation(object):
    method __init__ (line 161) | def __init__(self, proposal_matcher, box_coder,
    method match_targets_to_anchors (line 180) | def match_targets_to_anchors(self, anchor, target, copied_fields=[]):
    method prepare_targets (line 194) | def prepare_targets(self, anchors, targets):
    method __call__ (line 230) | def __call__(self, anchors, box_cls, box_regression, targets):
  class FCOSLossComputation (line 270) | class FCOSLossComputation(object):
    method __init__ (line 275) | def __init__(self, cfg):
    method get_sample_region (line 291) | def get_sample_region(self, gt, strides, num_points_per, gt_xs, gt_ys,...
    method prepare_targets (line 338) | def prepare_targets(self, points, targets):
    method compute_targets_for_locations (line 384) | def compute_targets_for_locations(self, locations, targets, object_siz...
    method compute_centerness_targets (line 443) | def compute_centerness_targets(self, reg_targets):
    method __call__ (line 451) | def __call__(self, locations, box_cls, box_regression, centerness, tar...
  class ATSSLossComputation (line 518) | class ATSSLossComputation(torch.nn.Module):
    method __init__ (line 520) | def __init__(self, cfg, box_coder):
    method NllSoftMaxLoss (line 582) | def NllSoftMaxLoss(self, logits, target):
    method ContrastiveAlignLoss (line 587) | def ContrastiveAlignLoss(self, logits, positive_map):
    method GIoULoss (line 610) | def GIoULoss(self, pred, target, anchor, weight=None):
    method prepare_targets (line 653) | def prepare_targets(self, targets, anchors, tokenized=None, positive_m...
    method compute_centerness_targets (line 832) | def compute_centerness_targets(self, reg_targets, anchors):
    method __call__ (line 848) | def __call__(self, box_cls, box_regression, centerness, targets, anchors,
  function generate_anchor_labels (line 1202) | def generate_anchor_labels(matched_targets):
  function make_focal_loss_evaluator (line 1207) | def make_focal_loss_evaluator(cfg, box_coder):
  function make_rpn_loss_evaluator (line 1229) | def make_rpn_loss_evaluator(cfg, box_coder):
  function make_fcos_loss_evaluator (line 1244) | def make_fcos_loss_evaluator(cfg):
  function make_atss_loss_evaluator (line 1249) | def make_atss_loss_evaluator(cfg, box_coder):

FILE: GLIP/maskrcnn_benchmark/modeling/rpn/modeling_bert.py
  function clamp_values (line 34) | def clamp_values(vector, min_val = -50000, max_val = 50000):
  class BertSelfAttention (line 39) | class BertSelfAttention(nn.Module):
    method __init__ (line 40) | def __init__(self, config, clamp_min_for_underflow=False, clamp_max_fo...
    method transpose_for_scores (line 66) | def transpose_for_scores(self, x):
    method forward (line 71) | def forward(
  class BertSelfOutput (line 179) | class BertSelfOutput(nn.Module):
    method __init__ (line 180) | def __init__(self, config):
    method forward (line 186) | def forward(self, hidden_states, input_tensor):
  class BertAttention (line 193) | class BertAttention(nn.Module):
    method __init__ (line 194) | def __init__(self, config, clamp_min_for_underflow=False, clamp_max_fo...
    method prune_heads (line 200) | def prune_heads(self, heads):
    method forward (line 218) | def forward(
  class BertIntermediate (line 242) | class BertIntermediate(nn.Module):
    method __init__ (line 243) | def __init__(self, config):
    method forward (line 251) | def forward(self, hidden_states):
  class BertOutput (line 259) | class BertOutput(nn.Module):
    method __init__ (line 260) | def __init__(self, config):
    method forward (line 266) | def forward(self, hidden_states, input_tensor):

FILE: GLIP/maskrcnn_benchmark/modeling/rpn/retina.py
  class RetinaNetHead (line 15) | class RetinaNetHead(torch.nn.Module):
    method __init__ (line 20) | def __init__(self, cfg):
    method forward (line 84) | def forward(self, x):
  class RetinaNetModule (line 93) | class RetinaNetModule(torch.nn.Module):
    method __init__ (line 99) | def __init__(self, cfg):
    method forward (line 118) | def forward(self, images, features, targets=None):
    method _forward_train (line 141) | def _forward_train(self, anchors, box_cls, box_regression, targets):
    method _forward_test (line 152) | def _forward_test(self, anchors, box_cls, box_regression):

FILE: GLIP/maskrcnn_benchmark/modeling/rpn/rpn.py
  class mRPNHead (line 14) | class mRPNHead(nn.Module):
    method __init__ (line 19) | def __init__(self, cfg, in_channels, num_anchors):
    method forward (line 36) | def forward(self, x):
  class RPNHead (line 47) | class RPNHead(nn.Module):
    method __init__ (line 52) | def __init__(self, cfg, in_channels, num_anchors):
    method forward (line 72) | def forward(self, x):
  class RPNModule (line 82) | class RPNModule(torch.nn.Module):
    method __init__ (line 88) | def __init__(self, cfg):
    method forward (line 114) | def forward(self, images, features, targets=None):
    method _forward_train (line 137) | def _forward_train(self, anchors, objectness, rpn_box_regression, targ...
    method _forward_test (line 160) | def _forward_test(self, anchors, objectness, rpn_box_regression):

FILE: GLIP/maskrcnn_benchmark/modeling/rpn/transformer.py
  function _get_clones (line 9) | def _get_clones(module, N):
  function _get_activation_fn (line 13) | def _get_activation_fn(activation):
  class TransformerEncoderLayer (line 24) | class TransformerEncoderLayer(nn.Module):
    method __init__ (line 25) | def __init__(self, d_model, nhead, dim_feedforward=2048, dropout=0.1,
    method forward (line 42) | def forward(self, src,

FILE: GLIP/maskrcnn_benchmark/modeling/rpn/vldyhead.py
  class h_sigmoid (line 28) | class h_sigmoid(nn.Module):
    method __init__ (line 29) | def __init__(self, inplace=True, h_max=1):
    method forward (line 34) | def forward(self, x):
  class BoxCoder (line 38) | class BoxCoder(object):
    method __init__ (line 40) | def __init__(self, cfg):
    method encode (line 43) | def encode(self, gt_boxes, anchors):
    method decode (line 64) | def decode(self, preds, anchors):
  class Conv3x3Norm (line 97) | class Conv3x3Norm(torch.nn.Module):
    method __init__ (line 98) | def __init__(self,
    method forward (line 134) | def forward(self, input, **kwargs):
  class DyConv (line 141) | class DyConv(torch.nn.Module):
    method __init__ (line 142) | def __init__(self,
    method init_weights (line 178) | def init_weights(self):
    method forward (line 191) | def forward(self, inputs):
  class BertEncoderLayer (line 236) | class BertEncoderLayer(BertPreTrainedModel):
    method __init__ (line 237) | def __init__(self, config,  clamp_min_for_underflow = False, clamp_max...
    method forward (line 250) | def forward(self, inputs):
    method feed_forward_chunk (line 284) | def feed_forward_chunk(self, attention_output):
  class CLIPTransformerLayer (line 290) | class CLIPTransformerLayer(nn.Module):
    method __init__ (line 291) | def __init__(self, config):
    method _init_weights (line 310) | def _init_weights(self, m):
    method attention (line 318) | def attention(self, x: torch.Tensor, key_padding_mask: torch.Tensor = ...
    method forward (line 323) | def forward(self, inputs):
  class DummyLayer (line 342) | class DummyLayer(nn.Module):
    method __init__ (line 343) | def __init__(self):
    method forward (line 346) | def forward(self, inputs):
  class VLFuse (line 350) | class VLFuse(torch.nn.Module):
    method __init__ (line 355) | def __init__(self, cfg):
    method init_configs (line 423) | def init_configs(self, cfg):
    method forward (line 447) | def forward(self, x):
  class VLDyHead (line 560) | class VLDyHead(torch.nn.Module):
    method __init__ (line 561) | def __init__(self, cfg):
    method forward (line 731) | def forward(self, x, language_dict_features=None, embedding=None, swin...
  class VLDyHeadModule (line 862) | class VLDyHeadModule(torch.nn.Module):
    method __init__ (line 864) | def __init__(self, cfg):
    method forward (line 892) | def forward(self, images, features, targets=None,
    method _forward_train (line 950) | def _forward_train(self, box_cls, box_regression, centerness, targets,...
    method _forward_test (line 1022) | def _forward_test(self, box_regression, centerness, anchors,

FILE: GLIP/maskrcnn_benchmark/modeling/utils.py
  function cat (line 9) | def cat(tensors, dim=0):
  function permute_and_flatten (line 19) | def permute_and_flatten(layer, N, A, C, H, W):
  function concat_box_prediction_layers (line 26) | def concat_box_prediction_layers(box_regression, box_cls=None, token_log...
  function round_channels (line 75) | def round_channels(channels, divisor=8):

FILE: GLIP/maskrcnn_benchmark/solver/build.py
  function make_optimizer (line 8) | def make_optimizer(cfg, model):
  function make_lr_scheduler (line 58) | def make_lr_scheduler(cfg, optimizer):

FILE: GLIP/maskrcnn_benchmark/solver/lr_scheduler.py
  class WarmupMultiStepLR (line 11) | class WarmupMultiStepLR(torch.optim.lr_scheduler._LRScheduler):
    method __init__ (line 12) | def __init__(
    method get_lr (line 40) | def get_lr(self):
  class WarmupCosineAnnealingLR (line 56) | class WarmupCosineAnnealingLR(torch.optim.lr_scheduler._LRScheduler):
    method __init__ (line 57) | def __init__(
    method get_lr (line 82) | def get_lr(self):
  class WarmupReduceLROnPlateau (line 104) | class WarmupReduceLROnPlateau(torch.optim.lr_scheduler.ReduceLROnPlateau):
    method __init__ (line 105) | def __init__(
    method step (line 140) | def step(self, metrics=None):

FILE: GLIP/maskrcnn_benchmark/structures/bounding_box.py
  class BoxList (line 9) | class BoxList(object):
    method __init__ (line 19) | def __init__(self, bbox, image_size, mode="xyxy"):
    method _jit_unwrap (line 43) | def _jit_unwrap(self):
    method _jit_wrap (line 48) | def _jit_wrap(self, input_stream):
    method add_field (line 57) | def add_field(self, field, field_data):
    method get_field (line 60) | def get_field(self, field):
    method has_field (line 63) | def has_field(self, field):
    method fields (line 66) | def fields(self):
    method _copy_extra_fields (line 69) | def _copy_extra_fields(self, bbox):
    method convert (line 73) | def convert(self, mode):
    method _split_into_xyxy (line 94) | def _split_into_xyxy(self):
    method resize (line 110) | def resize(self, size, *args, **kwargs):
    method transpose (line 148) | def transpose(self, method):
    method crop (line 186) | def crop(self, box):
    method to (line 213) | def to(self, device):
    method __getitem__ (line 221) | def __getitem__(self, item):
    method __len__ (line 227) | def __len__(self):
    method clip_to_image (line 230) | def clip_to_image(self, remove_empty=True):
    method area (line 243) | def area(self):
    method copy_with_fields (line 256) | def copy_with_fields(self, fields):
    method __repr__ (line 264) | def __repr__(self):
    method concate_box_list (line 273) | def concate_box_list(list_of_boxes):
  function _onnx_clip_boxes_to_image (line 287) | def _onnx_clip_boxes_to_image(boxes, size):

FILE: GLIP/maskrcnn_benchmark/structures/boxlist_ops.py
  function boxlist_nms (line 10) | def boxlist_nms(boxlist, nms_thresh, max_proposals=-1, score_field="scor...
  function boxlist_ml_nms (line 35) | def boxlist_ml_nms(boxlist, nms_thresh, max_proposals=-1,
  function remove_small_boxes (line 78) | def remove_small_boxes(boxlist, min_size):
  function boxlist_iou (line 97) | def boxlist_iou(boxlist1, boxlist2):
  function _cat (line 136) | def _cat(tensors, dim=0):
  function cat_boxlist (line 148) | def cat_boxlist(bboxes):
  function getUnionBBox (line 177) | def getUnionBBox(aBB, bBB, margin = 10):

FILE: GLIP/maskrcnn_benchmark/structures/image_list.py
  class ImageList (line 7) | class ImageList(object):
    method __init__ (line 15) | def __init__(self, tensors, image_sizes):
    method to (line 24) | def to(self, *args, **kwargs):
  function to_image_list (line 29) | def to_image_list(tensors, size_divisible=0):

FILE: GLIP/maskrcnn_benchmark/structures/keypoint.py
  class Keypoints (line 9) | class Keypoints(object):
    method __init__ (line 10) | def __init__(self, keypoints, size, mode=None):
    method crop (line 27) | def crop(self, box):
    method resize (line 30) | def resize(self, size, *args, **kwargs):
    method transpose (line 41) | def transpose(self, method):
    method to (line 62) | def to(self, *args, **kwargs):
    method __getitem__ (line 70) | def __getitem__(self, item):
    method add_field (line 76) | def add_field(self, field, field_data):
    method get_field (line 79) | def get_field(self, field):
    method __repr__ (line 82) | def __repr__(self):
  class PersonKeypoints (line 90) | class PersonKeypoints(Keypoints):
    method __init__ (line 121) | def __init__(self, *args, **kwargs):
    method to_coco_format (line 133) | def to_coco_format(self):
    method _create_flip_indices (line 144) | def _create_flip_indices(self, names, flip_map):
    method _kp_connections (line 152) | def _kp_connections(self, keypoints):
  function keypoints_to_heat_map (line 178) | def keypoints_to_heat_map(keypoints, rois, heatmap_size):

FILE: GLIP/maskrcnn_benchmark/structures/segmentation_mask.py
  class Mask (line 11) | class Mask(object):
    method __init__ (line 18) | def __init__(self, masks, size, mode):
    method transpose (line 23) | def transpose(self, method):
    method crop (line 41) | def crop(self, box):
    method resize (line 47) | def resize(self, size, *args, **kwargs):
  class Polygons (line 51) | class Polygons(object):
    method __init__ (line 58) | def __init__(self, polygons, size, mode):
    method transpose (line 69) | def transpose(self, method):
    method crop (line 92) | def crop(self, box):
    method resize (line 108) | def resize(self, size, *args, **kwargs):
    method convert (line 125) | def convert(self, mode):
    method __repr__ (line 137) | def __repr__(self):
  class SegmentationMask (line 146) | class SegmentationMask(object):
    method __init__ (line 151) | def __init__(self, polygons, size, mode=None):
    method transpose (line 165) | def transpose(self, method):
    method crop (line 176) | def crop(self, box):
    method resize (line 183) | def resize(self, size, *args, **kwargs):
    method to (line 189) | def to(self, *args, **kwargs):
    method __getitem__ (line 192) | def __getitem__(self, item):
    method __iter__ (line 206) | def __iter__(self):
    method __repr__ (line 209) | def __repr__(self):

FILE: GLIP/maskrcnn_benchmark/utils/amp.py
  function nullcontext (line 4) | def nullcontext(enter_result=None, **kwargs):

FILE: GLIP/maskrcnn_benchmark/utils/big_model_loading.py
  function tf2th (line 8) | def tf2th(conv_weights):
  function _rename_conv_weights_for_deformable_conv_layers (line 15) | def _rename_conv_weights_for_deformable_conv_layers(state_dict, cfg):
  function load_big_format (line 47) | def load_big_format(cfg, f):

FILE: GLIP/maskrcnn_benchmark/utils/c2_model_loading.py
  function _rename_basic_resnet_weights (line 12) | def _rename_basic_resnet_weights(layer_keys):
  function _rename_fpn_weights (line 64) | def _rename_fpn_weights(layer_keys, stage_names):
  function _rename_weights_for_resnet (line 84) | def _rename_weights_for_resnet(weights, stage_names):
  function _load_c2_pickled_weights (line 135) | def _load_c2_pickled_weights(file_path):
  function _rename_conv_weights_for_deformable_conv_layers (line 148) | def _rename_conv_weights_for_deformable_conv_layers(state_dict, cfg):
  function load_resnet_c2_format (line 193) | def load_resnet_c2_format(cfg, f):
  function load_c2_format (line 206) | def load_c2_format(cfg, f):

FILE: GLIP/maskrcnn_benchmark/utils/checkpoint.py
  class Checkpointer (line 15) | class Checkpointer(object):
    method __init__ (line 16) | def __init__(
    method save (line 34) | def save(self, name, **kwargs):
    method load (line 59) | def load(self, f=None, force=False, keyword="model", skip_optimizer =F...
    method has_checkpoint (line 93) | def has_checkpoint(self):
    method get_checkpoint_file (line 97) | def get_checkpoint_file(self):
    method tag_last_checkpoint (line 109) | def tag_last_checkpoint(self, last_filename):
    method _load_file (line 114) | def _load_file(self, f):
    method _load_model (line 117) | def _load_model(self, checkpoint, keyword="model"):
  class DetectronCheckpointer (line 121) | class DetectronCheckpointer(Checkpointer):
    method __init__ (line 122) | def __init__(
    method _load_file (line 137) | def _load_file(self, f):

FILE: GLIP/maskrcnn_benchmark/utils/collect_env.py
  function get_pil_version (line 7) | def get_pil_version():
  function collect_env_info (line 11) | def collect_env_info():

FILE: GLIP/maskrcnn_benchmark/utils/comm.py
  function get_world_size (line 15) | def get_world_size():
  function get_rank (line 23) | def get_rank():
  function is_main_process (line 31) | def is_main_process():
  function synchronize (line 35) | def synchronize():
  function all_gather (line 50) | def all_gather(data):
  function reduce_dict (line 93) | def reduce_dict(input_dict, average=True):
  function broadcast_data (line 122) | def broadcast_data(data):
  function reduce_sum (line 137) | def reduce_sum(tensor):
  function shared_random_seed (line 146) | def shared_random_seed():

FILE: GLIP/maskrcnn_benchmark/utils/cv2_util.py
  function findContours (line 8) | def findContours(*args, **kwargs):

FILE: GLIP/maskrcnn_benchmark/utils/dist.py
  function _get_global_gloo_group (line 20) | def _get_global_gloo_group():
  function all_gather (line 32) | def all_gather(data):
  function reduce_dict (line 92) | def reduce_dict(input_dict, average=True):
  function setup_for_distributed (line 119) | def setup_for_distributed(is_master):
  function is_dist_avail_and_initialized (line 135) | def is_dist_avail_and_initialized():
  function get_world_size (line 147) | def get_world_size():
  function get_rank (line 157) | def get_rank():
  function get_local_rank (line 167) | def get_local_rank() -> int:
  function get_local_size (line 180) | def get_local_size() -> int:
  function is_main_process (line 193) | def is_main_process():
  function save_on_master (line 198) | def save_on_master(*args, **kwargs):
  function init_distributed_mode (line 204) | def init_distributed_mode(args):

FILE: GLIP/maskrcnn_benchmark/utils/ema.py
  class ModelEma (line 6) | class ModelEma:
    method __init__ (line 7) | def __init__(self, model, decay=0.9999, device=''):
    method load_checkpoint (line 18) | def load_checkpoint(self, checkpoint):
    method state_dict (line 33) | def state_dict(self):
    method update (line 36) | def update(self, model):

FILE: GLIP/maskrcnn_benchmark/utils/env.py
  function setup_environment (line 7) | def setup_environment():
  function setup_custom_environment (line 20) | def setup_custom_environment(custom_module_path):

FILE: GLIP/maskrcnn_benchmark/utils/flops.py
  function profile (line 15) | def profile(model, input_size, custom_ops={}, device="cpu", verbose=Fals...
  function count_conv2d (line 79) | def count_conv2d(m, x, y):
  function count_convtranspose2d (line 100) | def count_convtranspose2d(m, x, y):
  function count_bn (line 123) | def count_bn(m, x, y):
  function count_relu (line 131) | def count_relu(m, x, y):
  function count_softmax (line 138) | def count_softmax(m, x, y):
  function count_maxpool (line 148) | def count_maxpool(m, x, y):
  function count_adap_maxpool (line 155) | def count_adap_maxpool(m, x, y):
  function count_avgpool (line 163) | def count_avgpool(m, x, y):
  function count_adap_avgpool (line 172) | def count_adap_avgpool(m, x, y):
  function count_linear (line 182) | def count_linear(m, x, y):
  function count_LastLevelMaxPool (line 191) | def count_LastLevelMaxPool(m, x, y):
  function count_ROIAlign (line 197) | def count_ROIAlign(m, x, y):

FILE: GLIP/maskrcnn_benchmark/utils/fuse_helper.py
  class BertPredictionHeadTransform (line 10) | class BertPredictionHeadTransform(nn.Module):
    method __init__ (line 11) | def __init__(self, config):
    method forward (line 20) | def forward(self, hidden_states):
  class BertLMPredictionHead (line 27) | class BertLMPredictionHead(nn.Module):
    method __init__ (line 28) | def __init__(self, config):
    method forward (line 41) | def forward(self, hidden_states):
  class FeatureResizer (line 46) | class FeatureResizer(nn.Module):
    method __init__ (line 52) | def __init__(self, input_feat_size, output_feat_size, dropout, do_ln=T...
    method forward (line 60) | def forward(self, encoder_features):
  function _make_conv (line 68) | def _make_conv(input_dim, output_dim, k, stride=1):
  function _make_mlp (line 77) | def _make_mlp(input_dim, output_dim, drop):
  function _make_coord (line 87) | def _make_coord(batch, height, width):
  function l1norm (line 106) | def l1norm(X, dim, eps=1e-8):
  function l2norm (line 114) | def l2norm(X, dim, eps=1e-8):
  function func_attention (line 122) | def func_attention(query, context, smooth=1, raw_feature_norm="softmax",...
  class BiMultiHeadAttention (line 171) | class BiMultiHeadAttention(nn.Module):
    method __init__ (line 172) | def __init__(self, v_dim, l_dim, embed_dim, num_heads, dropout=0.1, cf...
    method _shape (line 201) | def _shape(self, tensor: torch.Tensor, seq_len: int, bsz: int):
    method _reset_parameters (line 204) | def _reset_parameters(self):
    method forward (line 218) | def forward(self, v, l, attention_mask_l=None):
  class BiAttentionBlock (line 307) | class BiAttentionBlock(nn.Module):
    method __init__ (line 308) | def __init__(self, v_dim, l_dim, embed_dim, num_heads, hidden_dim=None...
    method forward (line 335) | def forward(self, v, l, attention_mask_l=None, dummy_tensor=None):
  class BiAttentionBlockForCheckpoint (line 344) | class BiAttentionBlockForCheckpoint(nn.Module):
    method __init__ (line 345) | def __init__(self, v_dim, l_dim, embed_dim, num_heads, hidden_dim=None...
    method forward (line 377) | def forward(self, q0, q1, q2, q3, q4, l, attention_mask_l=None, dummy_...
    method single_attention_call (line 419) | def single_attention_call(self, v, l, attention_mask_l=None, dummy_ten...
  class MultiHeadAttention (line 430) | class MultiHeadAttention(nn.Module):
    method __init__ (line 435) | def __init__(self, q_dim, k_dim, embed_dim, num_heads, dropout=0.1,
    method _shape (line 460) | def _shape(self, tensor: torch.Tensor, seq_len: int, bsz: int):
    method _reset_parameters (line 463) | def _reset_parameters(self):
    method forward (line 473) | def forward(self, q, k, v, attention_mask=None, return_attention=False):
  class AttentionMLP (line 543) | class AttentionMLP(nn.Module):
    method __init__ (line 544) | def __init__(self, q_dim, hidden_dim, dropout=0.1):
    method forward (line 552) | def forward(self, hidden_states):
  class AttentionT2I (line 559) | class AttentionT2I(nn.Module):
    method __init__ (line 560) | def __init__(self, q_dim, k_dim, embed_dim, num_heads, hidden_dim=None...
    method forward (line 591) | def forward(self, q0, q1, q2, q3, q4, k, v, attention_mask, dummy_arg=...

FILE: GLIP/maskrcnn_benchmark/utils/imports.py
  function import_file (line 11) | def import_file(module_name, file_path, make_importable=False):
  function import_file (line 21) | def import_file(module_name, file_path, make_importable=None):

FILE: GLIP/maskrcnn_benchmark/utils/logger.py
  function setup_logger (line 7) | def setup_logger(name, save_dir, distributed_rank):

FILE: GLIP/maskrcnn_benchmark/utils/mdetr_dist.py
  function _get_global_gloo_group (line 21) | def _get_global_gloo_group():
  function all_gather (line 32) | def all_gather(data):
  function reduce_dict (line 92) | def reduce_dict(input_dict, average=True):
  function setup_for_distributed (line 119) | def setup_for_distributed(is_master):
  function is_dist_avail_and_initialized (line 135) | def is_dist_avail_and_initialized():
  function get_world_size (line 147) | def get_world_size():
  function get_rank (line 157) | def get_rank():
  function get_local_rank (line 167) | def get_local_rank() -> int:
  function get_local_size (line 180) | def get_local_size() -> int:
  function is_main_process (line 193) | def is_main_process():
  function save_on_master (line 198) | def save_on_master(*args, **kwargs):
  function init_distributed_mode (line 204) | def init_distributed_mode(args):

FILE: GLIP/maskrcnn_benchmark/utils/metric_logger.py
  class SmoothedValue (line 11) | class SmoothedValue(object):
    method __init__ (line 16) | def __init__(self, window_size=20):
    method update (line 22) | def update(self, value):
    method median (line 31) | def median(self):
    method avg (line 36) | def avg(self):
    method global_avg (line 41) | def global_avg(self):
  class AverageMeter (line 45) | class AverageMeter(object):
    method __init__ (line 48) | def __init__(self):
    method reset (line 51) | def reset(self):
    method update (line 57) | def update(self, val, n=1):
  class MetricLogger (line 64) | class MetricLogger(object):
    method __init__ (line 65) | def __init__(self, delimiter="\t"):
    method update (line 69) | def update(self, **kwargs):
    method __getattr__ (line 76) | def __getattr__(self, attr):
    method __str__ (line 84) | def __str__(self):
  class TensorboardLogger (line 94) | class TensorboardLogger(MetricLogger):
    method __init__ (line 95) | def __init__(self,
    method _get_tensorboard_writer (line 105) | def _get_tensorboard_writer(log_dir):
    method update (line 121) | def update(self, **kwargs):

FILE: GLIP/maskrcnn_benchmark/utils/miscellaneous.py
  function mkdir (line 6) | def mkdir(path):
  function save_config (line 14) | def save_config(cfg, path):

FILE: GLIP/maskrcnn_benchmark/utils/model_serialization.py
  function resize_2d (line 9) | def resize_2d(posemb, shape_new):
  function align_and_update_state_dicts (line 20) | def align_and_update_state_dicts(model_state_dict, loaded_state_dict, re...
  function strip_prefix_if_present (line 103) | def strip_prefix_if_present(state_dict, prefix):
  function load_state_dict (line 112) | def load_state_dict(model, loaded_state_dict):
  function _group_checkpoint_keys (line 123) | def _group_checkpoint_keys(keys):
  function _group_to_str (line 143) | def _group_to_str(group):

FILE: GLIP/maskrcnn_benchmark/utils/model_zoo.py
  function cache_url (line 20) | def cache_url(url, model_dir='model', progress=True):

FILE: GLIP/maskrcnn_benchmark/utils/pretrain_model_loading.py
  function _remove_bn_statics (line 7) | def _remove_bn_statics(state_dict):
  function _rename_conv_weights_for_deformable_conv_layers (line 17) | def _rename_conv_weights_for_deformable_conv_layers(state_dict, cfg):
  function load_pretrain_format (line 44) | def load_pretrain_format(cfg, f):

FILE: GLIP/maskrcnn_benchmark/utils/registry.py
  function _register_generic (line 4) | def _register_generic(module_dict, module_name, module):
  class Registry (line 9) | class Registry(dict):
    method __init__ (line 31) | def __init__(self, *args, **kwargs):
    method register (line 34) | def register(self, module_name, module=None):

FILE: GLIP/maskrcnn_benchmark/utils/shallow_contrastive_loss_helper.py
  function normalized_positive_map (line 5) | def normalized_positive_map(positive_map):
  function pad_tensor_given_dim_length (line 13) | def pad_tensor_given_dim_length(tensor, dim, length, padding_value=0, ba...
  function pad_random_negative_tensor_given_length (line 23) | def pad_random_negative_tensor_given_length(positive_tensor, negative_pa...
  function gather_tensors (line 28) | def gather_tensors(tensor):
  function convert_to_roi_format (line 53) | def convert_to_roi_format(boxes):

FILE: GLIP/maskrcnn_benchmark/utils/stats.py
  function get_model_complexity_info (line 18) | def get_model_complexity_info(model, input_res,
  function flops_to_string (line 58) | def flops_to_string(flops, units='GMac', precision=2):
  function params_to_string (line 79) | def params_to_string(params_num, units=None, precision=2):
  function accumulate_flops (line 96) | def accumulate_flops(self):
  function print_model_with_flops (line 106) | def print_model_with_flops(model, total_flops, total_params, units='GMac',
  function get_model_parameters_number (line 150) | def get_model_parameters_number(model):
  function add_flops_counting_methods (line 155) | def add_flops_counting_methods(net_main_module):
  function compute_average_flops_cost (line 169) | def compute_average_flops_cost(self):
  function start_flops_count (line 191) | def start_flops_count(self, **kwargs):
  function stop_flops_count (line 233) | def stop_flops_count(self):
  function reset_flops_count (line 246) | def reset_flops_count(self):
  function empty_flops_counter_hook (line 259) | def empty_flops_counter_hook(module, input, output):
  function upsample_flops_counter_hook (line 263) | def upsample_flops_counter_hook(module, input, output):
  function relu_flops_counter_hook (line 272) | def relu_flops_counter_hook(module, input, output):
  function linear_flops_counter_hook (line 277) | def linear_flops_counter_hook(module, input, output):
  function pool_flops_counter_hook (line 285) | def pool_flops_counter_hook(module, input, output):
  function bn_flops_counter_hook (line 290) | def bn_flops_counter_hook(module, input, output):
  function conv_flops_counter_hook (line 299) | def conv_flops_counter_hook(conv_module, input, output):
  function batch_counter_hook (line 330) | def batch_counter_hook(module, input, output):
  function rnn_flops (line 343) | def rnn_flops(flops, rnn_module, w_ih, w_hh, input_size):
  function rnn_flops_counter_hook (line 368) | def rnn_flops_counter_hook(rnn_module, input, output):
  function rnn_cell_flops_counter_hook (line 401) | def rnn_cell_flops_counter_hook(rnn_cell_module, input, output):
  function add_batch_counter_variables_or_reset (line 418) | def add_batch_counter_variables_or_reset(module):
  function add_batch_counter_hook_function (line 423) | def add_batch_counter_hook_function(module):
  function remove_batch_counter_hook_function (line 431) | def remove_batch_counter_hook_function(module):
  function add_flops_counter_variable_or_reset (line 437) | def add_flops_counter_variable_or_reset(module):
  function is_supported_instance (line 499) | def is_supported_instance(module):
  function remove_flops_counter_hook_function (line 506) | def remove_flops_counter_hook_function(module):

FILE: GLIP/setup.py
  function get_extensions (line 17) | def get_extensions():

FILE: GLIP/tools/cityscapes/convert_cityscapes_to_coco.py
  function parse_args (line 38) | def parse_args():
  function convert_coco_stuff_mat (line 53) | def convert_coco_stuff_mat(data_dir, out_dir):
  function getLabelID (line 94) | def getLabelID(self, instID):
  function convert_cityscapes_instance_only (line 101) | def convert_cityscapes_instance_only(

FILE: GLIP/tools/cityscapes/instances2dict_with_polygons.py
  function instances2dict_with_polygons (line 19) | def instances2dict_with_polygons(imageFileList, verbose=False):
  function main (line 72) | def main(argv):

FILE: GLIP/tools/eval_all.py
  function main (line 25) | def main():

FILE: GLIP/tools/finetune.py
  function removekey (line 35) | def removekey(d, prefix):
  function train (line 47) | def train(cfg, local_rank, distributed, zero_shot, skip_optimizer_resume...
  function test (line 177) | def test(cfg, model, distributed, verbose=False):
  function tuning_highlevel_override (line 215) | def tuning_highlevel_override(cfg,):
  function report_freeze_options (line 262) | def report_freeze_options(cfg):
  function main (line 271) | def main():

FILE: GLIP/tools/test_grounding_net.py
  function init_distributed_mode (line 31) | def init_distributed_mode(args):
  function setup_for_distributed (line 58) | def setup_for_distributed(is_master):
  function main (line 73) | def main():

FILE: GLIP/tools/test_net.py
  function run_test (line 22) | def run_test(cfg, model, distributed, log_dir):
  function main (line 57) | def main():

FILE: GLIP/tools/train_net.py
  function train (line 34) | def train(cfg, local_rank, distributed, use_tensorboard=False,):
  function setup_for_distributed (line 145) | def setup_for_distributed(is_master):
  function main (line 159) | def main():

FILE: GroundingDINO/demo/gradio_app.py
  function load_model_hf (line 42) | def load_model_hf(model_config_path, repo_id, filename, device='cpu'):
  function image_transform_grounding (line 54) | def image_transform_grounding(init_image):
  function image_transform_grounding_for_vis (line 63) | def image_transform_grounding_for_vis(init_image):
  function run_grounding (line 72) | def run_grounding(input_image, grounding_caption, box_threshold, text_th...

FILE: GroundingDINO/demo/inference_on_a_image.py
  function plot_boxes_to_image (line 16) | def plot_boxes_to_image(image_pil, tgt):
  function load_image (line 57) | def load_image(image_path):
  function load_model (line 72) | def load_model(model_config_path, model_checkpoint_path, cpu_only=False):
  function get_grounding_output (line 83) | def get_grounding_output(model, image, caption, box_threshold, text_thre...

FILE: GroundingDINO/groundingdino/datasets/transforms.py
  function crop (line 17) | def crop(image, target, region):
  function hflip (line 68) | def hflip(image, target):
  function resize (line 87) | def resize(image, target, size, max_size=None):
  function pad (line 149) | def pad(image, target, padding):
  class ResizeDebug (line 162) | class ResizeDebug(object):
    method __init__ (line 163) | def __init__(self, size):
    method __call__ (line 166) | def __call__(self, img, target):
  class RandomCrop (line 170) | class RandomCrop(object):
    method __init__ (line 171) | def __init__(self, size):
    method __call__ (line 174) | def __call__(self, img, target):
  class RandomSizeCrop (line 179) | class RandomSizeCrop(object):
    method __init__ (line 180) | def __init__(self, min_size: int, max_size: int, respect_boxes: bool =...
    method __call__ (line 187) | def __call__(self, img: PIL.Image.Image, target: dict):
  class CenterCrop (line 204) | class CenterCrop(object):
    method __init__ (line 205) | def __init__(self, size):
    method __call__ (line 208) | def __call__(self, img, target):
  class RandomHorizontalFlip (line 216) | class RandomHorizontalFlip(object):
    method __init__ (line 217) | def __init__(self, p=0.5):
    method __call__ (line 220) | def __call__(self, img, target):
  class RandomResize (line 226) | class RandomResize(object):
    method __init__ (line 227) | def __init__(self, sizes, max_size=None):
    method __call__ (line 232) | def __call__(self, img, target=None):
  class RandomPad (line 237) | class RandomPad(object):
    method __init__ (line 238) | def __init__(self, max_pad):
    method __call__ (line 241) | def __call__(self, img, target):
  class RandomSelect (line 247) | class RandomSelect(object):
    method __init__ (line 253) | def __init__(self, transforms1, transforms2, p=0.5):
    method __call__ (line 258) | def __call__(self, img, target):
  class ToTensor (line 264) | class ToTensor(object):
    method __call__ (line 265) | def __call__(self, img, target):
  class RandomErasing (line 269) | class RandomErasing(object):
    method __init__ (line 270) | def __init__(self, *args, **kwargs):
    method __call__ (line 273) | def __call__(self, img, target):
  class Normalize (line 277) | class Normalize(object):
    method __init__ (line 278) | def __init__(self, mean, std):
    method __call__ (line 282) | def __call__(self, image, target=None):
  class Compose (line 296) | class Compose(object):
    method __init__ (line 297) | def __init__(self, transforms):
    method __call__ (line 300) | def __call__(self, image, target):
    method __repr__ (line 305) | def __repr__(self):

FILE: GroundingDINO/groundingdino/models/GroundingDINO/backbone/backbone.py
  class FrozenBatchNorm2d (line 33) | class FrozenBatchNorm2d(torch.nn.Module):
    method __init__ (line 42) | def __init__(self, n):
    method _load_from_state_dict (line 49) | def _load_from_state_dict(
    method forward (line 60) | def forward(self, x):
  class BackboneBase (line 73) | class BackboneBase(nn.Module):
    method __init__ (line 74) | def __init__(
    method forward (line 107) | def forward(self, tensor_list: NestedTensor):
  class Backbone (line 119) | class Backbone(BackboneBase):
    method __init__ (line 122) | def __init__(
  class Joiner (line 146) | class Joiner(nn.Sequential):
    method __init__ (line 147) | def __init__(self, backbone, position_embedding):
    method forward (line 150) | def forward(self, tensor_list: NestedTensor):
  function build_backbone (line 162) | def build_backbone(args):

FILE: GroundingDINO/groundingdino/models/GroundingDINO/backbone/position_encoding.py
  class PositionEmbeddingSine (line 30) | class PositionEmbeddingSine(nn.Module):
    method __init__ (line 36) | def __init__(self, num_pos_feats=64, temperature=10000, normalize=Fals...
    method forward (line 47) | def forward(self, tensor_list: NestedTensor):
  class PositionEmbeddingSineHW (line 78) | class PositionEmbeddingSineHW(nn.Module):
    method __init__ (line 84) | def __init__(
    method forward (line 98) | def forward(self, tensor_list: NestedTensor):
  class PositionEmbeddingLearned (line 134) | class PositionEmbeddingLearned(nn.Module):
    method __init__ (line 139) | def __init__(self, num_pos_feats=256):
    method reset_parameters (line 145) | def reset_parameters(self):
    method forward (line 149) | def forward(self, tensor_list: NestedTensor):
  function build_position_encoding (line 171) | def build_position_encoding(args):

FILE: GroundingDINO/groundingdino/models/GroundingDINO/backbone/swin_transformer.py
  class Mlp (line 24) | class Mlp(nn.Module):
    method __init__ (line 27) | def __init__(
    method forward (line 38) | def forward(self, x):
  function window_partition (line 47) | def window_partition(x, window_size):
  function window_reverse (line 61) | def window_reverse(windows, window_size, H, W):
  class WindowAttention (line 77) | class WindowAttention(nn.Module):
    method __init__ (line 90) | def __init__(
    method forward (line 134) | def forward(self, x, mask=None):
  class SwinTransformerBlock (line 177) | class SwinTransformerBlock(nn.Module):
    method __init__ (line 194) | def __init__(
    method forward (line 238) | def forward(self, x, mask_matrix):
  class PatchMerging (line 301) | class PatchMerging(nn.Module):
    method __init__ (line 308) | def __init__(self, dim, norm_layer=nn.LayerNorm):
    method forward (line 314) | def forward(self, x, H, W):
  class BasicLayer (line 343) | class BasicLayer(nn.Module):
    method __init__ (line 361) | def __init__(
    method forward (line 409) | def forward(self, x, H, W):
  class PatchEmbed (line 459) | class PatchEmbed(nn.Module):
    method __init__ (line 468) | def __init__(self, patch_size=4, in_chans=3, embed_dim=96, norm_layer=...
    method forward (line 482) | def forward(self, x):
  class SwinTransformer (line 501) | class SwinTransformer(nn.Module):
    method __init__ (line 530) | def __init__(
    method _freeze_stages (line 636) | def _freeze_stages(self):
    method forward_raw (line 678) | def forward_raw(self, x):
    method forward (line 712) | def forward(self, tensor_list: NestedTensor):
    method train (line 756) | def train(self, mode=True):
  function build_swin_transformer (line 762) | def build_swin_transformer(modelname, pretrain_img_size, **kw):

FILE: GroundingDINO/groundingdino/models/GroundingDINO/bertwarper.py
  class BertModelWarper (line 17) | class BertModelWarper(nn.Module):
    method __init__ (line 18) | def __init__(self, bert_model):
    method forward (line 31) | def forward(
  class TextEncoderShell (line 169) | class TextEncoderShell(nn.Module):
    method __init__ (line 170) | def __init__(self, text_encoder):
    method forward (line 175) | def forward(self, **kw):
  function generate_masks_with_special_tokens (line 180) | def generate_masks_with_special_tokens(tokenized, special_tokens_list, t...
  function generate_masks_with_special_tokens_and_transfer_map (line 224) | def generate_masks_with_special_tokens_and_transfer_map(tokenized, speci...

FILE: GroundingDINO/groundingdino/models/GroundingDINO/csrc/MsDeformAttn/ms_deform_attn.h
  function namespace (line 19) | namespace groundingdino {

FILE: GroundingDINO/groundingdino/models/GroundingDINO/csrc/MsDeformAttn/ms_deform_attn_cpu.cpp
  type groundingdino (line 16) | namespace groundingdino {
    function ms_deform_attn_cpu_forward (line 18) | at::Tensor
    function ms_deform_attn_cpu_backward (line 30) | std::vector<at::Tensor>

FILE: GroundingDINO/groundingdino/models/GroundingDINO/csrc/MsDeformAttn/ms_deform_attn_cpu.h
  function namespace (line 14) | namespace groundingdino {

FILE: GroundingDINO/groundingdino/models/GroundingDINO/csrc/MsDeformAttn/ms_deform_attn_cuda.h
  function namespace (line 14) | namespace groundingdino {

FILE: GroundingDINO/groundingdino/models/GroundingDINO/csrc/vision.cpp
  type groundingdino (line 5) | namespace groundingdino {
    function get_cuda_version (line 11) | std::string get_cuda_version() {
    function get_compiler_version (line 32) | std::string get_compiler_version() {
    function PYBIND11_MODULE (line 53) | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {

FILE: GroundingDINO/groundingdino/models/GroundingDINO/fuse_modules.py
  class FeatureResizer (line 14) | class FeatureResizer(nn.Module):
    method __init__ (line 20) | def __init__(self, input_feat_size, output_feat_size, dropout, do_ln=T...
    method forward (line 28) | def forward(self, encoder_features):
  function l1norm (line 36) | def l1norm(X, dim, eps=1e-8):
  function l2norm (line 43) | def l2norm(X, dim, eps=1e-8):
  function func_attention (line 50) | def func_attention(query, context, smooth=1, raw_feature_norm="softmax",...
  class BiMultiHeadAttention (line 99) | class BiMultiHeadAttention(nn.Module):
    method __init__ (line 100) | def __init__(self, v_dim, l_dim, embed_dim, num_heads, dropout=0.1, cf...
    method _shape (line 129) | def _shape(self, tensor: torch.Tensor, seq_len: int, bsz: int):
    method _reset_parameters (line 132) | def _reset_parameters(self):
    method forward (line 146) | def forward(self, v, l, attention_mask_v=None, attention_mask_l=None):
  class BiAttentionBlock (line 252) | class BiAttentionBlock(nn.Module):
    method __init__ (line 253) | def __init__(
    method forward (line 286) | def forward(self, v, l, attention_mask_v=None, attention_mask_l=None):

FILE: GroundingDINO/groundingdino/models/GroundingDINO/groundingdino.py
  class GroundingDINO (line 51) | class GroundingDINO(nn.Module):
    method __init__ (line 54) | def __init__(
    method _reset_parameters (line 204) | def _reset_parameters(self):
    method init_ref_points (line 210) | def init_ref_points(self, use_num_queries):
    method forward (line 213) | def forward(self, samples: NestedTensor, targets: List = None, **kw):
    method _set_aux_loss (line 353) | def _set_aux_loss(self, outputs_class, outputs_coord):
  function build_groundingdino (line 364) | def build_groundingdino(args):

FILE: GroundingDINO/groundingdino/models/GroundingDINO/ms_deform_attn.py
  function _is_power_of_2 (line 35) | def _is_power_of_2(n):
  class MultiScaleDeformableAttnFunction (line 41) | class MultiScaleDeformableAttnFunction(Function):
    method forward (line 43) | def forward(
    method backward (line 72) | def backward(ctx, grad_output):
  function multi_scale_deformable_attn_pytorch (line 93) | def multi_scale_deformable_attn_pytorch(
  class MultiScaleDeformableAttention (line 136) | class MultiScaleDeformableAttention(nn.Module):
    method __init__ (line 154) | def __init__(
    method _reset_parameters (line 194) | def _reset_parameters(self):
    method init_weights (line 197) | def init_weights(self):
    method freeze_sampling_offsets (line 222) | def freeze_sampling_offsets(self):
    method freeze_attention_weights (line 227) | def freeze_attention_weights(self):
    method forward (line 232) | def forward(
  function create_dummy_class (line 362) | def create_dummy_class(klass, dependency, message=""):
  function create_dummy_func (line 391) | def create_dummy_func(func, dependency, message=""):

FILE: GroundingDINO/groundingdino/models/GroundingDINO/transformer.py
  class Transformer (line 40) | class Transformer(nn.Module):
    method __init__ (line 41) | def __init__(
    method _reset_parameters (line 189) | def _reset_parameters(self):
    method get_valid_ratio (line 199) | def get_valid_ratio(self, mask):
    method init_ref_points (line 208) | def init_ref_points(self, use_num_queries):
    method forward (line 211) | def forward(self, srcs, masks, refpoint_embed, pos_embeds, tgt, attn_m...
  class TransformerEncoder (line 406) | class TransformerEncoder(nn.Module):
    method __init__ (line 407) | def __init__(
    method get_reference_points (line 466) | def get_reference_points(spatial_shapes, valid_ratios, device):
    method forward (line 482) | def forward(
  class TransformerDecoder (line 599) | class TransformerDecoder(nn.Module):
    method __init__ (line 600) | def __init__(
    method forward (line 634) | def forward(
  class DeformableTransformerEncoderLayer (line 739) | class DeformableTransformerEncoderLayer(nn.Module):
    method __init__ (line 740) | def __init__(
    method with_pos_embed (line 772) | def with_pos_embed(tensor, pos):
    method forward_ffn (line 775) | def forward_ffn(self, src):
    method forward (line 781) | def forward(
  class DeformableTransformerDecoderLayer (line 803) | class DeformableTransformerDecoderLayer(nn.Module):
    method __init__ (line 804) | def __init__(
    method rm_self_attn_modules (line 853) | def rm_self_attn_modules(self):
    method with_pos_embed (line 859) | def with_pos_embed(tensor, pos):
    method forward_ffn (line 862) | def forward_ffn(self, tgt):
    method forward (line 869) | def forward(
  function build_transformer (line 931) | def build_transformer(args):

FILE: GroundingDINO/groundingdino/models/GroundingDINO/transformer_vanilla.py
  class TextTransformer (line 33) | class TextTransformer(nn.Module):
    method __init__ (line 34) | def __init__(self, num_layers, d_model=256, nheads=8, dim_feedforward=...
    method forward (line 47) | def forward(self, memory_text: torch.Tensor, text_attention_mask: torc...
  class TransformerEncoderLayer (line 72) | class TransformerEncoderLayer(nn.Module):
    method __init__ (line 73) | def __init__(
    method with_pos_embed (line 98) | def with_pos_embed(self, tensor, pos: Optional[Tensor]):
    method forward (line 101) | def forward(

FILE: GroundingDINO/groundingdino/models/GroundingDINO/utils.py
  function _get_clones (line 16) | def _get_clones(module, N, layer_share=False):
  function get_sine_pos_embed (line 24) | def get_sine_pos_embed(
  function gen_encoder_output_proposals (line 56) | def gen_encoder_output_proposals(
  class RandomBoxPerturber (line 120) | class RandomBoxPerturber:
    method __init__ (line 121) | def __init__(
    method __call__ (line 128) | def __call__(self, refanchors: Tensor) -> Tensor:
  function sigmoid_focal_loss (line 139) | def sigmoid_focal_loss(
  class MLP (line 172) | class MLP(nn.Module):
    method __init__ (line 175) | def __init__(self, input_dim, hidden_dim, output_dim, num_layers):
    method forward (line 183) | def forward(self, x):
  function _get_activation_fn (line 189) | def _get_activation_fn(activation, d_model=256, batch_dim=0):
  function gen_sineembed_for_position (line 205) | def gen_sineembed_for_position(pos_tensor):
  class ContrastiveEmbed (line 235) | class ContrastiveEmbed(nn.Module):
    method __init__ (line 236) | def __init__(self, max_text_len=256):
    method forward (line 244) | def forward(self, x, text_dict):

FILE: GroundingDINO/groundingdino/models/__init__.py
  function build_model (line 11) | def build_model(args):

FILE: GroundingDINO/groundingdino/models/registry.py
  class Registry (line 18) | class Registry(object):
    method __init__ (line 19) | def __init__(self, name):
    method __repr__ (line 23) | def __repr__(self):
    method __len__ (line 29) | def __len__(self):
    method name (line 33) | def name(self):
    method module_dict (line 37) | def module_dict(self):
    method get (line 40) | def get(self, key):
    method registe_with_name (line 43) | def registe_with_name(self, module_name=None, force=False):
    method register (line 46) | def register(self, module_build_function, module_name=None, force=False):

FILE: GroundingDINO/groundingdino/util/box_ops.py
  function box_cxcywh_to_xyxy (line 9) | def box_cxcywh_to_xyxy(x):
  function box_xyxy_to_cxcywh (line 15) | def box_xyxy_to_cxcywh(x):
  function box_iou (line 22) | def box_iou(boxes1, boxes2):
  function generalized_box_iou (line 39) | def generalized_box_iou(boxes1, boxes2):
  function box_iou_pairwise (line 66) | def box_iou_pairwise(boxes1, boxes2):
  function generalized_box_iou_pairwise (line 82) | def generalized_box_iou_pairwise(boxes1, boxes2):
  function masks_to_boxes (line 107) | def masks_to_boxes(masks):

FILE: GroundingDINO/groundingdino/util/get_tokenlizer.py
  function get_tokenlizer (line 4) | def get_tokenlizer(text_encoder_type, bert_base_uncased_path):
  function get_pretrained_language_model (line 27) | def get_pretrained_language_model(text_encoder_type, bert_base_uncased_p...
  function is_bert_model_use_local_path (line 36) | def is_bert_model_use_local_path(bert_base_uncased_path):

FILE: GroundingDINO/groundingdino/util/inference.py
  function preprocess_caption (line 22) | def preprocess_caption(caption: str) -> str:
  function load_model (line 29) | def load_model(model_config_path: str, model_checkpoint_path: str, devic...
  function load_image (line 39) | def load_image(image_path: str) -> Tuple[np.array, torch.Tensor]:
  function predict (line 53) | def predict(
  function annotate (line 88) | def annotate(image_source: np.ndarray, boxes: torch.Tensor, logits: torc...
  class Model (line 111) | class Model:
    method __init__ (line 113) | def __init__(
    method predict_with_caption (line 126) | def predict_with_caption(
    method predict_with_classes (line 167) | def predict_with_classes(
    method preprocess_image (line 213) | def preprocess_image(image_bgr: np.ndarray) -> torch.Tensor:
    method post_process_result (line 226) | def post_process_result(
    method phrases2classes (line 238) | def phrases2classes(phrases: List[str], classes: List[str]) -> np.ndar...
    method find_index (line 249) | def find_index(string, lst):

FILE: GroundingDINO/groundingdino/util/logger.py
  class _ColorfulFormatter (line 10) | class _ColorfulFormatter(logging.Formatter):
    method __init__ (line 11) | def __init__(self, *args, **kwargs):
    method formatMessage (line 18) | def formatMessage(self, record):
  function setup_logger (line 32) | def setup_logger(output=None, distributed_rank=0, *, color=True, name="i...
  function _cached_log_stream (line 92) | def _cached_log_stream(filename):

FILE: GroundingDINO/groundingdino/util/misc.py
  class SmoothedValue (line 33) | class SmoothedValue(object):
    method __init__ (line 38) | def __init__(self, window_size=20, fmt=None):
    method update (line 46) | def update(self, value, n=1):
    method synchronize_between_processes (line 51) | def synchronize_between_processes(self):
    method median (line 65) | def median(self):
    method avg (line 72) | def avg(self):
    method global_avg (line 77) | def global_avg(self):
    method max (line 85) | def max(self):
    method value (line 89) | def value(self):
    method __str__ (line 92) | def __str__(self):
  function _get_global_gloo_group (line 103) | def _get_global_gloo_group():
  function all_gather_cpu (line 115) | def all_gather_cpu(data):
  function all_gather (line 173) | def all_gather(data):
  function reduce_dict (line 220) | def reduce_dict(input_dict, average=True):
  class MetricLogger (line 247) | class MetricLogger(object):
    method __init__ (line 248) | def __init__(self, delimiter="\t"):
    method update (line 252) | def update(self, **kwargs):
    method __getattr__ (line 259) | def __getattr__(self, attr):
    method __str__ (line 266) | def __str__(self):
    method synchronize_between_processes (line 275) | def synchronize_between_processes(self):
    method add_meter (line 279) | def add_meter(self, name, meter):
    method log_every (line 282) | def log_every(self, iterable, print_freq, header=None, logger=None):
  function get_sha (line 362) | def get_sha():
  function collate_fn (line 383) | def collate_fn(batch):
  function _max_by_axis (line 390) | def _max_by_axis(the_list):
  class NestedTensor (line 399) | class NestedTensor(object):
    method __init__ (line 400) | def __init__(self, tensors, mask: Optional[Tensor]):
    method imgsize (line 416) | def imgsize(self):
    method to (line 425) | def to(self, device):
    method to_img_list_single (line 436) | def to_img_list_single(self, tensor, mask):
    method to_img_list (line 443) | def to_img_list(self):
    method device (line 460) | def device(self):
    method decompose (line 463) | def decompose(self):
    method __repr__ (line 466) | def __repr__(self):
    method shape (line 470) | def shape(self):
  function nested_tensor_from_tensor_list (line 474) | def nested_tensor_from_tensor_list(tensor_list: List[Tensor]):
  function _onnx_nested_tensor_from_tensor_list (line 502) | def _onnx_nested_tensor_from_tensor_list(tensor_list: List[Tensor]) -> N...
  function setup_for_distributed (line 532) | def setup_for_distributed(is_master):
  function is_dist_avail_and_initialized (line 548) | def is_dist_avail_and_initialized():
  function get_world_size (line 556) | def get_world_size():
  function get_rank (line 562) | def get_rank():
  function is_main_process (line 568) | def is_main_process():
  function save_on_master (line 572) | def save_on_master(*args, **kwargs):
  function init_distributed_mode (line 577) | def init_distributed_mode(args):
  function accuracy (line 638) | def accuracy(output, target, topk=(1,)):
  function accuracy_onehot (line 657) | def accuracy_onehot(pred, gt):
  function interpolate (line 669) | def interpolate(input, size=None, scale_factor=None, mode="nearest", ali...
  class color_sys (line 687) | class color_sys:
    method __init__ (line 688) | def __init__(self, num_colors) -> None:
    method __call__ (line 700) | def __call__(self, idx):
  function inverse_sigmoid (line 704) | def inverse_sigmoid(x, eps=1e-3):
  function clean_state_dict (line 711) | def clean_state_dict(state_dict):

FILE: GroundingDINO/groundingdino/util/slconfig.py
  function check_file_exist (line 21) | def check_file_exist(filename, msg_tmpl='file "{}" does not exist'):
  class ConfigDict (line 26) | class ConfigDict(Dict):
    method __missing__ (line 27) | def __missing__(self, name):
    method __getattr__ (line 30) | def __getattr__(self, name):
  class SLConfig (line 42) | class SLConfig(object):
    method _validate_py_syntax (line 68) | def _validate_py_syntax(filename):
    method _file2dict (line 77) | def _file2dict(filename):
    method _merge_a_into_b (line 140) | def _merge_a_into_b(a, b):
    method fromfile (line 184) | def fromfile(filename):
    method __init__ (line 188) | def __init__(self, cfg_dict=None, cfg_text=None, filename=None):
    method filename (line 209) | def filename(self):
    method text (line 213) | def text(self):
    method pretty_text (line 217) | def pretty_text(self):
    method __repr__ (line 310) | def __repr__(self):
    method __len__ (line 313) | def __len__(self):
    method __getattr__ (line 316) | def __getattr__(self, name):
    method __getitem__ (line 329) | def __getitem__(self, name):
    method __setattr__ (line 332) | def __setattr__(self, name, value):
    method __setitem__ (line 337) | def __setitem__(self, name, value):
    method __iter__ (line 342) | def __iter__(self):
    method dump (line 345) | def dump(self, file=None):
    method merge_from_dict (line 353) | def merge_from_dict(self, options):
    method __setstate__ (line 386) | def __setstate__(self, state):
    method copy (line 389) | def copy(self):
    method deepcopy (line 392) | def deepcopy(self):
  class DictAction (line 396) | class DictAction(Action):
    method _parse_int_float_bool (line 404) | def _parse_int_float_bool(val):
    method __call__ (line 419) | def __call__(self, parser, namespace, values, option_string=None):

FILE: GroundingDINO/groundingdino/util/slio.py
  class BaseFileHandler (line 23) | class BaseFileHandler(metaclass=ABCMeta):
    method load_from_fileobj (line 25) | def load_from_fileobj(self, file, **kwargs):
    method dump_to_fileobj (line 29) | def dump_to_fileobj(self, obj, file, **kwargs):
    method dump_to_str (line 33) | def dump_to_str(self, obj, **kwargs):
    method load_from_path (line 36) | def load_from_path(self, filepath, mode="r", **kwargs):
    method dump_to_path (line 40) | def dump_to_path(self, obj, filepath, mode="w", **kwargs):
  class JsonHandler (line 45) | class JsonHandler(BaseFileHandler):
    method load_from_fileobj (line 46) | def load_from_fileobj(self, file):
    method dump_to_fileobj (line 49) | def dump_to_fileobj(self, obj, file, **kwargs):
    method dump_to_str (line 52) | def dump_to_str(self, obj, **kwargs):
  class PickleHandler (line 56) | class PickleHandler(BaseFileHandler):
    method load_from_fileobj (line 57) | def load_from_fileobj(self, file, **kwargs):
    method load_from_path (line 60) | def load_from_path(self, filepath, **kwargs):
    method dump_to_str (line 63) | def dump_to_str(self, obj, **kwargs):
    method dump_to_fileobj (line 67) | def dump_to_fileobj(self, obj, file, **kwargs):
    method dump_to_path (line 71) | def dump_to_path(self, obj, filepath, **kwargs):
  class YamlHandler (line 75) | class YamlHandler(BaseFileHandler):
    method load_from_fileobj (line 76) | def load_from_fileobj(self, file, **kwargs):
    method dump_to_fileobj (line 80) | def dump_to_fileobj(self, obj, file, **kwargs):
    method dump_to_str (line 84) | def dump_to_str(self, obj, **kwargs):
  function is_str (line 102) | def is_str(x):
  function slload (line 110) | def slload(file, file_format=None, **kwargs):
  function sldump (line 143) | def sldump(obj, file=None, file_format=None, **kwargs):

FILE: GroundingDINO/groundingdino/util/time_counter.py
  class TimeCounter (line 5) | class TimeCounter:
    method __init__ (line 6) | def __init__(self) -> None:
    method clear (line 9) | def clear(self):
    method timeit (line 13) | def timeit(self, name):
  class TimeHolder (line 19) | class TimeHolder:
    method __init__ (line 20) | def __init__(self) -> None:
    method update (line 23) | def update(self, _timedict: dict):
    method final_res (line 29) | def final_res(self):
    method __str__ (line 32) | def __str__(self):
  class AverageMeter (line 36) | class AverageMeter(object):
    method __init__ (line 39) | def __init__(self, name, fmt=":f", val_only=False):
    method reset (line 45) | def reset(self):
    method update (line 51) | def update(self, val, n=1):
    method __str__ (line 57) | def __str__(self):

FILE: GroundingDINO/groundingdino/util/utils.py
  function slprint (line 15) | def slprint(x, name="x"):
  function clean_state_dict (line 29) | def clean_state_dict(state_dict):
  function renorm (line 38) | def renorm(
  class CocoClassMapper (line 66) | class CocoClassMapper:
    method __init__ (line 67) | def __init__(self) -> None:
    method origin2compact (line 153) | def origin2compact(self, idx):
    method compact2origin (line 156) | def compact2origin(self, idx):
  function to_device (line 160) | def to_device(item, device):
  function get_gaussian_mean (line 174) | def get_gaussian_mean(x, axis, other_axis, softmax=True):
  function get_expected_points_from_map (line 200) | def get_expected_points_from_map(hm, softmax=True):
  class Embedder (line 222) | class Embedder:
    method __init__ (line 223) | def __init__(self, **kwargs):
    method create_embedding_fn (line 227) | def create_embedding_fn(self):
    method embed (line 251) | def embed(self, inputs):
  function get_embedder (line 255) | def get_embedder(multires, i=0):
  class APOPMeter (line 275) | class APOPMeter:
    method __init__ (line 276) | def __init__(self) -> None:
    method update (line 282) | def update(self, pred, gt):
    method update_cm (line 293) | def update_cm(self, tp, fp, tn, fn):
  function inverse_sigmoid (line 300) | def inverse_sigmoid(x, eps=1e-5):
  function get_raw_dict (line 307) | def get_raw_dict(args):
  function stat_tensors (line 325) | def stat_tensors(tensor):
  class NiceRepr (line 340) | class NiceRepr:
    method __nice__ (line 374) | def __nice__(self):
    method __repr__ (line 384) | def __repr__(self):
    method __str__ (line 394) | def __str__(self):
  function ensure_rng (line 405) | def ensure_rng(rng=None):
  function random_boxes (line 436) | def random_boxes(num=1, scale=1, rng=None):
  class ModelEma (line 473) | class ModelEma(torch.nn.Module):
    method __init__ (line 474) | def __init__(self, model, decay=0.9997, device=None):
    method _update (line 487) | def _update(self, model, update_fn):
    method update (line 496) | def update(self, model):
    method set (line 499) | def set(self, model):
  class BestMetricSingle (line 503) | class BestMetricSingle:
    method __init__ (line 504) | def __init__(self, init_res=0.0, better="large") -> None:
    method isbetter (line 512) | def isbetter(self, new_res, old_res):
    method update (line 518) | def update(self, new_res, ep):
    method __str__ (line 525) | def __str__(self) -> str:
    method __repr__ (line 528) | def __repr__(self) -> str:
    method summary (line 531) | def summary(self) -> dict:
  class BestMetricHolder (line 538) | class BestMetricHolder:
    method __init__ (line 539) | def __init__(self, init_res=0.0, better="large", use_ema=False) -> None:
    method update (line 546) | def update(self, new_res, epoch, is_ema=False):
    method summary (line 560) | def summary(self):
    method __repr__ (line 570) | def __repr__(self) -> str:
    method __str__ (line 573) | def __str__(self) -> str:
  function targets_to (line 577) | def targets_to(targets: List[Dict[str, Any]], device):
  function get_phrases_from_posmap (line 599) | def get_phrases_from_posmap(

FILE: GroundingDINO/groundingdino/util/visualizer.py
  function renorm (line 22) | def renorm(
  class ColorMap (line 50) | class ColorMap:
    method __init__ (line 51) | def __init__(self, basergb=[255, 255, 0]):
    method __call__ (line 54) | def __call__(self, attnmap):
  function rainbow_text (line 66) | def rainbow_text(x, y, ls, lc, **kw):
  class COCOVisualizer (line 95) | class COCOVisualizer:
    method __init__ (line 96) | def __init__(self, coco=None, tokenlizer=None) -> None:
    method visualize (line 99) | def visualize(self, img, tgt, caption=None, dpi=180, savedir="vis"):
    method addtgt (line 135) | def addtgt(self, tgt):
    method showAnns (line 225) | def showAnns(self, anns, draw_bbox=False):

FILE: GroundingDINO/groundingdino/util/vl_utils.py
  function create_positive_map_from_span (line 8) | def create_positive_map_from_span(tokenized, token_span, max_text_len=256):
  function build_captions_and_token_span (line 49) | def build_captions_and_token_span(cat_list, force_lowercase):
  function build_id2posspan_and_caption (line 90) | def build_id2posspan_and_caption(category_dict: dict):

FILE: GroundingDINO/setup.py
  function write_version_file (line 44) | def write_version_file():
  function get_extensions (line 56) | def get_extensions():
  function parse_requirements (line 114) | def parse_requirements(fname="requirements.txt", with_version=True):

FILE: SG_Nav.py
  class SG_Nav_Agent (line 40) | class SG_Nav_Agent():
    method __init__ (line 41) | def __init__(self, task_config, args=None):
    method add_predicates (line 142) | def add_predicates(self, model):
    method add_rules (line 156) | def add_rules(self, model):
    method reset (line 164) | def reset(self):
    method detect_objects (line 224) | def detect_objects(self, observations):
    method act (line 356) | def act(self, observations):
    method not_use_random_goal (line 508) | def not_use_random_goal(self):
    method get_glip_real_label (line 512) | def get_glip_real_label(self, prediction):
    method fbe (line 525) | def fbe(self, traversible, start):
    method get_goal_gps (line 572) | def get_goal_gps(self, observations, angle, distance):
    method get_relative_goal_gps (line 582) | def get_relative_goal_gps(self, observations, goal_gps=None):
    method init_map (line 592) | def init_map(self):
    method update_map (line 611) | def update_map(self, observations):
    method update_free_map (line 617) | def update_free_map(self, observations):
    method update_room_map (line 624) | def update_room_map(self, observations, room_prediction_result):
    method get_traversible (line 636) | def get_traversible(self, map_pred, pose_pred):
    method _plan (line 680) | def _plan(self, traversible, goal_map, agent_pose, start, start_o, goa...
    method _get_stg (line 746) | def _get_stg(self, traversible, start, goal, goal_found):
    method set_random_goal (line 781) | def set_random_goal(self):
    method update_metrics (line 796) | def update_metrics(self, metrics):
    method visualize (line 804) | def visualize(self, traversible, observations, number_action):
    method save_video (line 840) | def save_video(self):
    method visualize_agent_and_goal (line 852) | def visualize_agent_and_goal(self, map):
  function main (line 864) | def main():

FILE: habitat-lab/examples/benchmark.py
  class ForwardOnlyAgent (line 13) | class ForwardOnlyAgent(habitat.Agent):
    method reset (line 14) | def reset(self):
    method act (line 17) | def act(self, observations):
  function main (line 22) | def main():

FILE: habitat-lab/examples/example.py
  function example (line 10) | def example():

FILE: habitat-lab/examples/example_pointnav.py
  function example (line 10) | def example():

FILE: habitat-lab/examples/interactive_play.py
  function step_env (line 66) | def step_env(env, action_name, action_args, args):
  function get_input_vel_ctlr (line 70) | def get_input_vel_ctlr(skip_pygame, arm_action, g_args, prev_obs, env):
  function get_wrapped_prop (line 209) | def get_wrapped_prop(venv, prop):
  function play_env (line 220) | def play_env(env, args, config):
  function has_pygame (line 323) | def has_pygame():

FILE: habitat-lab/examples/new_actions.py
  class NoisyStrafeActuationSpec (line 28) | class NoisyStrafeActuationSpec:
  function _strafe_impl (line 35) | def _strafe_impl(
  class NoisyStrafeLeft (line 62) | class NoisyStrafeLeft(habitat_sim.SceneNodeControl):
    method __call__ (line 63) | def __call__(
  class NoisyStrafeRight (line 78) | class NoisyStrafeRight(habitat_sim.SceneNodeControl):
    method __call__ (line 79) | def __call__(
  class NoNoiseStrafe (line 96) | class NoNoiseStrafe(HabitatSimV1ActionSpaceConfiguration):
    method get (line 97) | def get(self):
  class NoiseStrafe (line 113) | class NoiseStrafe(HabitatSimV1ActionSpaceConfiguration):
    method get (line 114) | def get(self):
  class StrafeLeft (line 130) | class StrafeLeft(SimulatorTaskAction):
    method _get_uuid (line 131) | def _get_uuid(self, *args, **kwargs) -> str:
    method step (line 134) | def step(self, *args, **kwargs):
  class StrafeRight (line 139) | class StrafeRight(SimulatorTaskAction):
    method _get_uuid (line 140) | def _get_uuid(self, *args, **kwargs) -> str:
    method step (line 143) | def step(self, *args, **kwargs):
  function main (line 147) | def main():

FILE: habitat-lab/examples/register_new_sensors_and_measures.py
  class EpisodeInfoExample (line 18) | class EpisodeInfoExample(habitat.Measure):
    method __init__ (line 19) | def __init__(self, sim, config, **kwargs: Any):
    method _get_uuid (line 26) | def _get_uuid(self, *args: Any, **kwargs: Any) -> str:
    method reset_metric (line 30) | def reset_metric(self, *args: Any, episode, **kwargs: Any):
    method update_metric (line 37) | def update_metric(self, *args: Any, episode, action, **kwargs: Any):
  class AgentPositionSensor (line 45) | class AgentPositionSensor(habitat.Sensor):
    method __init__ (line 46) | def __init__(self, sim, config, **kwargs: Any):
    method _get_uuid (line 54) | def _get_uuid(self, *args: Any, **kwargs: Any) -> str:
    method _get_sensor_type (line 58) | def _get_sensor_type(self, *args: Any, **kwargs: Any):
    method _get_observation_space (line 62) | def _get_observation_space(self, *args: Any, **kwargs: Any):
    method get_observation (line 71) | def get_observation(
  function main (line 77) | def main():

FILE: habitat-lab/examples/shortest_path_follower_example.py
  class SimpleRLEnv (line 25) | class SimpleRLEnv(habitat.RLEnv):
    method get_reward_range (line 26) | def get_reward_range(self):
    method get_reward (line 29) | def get_reward(self, observations):
    method get_done (line 32) | def get_done(self, observations):
    method get_info (line 35) | def get_info(self, observations):
  function draw_top_down_map (line 39) | def draw_top_down_map(info, output_size):
  function shortest_path_example (line 45) | def shortest_path_example():
  function main (line 85) | def main():

FILE: habitat-lab/examples/tutorials/nb_python/Habitat_Lab.py
  function display_sample (line 76) | def display_sample(
  class NewNavigationTask (line 254) | class NewNavigationTask(NavigationTask):
    method __init__ (line 255) | def __init__(self, config, sim, dataset):
    method _check_episode_is_active (line 259) | def _check_episode_is_active(self, *args, **kwargs):
  class AgentPositionSensor (line 313) | class AgentPositionSensor(habitat.Sensor):
    method __init__ (line 314) | def __init__(self, sim, config, **kwargs):
    method _get_uuid (line 319) | def _get_uuid(self, *args, **kwargs):
    method _get_sensor_type (line 323) | def _get_sensor_type(self, *args, **kwargs):
    method _get_observation_space (line 327) | def _get_observation_space(self, *args, **kwargs):
    method get_observation (line 336) | def get_observation(self, observations, *args, episode, **kwargs):
  class ForwardOnlyAgent (line 383) | class ForwardOnlyAgent(habitat.Agent):
    method __init__ (line 384) | def __init__(self, success_distance, goal_sensor_uuid):
    method reset (line 388) | def reset(self):
    method is_goal_reached (line 391) | def is_goal_reached(self, observations):
    method act (line 395) | def act(self, observations):

FILE: habitat-lab/examples/visualization_examples.py
  function example_pointnav_draw_target_birdseye_view (line 22) | def example_pointnav_draw_target_birdseye_view():
  function example_pointnav_draw_target_birdseye_view_agent_on_border (line 48) | def example_pointnav_draw_target_birdseye_view_agent_on_border():
  function example_get_topdown_map (line 82) | def example_get_topdown_map():
  function main (line 101) | def main():

FILE: habitat-lab/examples/vln_benchmark.py
  function reference_path_benchmark (line 17) | def reference_path_benchmark(config, num_episodes=None):
  function main (line 61) | def main():

FILE: habitat-lab/examples/vln_reference_path_follower_example.py
  function save_map (line 28) | def save_map(observations, info, images):
  function reference_path_example (line 38) | def reference_path_example(mode):

FILE: habitat-lab/habitat/config/default.py
  class Config (line 13) | class Config(yacs.config.CfgNode):
    method __init__ (line 14) | def __init__(self, *args, **kwargs):
  function get_config (line 446) | def get_config(

FILE: habitat-lab/habitat/core/agent.py
  class Agent (line 16) | class Agent:
    method reset (line 22) | def reset(self) -> None:
    method act (line 26) | def act(

FILE: habitat-lab/habitat/core/benchmark.py
  class Benchmark (line 29) | class Benchmark:
    method __init__ (line 32) | def __init__(
    method remote_evaluate (line 52) | def remote_evaluate(
    method local_evaluate (line 126) | def local_evaluate(
    method evaluate (line 208) | def evaluate(

FILE: habitat-lab/habitat/core/challenge.py
  class Challenge (line 13) | class Challenge(Benchmark):
    method __init__ (line 14) | def __init__(self, eval_remote=False, split_l=-1, split_r=-1):
    method submit (line 18) | def submit(self, agent):

FILE: habitat-lab/habitat/core/dataset.py
  class Episode (line 40) | class Episode:
    method _reset_shortest_path_cache_hook (line 79) | def _reset_shortest_path_cache_hook(
    method __getstate__ (line 85) | def __getstate__(self):
    method __setstate__ (line 92) | def __setstate__(self, state):
  class Dataset (line 100) | class Dataset(Generic[T]):
    method scene_from_scene_path (line 105) | def scene_from_scene_path(scene_path: str) -> str:
    method get_scenes_to_load (line 116) | def get_scenes_to_load(cls, config: Config) -> List[str]:
    method build_content_scenes_filter (line 130) | def build_content_scenes_filter(cls, config) -> Callable[[T], bool]:
    method num_episodes (line 145) | def num_episodes(self) -> int:
    method scene_ids (line 150) | def scene_ids(self) -> List[str]:
    method get_scene_episodes (line 154) | def get_scene_episodes(self, scene_id: str) -> List[T]:
    method get_episodes (line 164) | def get_episodes(self, indexes: List[int]) -> List[T]:
    method get_episode_iterator (line 172) | def get_episode_iterator(self, *args: Any, **kwargs: Any) -> Iterator[T]:
    method to_json (line 186) | def to_json(self) -> str:
    method from_json (line 201) | def from_json(
    method filter_episodes (line 215) | def filter_episodes(self, filter_fn: Callable[[T], bool]) -> "Dataset":
    method get_splits (line 230) | def get_splits(
  class EpisodeIterator (line 328) | class EpisodeIterator(Iterator[T]):
    method __init__ (line 356) | def __init__(
    method __iter__ (line 425) | def __iter__(self) -> "EpisodeIterator":
    method __next__ (line 428) | def __next__(self) -> Episode:
    method _forced_scene_switch (line 457) | def _forced_scene_switch(self) -> None:
    method _shuffle (line 472) | def _shuffle(self) -> None:
    method _group_scenes (line 486) | def _group_scenes(
    method step_taken (line 505) | def step_taken(self) -> None:
    method _randomize_value (line 509) | def _randomize_value(value: int, value_range: float) -> int:
    method _set_shuffle_intervals (line 514) | def _set_shuffle_intervals(self) -> None:
    method _forced_scene_switch_if (line 527) | def _forced_scene_switch_if(self) -> None:

FILE: habitat-lab/habitat/core/embodied_task.py
  class Action (line 21) | class Action:
    method __init__ (line 30) | def __init__(self, *args: Any, **kwargs: Any) -> None:
    method reset (line 33) | def reset(self, *args: Any, **kwargs: Any) -> None:
    method step (line 40) | def step(self, *args: Any, **kwargs: Any) -> Observations:
    method action_space (line 51) | def action_space(self) -> Space:
  class SimulatorTaskAction (line 56) | class SimulatorTaskAction(Action):
    method __init__ (line 61) | def __init__(
    method action_space (line 68) | def action_space(self):
    method reset (line 71) | def reset(self, *args: Any, **kwargs: Any) -> None:
    method step (line 74) | def step(self, *args: Any, **kwargs: Any) -> Observations:
  class Measure (line 79) | class Measure:
    method __init__ (line 98) | def __init__(self, *args: Any, **kwargs: Any) -> None:
    method _get_uuid (line 102) | def _get_uuid(self, *args: Any, **kwargs: Any) -> str:
    method reset_metric (line 105) | def reset_metric(self, *args: Any, **kwargs: Any) -> None:
    method update_metric (line 111) | def update_metric(self, *args: Any, **kwargs: Any) -> None:
    method get_metric (line 117) | def get_metric(self):
  class Metrics (line 125) | class Metrics(dict):
    method __init__ (line 128) | def __init__(self, measures: Dict[str, Measure]) -> None:
  class Measurements (line 140) | class Measurements:
    method __init__ (line 147) | def __init__(self, measures: Iterable[Measure]) -> None:
    method reset_measures (line 160) | def reset_measures(self, *args: Any, **kwargs: Any) -> None:
    method update_measures (line 164) | def update_measures(self, *args: Any, **kwargs: Any) -> None:
    method get_metrics (line 168) | def get_metrics(self) -> Metrics:
    method _get_measure_index (line 174) | def _get_measure_index(self, measure_name):
    method check_measure_dependencies (line 177) | def check_measure_dependencies(
  class EmbodiedTask (line 200) | class EmbodiedTask:
    method __init__ (line 226) | def __init__(
    method _init_entities (line 260) | def _init_entities(
    method reset (line 283) | def reset(self, episode: Episode):
    method _step_single_action (line 298) | def _step_single_action(
    method step (line 320) | def step(self, action: Dict[str, Any], episode: Episode):
    method get_action_name (line 352) | def get_action_name(self, action_index: Union[int, np.integer]):
    method action_space (line 358) | def action_space(self) -> Space:
    method overwrite_sim_config (line 366) | def overwrite_sim_config(
    method _check_episode_is_active (line 377) | def _check_episode_is_active(
    method is_episode_active (line 387) | def is_episode_active(self):
    method seed (line 390) | def seed(self, seed: int) -> None:

FILE: habitat-lab/habitat/core/env.py
  class Env (line 26) | class Env:
    method __init__ (line 57) | def __init__(
    method _setup_episode_iterator (line 129) | def _setup_episode_iterator(self):
    method current_episode (line 141) | def current_episode(self) -> Episode:
    method current_episode (line 146) | def current_episode(self, episode: Episode) -> None:
    method episode_iterator (line 154) | def episode_iterator(self) -> Iterator[Episode]:
    method episode_iterator (line 158) | def episode_iterator(self, new_iter: Iterator[Episode]) -> None:
    method episodes (line 164) | def episodes(self) -> List[Episode]:
    method episodes (line 172) | def episodes(self, episodes: List[Episode]) -> None:
    method sim (line 186) | def sim(self) -> Simulator:
    method episode_start_time (line 190) | def episode_start_time(self) -> Optional[float]:
    method episode_over (line 194) | def episode_over(self) -> bool:
    method task (line 198) | def task(self) -> EmbodiedTask:
    method _elapsed_seconds (line 202) | def _elapsed_seconds(self) -> float:
    method get_metrics (line 208) | def get_metrics(self) -> Metrics:
    method _past_limit (line 211) | def _past_limit(self) -> bool:
    method _reset_stats (line 220) | def _reset_stats(self) -> None:
    method reset (line 225) | def reset(self) -> Observations:
    method _update_step_stats (line 261) | def _update_step_stats(self) -> None:
    method step (line 272) | def step(
    method _seed_numba (line 316) | def _seed_numba(seed: int):
    method seed (line 320) | def seed(self, seed: int) -> None:
    method reconfigure (line 327) | def reconfigure(self, config: Config) -> None:
    method render (line 338) | def render(self, mode="rgb") -> np.ndarray:
    method close (line 341) | def close(self) -> None:
    method __enter__ (line 344) | def __enter__(self):
    method __exit__ (line 347) | def __exit__(self, exc_type, exc_val, exc_tb):
  class RLEnv (line 351) | class RLEnv(gym.Env):
    method __init__ (line 365) | def __init__(
    method config (line 381) | def config(self) -> Config:
    method habitat_env (line 385) | def habitat_env(self) -> Env:
    method episodes (line 389) | def episodes(self) -> List[Episode]:
    method episodes (line 393) | def episodes(self, episodes: List[Episode]) -> None:
    method current_episode (line 397) | def current_episode(self) -> Episode:
    method reset (line 401) | def reset(self) -> Observations:
    method get_reward_range (line 404) | def get_reward_range(self):
    method get_reward (line 411) | def get_reward(self, observations: Observations) -> Any:
    method get_done (line 421) | def get_done(self, observations: Observations) -> bool:
    method get_info (line 432) | def get_info(self, observations) -> Dict[Any, Any]:
    method step (line 441) | def step(self, *args, **kwargs) -> Tuple[Observations, Any, bool, dict]:
    method seed (line 454) | def seed(self, seed: Optional[int] = None) -> None:
    method render (line 457) | def render(self, mode: str = "rgb") -> np.ndarray:
    method close (line 460) | def close(self) -> None:
    method __enter__ (line 463) | def __enter__(self):
    method __exit__ (line 466) | def __exit__(self, exc_type, exc_val, exc_tb):

FILE: habitat-lab/habitat/core/environments.py
  function get_env_class (line 21) | def get_env_class(env_name: str) -> Type[habitat.RLEnv]:
  class RearrangeRLEnv (line 34) | class RearrangeRLEnv(habitat.RLEnv):
    method __init__ (line 35) | def __init__(self, config: Config, dataset: Optional[Dataset] = None):
    method reset (line 40) | def reset(self):
    method step (line 44) | def step(self, *args, **kwargs):
    method get_reward_range (line 47) | def get_reward_range(self):
    method get_reward (line 51) | def get_reward(self, observations):
    method _episode_success (line 62) | def _episode_success(self):
    method get_done (line 65) | def get_done(self, observations):
    method get_info (line 73) | def get_info(self, observations):
  class NavRLEnv (line 78) | class NavRLEnv(habitat.RLEnv):
    method __init__ (line 79) | def __init__(self, config: Config, dataset: Optional[Dataset] = None):
    method reset (line 86) | def reset(self):
    method step (line 93) | def step(self, *args, **kwargs):
    method get_reward_range (line 96) | def get_reward_range(self):
    method get_reward (line 102) | def get_reward(self, observations):
    method _episode_success (line 115) | def _episode_success(self):
    method get_done (line 118) | def get_done(self, observations):
    method get_info (line 124) | def get_info(self, observations):

FILE: habitat-lab/habitat/core/logging.py
  class HabitatLogger (line 10) | class HabitatLogger(logging.Logger):
    method __init__ (line 11) | def __init__(
    method add_filehandler (line 31) | def add_filehandler(self, log_filename):

FILE: habitat-lab/habitat/core/registry.py
  class Registry (line 43) | class Registry(metaclass=Singleton):
    method _register_impl (line 47) | def _register_impl(
    method register_task (line 72) | def register_task(cls, to_register=None, *, name: Optional[str] = None):
    method register_simulator (line 101) | def register_simulator(
    method register_sensor (line 132) | def register_sensor(cls, to_register=None, *, name: Optional[str] = No...
    method register_measure (line 144) | def register_measure(cls, to_register=None, *, name: Optional[str] = N...
    method register_task_action (line 156) | def register_task_action(
    method register_dataset (line 173) | def register_dataset(cls, to_register=None, *, name: Optional[str] = N...
    method register_action_space_configuration (line 185) | def register_action_space_configuration(
    method register_env (line 202) | def register_env(cls, to_register=None, *, name: Optional[str] = None):
    method _get_impl (line 216) | def _get_impl(cls, _type: str, name: str) -> Type:
    method get_task (line 220) | def get_task(cls, name: str) -> Type[EmbodiedTask]:
    method get_task_action (line 224) | def get_task_action(cls, name: str) -> Type[Action]:
    method get_simulator (line 228) | def get_simulator(cls, name: str) -> Type[Simulator]:
    method get_sensor (line 232) | def get_sensor(cls, name: str) -> Type[Sensor]:
    method get_measure (line 236) | def get_measure(cls, name: str) -> Type[Measure]:
    method get_dataset (line 240) | def get_dataset(cls, name: str) -> Type[Dataset]:
    method get_action_space_configuration (line 244) | def get_action_space_configuration(
    method get_env (line 250) | def get_env(cls, name: str) -> Type["RLEnv"]:

FILE: habitat-lab/habitat/core/simulator.py
  class ActionSpaceConfiguration (line 38) | class ActionSpaceConfiguration(metaclass=abc.ABCMeta):
    method get (line 42) | def get(self) -> Any:
  class SensorTypes (line 46) | class SensorTypes(Enum):
  class Sensor (line 65) | class Sensor(metaclass=abc.ABCMeta):
    method __init__ (line 83) | def __init__(self, *args: Any, **kwargs: Any) -> None:
    method _get_uuid (line 93) | def _get_uuid(self, *args: Any, **kwargs: Any) -> str:
    method _get_sensor_type (line 96) | def _get_sensor_type(self, *args: Any, **kwargs: Any) -> SensorTypes:
    method _get_observation_space (line 99) | def _get_observation_space(self, *args: Any, **kwargs: Any) -> Space:
    method get_observation (line 103) | def get_observation(self, *args: Any, **kwargs: Any) -> Any:
  class Observations (line 111) | class Observations(Dict[str, Any]):
    method __init__ (line 114) | def __init__(
  class RGBSensor (line 130) | class RGBSensor(Sensor, metaclass=abc.ABCMeta):
    method __init__ (line 131) | def __init__(self, *args: Any, **kwargs: Any) -> None:
    method _get_uuid (line 134) | def _get_uuid(self, *args: Any, **kwargs: Any) -> str:
    method _get_sensor_type (line 137) | def _get_sensor_type(self, *args: Any, **kwargs: Any) -> SensorTypes:
    method _get_observation_space (line 140) | def _get_observation_space(self, *args: Any, **kwargs: Any) -> Space:
    method get_observation (line 143) | def get_observation(self, *args: Any, **kwargs: Any) -> VisualObservat...
  class DepthSensor (line 147) | class DepthSensor(Sensor, metaclass=abc.ABCMeta):
    method __init__ (line 148) | def __init__(self, *args: Any, **kwargs: Any) -> None:
    method _get_uuid (line 151) | def _get_uuid(self, *args: Any, **kwargs: Any) -> str:
    method _get_sensor_type (line 154) | def _get_sensor_type(self, *args: Any, **kwargs: Any) -> SensorTypes:
    method _get_observation_space (line 157) | def _get_observation_space(self, *args: Any, **kwargs: Any) -> Space:
    method get_observation (line 160) | def get_observation(self, *args: Any, **kwargs: Any) -> VisualObservat...
  class SemanticSensor (line 164) | class SemanticSensor(Sensor):
    method __init__ (line 165) | def __init__(self, *args: Any, **kwargs: Any) -> None:
    method _get_uuid (line 168) | def _get_uuid(self, *args: Any, **kwargs: Any) -> str:
    metho
Copy disabled (too large) Download .json
Condensed preview — 618 files, each showing path, character count, and a content snippet. Download the .json file for the full structured content (17,289K chars).
[
  {
    "path": ".gitignore",
    "chars": 246,
    "preview": "**/__pycache__/\nglip_large_model.pth\n/data\n/GLIP/build\n/GLIP/maskrcnn_benchmark.egg-info\n/segment_anything/segment_anyth"
  },
  {
    "path": "GLIP/CODE_OF_CONDUCT.md",
    "chars": 444,
    "preview": "# Microsoft Open Source Code of Conduct\n\nThis project has adopted the [Microsoft Open Source Code of Conduct](https://op"
  },
  {
    "path": "GLIP/DATA.md",
    "chars": 4347,
    "preview": "We provide guidance for preparing the data used by GLIP. Note that not all data are needed for a specific experiments. P"
  },
  {
    "path": "GLIP/LICENSE",
    "chars": 1141,
    "preview": "    MIT License\n\n    Copyright (c) Microsoft Corporation.\n\n    Permission is hereby granted, free of charge, to any pers"
  },
  {
    "path": "GLIP/README.md",
    "chars": 14131,
    "preview": "# GLIP: Grounded Language-Image Pre-training  \r\n\r\n<img src=\"docs/main_model.png\" width=\"800\"> \r\n\r\n## Updates\r\n04/30/2022"
  },
  {
    "path": "GLIP/SECURITY.md",
    "chars": 2780,
    "preview": "<!-- BEGIN MICROSOFT SECURITY.MD V0.0.5 BLOCK -->\n\n## Security\n\nMicrosoft takes the security of our software products an"
  },
  {
    "path": "GLIP/SUPPORT.md",
    "chars": 1315,
    "preview": "# TODO: The maintainer of this repo has not yet edited this file\r\n\r\n**REPO OWNER**: Do you want Customer Service & Suppo"
  },
  {
    "path": "GLIP/configs/flickr/test.yaml",
    "chars": 441,
    "preview": "MODEL:\n  ATSS:\n    NUM_CLASSES: 8 # Placeholder\n  FCOS:\n    NUM_CLASSES: 8 # Placeholder\n  ROI_BOX_HEAD:\n    NUM_CLASSES"
  },
  {
    "path": "GLIP/configs/flickr/val.yaml",
    "chars": 531,
    "preview": "MODEL:\n  ATSS:\n    NUM_CLASSES: 8 # Placeholder\n  FCOS:\n    NUM_CLASSES: 8 # Placeholder\n  ROI_BOX_HEAD:\n    NUM_CLASSES"
  },
  {
    "path": "GLIP/configs/lvis/minival.yaml",
    "chars": 688,
    "preview": "MODEL:\n  ATSS:\n    NUM_CLASSES: 8 # these fields are not used; just a placeholder\n  FCOS:\n    NUM_CLASSES: 8\n  ROI_BOX_H"
  },
  {
    "path": "GLIP/configs/lvis/val.yaml",
    "chars": 678,
    "preview": "MODEL:\n  ATSS:\n    NUM_CLASSES: 8 # these fields are not used; just a placeholder\n  FCOS:\n    NUM_CLASSES: 8\n  ROI_BOX_H"
  },
  {
    "path": "GLIP/configs/odinw/Aquarium_Aquarium_Combined.v2-raw-1024.coco.yaml",
    "chars": 1497,
    "preview": "DATALOADER:\r\n  ASPECT_RATIO_GROUPING: false\r\n  SIZE_DIVISIBILITY: 32\r\n\r\nDATASETS:\r\n  GENERAL_COPY: 16\r\n  CAPTION_PROMPT:"
  },
  {
    "path": "GLIP/configs/pretrain/glip_Swin_L.yaml",
    "chars": 2778,
    "preview": "MODEL:\n  META_ARCHITECTURE: \"GeneralizedVLRCNN\"\n  WEIGHT: \"swin_large_patch4_window12_384_22k.pth\"\n  RPN_ONLY: True\n  RP"
  },
  {
    "path": "GLIP/configs/pretrain/glip_Swin_T_O365_GoldG.yaml",
    "chars": 2271,
    "preview": "MODEL:\n  META_ARCHITECTURE: \"GeneralizedVLRCNN\"\n  WEIGHT: \"swin_tiny_patch4_window7_224.pth\"\n  RPN_ONLY: True\n  RPN_ARCH"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/__init__.py",
    "chars": 73,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/config/__init__.py",
    "chars": 144,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nfrom .defaults import _C as cfg\r\nfrom .paths_ca"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/config/defaults.py",
    "chars": 33374,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport os\r\n\r\nfrom yacs.config import CfgNode as"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/config/paths_catalog.py",
    "chars": 17564,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n\"\"\"Centralized catalog of paths.\"\"\"\r\n\r\nimport o"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/csrc/ROIAlign.h",
    "chars": 1704,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n#pragma once\r\n\r\n#include \"cpu/vision.h\"\r\n\r\n#if"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/csrc/ROIPool.h",
    "chars": 1682,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n#pragma once\r\n\r\n#include \"cpu/vision.h\"\r\n\r\n#if"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/csrc/SigmoidFocalLoss.h",
    "chars": 1088,
    "preview": "#pragma once\r\n\r\n#include \"cpu/vision.h\"\r\n\r\n#ifdef WITH_CUDA\r\n#include \"cuda/vision.h\"\r\n#endif\r\n\r\n// Interface for Python"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/csrc/cpu/ROIAlign_cpu.cpp",
    "chars": 8219,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n#include \"cpu/vision.h\"\r\n\r\n// implementation t"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/csrc/cpu/nms_cpu.cpp",
    "chars": 2575,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n#include \"cpu/vision.h\"\r\n\r\n\r\ntemplate <typenam"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/csrc/cpu/soft_nms.cpp",
    "chars": 4031,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n#include \"cpu/vision.h\"\r\n\r\n\r\ntemplate <typenam"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/csrc/cpu/vision.h",
    "chars": 897,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n#pragma once\r\n#include <torch/extension.h>\r\n\r\n"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/csrc/cuda/ROIAlign_cuda.cu",
    "chars": 12707,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n#include <ATen/ATen.h>\r\n#include <ATen/cuda/CU"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/csrc/cuda/ROIPool_cuda.cu",
    "chars": 8099,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n#include <ATen/ATen.h>\r\n#include <ATen/cuda/CU"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/csrc/cuda/SigmoidFocalLoss_cuda.cu",
    "chars": 5956,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n// This file is modified from  https://github."
  },
  {
    "path": "GLIP/maskrcnn_benchmark/csrc/cuda/deform_conv_cuda.cu",
    "chars": 29406,
    "preview": "// modify from\r\n// https://github.com/chengdazhi/Deformable-Convolution-V2-PyTorch/blob/mmdetection/mmdet/ops/dcn/src/de"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/csrc/cuda/deform_conv_kernel_cuda.cu",
    "chars": 43374,
    "preview": "/*!\r\n ******************* BEGIN Caffe Copyright Notice and Disclaimer ****************\r\n *\r\n * COPYRIGHT\r\n *\r\n * All con"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/csrc/cuda/deform_pool_cuda.cu",
    "chars": 3791,
    "preview": "// modify from\r\n// https://github.com/chengdazhi/Deformable-Convolution-V2-PyTorch/blob/mmdetection/mmdet/ops/dcn/src/mo"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/csrc/cuda/deform_pool_kernel_cuda.cu",
    "chars": 16452,
    "preview": "/*!\r\n * Copyright (c) 2017 Microsoft\r\n * Licensed under The MIT License [see LICENSE for details]\r\n * \\file deformable_p"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/csrc/cuda/ml_nms.cu",
    "chars": 5158,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n#include <ATen/ATen.h>\r\n#include <ATen/cuda/CU"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/csrc/cuda/nms.cu",
    "chars": 4991,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n#include <ATen/ATen.h>\r\n#include <ATen/cuda/CU"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/csrc/cuda/vision.h",
    "chars": 5775,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n#pragma once\r\n#include <torch/extension.h>\r\n\r\n"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/csrc/deform_conv.h",
    "chars": 4645,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n#pragma once\r\n#include \"cpu/vision.h\"\r\n\r\n#ifde"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/csrc/deform_pool.h",
    "chars": 1844,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n#pragma once\r\n#include \"cpu/vision.h\"\r\n\r\n#ifde"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/csrc/ml_nms.h",
    "chars": 791,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n#pragma once\r\n#include \"cpu/vision.h\"\r\n\r\n#ifde"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/csrc/nms.h",
    "chars": 1255,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n#pragma once\r\n#include \"cpu/vision.h\"\r\n\r\n#ifde"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/csrc/vision.cpp",
    "chars": 1667,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n#include \"nms.h\"\r\n#include \"ml_nms.h\"\r\n#includ"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/__init__.py",
    "chars": 110,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nfrom .build import make_data_loader\r\n"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/build.py",
    "chars": 23115,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport bisect\r\nimport copy\r\nimport logging\r\nimp"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/collate_batch.py",
    "chars": 3953,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport torch\r\nfrom maskrcnn_benchmark.structure"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/__init__.py",
    "chars": 1010,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nfrom .coco import COCODataset\r\nfrom .voc import"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/background.py",
    "chars": 1629,
    "preview": "import os\r\nimport os.path\r\nimport json\r\nfrom PIL import Image\r\n\r\nimport torch\r\nimport torchvision\r\nimport torch.utils.da"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/box_label_loader.py",
    "chars": 11465,
    "preview": "import torch\r\nimport numpy as np\r\nimport math\r\nimport base64\r\nimport collections\r\nimport pycocotools.mask as mask_utils\r"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/caption.py",
    "chars": 13391,
    "preview": "import torch\r\nimport torch.distributed as dist\r\nimport time\r\nfrom torchvision.ops import nms\r\nimport random\r\nimport nump"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/coco.py",
    "chars": 10605,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport os\r\nimport os.path\r\nimport math\r\nfrom PI"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/coco_dt.py",
    "chars": 6284,
    "preview": "\"\"\"\r\nCOCO dataset which returns image_id for evaluation.\r\n\r\nMostly copy-paste from https://github.com/pytorch/vision/blo"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/concat_dataset.py",
    "chars": 789,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport bisect\r\n\r\nfrom torch.utils.data.dataset "
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/custom_distributed_sampler.py",
    "chars": 7961,
    "preview": "import math\r\nfrom typing import TypeVar, Optional, Iterator\r\n\r\nimport torch\r\nfrom torch.utils.data import Sampler, Datas"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/duplicate_dataset.py",
    "chars": 894,
    "preview": "import math\r\nfrom typing import TypeVar, Optional, Iterator\r\n\r\nimport torch\r\nfrom torch.utils.data import Sampler, Datas"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/evaluation/__init__.py",
    "chars": 2317,
    "preview": "from maskrcnn_benchmark.data import datasets\r\n\r\nfrom .coco import coco_evaluation\r\nfrom .voc import voc_evaluation\r\nfrom"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/evaluation/box_aug.py",
    "chars": 13633,
    "preview": "import torch\r\nimport numpy as np\r\n\r\nfrom maskrcnn_benchmark.config import cfg\r\nfrom maskrcnn_benchmark.data import trans"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/evaluation/coco/__init__.py",
    "chars": 536,
    "preview": "from .coco_eval import do_coco_evaluation\r\n\r\n\r\ndef coco_evaluation(\r\n    dataset,\r\n    predictions,\r\n    output_folder,\r"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/evaluation/coco/coco_eval.py",
    "chars": 19712,
    "preview": "import logging\r\nimport tempfile\r\nimport os\r\nimport torch\r\nimport numpy as np\r\nimport json\r\n\r\nfrom collections import Ord"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/evaluation/flickr/__init__.py",
    "chars": 42,
    "preview": "from .flickr_eval import FlickrEvaluator\r\n"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/evaluation/flickr/flickr_eval.py",
    "chars": 16992,
    "preview": "from maskrcnn_benchmark.structures.boxlist_ops import boxlist_iou\r\nfrom maskrcnn_benchmark.structures.bounding_box impor"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/evaluation/lvis/_change_lvis_annotation.py",
    "chars": 319,
    "preview": "path = \"DATASET/coco/annotations/lvis_v1_minival.json\"\r\nimport json\r\nwith open(path) as f:\r\n    all = json.load(f)\r\n\r\nfo"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/evaluation/lvis/lvis.py",
    "chars": 6832,
    "preview": "# Copyright (c) Aishwarya Kamath & Nicolas Carion. Licensed under the Apache License 2.0. All Rights Reserved\r\n# Copyrig"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/evaluation/lvis/lvis_eval.py",
    "chars": 38463,
    "preview": "# Copyright (c) Aishwarya Kamath & Nicolas Carion. Licensed under the Apache License 2.0. All Rights Reserved\r\n# Copyrig"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/evaluation/od_eval.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/evaluation/od_to_grounding/__init__.py",
    "chars": 568,
    "preview": "from .od_eval import do_od_evaluation\r\n\r\n\r\ndef od_to_grounding_evaluation(\r\n        dataset,\r\n        predictions,\r\n    "
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/evaluation/od_to_grounding/od_eval.py",
    "chars": 19712,
    "preview": "import logging\r\nimport tempfile\r\nimport os\r\nimport torch\r\nimport numpy as np\r\nimport json\r\n\r\nfrom collections import Ord"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/evaluation/vg/__init__.py",
    "chars": 514,
    "preview": "import logging\r\n\r\nfrom .vg_eval import do_vg_evaluation\r\n\r\n\r\ndef vg_evaluation(dataset, predictions, output_folder, box_"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/evaluation/vg/vg_eval.py",
    "chars": 27419,
    "preview": "# A modification version from chainercv repository.\r\n# (See https://github.com/chainer/chainercv/blob/master/chainercv/e"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/evaluation/voc/__init__.py",
    "chars": 521,
    "preview": "import logging\r\n\r\nfrom .voc_eval import do_voc_evaluation\r\n\r\n\r\ndef voc_evaluation(dataset, predictions, output_folder, b"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/evaluation/voc/voc_eval.py",
    "chars": 8369,
    "preview": "# A modification version from chainercv repository.\r\n# (See https://github.com/chainer/chainercv/blob/master/chainercv/e"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/flickr.py",
    "chars": 199,
    "preview": "import torch\r\nimport torchvision\r\nimport torch.utils.data as data\r\nfrom maskrcnn_benchmark.data.datasets.modulated_coco "
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/gqa.py",
    "chars": 3753,
    "preview": "import json\r\nfrom pathlib import Path\r\n\r\nimport torch\r\nimport torchvision\r\n\r\nfrom .modulated_coco import ConvertCocoPoly"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/imagenet.py",
    "chars": 2027,
    "preview": "import os\r\nimport os.path\r\nimport json\r\nfrom PIL import Image\r\n\r\nimport torch.utils.data as data\r\n\r\ndef pil_loader(path)"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/list_dataset.py",
    "chars": 979,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n\"\"\"\r\nSimple dataset class that wraps a list of "
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/lvis.py",
    "chars": 9136,
    "preview": "# Copyright (c) Aishwarya Kamath & Nicolas Carion. Licensed under the Apache License 2.0. All Rights Reserved\r\n# Copyrig"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/mixed.py",
    "chars": 5431,
    "preview": "import os\r\nimport os.path\r\nfrom pathlib import Path\r\nfrom typing import Any, Callable, Optional, Tuple\r\n\r\nimport torch\r\n"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/mixup.py",
    "chars": 5008,
    "preview": "\"\"\"Mixup detection dataset wrapper.\"\"\"\r\nfrom __future__ import absolute_import\r\nimport numpy as np\r\nimport torch\r\nimport"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/modulated_coco.py",
    "chars": 27167,
    "preview": "import logging\r\nimport os\r\nimport os.path\r\nimport math\r\nfrom PIL import Image, ImageDraw\r\n\r\nimport random\r\nimport numpy "
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/object365.py",
    "chars": 200,
    "preview": "import torch\r\nimport torchvision\r\nimport torch.utils.data as data\r\nfrom maskrcnn_benchmark.data.datasets.coco_dt import "
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/od_to_grounding.py",
    "chars": 14681,
    "preview": "import numpy as np\r\nimport random\r\nimport re\r\nimport torch\r\nimport pdb\r\nimport logging\r\n\r\n\r\ndef clean_name(name):\r\n    n"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/phrasecut.py",
    "chars": 196,
    "preview": "import torch\nimport torchvision\nimport torch.utils.data as data\nfrom maskrcnn_benchmark.data.datasets.modulated_coco imp"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/pseudo_data.py",
    "chars": 9133,
    "preview": "import torch\nimport torch.distributed as dist\nimport time\nfrom torchvision.ops import nms\nimport random\nimport numpy as "
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/refexp.py",
    "chars": 3358,
    "preview": "import copy\r\nfrom collections import defaultdict\r\nfrom pathlib import Path\r\n\r\nimport torch\r\nimport torch.utils.data\r\n\r\ni"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/tsv.py",
    "chars": 15152,
    "preview": "import os\r\nimport os.path as op\r\nimport json\r\n# import logging\r\nimport base64\r\nimport yaml\r\nimport errno\r\nimport io\r\nimp"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/vg.py",
    "chars": 11011,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport collections\r\nimport json\r\nimport os.path"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/datasets/voc.py",
    "chars": 4255,
    "preview": "import os\r\n\r\nimport torch\r\nimport torch.utils.data\r\nfrom PIL import Image\r\nimport sys\r\n\r\nif sys.version_info[0] == 2:\r\n "
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/samplers/__init__.py",
    "chars": 334,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nfrom .distributed import DistributedSampler\r\nfr"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/samplers/distributed.py",
    "chars": 2866,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n# Code is copy-pasted exactly as in torch.utils"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/samplers/grouped_batch_sampler.py",
    "chars": 4960,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport itertools\r\n\r\nimport torch\r\nfrom torch.ut"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/samplers/iteration_based_batch_sampler.py",
    "chars": 1195,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nfrom torch.utils.data.sampler import BatchSampl"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/transforms/__init__.py",
    "chars": 292,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nfrom .transforms import Compose\r\nfrom .transfor"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/transforms/build.py",
    "chars": 1521,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nfrom . import transforms as T\r\n\r\n\r\ndef build_tr"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/data/transforms/transforms.py",
    "chars": 14395,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport cv2\r\nimport random\r\nimport numpy as np\r\n"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/engine/__init__.py",
    "chars": 73,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/engine/alter_trainer.py",
    "chars": 4558,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport datetime\r\nimport logging\r\nimport time\r\n\r"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/engine/evolution.py",
    "chars": 13044,
    "preview": "\r\nimport time\r\nimport pickle\r\nimport logging\r\nimport os\r\nimport numpy as np\r\nimport torch\r\nimport torch.nn as nn\r\n\r\n\r\nfr"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/engine/inference.py",
    "chars": 21833,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport datetime\r\nimport logging\r\nimport time\r\ni"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/engine/predictor.py",
    "chars": 20913,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport cv2\r\nimport torch\r\nimport numpy as np\r\nf"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/engine/predictor_glip.py",
    "chars": 18786,
    "preview": "import cv2\nimport torch\nimport re\nimport numpy as np\nfrom typing import List, Union\nimport nltk\nimport inflect\nfrom tran"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/engine/singlepath_trainer.py",
    "chars": 5076,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport datetime\r\nimport logging\r\nimport time\r\ni"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/engine/stage_trainer.py",
    "chars": 7454,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport datetime\r\nimport logging\r\nimport time\r\n\r"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/engine/trainer.py",
    "chars": 16079,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport datetime\r\nimport logging\r\nimport sys\r\nim"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/layers/__init__.py",
    "chars": 1544,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport torch\r\n\r\nfrom .batch_norm import FrozenB"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/layers/batch_norm.py",
    "chars": 5041,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport torch\r\nfrom torch import nn\r\n\r\nimport to"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/layers/deform_conv.py",
    "chars": 14640,
    "preview": "import torch\r\nimport math\r\nfrom torch import nn\r\nfrom torch.nn import init\r\nfrom torch.nn.modules.utils import _pair\r\nfr"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/layers/deform_pool.py",
    "chars": 17502,
    "preview": "import torch\r\nimport torch.nn as nn\r\nimport torch.nn.functional as F\r\n\r\nfrom .deform_conv import DeformConv2d\r\n\r\ndef add"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/layers/dropblock.py",
    "chars": 4584,
    "preview": "import torch\r\nimport torch.nn.functional as F\r\nfrom torch import nn\r\n\r\n\r\nclass DropBlock2D(nn.Module):\r\n    r\"\"\"Randomly"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/layers/dyhead.py",
    "chars": 5205,
    "preview": "import torch\r\nimport torch.nn.functional as F\r\nfrom torch import nn\r\n\r\nfrom .deform_conv import ModulatedDeformConv\r\nfro"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/layers/dyrelu.py",
    "chars": 3956,
    "preview": "import torch\r\nimport torch.nn as nn\r\nimport torch.nn.functional as F\r\n\r\ndef _make_divisible(v, divisor, min_value=None):"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/layers/evonorm.py",
    "chars": 1344,
    "preview": "import torch\r\nimport torch.nn as nn\r\n\r\n\r\nclass EvoNorm2d(nn.Module):\r\n    __constants__ = ['num_features', 'eps', 'nonli"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/layers/iou_loss.py",
    "chars": 2931,
    "preview": "import torch\r\nfrom torch import nn\r\n\r\n\r\nclass IOULoss(nn.Module):\r\n    def __init__(self, loss_type=\"iou\"):\r\n        sup"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/layers/misc.py",
    "chars": 6871,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n\"\"\"\r\nhelper class that supports empty tensors o"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/layers/nms.py",
    "chars": 325,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nfrom maskrcnn_benchmark import _C\r\n\r\ntry:\r\n    "
  },
  {
    "path": "GLIP/maskrcnn_benchmark/layers/roi_align.py",
    "chars": 3033,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport torch\r\nfrom torch import nn\r\nfrom torch."
  },
  {
    "path": "GLIP/maskrcnn_benchmark/layers/roi_pool.py",
    "chars": 1918,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport torch\r\nfrom torch import nn\r\nfrom torch."
  },
  {
    "path": "GLIP/maskrcnn_benchmark/layers/se.py",
    "chars": 1821,
    "preview": "from torch import nn\r\n\r\n\r\nclass SELayer(nn.Module):\r\n    def __init__(self, channel, reduction=16):\r\n        super(SELay"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/layers/set_loss.py",
    "chars": 16913,
    "preview": "import torch\r\nimport torch.nn.functional as F\r\nimport torch.distributed as dist\r\nfrom torch import nn\r\n\r\nfrom scipy.opti"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/layers/sigmoid_focal_loss.py",
    "chars": 7532,
    "preview": "import torch\r\nfrom torch import nn\r\nimport torch.nn.functional as F\r\nfrom torch.autograd import Function\r\nfrom torch.aut"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/layers/smooth_l1_loss.py",
    "chars": 497,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport torch\r\n\r\n\r\n# TODO maybe push this to nn?"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/backbone/__init__.py",
    "chars": 9052,
    "preview": "from collections import OrderedDict\r\n\r\nfrom torch import nn\r\n\r\nfrom maskrcnn_benchmark.modeling import registry\r\nfrom ma"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/backbone/bifpn.py",
    "chars": 12109,
    "preview": "import torch.nn as nn\r\nimport torch\r\n\r\nfrom maskrcnn_benchmark.layers import swish\r\n\r\n\r\nclass BiFPN(nn.Module):\r\n    def"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/backbone/blocks.py",
    "chars": 8678,
    "preview": "import torch.nn as nn\r\nfrom .ops import *\r\n\r\n\r\nclass stem(nn.Module):\r\n    num_layer = 1\r\n\r\n    def __init__(self, conv,"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/backbone/efficientdet.py",
    "chars": 80575,
    "preview": "import torch\r\nimport re\r\nimport numpy as np\r\nimport torch.nn as nn\r\nimport torch.nn.functional as F\r\nimport logging\r\nimp"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/backbone/efficientnet.py",
    "chars": 23139,
    "preview": "\"\"\"\r\n    EfficientNet for ImageNet-1K, implemented in PyTorch.\r\n    Original papers:\r\n    - 'EfficientNet: Rethinking Mo"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/backbone/fbnet.py",
    "chars": 15879,
    "preview": "\"\"\"\r\nFBNet model builder\r\n\"\"\"\r\n\r\nfrom __future__ import absolute_import, division, print_function, unicode_literals\r\n\r\ni"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/backbone/fpn.py",
    "chars": 7046,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport torch\r\nimport torch.nn.functional as F\r\n"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/backbone/mixer.py",
    "chars": 1040,
    "preview": "import torch\r\nfrom torch import nn\r\n\r\nclass MixedOperationRandom(nn.Module):\r\n    def __init__(self, search_ops):\r\n     "
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/backbone/ops.py",
    "chars": 2732,
    "preview": "import math\r\nimport torch\r\nimport torch.nn as nn\r\nimport torch.nn.functional as F\r\n\r\n\r\ndef conv7x7(in_planes, out_planes"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/backbone/resnet.py",
    "chars": 20973,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n\"\"\"\r\nVariant of the resnet module that takes cf"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/backbone/swint.py",
    "chars": 25846,
    "preview": "# --------------------------------------------------------\n# Swin Transformer\n# modified from https://github.com/SwinTra"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/backbone/swint_v2.py",
    "chars": 28474,
    "preview": "# --------------------------------------------------------\n# Swin Transformer\n# modified from https://github.com/SwinTra"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/backbone/swint_v2_vl.py",
    "chars": 35807,
    "preview": "# --------------------------------------------------------\r\n# Swin Transformer\r\n# modified from https://github.com/SwinT"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/backbone/swint_vl.py",
    "chars": 32089,
    "preview": "# --------------------------------------------------------\n# Swin Transformer\n# modified from https://github.com/SwinTra"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/balanced_positive_negative_sampler.py",
    "chars": 2784,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport torch\r\n\r\n\r\nclass BalancedPositiveNegativ"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/box_coder.py",
    "chars": 3462,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport math\r\n\r\nimport torch\r\n\r\n\r\nclass BoxCoder"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/detector/__init__.py",
    "chars": 420,
    "preview": "from .generalized_rcnn import GeneralizedRCNN\r\nfrom .generalized_vl_rcnn import GeneralizedVLRCNN\r\n\r\n_DETECTION_META_ARC"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/detector/generalized_rcnn.py",
    "chars": 4915,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n\"\"\"\r\nImplements the Generalized R-CNN framework"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/detector/generalized_vl_rcnn.py",
    "chars": 14299,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n\"\"\"\r\nImplements the Generalized VL R-CNN framew"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/language_backbone/__init__.py",
    "chars": 234,
    "preview": "from .backbone import build_backbone as build_language_backbone\r\nfrom .build import build_tokenizer\r\n\r\nfrom .hfpt_tokeni"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/language_backbone/backbone.py",
    "chars": 1389,
    "preview": "from collections import OrderedDict\r\nimport torch\r\nfrom torch import nn\r\n\r\nfrom maskrcnn_benchmark.modeling import regis"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/language_backbone/bert_model.py",
    "chars": 3313,
    "preview": "from copy import deepcopy\r\nimport numpy as np\r\nimport torch\r\nfrom torch import nn\r\n\r\n# from pytorch_pretrained_bert.mode"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/language_backbone/build.py",
    "chars": 567,
    "preview": "from .simple_tokenizer import SimpleTokenizer\r\n\r\n\r\ndef build_tokenizer(tokenizer_name):\r\n    tokenizer = None\r\n    if to"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/language_backbone/clip_model.py",
    "chars": 7903,
    "preview": "from collections import OrderedDict\r\nimport logging\r\nimport os\r\n\r\nimport torch\r\nfrom torch import nn\r\nimport torch.nn.fu"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/language_backbone/hfpt_tokenizer.py",
    "chars": 3286,
    "preview": "from typing import Union, List\r\n\r\nfrom transformers import AutoTokenizer\r\nimport torch\r\n\r\n\r\nclass HFPTTokenizer(object):"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/language_backbone/rnn_model.py",
    "chars": 5807,
    "preview": "from copy import deepcopy\r\nimport numpy as np\r\nimport torch\r\nfrom torch import nn\r\n\r\n\r\nclass RNNEnoder(nn.Module):\r\n    "
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/language_backbone/simple_tokenizer.py",
    "chars": 6090,
    "preview": "import gzip\r\nimport html\r\nimport os\r\nfrom functools import lru_cache\r\n\r\nimport ftfy\r\nimport regex as re\r\nfrom typing imp"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/language_backbone/test_clip_tokenizer.py",
    "chars": 308,
    "preview": "from maskrcnn_benchmark.modeling.language_backbone import build_tokenizer\r\n\r\nif __name__ == '__main__':\r\n\r\n    tokenizer"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/language_backbone/word_utils.py",
    "chars": 3279,
    "preview": "\"\"\"\r\nLanguage-related data loading helper functions and class wrappers.\r\n\"\"\"\r\n\r\nimport re\r\nimport torch\r\nimport codecs\r\n"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/make_layers.py",
    "chars": 3829,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n\"\"\"\r\nMiscellaneous utility functions\r\n\"\"\"\r\n\r\nim"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/matcher.py",
    "chars": 5438,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport torch\r\n\r\n\r\nclass Matcher(object):\r\n    \""
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/poolers.py",
    "chars": 4535,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport torch\r\nimport torch.nn.functional as F\r\n"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/registry.py",
    "chars": 259,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n\r\nfrom maskrcnn_benchmark.utils.registry import"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/roi_heads/__init__.py",
    "chars": 3667,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport torch\r\n\r\nfrom .box_head.box_head import "
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/roi_heads/box_head/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/roi_heads/box_head/box_head.py",
    "chars": 2992,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport torch\r\nfrom torch import nn\r\n\r\nfrom .roi"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/roi_heads/box_head/inference.py",
    "chars": 7586,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport torch\r\nimport torch.nn.functional as F\r\n"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/roi_heads/box_head/loss.py",
    "chars": 7500,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport torch\r\nfrom torch.nn import functional a"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/roi_heads/box_head/roi_box_feature_extractors.py",
    "chars": 7493,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport torch\r\nfrom torch import nn\r\nfrom torch."
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/roi_heads/box_head/roi_box_predictors.py",
    "chars": 2128,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nfrom torch import nn\r\n\r\n\r\nclass FastRCNNPredict"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/roi_heads/keypoint_head/inference.py",
    "chars": 4542,
    "preview": "import cv2\r\nimport numpy as np\r\nimport torch\r\nfrom torch import nn\r\n\r\nfrom maskrcnn_benchmark.structures.bounding_box im"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/roi_heads/keypoint_head/keypoint_head.py",
    "chars": 2003,
    "preview": "import torch\r\n\r\nfrom .roi_keypoint_feature_extractors import make_roi_keypoint_feature_extractor\r\nfrom .roi_keypoint_pre"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/roi_heads/keypoint_head/loss.py",
    "chars": 7284,
    "preview": "import torch\r\nfrom torch.nn import functional as F\r\n\r\nfrom maskrcnn_benchmark.modeling.matcher import Matcher\r\n\r\nfrom ma"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/roi_heads/keypoint_head/roi_keypoint_feature_extractors.py",
    "chars": 3915,
    "preview": "from torch import nn\r\nfrom torch.nn import functional as F\r\n\r\nfrom maskrcnn_benchmark.modeling.poolers import Pooler\r\n\r\n"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/roi_heads/keypoint_head/roi_keypoint_predictors.py",
    "chars": 1251,
    "preview": "from torch import nn\r\nfrom torch.nn import functional as F\r\n\r\nfrom maskrcnn_benchmark import layers\r\n\r\n\r\nclass KeypointR"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/roi_heads/mask_head/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/roi_heads/mask_head/hourglass.py",
    "chars": 2156,
    "preview": "from torch import nn\n\nfrom maskrcnn_benchmark.modeling.make_layers import make_conv3x3\n\n\nclass Residual(nn.Module):\n    "
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/roi_heads/mask_head/inference.py",
    "chars": 8026,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport numpy as np\r\nimport torch\r\nfrom torch im"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/roi_heads/mask_head/loss.py",
    "chars": 7470,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport torch\r\nfrom torch.nn import functional a"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/roi_heads/mask_head/mask_head.py",
    "chars": 3490,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport torch\r\nfrom torch import nn\r\n\r\nfrom mask"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/roi_heads/mask_head/roi_mask_feature_extractors.py",
    "chars": 4143,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nfrom torch import nn\r\nfrom torch.nn import func"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/roi_heads/mask_head/roi_mask_predictors.py",
    "chars": 4961,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport torch\r\nfrom torch import nn\r\nfrom torch."
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/rpn/__init__.py",
    "chars": 859,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n# from .rpn import build_rpn\r\nfrom .rpn import "
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/rpn/anchor_generator.py",
    "chars": 16408,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport math\r\n\r\nimport numpy as np\r\nimport torch"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/rpn/atss.py",
    "chars": 8913,
    "preview": "import math\r\nimport torch\r\nimport torch.nn.functional as F\r\nfrom torch import nn\r\n\r\nfrom .inference import make_atss_pos"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/rpn/dyhead.py",
    "chars": 15342,
    "preview": "import math\r\nimport torch\r\nimport torch.nn.functional as F\r\nfrom torch import nn\r\n\r\nfrom .inference import make_atss_pos"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/rpn/fcos.py",
    "chars": 8718,
    "preview": "import math\r\nimport torch\r\nimport torch.nn.functional as F\r\nfrom torch import nn\r\n\r\nfrom maskrcnn_benchmark.modeling imp"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/rpn/inference.py",
    "chars": 33226,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport logging\r\n\r\nimport torch\r\n\r\nfrom maskrcnn"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/rpn/loss.py",
    "chars": 63682,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n\"\"\"\r\nThis file contains specific functions for "
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/rpn/modeling_bert.py",
    "chars": 12874,
    "preview": "# coding=utf-8\r\n# Copyright 2018 The Google AI Language Team Authors and The HuggingFace Inc. team.\r\n# Copyright (c) 201"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/rpn/retina.py",
    "chars": 5641,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport math\r\nimport torch\r\nimport torch.nn.func"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/rpn/rpn.py",
    "chars": 6760,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport torch\r\nimport torch.nn.functional as F\r\n"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/rpn/transformer.py",
    "chars": 1913,
    "preview": "import torch\r\nimport torch.nn.functional as F\r\nfrom torch import nn, Tensor\r\n\r\nimport copy\r\nfrom typing import Optional,"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/rpn/vldyhead.py",
    "chars": 47631,
    "preview": "import torch\r\nimport torch.nn.functional as F\r\nfrom torch import nn\r\nfrom collections import defaultdict\r\n\r\nfrom .infere"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/modeling/utils.py",
    "chars": 2817,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\n\"\"\"\r\nMiscellaneous utility functions\r\n\"\"\"\r\n\r\nim"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/solver/__init__.py",
    "chars": 191,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nfrom .build import make_optimizer\r\nfrom .build "
  },
  {
    "path": "GLIP/maskrcnn_benchmark/solver/build.py",
    "chars": 4573,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport torch\r\nimport itertools\r\n\r\nfrom .lr_sche"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/solver/lr_scheduler.py",
    "chars": 5963,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nfrom bisect import bisect_right\r\n\r\nimport math\r"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/structures/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "GLIP/maskrcnn_benchmark/structures/bounding_box.py",
    "chars": 12331,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport torch\r\n\r\n# transpose\r\nFLIP_LEFT_RIGHT = "
  },
  {
    "path": "GLIP/maskrcnn_benchmark/structures/boxlist_ops.py",
    "chars": 5764,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport torch\r\n\r\nfrom .bounding_box import BoxLi"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/structures/image_list.py",
    "chars": 2488,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nfrom __future__ import division\r\n\r\nimport torch"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/structures/keypoint.py",
    "chars": 7555,
    "preview": "import torch\r\nfrom maskrcnn_benchmark.config import cfg\r\n\r\n# transpose\r\nFLIP_LEFT_RIGHT = 0\r\nFLIP_TOP_BOTTOM = 1\r\n\r\n\r\ncl"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/structures/segmentation_mask.py",
    "chars": 7179,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport torch\r\n\r\nimport pycocotools.mask as mask"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/utils/README.md",
    "chars": 180,
    "preview": "# Utility functions\r\n\r\nThis folder contain utility functions that are not used in the\r\ncore library, but are useful for "
  },
  {
    "path": "GLIP/maskrcnn_benchmark/utils/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "GLIP/maskrcnn_benchmark/utils/amp.py",
    "chars": 433,
    "preview": "from contextlib import contextmanager\r\n\r\n@contextmanager\r\ndef nullcontext(enter_result=None, **kwargs):\r\n    yield enter"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/utils/big_model_loading.py",
    "chars": 3268,
    "preview": "import numpy as np\r\nimport torch\r\nimport torch.nn as nn\r\n\r\nfrom collections import OrderedDict\r\n\r\n\r\ndef tf2th(conv_weigh"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/utils/c2_model_loading.py",
    "chars": 8603,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport logging\r\nimport pickle\r\nfrom collections"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/utils/checkpoint.py",
    "chars": 6280,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport logging\r\nimport os\r\n\r\nimport torch\r\n\r\nfr"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/utils/collect_env.py",
    "chars": 352,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\r\nimport PIL\r\n\r\nfrom torch.utils.collect_env impo"
  },
  {
    "path": "GLIP/maskrcnn_benchmark/utils/comm.py",
    "chars": 4579,
    "preview": "\"\"\"\r\nThis file contains primitives for multi-gpu communication.\r\nThis is useful when doing distributed training.\r\n\"\"\"\r\n\r"
  }
]

// ... and 418 more files (download for full content)

About this extraction

This page contains the full source code of the bagh2178/SG-Nav GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 618 files (16.2 MB), approximately 4.3M tokens, and a symbol index with 4564 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Copied to clipboard!