[
  {
    "path": ".gitignore",
    "content": "*.pyc\n.ipynb_checkpoints\nlib/build\nlib/pycocotools/_mask.c\nlib/pycocotools/_mask.so\n"
  },
  {
    "path": ".gitmodules",
    "content": "[submodule \"caffe-fast-rcnn\"]\n\tpath = caffe-fast-rcnn\n\turl = https://github.com/rbgirshick/caffe-fast-rcnn.git\n\tbranch = fast-rcnn\n"
  },
  {
    "path": "LICENSE",
    "content": "Faster R-CNN\n\nThe MIT License (MIT)\n\nCopyright (c) 2015 Microsoft Corporation\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in\nall copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN\nTHE SOFTWARE.\n\n************************************************************************\n\nTHIRD-PARTY SOFTWARE NOTICES AND INFORMATION\n\nThis project, Faster R-CNN, incorporates material from the project(s)\nlisted below (collectively, \"Third Party Code\").  Microsoft is not the\noriginal author of the Third Party Code.  The original copyright notice\nand license under which Microsoft received such Third Party Code are set\nout below. This Third Party Code is licensed to you under their original\nlicense terms set forth below.  Microsoft reserves all other rights not\nexpressly granted, whether by implication, estoppel or otherwise.\n\n1.\tCaffe, (https://github.com/BVLC/caffe/)\n\nCOPYRIGHT\n\nAll contributions by the University of California:\nCopyright (c) 2014, 2015, The Regents of the University of California (Regents)\nAll rights reserved.\n\nAll other contributions:\nCopyright (c) 2014, 2015, the respective contributors\nAll rights reserved.\n\nCaffe uses a shared copyright model: each contributor holds copyright\nover their contributions to Caffe. The project versioning records all\nsuch contribution and copyright details. If a contributor wants to\nfurther mark their specific copyright on a particular contribution,\nthey should indicate their copyright solely in the commit message of\nthe change when it is committed.\n\nThe BSD 2-Clause License\n\nRedistribution and use in source and binary forms, with or without\nmodification, are permitted provided that the following conditions\nare met:\n\n1. Redistributions of source code must retain the above copyright notice,\nthis list of conditions and the following disclaimer.\n\n2. Redistributions in binary form must reproduce the above copyright\nnotice, this list of conditions and the following disclaimer in the\ndocumentation and/or other materials provided with the distribution.\n\nTHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n\"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\nLIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\nA PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\nHOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\nSPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED\nTO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR\nPROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF\nLIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING\nNEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS\nSOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n\n************END OF THIRD-PARTY SOFTWARE NOTICES AND INFORMATION**********\n"
  },
  {
    "path": "README.md",
    "content": "# py-faster-rcnn has been deprecated. Please see [Detectron](https://github.com/facebookresearch/Detectron), which includes an implementation of [Mask R-CNN](https://arxiv.org/abs/1703.06870).\n\n### Disclaimer\n\nThe official Faster R-CNN code (written in MATLAB) is available [here](https://github.com/ShaoqingRen/faster_rcnn).\nIf your goal is to reproduce the results in our NIPS 2015 paper, please use the [official code](https://github.com/ShaoqingRen/faster_rcnn).\n\nThis repository contains a Python *reimplementation* of the MATLAB code.\nThis Python implementation is built on a fork of [Fast R-CNN](https://github.com/rbgirshick/fast-rcnn).\nThere are slight differences between the two implementations.\nIn particular, this Python port\n - is ~10% slower at test-time, because some operations execute on the CPU in Python layers (e.g., 220ms / image vs. 200ms / image for VGG16)\n - gives similar, but not exactly the same, mAP as the MATLAB version\n - is *not compatible* with models trained using the MATLAB code due to the minor implementation differences\n - **includes approximate joint training** that is 1.5x faster than alternating optimization (for VGG16) -- see these [slides](https://www.dropbox.com/s/xtr4yd4i5e0vw8g/iccv15_tutorial_training_rbg.pdf?dl=0) for more information\n\n# *Faster* R-CNN: Towards Real-Time Object Detection with Region Proposal Networks\n\nBy Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun (Microsoft Research)\n\nThis Python implementation contains contributions from Sean Bell (Cornell) written during an MSR internship.\n\nPlease see the official [README.md](https://github.com/ShaoqingRen/faster_rcnn/blob/master/README.md) for more details.\n\nFaster R-CNN was initially described in an [arXiv tech report](http://arxiv.org/abs/1506.01497) and was subsequently published in NIPS 2015.\n\n### License\n\nFaster R-CNN is released under the MIT License (refer to the LICENSE file for details).\n\n### Citing Faster R-CNN\n\nIf you find Faster R-CNN useful in your research, please consider citing:\n\n    @inproceedings{renNIPS15fasterrcnn,\n        Author = {Shaoqing Ren and Kaiming He and Ross Girshick and Jian Sun},\n        Title = {Faster {R-CNN}: Towards Real-Time Object Detection\n                 with Region Proposal Networks},\n        Booktitle = {Advances in Neural Information Processing Systems ({NIPS})},\n        Year = {2015}\n    }\n\n### Contents\n1. [Requirements: software](#requirements-software)\n2. [Requirements: hardware](#requirements-hardware)\n3. [Basic installation](#installation-sufficient-for-the-demo)\n4. [Demo](#demo)\n5. [Beyond the demo: training and testing](#beyond-the-demo-installation-for-training-and-testing-models)\n6. [Usage](#usage)\n\n### Requirements: software\n\n**NOTE** If you are having issues compiling and you are using a recent version of CUDA/cuDNN, please consult [this issue](https://github.com/rbgirshick/py-faster-rcnn/issues/509?_pjax=%23js-repo-pjax-container#issuecomment-284133868) for a workaround\n\n1. Requirements for `Caffe` and `pycaffe` (see: [Caffe installation instructions](http://caffe.berkeleyvision.org/installation.html))\n\n  **Note:** Caffe *must* be built with support for Python layers!\n\n  ```make\n  # In your Makefile.config, make sure to have this line uncommented\n  WITH_PYTHON_LAYER := 1\n  # Unrelatedly, it's also recommended that you use CUDNN\n  USE_CUDNN := 1\n  ```\n\n  You can download my [Makefile.config](https://dl.dropboxusercontent.com/s/6joa55k64xo2h68/Makefile.config?dl=0) for reference.\n2. Python packages you might not have: `cython`, `python-opencv`, `easydict`\n3. [Optional] MATLAB is required for **official** PASCAL VOC evaluation only. The code now includes unofficial Python evaluation code.\n\n### Requirements: hardware\n\n1. For training smaller networks (ZF, VGG_CNN_M_1024) a good GPU (e.g., Titan, K20, K40, ...) with at least 3G of memory suffices\n2. For training Fast R-CNN with VGG16, you'll need a K40 (~11G of memory)\n3. For training the end-to-end version of Faster R-CNN with VGG16, 3G of GPU memory is sufficient (using CUDNN)\n\n### Installation (sufficient for the demo)\n\n1. Clone the Faster R-CNN repository\n  ```Shell\n  # Make sure to clone with --recursive\n  git clone --recursive https://github.com/rbgirshick/py-faster-rcnn.git\n  ```\n\n2. We'll call the directory that you cloned Faster R-CNN into `FRCN_ROOT`\n\n   *Ignore notes 1 and 2 if you followed step 1 above.*\n\n   **Note 1:** If you didn't clone Faster R-CNN with the `--recursive` flag, then you'll need to manually clone the `caffe-fast-rcnn` submodule:\n    ```Shell\n    git submodule update --init --recursive\n    ```\n    **Note 2:** The `caffe-fast-rcnn` submodule needs to be on the `faster-rcnn` branch (or equivalent detached state). This will happen automatically *if you followed step 1 instructions*.\n\n3. Build the Cython modules\n    ```Shell\n    cd $FRCN_ROOT/lib\n    make\n    ```\n\n4. Build Caffe and pycaffe\n    ```Shell\n    cd $FRCN_ROOT/caffe-fast-rcnn\n    # Now follow the Caffe installation instructions here:\n    #   http://caffe.berkeleyvision.org/installation.html\n\n    # If you're experienced with Caffe and have all of the requirements installed\n    # and your Makefile.config in place, then simply do:\n    make -j8 && make pycaffe\n    ```\n\n5. Download pre-computed Faster R-CNN detectors\n    ```Shell\n    cd $FRCN_ROOT\n    ./data/scripts/fetch_faster_rcnn_models.sh\n    ```\n\n    This will populate the `$FRCN_ROOT/data` folder with `faster_rcnn_models`. See `data/README.md` for details.\n    These models were trained on VOC 2007 trainval.\n\n### Demo\n\n*After successfully completing [basic installation](#installation-sufficient-for-the-demo)*, you'll be ready to run the demo.\n\nTo run the demo\n```Shell\ncd $FRCN_ROOT\n./tools/demo.py\n```\nThe demo performs detection using a VGG16 network trained for detection on PASCAL VOC 2007.\n\n### Beyond the demo: installation for training and testing models\n1. Download the training, validation, test data and VOCdevkit\n\n\t```Shell\n\twget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar\n\twget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar\n\twget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar\n\t```\n\n2. Extract all of these tars into one directory named `VOCdevkit`\n\n\t```Shell\n\ttar xvf VOCtrainval_06-Nov-2007.tar\n\ttar xvf VOCtest_06-Nov-2007.tar\n\ttar xvf VOCdevkit_08-Jun-2007.tar\n\t```\n\n3. It should have this basic structure\n\n\t```Shell\n  \t$VOCdevkit/                           # development kit\n  \t$VOCdevkit/VOCcode/                   # VOC utility code\n  \t$VOCdevkit/VOC2007                    # image sets, annotations, etc.\n  \t# ... and several other directories ...\n  \t```\n\n4. Create symlinks for the PASCAL VOC dataset\n\n\t```Shell\n    cd $FRCN_ROOT/data\n    ln -s $VOCdevkit VOCdevkit2007\n    ```\n    Using symlinks is a good idea because you will likely want to share the same PASCAL dataset installation between multiple projects.\n5. [Optional] follow similar steps to get PASCAL VOC 2010 and 2012\n6. [Optional] If you want to use COCO, please see some notes under `data/README.md`\n7. Follow the next sections to download pre-trained ImageNet models\n\n### Download pre-trained ImageNet models\n\nPre-trained ImageNet models can be downloaded for the three networks described in the paper: ZF and VGG16.\n\n```Shell\ncd $FRCN_ROOT\n./data/scripts/fetch_imagenet_models.sh\n```\nVGG16 comes from the [Caffe Model Zoo](https://github.com/BVLC/caffe/wiki/Model-Zoo), but is provided here for your convenience.\nZF was trained at MSRA.\n\n### Usage\n\nTo train and test a Faster R-CNN detector using the **alternating optimization** algorithm from our NIPS 2015 paper, use `experiments/scripts/faster_rcnn_alt_opt.sh`.\nOutput is written underneath `$FRCN_ROOT/output`.\n\n```Shell\ncd $FRCN_ROOT\n./experiments/scripts/faster_rcnn_alt_opt.sh [GPU_ID] [NET] [--set ...]\n# GPU_ID is the GPU you want to train on\n# NET in {ZF, VGG_CNN_M_1024, VGG16} is the network arch to use\n# --set ... allows you to specify fast_rcnn.config options, e.g.\n#   --set EXP_DIR seed_rng1701 RNG_SEED 1701\n```\n\n(\"alt opt\" refers to the alternating optimization training algorithm described in the NIPS paper.)\n\nTo train and test a Faster R-CNN detector using the **approximate joint training** method, use `experiments/scripts/faster_rcnn_end2end.sh`.\nOutput is written underneath `$FRCN_ROOT/output`.\n\n```Shell\ncd $FRCN_ROOT\n./experiments/scripts/faster_rcnn_end2end.sh [GPU_ID] [NET] [--set ...]\n# GPU_ID is the GPU you want to train on\n# NET in {ZF, VGG_CNN_M_1024, VGG16} is the network arch to use\n# --set ... allows you to specify fast_rcnn.config options, e.g.\n#   --set EXP_DIR seed_rng1701 RNG_SEED 1701\n```\n\nThis method trains the RPN module jointly with the Fast R-CNN network, rather than alternating between training the two. It results in faster (~ 1.5x speedup) training times and similar detection accuracy. See these [slides](https://www.dropbox.com/s/xtr4yd4i5e0vw8g/iccv15_tutorial_training_rbg.pdf?dl=0) for more details.\n\nArtifacts generated by the scripts in `tools` are written in this directory.\n\nTrained Fast R-CNN networks are saved under:\n\n```\noutput/<experiment directory>/<dataset name>/\n```\n\nTest outputs are saved under:\n\n```\noutput/<experiment directory>/<dataset name>/<network snapshot name>/\n```\n"
  },
  {
    "path": "data/.gitignore",
    "content": "selective_search*\nimagenet_models*\nfast_rcnn_models*\nVOCdevkit*\ncache\n"
  },
  {
    "path": "data/README.md",
    "content": "This directory holds (*after you download them*):\n- Caffe models pre-trained on ImageNet\n- Faster R-CNN models\n- Symlinks to datasets\n\nTo download Caffe models (ZF, VGG16) pre-trained on ImageNet, run:\n\n```\n./data/scripts/fetch_imagenet_models.sh\n```\n\nThis script will populate `data/imagenet_models`.\n\nTo download Faster R-CNN models trained on VOC 2007, run:\n\n```\n./data/scripts/fetch_faster_rcnn_models.sh\n```\n\nThis script will populate `data/faster_rcnn_models`.\n\nIn order to train and test with PASCAL VOC, you will need to establish symlinks.\nFrom the `data` directory (`cd data`):\n\n```\n# For VOC 2007\nln -s /your/path/to/VOC2007/VOCdevkit VOCdevkit2007\n\n# For VOC 2012\nln -s /your/path/to/VOC2012/VOCdevkit VOCdevkit2012\n```\n\nInstall the MS COCO dataset at /path/to/coco\n\n```\nln -s /path/to/coco coco\n```\n\nFor COCO with Fast R-CNN, place object proposals under `coco_proposals` (inside\nthe `data` directory). You can obtain proposals on COCO from Jan Hosang at\nhttps://www.mpi-inf.mpg.de/departments/computer-vision-and-multimodal-computing/research/object-recognition-and-scene-understanding/how-good-are-detection-proposals-really/.\nFor COCO, using MCG is recommended over selective search. MCG boxes can be downloaded\nfrom http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/mcg/.\nUse the tool `lib/datasets/tools/mcg_munge.py` to convert the downloaded MCG data\ninto the same file layout as those from Jan Hosang.\n\nSince you'll likely be experimenting with multiple installs of Fast/er R-CNN in\nparallel, you'll probably want to keep all of this data in a shared place and\nuse symlinks. On my system I create the following symlinks inside `data`:\n\nAnnotations for the 5k image 'minival' subset of COCO val2014 that I like to use\ncan be found at https://dl.dropboxusercontent.com/s/o43o90bna78omob/instances_minival2014.json.zip?dl=0.\nAnnotations for COCO val2014 (set) minus minival (~35k images) can be found at\nhttps://dl.dropboxusercontent.com/s/s3tw5zcg7395368/instances_valminusminival2014.json.zip?dl=0.\n\n```\n# data/cache holds various outputs created by the datasets package\nln -s /data/fast_rcnn_shared/cache\n\n# move the imagenet_models to shared location and symlink to them\nln -s /data/fast_rcnn_shared/imagenet_models\n\n# move the selective search data to a shared location and symlink to them\n# (only applicable to Fast R-CNN training)\nln -s /data/fast_rcnn_shared/selective_search_data\n\nln -s /data/VOC2007/VOCdevkit VOCdevkit2007\nln -s /data/VOC2012/VOCdevkit VOCdevkit2012\n```\n"
  },
  {
    "path": "data/pylintrc",
    "content": "[TYPECHECK]\n\nignored-modules = numpy, numpy.random, cv2\n"
  },
  {
    "path": "data/scripts/fetch_faster_rcnn_models.sh",
    "content": "#!/bin/bash\n\nDIR=\"$( cd \"$( dirname \"${BASH_SOURCE[0]}\" )/../\" && pwd )\"\ncd $DIR\n\nFILE=faster_rcnn_models.tgz\nURL=https://dl.dropboxusercontent.com/s/o6ii098bu51d139/faster_rcnn_models.tgz?dl=0\nCHECKSUM=ac116844f66aefe29587214272054668\n\nif [ -f $FILE ]; then\n  echo \"File already exists. Checking md5...\"\n  os=`uname -s`\n  if [ \"$os\" = \"Linux\" ]; then\n    checksum=`md5sum $FILE | awk '{ print $1 }'`\n  elif [ \"$os\" = \"Darwin\" ]; then\n    checksum=`cat $FILE | md5`\n  fi\n  if [ \"$checksum\" = \"$CHECKSUM\" ]; then\n    echo \"Checksum is correct. No need to download.\"\n    exit 0\n  else\n    echo \"Checksum is incorrect. Need to download again.\"\n  fi\nfi\n\necho \"Downloading Faster R-CNN demo models (695M)...\"\n\nwget $URL -O $FILE\n\necho \"Unzipping...\"\n\ntar zxvf $FILE\n\necho \"Done. Please run this command again to verify that checksum = $CHECKSUM.\"\n"
  },
  {
    "path": "data/scripts/fetch_imagenet_models.sh",
    "content": "#!/bin/bash\n\nDIR=\"$( cd \"$( dirname \"${BASH_SOURCE[0]}\" )/../\" && pwd )\"\ncd $DIR\n\nFILE=imagenet_models.tgz\nURL=https://dl.dropbox.com/s/gstw7122padlf0l/imagenet_models.tgz?dl=0\nCHECKSUM=ed34ca912d6782edfb673a8c3a0bda6d\n\nif [ -f $FILE ]; then\n  echo \"File already exists. Checking md5...\"\n  os=`uname -s`\n  if [ \"$os\" = \"Linux\" ]; then\n    checksum=`md5sum $FILE | awk '{ print $1 }'`\n  elif [ \"$os\" = \"Darwin\" ]; then\n    checksum=`cat $FILE | md5`\n  fi\n  if [ \"$checksum\" = \"$CHECKSUM\" ]; then\n    echo \"Checksum is correct. No need to download.\"\n    exit 0\n  else\n    echo \"Checksum is incorrect. Need to download again.\"\n  fi\nfi\n\necho \"Downloading pretrained ImageNet models (1G)...\"\n\nwget $URL -O $FILE\n\necho \"Unzipping...\"\n\ntar zxvf $FILE\n\necho \"Done. Please run this command again to verify that checksum = $CHECKSUM.\"\n"
  },
  {
    "path": "data/scripts/fetch_selective_search_data.sh",
    "content": "#!/bin/bash\n\nDIR=\"$( cd \"$( dirname \"${BASH_SOURCE[0]}\" )/../\" && pwd )\"\ncd $DIR\n\nFILE=selective_search_data.tgz\nURL=https://dl.dropboxusercontent.com/s/orrt7o6bp6ae0tc/selective_search_data.tgz?dl=0\nCHECKSUM=7078c1db87a7851b31966b96774cd9b9\n\nif [ -f $FILE ]; then\n  echo \"File already exists. Checking md5...\"\n  os=`uname -s`\n  if [ \"$os\" = \"Linux\" ]; then\n    checksum=`md5sum $FILE | awk '{ print $1 }'`\n  elif [ \"$os\" = \"Darwin\" ]; then\n    checksum=`cat $FILE | md5`\n  fi\n  if [ \"$checksum\" = \"$CHECKSUM\" ]; then\n    echo \"Checksum is correct. No need to download.\"\n    exit 0\n  else\n    echo \"Checksum is incorrect. Need to download again.\"\n  fi\nfi\n\necho \"Downloading precomputed selective search boxes (0.5G)...\"\n\nwget $URL -O $FILE\n\necho \"Unzipping...\"\n\ntar zxvf $FILE\n\necho \"Done. Please run this command again to verify that checksum = $CHECKSUM.\"\n"
  },
  {
    "path": "experiments/README.md",
    "content": "Scripts are under `experiments/scripts`.\n\nEach script saves a log file under `experiments/logs`.\n\nConfiguration override files used in the experiments are stored in `experiments/cfgs`.\n"
  },
  {
    "path": "experiments/cfgs/faster_rcnn_alt_opt.yml",
    "content": "EXP_DIR: faster_rcnn_alt_opt\nTRAIN:\n  BG_THRESH_LO: 0.0\nTEST:\n  HAS_RPN: True\n"
  },
  {
    "path": "experiments/cfgs/faster_rcnn_end2end.yml",
    "content": "EXP_DIR: faster_rcnn_end2end\nTRAIN:\n  HAS_RPN: True\n  IMS_PER_BATCH: 1\n  BBOX_NORMALIZE_TARGETS_PRECOMPUTED: True\n  RPN_POSITIVE_OVERLAP: 0.7\n  RPN_BATCHSIZE: 256\n  PROPOSAL_METHOD: gt\n  BG_THRESH_LO: 0.0\nTEST:\n  HAS_RPN: True\n"
  },
  {
    "path": "experiments/logs/.gitignore",
    "content": "*.txt*\n"
  },
  {
    "path": "experiments/scripts/fast_rcnn.sh",
    "content": "#!/bin/bash\n# Usage:\n# ./experiments/scripts/fast_rcnn.sh GPU NET DATASET [options args to {train,test}_net.py]\n# DATASET is either pascal_voc or coco.\n#\n# Example:\n# ./experiments/scripts/fast_rcnn.sh 0 VGG_CNN_M_1024 pascal_voc \\\n#   --set EXP_DIR foobar RNG_SEED 42 TRAIN.SCALES \"[400, 500, 600, 700]\"\n\nset -x\nset -e\n\nexport PYTHONUNBUFFERED=\"True\"\n\nGPU_ID=$1\nNET=$2\nNET_lc=${NET,,}\nDATASET=$3\n\narray=( $@ )\nlen=${#array[@]}\nEXTRA_ARGS=${array[@]:3:$len}\nEXTRA_ARGS_SLUG=${EXTRA_ARGS// /_}\n\ncase $DATASET in\n  pascal_voc)\n    TRAIN_IMDB=\"voc_2007_trainval\"\n    TEST_IMDB=\"voc_2007_test\"\n    PT_DIR=\"pascal_voc\"\n    ITERS=40000\n    ;;\n  coco)\n    TRAIN_IMDB=\"coco_2014_train\"\n    TEST_IMDB=\"coco_2014_minival\"\n    PT_DIR=\"coco\"\n    ITERS=280000\n    ;;\n  *)\n    echo \"No dataset given\"\n    exit\n    ;;\nesac\n\nLOG=\"experiments/logs/fast_rcnn_${NET}_${EXTRA_ARGS_SLUG}.txt.`date +'%Y-%m-%d_%H-%M-%S'`\"\nexec &> >(tee -a \"$LOG\")\necho Logging output to \"$LOG\"\n\ntime ./tools/train_net.py --gpu ${GPU_ID} \\\n  --solver models/${PT_DIR}/${NET}/fast_rcnn/solver.prototxt \\\n  --weights data/imagenet_models/${NET}.v2.caffemodel \\\n  --imdb ${TRAIN_IMDB} \\\n  --iters ${ITERS} \\\n  ${EXTRA_ARGS}\n\nset +x\nNET_FINAL=`grep -B 1 \"done solving\" ${LOG} | grep \"Wrote snapshot\" | awk '{print $4}'`\nset -x\n\ntime ./tools/test_net.py --gpu ${GPU_ID} \\\n  --def models/${PT_DIR}/${NET}/fast_rcnn/test.prototxt \\\n  --net ${NET_FINAL} \\\n  --imdb ${TEST_IMDB} \\\n  ${EXTRA_ARGS}\n"
  },
  {
    "path": "experiments/scripts/faster_rcnn_alt_opt.sh",
    "content": "#!/bin/bash\n# Usage:\n# ./experiments/scripts/faster_rcnn_alt_opt.sh GPU NET DATASET [options args to {train,test}_net.py]\n# DATASET is only pascal_voc for now\n#\n# Example:\n# ./experiments/scripts/faster_rcnn_alt_opt.sh 0 VGG_CNN_M_1024 pascal_voc \\\n#   --set EXP_DIR foobar RNG_SEED 42 TRAIN.SCALES \"[400, 500, 600, 700]\"\n\nset -x\nset -e\n\nexport PYTHONUNBUFFERED=\"True\"\n\nGPU_ID=$1\nNET=$2\nNET_lc=${NET,,}\nDATASET=$3\n\narray=( $@ )\nlen=${#array[@]}\nEXTRA_ARGS=${array[@]:3:$len}\nEXTRA_ARGS_SLUG=${EXTRA_ARGS// /_}\n\ncase $DATASET in\n  pascal_voc)\n    TRAIN_IMDB=\"voc_2007_trainval\"\n    TEST_IMDB=\"voc_2007_test\"\n    PT_DIR=\"pascal_voc\"\n    ITERS=40000\n    ;;\n  coco)\n    echo \"Not implemented: use experiments/scripts/faster_rcnn_end2end.sh for coco\"\n    exit\n    ;;\n  *)\n    echo \"No dataset given\"\n    exit\n    ;;\nesac\n\nLOG=\"experiments/logs/faster_rcnn_alt_opt_${NET}_${EXTRA_ARGS_SLUG}.txt.`date +'%Y-%m-%d_%H-%M-%S'`\"\nexec &> >(tee -a \"$LOG\")\necho Logging output to \"$LOG\"\n\ntime ./tools/train_faster_rcnn_alt_opt.py --gpu ${GPU_ID} \\\n  --net_name ${NET} \\\n  --weights data/imagenet_models/${NET}.v2.caffemodel \\\n  --imdb ${TRAIN_IMDB} \\\n  --cfg experiments/cfgs/faster_rcnn_alt_opt.yml \\\n  ${EXTRA_ARGS}\n\nset +x\nNET_FINAL=`grep \"Final model:\" ${LOG} | awk '{print $3}'`\nset -x\n\ntime ./tools/test_net.py --gpu ${GPU_ID} \\\n  --def models/${PT_DIR}/${NET}/faster_rcnn_alt_opt/faster_rcnn_test.pt \\\n  --net ${NET_FINAL} \\\n  --imdb ${TEST_IMDB} \\\n  --cfg experiments/cfgs/faster_rcnn_alt_opt.yml \\\n  ${EXTRA_ARGS}\n"
  },
  {
    "path": "experiments/scripts/faster_rcnn_end2end.sh",
    "content": "#!/bin/bash\n# Usage:\n# ./experiments/scripts/faster_rcnn_end2end.sh GPU NET DATASET [options args to {train,test}_net.py]\n# DATASET is either pascal_voc or coco.\n#\n# Example:\n# ./experiments/scripts/faster_rcnn_end2end.sh 0 VGG_CNN_M_1024 pascal_voc \\\n#   --set EXP_DIR foobar RNG_SEED 42 TRAIN.SCALES \"[400, 500, 600, 700]\"\n\nset -x\nset -e\n\nexport PYTHONUNBUFFERED=\"True\"\n\nGPU_ID=$1\nNET=$2\nNET_lc=${NET,,}\nDATASET=$3\n\narray=( $@ )\nlen=${#array[@]}\nEXTRA_ARGS=${array[@]:3:$len}\nEXTRA_ARGS_SLUG=${EXTRA_ARGS// /_}\n\ncase $DATASET in\n  pascal_voc)\n    TRAIN_IMDB=\"voc_2007_trainval\"\n    TEST_IMDB=\"voc_2007_test\"\n    PT_DIR=\"pascal_voc\"\n    ITERS=70000\n    ;;\n  coco)\n    # This is a very long and slow training schedule\n    # You can probably use fewer iterations and reduce the\n    # time to the LR drop (set in the solver to 350,000 iterations).\n    TRAIN_IMDB=\"coco_2014_train\"\n    TEST_IMDB=\"coco_2014_minival\"\n    PT_DIR=\"coco\"\n    ITERS=490000\n    ;;\n  *)\n    echo \"No dataset given\"\n    exit\n    ;;\nesac\n\nLOG=\"experiments/logs/faster_rcnn_end2end_${NET}_${EXTRA_ARGS_SLUG}.txt.`date +'%Y-%m-%d_%H-%M-%S'`\"\nexec &> >(tee -a \"$LOG\")\necho Logging output to \"$LOG\"\n\ntime ./tools/train_net.py --gpu ${GPU_ID} \\\n  --solver models/${PT_DIR}/${NET}/faster_rcnn_end2end/solver.prototxt \\\n  --weights data/imagenet_models/${NET}.v2.caffemodel \\\n  --imdb ${TRAIN_IMDB} \\\n  --iters ${ITERS} \\\n  --cfg experiments/cfgs/faster_rcnn_end2end.yml \\\n  ${EXTRA_ARGS}\n\nset +x\nNET_FINAL=`grep -B 1 \"done solving\" ${LOG} | grep \"Wrote snapshot\" | awk '{print $4}'`\nset -x\n\ntime ./tools/test_net.py --gpu ${GPU_ID} \\\n  --def models/${PT_DIR}/${NET}/faster_rcnn_end2end/test.prototxt \\\n  --net ${NET_FINAL} \\\n  --imdb ${TEST_IMDB} \\\n  --cfg experiments/cfgs/faster_rcnn_end2end.yml \\\n  ${EXTRA_ARGS}\n"
  },
  {
    "path": "lib/Makefile",
    "content": "all:\n\tpython setup.py build_ext --inplace\n\trm -rf build\n"
  },
  {
    "path": "lib/datasets/VOCdevkit-matlab-wrapper/get_voc_opts.m",
    "content": "function VOCopts = get_voc_opts(path)\n\ntmp = pwd;\ncd(path);\ntry\n  addpath('VOCcode');\n  VOCinit;\ncatch\n  rmpath('VOCcode');\n  cd(tmp);\n  error(sprintf('VOCcode directory not found under %s', path));\nend\nrmpath('VOCcode');\ncd(tmp);\n"
  },
  {
    "path": "lib/datasets/VOCdevkit-matlab-wrapper/voc_eval.m",
    "content": "function res = voc_eval(path, comp_id, test_set, output_dir)\n\nVOCopts = get_voc_opts(path);\nVOCopts.testset = test_set;\n\nfor i = 1:length(VOCopts.classes)\n  cls = VOCopts.classes{i};\n  res(i) = voc_eval_cls(cls, VOCopts, comp_id, output_dir);\nend\n\nfprintf('\\n~~~~~~~~~~~~~~~~~~~~\\n');\nfprintf('Results:\\n');\naps = [res(:).ap]';\nfprintf('%.1f\\n', aps * 100);\nfprintf('%.1f\\n', mean(aps) * 100);\nfprintf('~~~~~~~~~~~~~~~~~~~~\\n');\n\nfunction res = voc_eval_cls(cls, VOCopts, comp_id, output_dir)\n\ntest_set = VOCopts.testset;\nyear = VOCopts.dataset(4:end);\n\naddpath(fullfile(VOCopts.datadir, 'VOCcode'));\n\nres_fn = sprintf(VOCopts.detrespath, comp_id, cls);\n\nrecall = [];\nprec = [];\nap = 0;\nap_auc = 0;\n\ndo_eval = (str2num(year) <= 2007) | ~strcmp(test_set, 'test');\nif do_eval\n  % Bug in VOCevaldet requires that tic has been called first\n  tic;\n  [recall, prec, ap] = VOCevaldet(VOCopts, comp_id, cls, true);\n  ap_auc = xVOCap(recall, prec);\n\n  % force plot limits\n  ylim([0 1]);\n  xlim([0 1]);\n\n  print(gcf, '-djpeg', '-r0', ...\n        [output_dir '/' cls '_pr.jpg']);\nend\nfprintf('!!! %s : %.4f %.4f\\n', cls, ap, ap_auc);\n\nres.recall = recall;\nres.prec = prec;\nres.ap = ap;\nres.ap_auc = ap_auc;\n\nsave([output_dir '/' cls '_pr.mat'], ...\n     'res', 'recall', 'prec', 'ap', 'ap_auc');\n\nrmpath(fullfile(VOCopts.datadir, 'VOCcode'));\n"
  },
  {
    "path": "lib/datasets/VOCdevkit-matlab-wrapper/xVOCap.m",
    "content": "function ap = xVOCap(rec,prec)\r\n% From the PASCAL VOC 2011 devkit\r\n\r\nmrec=[0 ; rec ; 1];\r\nmpre=[0 ; prec ; 0];\r\nfor i=numel(mpre)-1:-1:1\r\n    mpre(i)=max(mpre(i),mpre(i+1));\r\nend\r\ni=find(mrec(2:end)~=mrec(1:end-1))+1;\r\nap=sum((mrec(i)-mrec(i-1)).*mpre(i));\r\n"
  },
  {
    "path": "lib/datasets/__init__.py",
    "content": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n"
  },
  {
    "path": "lib/datasets/coco.py",
    "content": "# --------------------------------------------------------\n# Fast/er R-CNN\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\nfrom datasets.imdb import imdb\nimport datasets.ds_utils as ds_utils\nfrom fast_rcnn.config import cfg\nimport os.path as osp\nimport sys\nimport os\nimport numpy as np\nimport scipy.sparse\nimport scipy.io as sio\nimport cPickle\nimport json\nimport uuid\n# COCO API\nfrom pycocotools.coco import COCO\nfrom pycocotools.cocoeval import COCOeval\nfrom pycocotools import mask as COCOmask\n\ndef _filter_crowd_proposals(roidb, crowd_thresh):\n    \"\"\"\n    Finds proposals that are inside crowd regions and marks them with\n    overlap = -1 (for all gt rois), which means they will be excluded from\n    training.\n    \"\"\"\n    for ix, entry in enumerate(roidb):\n        overlaps = entry['gt_overlaps'].toarray()\n        crowd_inds = np.where(overlaps.max(axis=1) == -1)[0]\n        non_gt_inds = np.where(entry['gt_classes'] == 0)[0]\n        if len(crowd_inds) == 0 or len(non_gt_inds) == 0:\n            continue\n        iscrowd = [int(True) for _ in xrange(len(crowd_inds))]\n        crowd_boxes = ds_utils.xyxy_to_xywh(entry['boxes'][crowd_inds, :])\n        non_gt_boxes = ds_utils.xyxy_to_xywh(entry['boxes'][non_gt_inds, :])\n        ious = COCOmask.iou(non_gt_boxes, crowd_boxes, iscrowd)\n        bad_inds = np.where(ious.max(axis=1) > crowd_thresh)[0]\n        overlaps[non_gt_inds[bad_inds], :] = -1\n        roidb[ix]['gt_overlaps'] = scipy.sparse.csr_matrix(overlaps)\n    return roidb\n\nclass coco(imdb):\n    def __init__(self, image_set, year):\n        imdb.__init__(self, 'coco_' + year + '_' + image_set)\n        # COCO specific config options\n        self.config = {'top_k' : 2000,\n                       'use_salt' : True,\n                       'cleanup' : True,\n                       'crowd_thresh' : 0.7,\n                       'min_size' : 2}\n        # name, paths\n        self._year = year\n        self._image_set = image_set\n        self._data_path = osp.join(cfg.DATA_DIR, 'coco')\n        # load COCO API, classes, class <-> id mappings\n        self._COCO = COCO(self._get_ann_file())\n        cats = self._COCO.loadCats(self._COCO.getCatIds())\n        self._classes = tuple(['__background__'] + [c['name'] for c in cats])\n        self._class_to_ind = dict(zip(self.classes, xrange(self.num_classes)))\n        self._class_to_coco_cat_id = dict(zip([c['name'] for c in cats],\n                                              self._COCO.getCatIds()))\n        self._image_index = self._load_image_set_index()\n        # Default to roidb handler\n        self.set_proposal_method('selective_search')\n        self.competition_mode(False)\n\n        # Some image sets are \"views\" (i.e. subsets) into others.\n        # For example, minival2014 is a random 5000 image subset of val2014.\n        # This mapping tells us where the view's images and proposals come from.\n        self._view_map = {\n            'minival2014' : 'val2014',          # 5k val2014 subset\n            'valminusminival2014' : 'val2014',  # val2014 \\setminus minival2014\n        }\n        coco_name = image_set + year  # e.g., \"val2014\"\n        self._data_name = (self._view_map[coco_name]\n                           if self._view_map.has_key(coco_name)\n                           else coco_name)\n        # Dataset splits that have ground-truth annotations (test splits\n        # do not have gt annotations)\n        self._gt_splits = ('train', 'val', 'minival')\n\n    def _get_ann_file(self):\n        prefix = 'instances' if self._image_set.find('test') == -1 \\\n                             else 'image_info'\n        return osp.join(self._data_path, 'annotations',\n                        prefix + '_' + self._image_set + self._year + '.json')\n\n    def _load_image_set_index(self):\n        \"\"\"\n        Load image ids.\n        \"\"\"\n        image_ids = self._COCO.getImgIds()\n        return image_ids\n\n    def _get_widths(self):\n        anns = self._COCO.loadImgs(self._image_index)\n        widths = [ann['width'] for ann in anns]\n        return widths\n\n    def image_path_at(self, i):\n        \"\"\"\n        Return the absolute path to image i in the image sequence.\n        \"\"\"\n        return self.image_path_from_index(self._image_index[i])\n\n    def image_path_from_index(self, index):\n        \"\"\"\n        Construct an image path from the image's \"index\" identifier.\n        \"\"\"\n        # Example image path for index=119993:\n        #   images/train2014/COCO_train2014_000000119993.jpg\n        file_name = ('COCO_' + self._data_name + '_' +\n                     str(index).zfill(12) + '.jpg')\n        image_path = osp.join(self._data_path, 'images',\n                              self._data_name, file_name)\n        assert osp.exists(image_path), \\\n                'Path does not exist: {}'.format(image_path)\n        return image_path\n\n    def selective_search_roidb(self):\n        return self._roidb_from_proposals('selective_search')\n\n    def edge_boxes_roidb(self):\n        return self._roidb_from_proposals('edge_boxes_AR')\n\n    def mcg_roidb(self):\n        return self._roidb_from_proposals('MCG')\n\n    def _roidb_from_proposals(self, method):\n        \"\"\"\n        Creates a roidb from pre-computed proposals of a particular methods.\n        \"\"\"\n        top_k = self.config['top_k']\n        cache_file = osp.join(self.cache_path, self.name +\n                              '_{:s}_top{:d}'.format(method, top_k) +\n                              '_roidb.pkl')\n\n        if osp.exists(cache_file):\n            with open(cache_file, 'rb') as fid:\n                roidb = cPickle.load(fid)\n            print '{:s} {:s} roidb loaded from {:s}'.format(self.name, method,\n                                                            cache_file)\n            return roidb\n\n        if self._image_set in self._gt_splits:\n            gt_roidb = self.gt_roidb()\n            method_roidb = self._load_proposals(method, gt_roidb)\n            roidb = imdb.merge_roidbs(gt_roidb, method_roidb)\n            # Make sure we don't use proposals that are contained in crowds\n            roidb = _filter_crowd_proposals(roidb, self.config['crowd_thresh'])\n        else:\n            roidb = self._load_proposals(method, None)\n        with open(cache_file, 'wb') as fid:\n            cPickle.dump(roidb, fid, cPickle.HIGHEST_PROTOCOL)\n        print 'wrote {:s} roidb to {:s}'.format(method, cache_file)\n        return roidb\n\n    def _load_proposals(self, method, gt_roidb):\n        \"\"\"\n        Load pre-computed proposals in the format provided by Jan Hosang:\n        http://www.mpi-inf.mpg.de/departments/computer-vision-and-multimodal-\n          computing/research/object-recognition-and-scene-understanding/how-\n          good-are-detection-proposals-really/\n        For MCG, use boxes from http://www.eecs.berkeley.edu/Research/Projects/\n          CS/vision/grouping/mcg/ and convert the file layout using\n        lib/datasets/tools/mcg_munge.py.\n        \"\"\"\n        box_list = []\n        top_k = self.config['top_k']\n        valid_methods = [\n            'MCG',\n            'selective_search',\n            'edge_boxes_AR',\n            'edge_boxes_70']\n        assert method in valid_methods\n\n        print 'Loading {} boxes'.format(method)\n        for i, index in enumerate(self._image_index):\n            if i % 1000 == 0:\n                print '{:d} / {:d}'.format(i + 1, len(self._image_index))\n\n            box_file = osp.join(\n                cfg.DATA_DIR, 'coco_proposals', method, 'mat',\n                self._get_box_file(index))\n\n            raw_data = sio.loadmat(box_file)['boxes']\n            boxes = np.maximum(raw_data - 1, 0).astype(np.uint16)\n            if method == 'MCG':\n                # Boxes from the MCG website are in (y1, x1, y2, x2) order\n                boxes = boxes[:, (1, 0, 3, 2)]\n            # Remove duplicate boxes and very small boxes and then take top k\n            keep = ds_utils.unique_boxes(boxes)\n            boxes = boxes[keep, :]\n            keep = ds_utils.filter_small_boxes(boxes, self.config['min_size'])\n            boxes = boxes[keep, :]\n            boxes = boxes[:top_k, :]\n            box_list.append(boxes)\n            # Sanity check\n            im_ann = self._COCO.loadImgs(index)[0]\n            width = im_ann['width']\n            height = im_ann['height']\n            ds_utils.validate_boxes(boxes, width=width, height=height)\n        return self.create_roidb_from_box_list(box_list, gt_roidb)\n\n    def gt_roidb(self):\n        \"\"\"\n        Return the database of ground-truth regions of interest.\n        This function loads/saves from/to a cache file to speed up future calls.\n        \"\"\"\n        cache_file = osp.join(self.cache_path, self.name + '_gt_roidb.pkl')\n        if osp.exists(cache_file):\n            with open(cache_file, 'rb') as fid:\n                roidb = cPickle.load(fid)\n            print '{} gt roidb loaded from {}'.format(self.name, cache_file)\n            return roidb\n\n        gt_roidb = [self._load_coco_annotation(index)\n                    for index in self._image_index]\n\n        with open(cache_file, 'wb') as fid:\n            cPickle.dump(gt_roidb, fid, cPickle.HIGHEST_PROTOCOL)\n        print 'wrote gt roidb to {}'.format(cache_file)\n        return gt_roidb\n\n    def _load_coco_annotation(self, index):\n        \"\"\"\n        Loads COCO bounding-box instance annotations. Crowd instances are\n        handled by marking their overlaps (with all categories) to -1. This\n        overlap value means that crowd \"instances\" are excluded from training.\n        \"\"\"\n        im_ann = self._COCO.loadImgs(index)[0]\n        width = im_ann['width']\n        height = im_ann['height']\n\n        annIds = self._COCO.getAnnIds(imgIds=index, iscrowd=None)\n        objs = self._COCO.loadAnns(annIds)\n        # Sanitize bboxes -- some are invalid\n        valid_objs = []\n        for obj in objs:\n            x1 = np.max((0, obj['bbox'][0]))\n            y1 = np.max((0, obj['bbox'][1]))\n            x2 = np.min((width - 1, x1 + np.max((0, obj['bbox'][2] - 1))))\n            y2 = np.min((height - 1, y1 + np.max((0, obj['bbox'][3] - 1))))\n            if obj['area'] > 0 and x2 >= x1 and y2 >= y1:\n                obj['clean_bbox'] = [x1, y1, x2, y2]\n                valid_objs.append(obj)\n        objs = valid_objs\n        num_objs = len(objs)\n\n        boxes = np.zeros((num_objs, 4), dtype=np.uint16)\n        gt_classes = np.zeros((num_objs), dtype=np.int32)\n        overlaps = np.zeros((num_objs, self.num_classes), dtype=np.float32)\n        seg_areas = np.zeros((num_objs), dtype=np.float32)\n\n        # Lookup table to map from COCO category ids to our internal class\n        # indices\n        coco_cat_id_to_class_ind = dict([(self._class_to_coco_cat_id[cls],\n                                          self._class_to_ind[cls])\n                                         for cls in self._classes[1:]])\n\n        for ix, obj in enumerate(objs):\n            cls = coco_cat_id_to_class_ind[obj['category_id']]\n            boxes[ix, :] = obj['clean_bbox']\n            gt_classes[ix] = cls\n            seg_areas[ix] = obj['area']\n            if obj['iscrowd']:\n                # Set overlap to -1 for all classes for crowd objects\n                # so they will be excluded during training\n                overlaps[ix, :] = -1.0\n            else:\n                overlaps[ix, cls] = 1.0\n\n        ds_utils.validate_boxes(boxes, width=width, height=height)\n        overlaps = scipy.sparse.csr_matrix(overlaps)\n        return {'boxes' : boxes,\n                'gt_classes': gt_classes,\n                'gt_overlaps' : overlaps,\n                'flipped' : False,\n                'seg_areas' : seg_areas}\n\n    def _get_box_file(self, index):\n        # first 14 chars / first 22 chars / all chars + .mat\n        # COCO_val2014_0/COCO_val2014_000000447/COCO_val2014_000000447991.mat\n        file_name = ('COCO_' + self._data_name +\n                     '_' + str(index).zfill(12) + '.mat')\n        return osp.join(file_name[:14], file_name[:22], file_name)\n\n    def _print_detection_eval_metrics(self, coco_eval):\n        IoU_lo_thresh = 0.5\n        IoU_hi_thresh = 0.95\n        def _get_thr_ind(coco_eval, thr):\n            ind = np.where((coco_eval.params.iouThrs > thr - 1e-5) &\n                           (coco_eval.params.iouThrs < thr + 1e-5))[0][0]\n            iou_thr = coco_eval.params.iouThrs[ind]\n            assert np.isclose(iou_thr, thr)\n            return ind\n\n        ind_lo = _get_thr_ind(coco_eval, IoU_lo_thresh)\n        ind_hi = _get_thr_ind(coco_eval, IoU_hi_thresh)\n        # precision has dims (iou, recall, cls, area range, max dets)\n        # area range index 0: all area ranges\n        # max dets index 2: 100 per image\n        precision = \\\n            coco_eval.eval['precision'][ind_lo:(ind_hi + 1), :, :, 0, 2]\n        ap_default = np.mean(precision[precision > -1])\n        print ('~~~~ Mean and per-category AP @ IoU=[{:.2f},{:.2f}] '\n               '~~~~').format(IoU_lo_thresh, IoU_hi_thresh)\n        print '{:.1f}'.format(100 * ap_default)\n        for cls_ind, cls in enumerate(self.classes):\n            if cls == '__background__':\n                continue\n            # minus 1 because of __background__\n            precision = coco_eval.eval['precision'][ind_lo:(ind_hi + 1), :, cls_ind - 1, 0, 2]\n            ap = np.mean(precision[precision > -1])\n            print '{:.1f}'.format(100 * ap)\n\n        print '~~~~ Summary metrics ~~~~'\n        coco_eval.summarize()\n\n    def _do_detection_eval(self, res_file, output_dir):\n        ann_type = 'bbox'\n        coco_dt = self._COCO.loadRes(res_file)\n        coco_eval = COCOeval(self._COCO, coco_dt)\n        coco_eval.params.useSegm = (ann_type == 'segm')\n        coco_eval.evaluate()\n        coco_eval.accumulate()\n        self._print_detection_eval_metrics(coco_eval)\n        eval_file = osp.join(output_dir, 'detection_results.pkl')\n        with open(eval_file, 'wb') as fid:\n            cPickle.dump(coco_eval, fid, cPickle.HIGHEST_PROTOCOL)\n        print 'Wrote COCO eval results to: {}'.format(eval_file)\n\n    def _coco_results_one_category(self, boxes, cat_id):\n        results = []\n        for im_ind, index in enumerate(self.image_index):\n            dets = boxes[im_ind].astype(np.float)\n            if dets == []:\n                continue\n            scores = dets[:, -1]\n            xs = dets[:, 0]\n            ys = dets[:, 1]\n            ws = dets[:, 2] - xs + 1\n            hs = dets[:, 3] - ys + 1\n            results.extend(\n              [{'image_id' : index,\n                'category_id' : cat_id,\n                'bbox' : [xs[k], ys[k], ws[k], hs[k]],\n                'score' : scores[k]} for k in xrange(dets.shape[0])])\n        return results\n\n    def _write_coco_results_file(self, all_boxes, res_file):\n        # [{\"image_id\": 42,\n        #   \"category_id\": 18,\n        #   \"bbox\": [258.15,41.29,348.26,243.78],\n        #   \"score\": 0.236}, ...]\n        results = []\n        for cls_ind, cls in enumerate(self.classes):\n            if cls == '__background__':\n                continue\n            print 'Collecting {} results ({:d}/{:d})'.format(cls, cls_ind,\n                                                          self.num_classes - 1)\n            coco_cat_id = self._class_to_coco_cat_id[cls]\n            results.extend(self._coco_results_one_category(all_boxes[cls_ind],\n                                                           coco_cat_id))\n        print 'Writing results json to {}'.format(res_file)\n        with open(res_file, 'w') as fid:\n            json.dump(results, fid)\n\n    def evaluate_detections(self, all_boxes, output_dir):\n        res_file = osp.join(output_dir, ('detections_' +\n                                         self._image_set +\n                                         self._year +\n                                         '_results'))\n        if self.config['use_salt']:\n            res_file += '_{}'.format(str(uuid.uuid4()))\n        res_file += '.json'\n        self._write_coco_results_file(all_boxes, res_file)\n        # Only do evaluation on non-test sets\n        if self._image_set.find('test') == -1:\n            self._do_detection_eval(res_file, output_dir)\n        # Optionally cleanup results json file\n        if self.config['cleanup']:\n            os.remove(res_file)\n\n    def competition_mode(self, on):\n        if on:\n            self.config['use_salt'] = False\n            self.config['cleanup'] = False\n        else:\n            self.config['use_salt'] = True\n            self.config['cleanup'] = True\n"
  },
  {
    "path": "lib/datasets/ds_utils.py",
    "content": "# --------------------------------------------------------\n# Fast/er R-CNN\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\nimport numpy as np\n\ndef unique_boxes(boxes, scale=1.0):\n    \"\"\"Return indices of unique boxes.\"\"\"\n    v = np.array([1, 1e3, 1e6, 1e9])\n    hashes = np.round(boxes * scale).dot(v)\n    _, index = np.unique(hashes, return_index=True)\n    return np.sort(index)\n\ndef xywh_to_xyxy(boxes):\n    \"\"\"Convert [x y w h] box format to [x1 y1 x2 y2] format.\"\"\"\n    return np.hstack((boxes[:, 0:2], boxes[:, 0:2] + boxes[:, 2:4] - 1))\n\ndef xyxy_to_xywh(boxes):\n    \"\"\"Convert [x1 y1 x2 y2] box format to [x y w h] format.\"\"\"\n    return np.hstack((boxes[:, 0:2], boxes[:, 2:4] - boxes[:, 0:2] + 1))\n\ndef validate_boxes(boxes, width=0, height=0):\n    \"\"\"Check that a set of boxes are valid.\"\"\"\n    x1 = boxes[:, 0]\n    y1 = boxes[:, 1]\n    x2 = boxes[:, 2]\n    y2 = boxes[:, 3]\n    assert (x1 >= 0).all()\n    assert (y1 >= 0).all()\n    assert (x2 >= x1).all()\n    assert (y2 >= y1).all()\n    assert (x2 < width).all()\n    assert (y2 < height).all()\n\ndef filter_small_boxes(boxes, min_size):\n    w = boxes[:, 2] - boxes[:, 0]\n    h = boxes[:, 3] - boxes[:, 1]\n    keep = np.where((w >= min_size) & (h > min_size))[0]\n    return keep\n"
  },
  {
    "path": "lib/datasets/factory.py",
    "content": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\n\"\"\"Factory method for easily getting imdbs by name.\"\"\"\n\n__sets = {}\n\nfrom datasets.pascal_voc import pascal_voc\nfrom datasets.coco import coco\nimport numpy as np\n\n# Set up voc_<year>_<split> using selective search \"fast\" mode\nfor year in ['2007', '2012']:\n    for split in ['train', 'val', 'trainval', 'test']:\n        name = 'voc_{}_{}'.format(year, split)\n        __sets[name] = (lambda split=split, year=year: pascal_voc(split, year))\n\n# Set up coco_2014_<split>\nfor year in ['2014']:\n    for split in ['train', 'val', 'minival', 'valminusminival']:\n        name = 'coco_{}_{}'.format(year, split)\n        __sets[name] = (lambda split=split, year=year: coco(split, year))\n\n# Set up coco_2015_<split>\nfor year in ['2015']:\n    for split in ['test', 'test-dev']:\n        name = 'coco_{}_{}'.format(year, split)\n        __sets[name] = (lambda split=split, year=year: coco(split, year))\n\ndef get_imdb(name):\n    \"\"\"Get an imdb (image database) by name.\"\"\"\n    if not __sets.has_key(name):\n        raise KeyError('Unknown dataset: {}'.format(name))\n    return __sets[name]()\n\ndef list_imdbs():\n    \"\"\"List all registered imdbs.\"\"\"\n    return __sets.keys()\n"
  },
  {
    "path": "lib/datasets/imdb.py",
    "content": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\nimport os\nimport os.path as osp\nimport PIL\nfrom utils.cython_bbox import bbox_overlaps\nimport numpy as np\nimport scipy.sparse\nfrom fast_rcnn.config import cfg\n\nclass imdb(object):\n    \"\"\"Image database.\"\"\"\n\n    def __init__(self, name):\n        self._name = name\n        self._num_classes = 0\n        self._classes = []\n        self._image_index = []\n        self._obj_proposer = 'selective_search'\n        self._roidb = None\n        self._roidb_handler = self.default_roidb\n        # Use this dict for storing dataset specific config options\n        self.config = {}\n\n    @property\n    def name(self):\n        return self._name\n\n    @property\n    def num_classes(self):\n        return len(self._classes)\n\n    @property\n    def classes(self):\n        return self._classes\n\n    @property\n    def image_index(self):\n        return self._image_index\n\n    @property\n    def roidb_handler(self):\n        return self._roidb_handler\n\n    @roidb_handler.setter\n    def roidb_handler(self, val):\n        self._roidb_handler = val\n\n    def set_proposal_method(self, method):\n        method = eval('self.' + method + '_roidb')\n        self.roidb_handler = method\n\n    @property\n    def roidb(self):\n        # A roidb is a list of dictionaries, each with the following keys:\n        #   boxes\n        #   gt_overlaps\n        #   gt_classes\n        #   flipped\n        if self._roidb is not None:\n            return self._roidb\n        self._roidb = self.roidb_handler()\n        return self._roidb\n\n    @property\n    def cache_path(self):\n        cache_path = osp.abspath(osp.join(cfg.DATA_DIR, 'cache'))\n        if not os.path.exists(cache_path):\n            os.makedirs(cache_path)\n        return cache_path\n\n    @property\n    def num_images(self):\n      return len(self.image_index)\n\n    def image_path_at(self, i):\n        raise NotImplementedError\n\n    def default_roidb(self):\n        raise NotImplementedError\n\n    def evaluate_detections(self, all_boxes, output_dir=None):\n        \"\"\"\n        all_boxes is a list of length number-of-classes.\n        Each list element is a list of length number-of-images.\n        Each of those list elements is either an empty list []\n        or a numpy array of detection.\n\n        all_boxes[class][image] = [] or np.array of shape #dets x 5\n        \"\"\"\n        raise NotImplementedError\n\n    def _get_widths(self):\n      return [PIL.Image.open(self.image_path_at(i)).size[0]\n              for i in xrange(self.num_images)]\n\n    def append_flipped_images(self):\n        num_images = self.num_images\n        widths = self._get_widths()\n        for i in xrange(num_images):\n            boxes = self.roidb[i]['boxes'].copy()\n            oldx1 = boxes[:, 0].copy()\n            oldx2 = boxes[:, 2].copy()\n            boxes[:, 0] = widths[i] - oldx2 - 1\n            boxes[:, 2] = widths[i] - oldx1 - 1\n            assert (boxes[:, 2] >= boxes[:, 0]).all()\n            entry = {'boxes' : boxes,\n                     'gt_overlaps' : self.roidb[i]['gt_overlaps'],\n                     'gt_classes' : self.roidb[i]['gt_classes'],\n                     'flipped' : True}\n            self.roidb.append(entry)\n        self._image_index = self._image_index * 2\n\n    def evaluate_recall(self, candidate_boxes=None, thresholds=None,\n                        area='all', limit=None):\n        \"\"\"Evaluate detection proposal recall metrics.\n\n        Returns:\n            results: dictionary of results with keys\n                'ar': average recall\n                'recalls': vector recalls at each IoU overlap threshold\n                'thresholds': vector of IoU overlap thresholds\n                'gt_overlaps': vector of all ground-truth overlaps\n        \"\"\"\n        # Record max overlap value for each gt box\n        # Return vector of overlap values\n        areas = { 'all': 0, 'small': 1, 'medium': 2, 'large': 3,\n                  '96-128': 4, '128-256': 5, '256-512': 6, '512-inf': 7}\n        area_ranges = [ [0**2, 1e5**2],    # all\n                        [0**2, 32**2],     # small\n                        [32**2, 96**2],    # medium\n                        [96**2, 1e5**2],   # large\n                        [96**2, 128**2],   # 96-128\n                        [128**2, 256**2],  # 128-256\n                        [256**2, 512**2],  # 256-512\n                        [512**2, 1e5**2],  # 512-inf\n                      ]\n        assert areas.has_key(area), 'unknown area range: {}'.format(area)\n        area_range = area_ranges[areas[area]]\n        gt_overlaps = np.zeros(0)\n        num_pos = 0\n        for i in xrange(self.num_images):\n            # Checking for max_overlaps == 1 avoids including crowd annotations\n            # (...pretty hacking :/)\n            max_gt_overlaps = self.roidb[i]['gt_overlaps'].toarray().max(axis=1)\n            gt_inds = np.where((self.roidb[i]['gt_classes'] > 0) &\n                               (max_gt_overlaps == 1))[0]\n            gt_boxes = self.roidb[i]['boxes'][gt_inds, :]\n            gt_areas = self.roidb[i]['seg_areas'][gt_inds]\n            valid_gt_inds = np.where((gt_areas >= area_range[0]) &\n                                     (gt_areas <= area_range[1]))[0]\n            gt_boxes = gt_boxes[valid_gt_inds, :]\n            num_pos += len(valid_gt_inds)\n\n            if candidate_boxes is None:\n                # If candidate_boxes is not supplied, the default is to use the\n                # non-ground-truth boxes from this roidb\n                non_gt_inds = np.where(self.roidb[i]['gt_classes'] == 0)[0]\n                boxes = self.roidb[i]['boxes'][non_gt_inds, :]\n            else:\n                boxes = candidate_boxes[i]\n            if boxes.shape[0] == 0:\n                continue\n            if limit is not None and boxes.shape[0] > limit:\n                boxes = boxes[:limit, :]\n\n            overlaps = bbox_overlaps(boxes.astype(np.float),\n                                     gt_boxes.astype(np.float))\n\n            _gt_overlaps = np.zeros((gt_boxes.shape[0]))\n            for j in xrange(gt_boxes.shape[0]):\n                # find which proposal box maximally covers each gt box\n                argmax_overlaps = overlaps.argmax(axis=0)\n                # and get the iou amount of coverage for each gt box\n                max_overlaps = overlaps.max(axis=0)\n                # find which gt box is 'best' covered (i.e. 'best' = most iou)\n                gt_ind = max_overlaps.argmax()\n                gt_ovr = max_overlaps.max()\n                assert(gt_ovr >= 0)\n                # find the proposal box that covers the best covered gt box\n                box_ind = argmax_overlaps[gt_ind]\n                # record the iou coverage of this gt box\n                _gt_overlaps[j] = overlaps[box_ind, gt_ind]\n                assert(_gt_overlaps[j] == gt_ovr)\n                # mark the proposal box and the gt box as used\n                overlaps[box_ind, :] = -1\n                overlaps[:, gt_ind] = -1\n            # append recorded iou coverage level\n            gt_overlaps = np.hstack((gt_overlaps, _gt_overlaps))\n\n        gt_overlaps = np.sort(gt_overlaps)\n        if thresholds is None:\n            step = 0.05\n            thresholds = np.arange(0.5, 0.95 + 1e-5, step)\n        recalls = np.zeros_like(thresholds)\n        # compute recall for each iou threshold\n        for i, t in enumerate(thresholds):\n            recalls[i] = (gt_overlaps >= t).sum() / float(num_pos)\n        # ar = 2 * np.trapz(recalls, thresholds)\n        ar = recalls.mean()\n        return {'ar': ar, 'recalls': recalls, 'thresholds': thresholds,\n                'gt_overlaps': gt_overlaps}\n\n    def create_roidb_from_box_list(self, box_list, gt_roidb):\n        assert len(box_list) == self.num_images, \\\n                'Number of boxes must match number of ground-truth images'\n        roidb = []\n        for i in xrange(self.num_images):\n            boxes = box_list[i]\n            num_boxes = boxes.shape[0]\n            overlaps = np.zeros((num_boxes, self.num_classes), dtype=np.float32)\n\n            if gt_roidb is not None and gt_roidb[i]['boxes'].size > 0:\n                gt_boxes = gt_roidb[i]['boxes']\n                gt_classes = gt_roidb[i]['gt_classes']\n                gt_overlaps = bbox_overlaps(boxes.astype(np.float),\n                                            gt_boxes.astype(np.float))\n                argmaxes = gt_overlaps.argmax(axis=1)\n                maxes = gt_overlaps.max(axis=1)\n                I = np.where(maxes > 0)[0]\n                overlaps[I, gt_classes[argmaxes[I]]] = maxes[I]\n\n            overlaps = scipy.sparse.csr_matrix(overlaps)\n            roidb.append({\n                'boxes' : boxes,\n                'gt_classes' : np.zeros((num_boxes,), dtype=np.int32),\n                'gt_overlaps' : overlaps,\n                'flipped' : False,\n                'seg_areas' : np.zeros((num_boxes,), dtype=np.float32),\n            })\n        return roidb\n\n    @staticmethod\n    def merge_roidbs(a, b):\n        assert len(a) == len(b)\n        for i in xrange(len(a)):\n            a[i]['boxes'] = np.vstack((a[i]['boxes'], b[i]['boxes']))\n            a[i]['gt_classes'] = np.hstack((a[i]['gt_classes'],\n                                            b[i]['gt_classes']))\n            a[i]['gt_overlaps'] = scipy.sparse.vstack([a[i]['gt_overlaps'],\n                                                       b[i]['gt_overlaps']])\n            a[i]['seg_areas'] = np.hstack((a[i]['seg_areas'],\n                                           b[i]['seg_areas']))\n        return a\n\n    def competition_mode(self, on):\n        \"\"\"Turn competition mode on or off.\"\"\"\n        pass\n"
  },
  {
    "path": "lib/datasets/pascal_voc.py",
    "content": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\nimport os\nfrom datasets.imdb import imdb\nimport datasets.ds_utils as ds_utils\nimport xml.etree.ElementTree as ET\nimport numpy as np\nimport scipy.sparse\nimport scipy.io as sio\nimport utils.cython_bbox\nimport cPickle\nimport subprocess\nimport uuid\nfrom voc_eval import voc_eval\nfrom fast_rcnn.config import cfg\n\nclass pascal_voc(imdb):\n    def __init__(self, image_set, year, devkit_path=None):\n        imdb.__init__(self, 'voc_' + year + '_' + image_set)\n        self._year = year\n        self._image_set = image_set\n        self._devkit_path = self._get_default_path() if devkit_path is None \\\n                            else devkit_path\n        self._data_path = os.path.join(self._devkit_path, 'VOC' + self._year)\n        self._classes = ('__background__', # always index 0\n                         'aeroplane', 'bicycle', 'bird', 'boat',\n                         'bottle', 'bus', 'car', 'cat', 'chair',\n                         'cow', 'diningtable', 'dog', 'horse',\n                         'motorbike', 'person', 'pottedplant',\n                         'sheep', 'sofa', 'train', 'tvmonitor')\n        self._class_to_ind = dict(zip(self.classes, xrange(self.num_classes)))\n        self._image_ext = '.jpg'\n        self._image_index = self._load_image_set_index()\n        # Default to roidb handler\n        self._roidb_handler = self.selective_search_roidb\n        self._salt = str(uuid.uuid4())\n        self._comp_id = 'comp4'\n\n        # PASCAL specific config options\n        self.config = {'cleanup'     : True,\n                       'use_salt'    : True,\n                       'use_diff'    : False,\n                       'matlab_eval' : False,\n                       'rpn_file'    : None,\n                       'min_size'    : 2}\n\n        assert os.path.exists(self._devkit_path), \\\n                'VOCdevkit path does not exist: {}'.format(self._devkit_path)\n        assert os.path.exists(self._data_path), \\\n                'Path does not exist: {}'.format(self._data_path)\n\n    def image_path_at(self, i):\n        \"\"\"\n        Return the absolute path to image i in the image sequence.\n        \"\"\"\n        return self.image_path_from_index(self._image_index[i])\n\n    def image_path_from_index(self, index):\n        \"\"\"\n        Construct an image path from the image's \"index\" identifier.\n        \"\"\"\n        image_path = os.path.join(self._data_path, 'JPEGImages',\n                                  index + self._image_ext)\n        assert os.path.exists(image_path), \\\n                'Path does not exist: {}'.format(image_path)\n        return image_path\n\n    def _load_image_set_index(self):\n        \"\"\"\n        Load the indexes listed in this dataset's image set file.\n        \"\"\"\n        # Example path to image set file:\n        # self._devkit_path + /VOCdevkit2007/VOC2007/ImageSets/Main/val.txt\n        image_set_file = os.path.join(self._data_path, 'ImageSets', 'Main',\n                                      self._image_set + '.txt')\n        assert os.path.exists(image_set_file), \\\n                'Path does not exist: {}'.format(image_set_file)\n        with open(image_set_file) as f:\n            image_index = [x.strip() for x in f.readlines()]\n        return image_index\n\n    def _get_default_path(self):\n        \"\"\"\n        Return the default path where PASCAL VOC is expected to be installed.\n        \"\"\"\n        return os.path.join(cfg.DATA_DIR, 'VOCdevkit' + self._year)\n\n    def gt_roidb(self):\n        \"\"\"\n        Return the database of ground-truth regions of interest.\n\n        This function loads/saves from/to a cache file to speed up future calls.\n        \"\"\"\n        cache_file = os.path.join(self.cache_path, self.name + '_gt_roidb.pkl')\n        if os.path.exists(cache_file):\n            with open(cache_file, 'rb') as fid:\n                roidb = cPickle.load(fid)\n            print '{} gt roidb loaded from {}'.format(self.name, cache_file)\n            return roidb\n\n        gt_roidb = [self._load_pascal_annotation(index)\n                    for index in self.image_index]\n        with open(cache_file, 'wb') as fid:\n            cPickle.dump(gt_roidb, fid, cPickle.HIGHEST_PROTOCOL)\n        print 'wrote gt roidb to {}'.format(cache_file)\n\n        return gt_roidb\n\n    def selective_search_roidb(self):\n        \"\"\"\n        Return the database of selective search regions of interest.\n        Ground-truth ROIs are also included.\n\n        This function loads/saves from/to a cache file to speed up future calls.\n        \"\"\"\n        cache_file = os.path.join(self.cache_path,\n                                  self.name + '_selective_search_roidb.pkl')\n\n        if os.path.exists(cache_file):\n            with open(cache_file, 'rb') as fid:\n                roidb = cPickle.load(fid)\n            print '{} ss roidb loaded from {}'.format(self.name, cache_file)\n            return roidb\n\n        if int(self._year) == 2007 or self._image_set != 'test':\n            gt_roidb = self.gt_roidb()\n            ss_roidb = self._load_selective_search_roidb(gt_roidb)\n            roidb = imdb.merge_roidbs(gt_roidb, ss_roidb)\n        else:\n            roidb = self._load_selective_search_roidb(None)\n        with open(cache_file, 'wb') as fid:\n            cPickle.dump(roidb, fid, cPickle.HIGHEST_PROTOCOL)\n        print 'wrote ss roidb to {}'.format(cache_file)\n\n        return roidb\n\n    def rpn_roidb(self):\n        if int(self._year) == 2007 or self._image_set != 'test':\n            gt_roidb = self.gt_roidb()\n            rpn_roidb = self._load_rpn_roidb(gt_roidb)\n            roidb = imdb.merge_roidbs(gt_roidb, rpn_roidb)\n        else:\n            roidb = self._load_rpn_roidb(None)\n\n        return roidb\n\n    def _load_rpn_roidb(self, gt_roidb):\n        filename = self.config['rpn_file']\n        print 'loading {}'.format(filename)\n        assert os.path.exists(filename), \\\n               'rpn data not found at: {}'.format(filename)\n        with open(filename, 'rb') as f:\n            box_list = cPickle.load(f)\n        return self.create_roidb_from_box_list(box_list, gt_roidb)\n\n    def _load_selective_search_roidb(self, gt_roidb):\n        filename = os.path.abspath(os.path.join(cfg.DATA_DIR,\n                                                'selective_search_data',\n                                                self.name + '.mat'))\n        assert os.path.exists(filename), \\\n               'Selective search data not found at: {}'.format(filename)\n        raw_data = sio.loadmat(filename)['boxes'].ravel()\n\n        box_list = []\n        for i in xrange(raw_data.shape[0]):\n            boxes = raw_data[i][:, (1, 0, 3, 2)] - 1\n            keep = ds_utils.unique_boxes(boxes)\n            boxes = boxes[keep, :]\n            keep = ds_utils.filter_small_boxes(boxes, self.config['min_size'])\n            boxes = boxes[keep, :]\n            box_list.append(boxes)\n\n        return self.create_roidb_from_box_list(box_list, gt_roidb)\n\n    def _load_pascal_annotation(self, index):\n        \"\"\"\n        Load image and bounding boxes info from XML file in the PASCAL VOC\n        format.\n        \"\"\"\n        filename = os.path.join(self._data_path, 'Annotations', index + '.xml')\n        tree = ET.parse(filename)\n        objs = tree.findall('object')\n        if not self.config['use_diff']:\n            # Exclude the samples labeled as difficult\n            non_diff_objs = [\n                obj for obj in objs if int(obj.find('difficult').text) == 0]\n            # if len(non_diff_objs) != len(objs):\n            #     print 'Removed {} difficult objects'.format(\n            #         len(objs) - len(non_diff_objs))\n            objs = non_diff_objs\n        num_objs = len(objs)\n\n        boxes = np.zeros((num_objs, 4), dtype=np.uint16)\n        gt_classes = np.zeros((num_objs), dtype=np.int32)\n        overlaps = np.zeros((num_objs, self.num_classes), dtype=np.float32)\n        # \"Seg\" area for pascal is just the box area\n        seg_areas = np.zeros((num_objs), dtype=np.float32)\n\n        # Load object bounding boxes into a data frame.\n        for ix, obj in enumerate(objs):\n            bbox = obj.find('bndbox')\n            # Make pixel indexes 0-based\n            x1 = float(bbox.find('xmin').text) - 1\n            y1 = float(bbox.find('ymin').text) - 1\n            x2 = float(bbox.find('xmax').text) - 1\n            y2 = float(bbox.find('ymax').text) - 1\n            cls = self._class_to_ind[obj.find('name').text.lower().strip()]\n            boxes[ix, :] = [x1, y1, x2, y2]\n            gt_classes[ix] = cls\n            overlaps[ix, cls] = 1.0\n            seg_areas[ix] = (x2 - x1 + 1) * (y2 - y1 + 1)\n\n        overlaps = scipy.sparse.csr_matrix(overlaps)\n\n        return {'boxes' : boxes,\n                'gt_classes': gt_classes,\n                'gt_overlaps' : overlaps,\n                'flipped' : False,\n                'seg_areas' : seg_areas}\n\n    def _get_comp_id(self):\n        comp_id = (self._comp_id + '_' + self._salt if self.config['use_salt']\n            else self._comp_id)\n        return comp_id\n\n    def _get_voc_results_file_template(self):\n        # VOCdevkit/results/VOC2007/Main/<comp_id>_det_test_aeroplane.txt\n        filename = self._get_comp_id() + '_det_' + self._image_set + '_{:s}.txt'\n        path = os.path.join(\n            self._devkit_path,\n            'results',\n            'VOC' + self._year,\n            'Main',\n            filename)\n        return path\n\n    def _write_voc_results_file(self, all_boxes):\n        for cls_ind, cls in enumerate(self.classes):\n            if cls == '__background__':\n                continue\n            print 'Writing {} VOC results file'.format(cls)\n            filename = self._get_voc_results_file_template().format(cls)\n            with open(filename, 'wt') as f:\n                for im_ind, index in enumerate(self.image_index):\n                    dets = all_boxes[cls_ind][im_ind]\n                    if dets == []:\n                        continue\n                    # the VOCdevkit expects 1-based indices\n                    for k in xrange(dets.shape[0]):\n                        f.write('{:s} {:.3f} {:.1f} {:.1f} {:.1f} {:.1f}\\n'.\n                                format(index, dets[k, -1],\n                                       dets[k, 0] + 1, dets[k, 1] + 1,\n                                       dets[k, 2] + 1, dets[k, 3] + 1))\n\n    def _do_python_eval(self, output_dir = 'output'):\n        annopath = os.path.join(\n            self._devkit_path,\n            'VOC' + self._year,\n            'Annotations',\n            '{:s}.xml')\n        imagesetfile = os.path.join(\n            self._devkit_path,\n            'VOC' + self._year,\n            'ImageSets',\n            'Main',\n            self._image_set + '.txt')\n        cachedir = os.path.join(self._devkit_path, 'annotations_cache')\n        aps = []\n        # The PASCAL VOC metric changed in 2010\n        use_07_metric = True if int(self._year) < 2010 else False\n        print 'VOC07 metric? ' + ('Yes' if use_07_metric else 'No')\n        if not os.path.isdir(output_dir):\n            os.mkdir(output_dir)\n        for i, cls in enumerate(self._classes):\n            if cls == '__background__':\n                continue\n            filename = self._get_voc_results_file_template().format(cls)\n            rec, prec, ap = voc_eval(\n                filename, annopath, imagesetfile, cls, cachedir, ovthresh=0.5,\n                use_07_metric=use_07_metric)\n            aps += [ap]\n            print('AP for {} = {:.4f}'.format(cls, ap))\n            with open(os.path.join(output_dir, cls + '_pr.pkl'), 'w') as f:\n                cPickle.dump({'rec': rec, 'prec': prec, 'ap': ap}, f)\n        print('Mean AP = {:.4f}'.format(np.mean(aps)))\n        print('~~~~~~~~')\n        print('Results:')\n        for ap in aps:\n            print('{:.3f}'.format(ap))\n        print('{:.3f}'.format(np.mean(aps)))\n        print('~~~~~~~~')\n        print('')\n        print('--------------------------------------------------------------')\n        print('Results computed with the **unofficial** Python eval code.')\n        print('Results should be very close to the official MATLAB eval code.')\n        print('Recompute with `./tools/reval.py --matlab ...` for your paper.')\n        print('-- Thanks, The Management')\n        print('--------------------------------------------------------------')\n\n    def _do_matlab_eval(self, output_dir='output'):\n        print '-----------------------------------------------------'\n        print 'Computing results with the official MATLAB eval code.'\n        print '-----------------------------------------------------'\n        path = os.path.join(cfg.ROOT_DIR, 'lib', 'datasets',\n                            'VOCdevkit-matlab-wrapper')\n        cmd = 'cd {} && '.format(path)\n        cmd += '{:s} -nodisplay -nodesktop '.format(cfg.MATLAB)\n        cmd += '-r \"dbstop if error; '\n        cmd += 'voc_eval(\\'{:s}\\',\\'{:s}\\',\\'{:s}\\',\\'{:s}\\'); quit;\"' \\\n               .format(self._devkit_path, self._get_comp_id(),\n                       self._image_set, output_dir)\n        print('Running:\\n{}'.format(cmd))\n        status = subprocess.call(cmd, shell=True)\n\n    def evaluate_detections(self, all_boxes, output_dir):\n        self._write_voc_results_file(all_boxes)\n        self._do_python_eval(output_dir)\n        if self.config['matlab_eval']:\n            self._do_matlab_eval(output_dir)\n        if self.config['cleanup']:\n            for cls in self._classes:\n                if cls == '__background__':\n                    continue\n                filename = self._get_voc_results_file_template().format(cls)\n                os.remove(filename)\n\n    def competition_mode(self, on):\n        if on:\n            self.config['use_salt'] = False\n            self.config['cleanup'] = False\n        else:\n            self.config['use_salt'] = True\n            self.config['cleanup'] = True\n\nif __name__ == '__main__':\n    from datasets.pascal_voc import pascal_voc\n    d = pascal_voc('trainval', '2007')\n    res = d.roidb\n    from IPython import embed; embed()\n"
  },
  {
    "path": "lib/datasets/tools/mcg_munge.py",
    "content": "import os\nimport sys\n\n\"\"\"Hacky tool to convert file system layout of MCG boxes downloaded from\nhttp://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/mcg/\nso that it's consistent with those computed by Jan Hosang (see:\nhttp://www.mpi-inf.mpg.de/departments/computer-vision-and-multimodal-\n  computing/research/object-recognition-and-scene-understanding/how-\n  good-are-detection-proposals-really/)\n\nNB: Boxes from the MCG website are in (y1, x1, y2, x2) order.\nBoxes from Hosang et al. are in (x1, y1, x2, y2) order.\n\"\"\"\n\ndef munge(src_dir):\n    # stored as: ./MCG-COCO-val2014-boxes/COCO_val2014_000000193401.mat\n    # want:      ./MCG/mat/COCO_val2014_0/COCO_val2014_000000141/COCO_val2014_000000141334.mat\n\n    files = os.listdir(src_dir)\n    for fn in files:\n        base, ext = os.path.splitext(fn)\n        # first 14 chars / first 22 chars / all chars + .mat\n        # COCO_val2014_0/COCO_val2014_000000447/COCO_val2014_000000447991.mat\n        first = base[:14]\n        second = base[:22]\n        dst_dir = os.path.join('MCG', 'mat', first, second)\n        if not os.path.exists(dst_dir):\n            os.makedirs(dst_dir)\n        src = os.path.join(src_dir, fn)\n        dst = os.path.join(dst_dir, fn)\n        print 'MV: {} -> {}'.format(src, dst)\n        os.rename(src, dst)\n\nif __name__ == '__main__':\n    # src_dir should look something like:\n    #  src_dir = 'MCG-COCO-val2014-boxes'\n    src_dir = sys.argv[1]\n    munge(src_dir)\n"
  },
  {
    "path": "lib/datasets/voc_eval.py",
    "content": "# --------------------------------------------------------\n# Fast/er R-CNN\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Bharath Hariharan\n# --------------------------------------------------------\n\nimport xml.etree.ElementTree as ET\nimport os\nimport cPickle\nimport numpy as np\n\ndef parse_rec(filename):\n    \"\"\" Parse a PASCAL VOC xml file \"\"\"\n    tree = ET.parse(filename)\n    objects = []\n    for obj in tree.findall('object'):\n        obj_struct = {}\n        obj_struct['name'] = obj.find('name').text\n        obj_struct['pose'] = obj.find('pose').text\n        obj_struct['truncated'] = int(obj.find('truncated').text)\n        obj_struct['difficult'] = int(obj.find('difficult').text)\n        bbox = obj.find('bndbox')\n        obj_struct['bbox'] = [int(bbox.find('xmin').text),\n                              int(bbox.find('ymin').text),\n                              int(bbox.find('xmax').text),\n                              int(bbox.find('ymax').text)]\n        objects.append(obj_struct)\n\n    return objects\n\ndef voc_ap(rec, prec, use_07_metric=False):\n    \"\"\" ap = voc_ap(rec, prec, [use_07_metric])\n    Compute VOC AP given precision and recall.\n    If use_07_metric is true, uses the\n    VOC 07 11 point method (default:False).\n    \"\"\"\n    if use_07_metric:\n        # 11 point metric\n        ap = 0.\n        for t in np.arange(0., 1.1, 0.1):\n            if np.sum(rec >= t) == 0:\n                p = 0\n            else:\n                p = np.max(prec[rec >= t])\n            ap = ap + p / 11.\n    else:\n        # correct AP calculation\n        # first append sentinel values at the end\n        mrec = np.concatenate(([0.], rec, [1.]))\n        mpre = np.concatenate(([0.], prec, [0.]))\n\n        # compute the precision envelope\n        for i in range(mpre.size - 1, 0, -1):\n            mpre[i - 1] = np.maximum(mpre[i - 1], mpre[i])\n\n        # to calculate area under PR curve, look for points\n        # where X axis (recall) changes value\n        i = np.where(mrec[1:] != mrec[:-1])[0]\n\n        # and sum (\\Delta recall) * prec\n        ap = np.sum((mrec[i + 1] - mrec[i]) * mpre[i + 1])\n    return ap\n\ndef voc_eval(detpath,\n             annopath,\n             imagesetfile,\n             classname,\n             cachedir,\n             ovthresh=0.5,\n             use_07_metric=False):\n    \"\"\"rec, prec, ap = voc_eval(detpath,\n                                annopath,\n                                imagesetfile,\n                                classname,\n                                [ovthresh],\n                                [use_07_metric])\n\n    Top level function that does the PASCAL VOC evaluation.\n\n    detpath: Path to detections\n        detpath.format(classname) should produce the detection results file.\n    annopath: Path to annotations\n        annopath.format(imagename) should be the xml annotations file.\n    imagesetfile: Text file containing the list of images, one image per line.\n    classname: Category name (duh)\n    cachedir: Directory for caching the annotations\n    [ovthresh]: Overlap threshold (default = 0.5)\n    [use_07_metric]: Whether to use VOC07's 11 point AP computation\n        (default False)\n    \"\"\"\n    # assumes detections are in detpath.format(classname)\n    # assumes annotations are in annopath.format(imagename)\n    # assumes imagesetfile is a text file with each line an image name\n    # cachedir caches the annotations in a pickle file\n\n    # first load gt\n    if not os.path.isdir(cachedir):\n        os.mkdir(cachedir)\n    cachefile = os.path.join(cachedir, 'annots.pkl')\n    # read list of images\n    with open(imagesetfile, 'r') as f:\n        lines = f.readlines()\n    imagenames = [x.strip() for x in lines]\n\n    if not os.path.isfile(cachefile):\n        # load annots\n        recs = {}\n        for i, imagename in enumerate(imagenames):\n            recs[imagename] = parse_rec(annopath.format(imagename))\n            if i % 100 == 0:\n                print 'Reading annotation for {:d}/{:d}'.format(\n                    i + 1, len(imagenames))\n        # save\n        print 'Saving cached annotations to {:s}'.format(cachefile)\n        with open(cachefile, 'w') as f:\n            cPickle.dump(recs, f)\n    else:\n        # load\n        with open(cachefile, 'r') as f:\n            recs = cPickle.load(f)\n\n    # extract gt objects for this class\n    class_recs = {}\n    npos = 0\n    for imagename in imagenames:\n        R = [obj for obj in recs[imagename] if obj['name'] == classname]\n        bbox = np.array([x['bbox'] for x in R])\n        difficult = np.array([x['difficult'] for x in R]).astype(np.bool)\n        det = [False] * len(R)\n        npos = npos + sum(~difficult)\n        class_recs[imagename] = {'bbox': bbox,\n                                 'difficult': difficult,\n                                 'det': det}\n\n    # read dets\n    detfile = detpath.format(classname)\n    with open(detfile, 'r') as f:\n        lines = f.readlines()\n\n    splitlines = [x.strip().split(' ') for x in lines]\n    image_ids = [x[0] for x in splitlines]\n    confidence = np.array([float(x[1]) for x in splitlines])\n    BB = np.array([[float(z) for z in x[2:]] for x in splitlines])\n\n    # sort by confidence\n    sorted_ind = np.argsort(-confidence)\n    sorted_scores = np.sort(-confidence)\n    BB = BB[sorted_ind, :]\n    image_ids = [image_ids[x] for x in sorted_ind]\n\n    # go down dets and mark TPs and FPs\n    nd = len(image_ids)\n    tp = np.zeros(nd)\n    fp = np.zeros(nd)\n    for d in range(nd):\n        R = class_recs[image_ids[d]]\n        bb = BB[d, :].astype(float)\n        ovmax = -np.inf\n        BBGT = R['bbox'].astype(float)\n\n        if BBGT.size > 0:\n            # compute overlaps\n            # intersection\n            ixmin = np.maximum(BBGT[:, 0], bb[0])\n            iymin = np.maximum(BBGT[:, 1], bb[1])\n            ixmax = np.minimum(BBGT[:, 2], bb[2])\n            iymax = np.minimum(BBGT[:, 3], bb[3])\n            iw = np.maximum(ixmax - ixmin + 1., 0.)\n            ih = np.maximum(iymax - iymin + 1., 0.)\n            inters = iw * ih\n\n            # union\n            uni = ((bb[2] - bb[0] + 1.) * (bb[3] - bb[1] + 1.) +\n                   (BBGT[:, 2] - BBGT[:, 0] + 1.) *\n                   (BBGT[:, 3] - BBGT[:, 1] + 1.) - inters)\n\n            overlaps = inters / uni\n            ovmax = np.max(overlaps)\n            jmax = np.argmax(overlaps)\n\n        if ovmax > ovthresh:\n            if not R['difficult'][jmax]:\n                if not R['det'][jmax]:\n                    tp[d] = 1.\n                    R['det'][jmax] = 1\n                else:\n                    fp[d] = 1.\n        else:\n            fp[d] = 1.\n\n    # compute precision recall\n    fp = np.cumsum(fp)\n    tp = np.cumsum(tp)\n    rec = tp / float(npos)\n    # avoid divide by zero in case the first detection matches a difficult\n    # ground truth\n    prec = tp / np.maximum(tp + fp, np.finfo(np.float64).eps)\n    ap = voc_ap(rec, prec, use_07_metric)\n\n    return rec, prec, ap\n"
  },
  {
    "path": "lib/fast_rcnn/__init__.py",
    "content": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n"
  },
  {
    "path": "lib/fast_rcnn/bbox_transform.py",
    "content": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\nimport numpy as np\n\ndef bbox_transform(ex_rois, gt_rois):\n    ex_widths = ex_rois[:, 2] - ex_rois[:, 0] + 1.0\n    ex_heights = ex_rois[:, 3] - ex_rois[:, 1] + 1.0\n    ex_ctr_x = ex_rois[:, 0] + 0.5 * ex_widths\n    ex_ctr_y = ex_rois[:, 1] + 0.5 * ex_heights\n\n    gt_widths = gt_rois[:, 2] - gt_rois[:, 0] + 1.0\n    gt_heights = gt_rois[:, 3] - gt_rois[:, 1] + 1.0\n    gt_ctr_x = gt_rois[:, 0] + 0.5 * gt_widths\n    gt_ctr_y = gt_rois[:, 1] + 0.5 * gt_heights\n\n    targets_dx = (gt_ctr_x - ex_ctr_x) / ex_widths\n    targets_dy = (gt_ctr_y - ex_ctr_y) / ex_heights\n    targets_dw = np.log(gt_widths / ex_widths)\n    targets_dh = np.log(gt_heights / ex_heights)\n\n    targets = np.vstack(\n        (targets_dx, targets_dy, targets_dw, targets_dh)).transpose()\n    return targets\n\ndef bbox_transform_inv(boxes, deltas):\n    if boxes.shape[0] == 0:\n        return np.zeros((0, deltas.shape[1]), dtype=deltas.dtype)\n\n    boxes = boxes.astype(deltas.dtype, copy=False)\n\n    widths = boxes[:, 2] - boxes[:, 0] + 1.0\n    heights = boxes[:, 3] - boxes[:, 1] + 1.0\n    ctr_x = boxes[:, 0] + 0.5 * widths\n    ctr_y = boxes[:, 1] + 0.5 * heights\n\n    dx = deltas[:, 0::4]\n    dy = deltas[:, 1::4]\n    dw = deltas[:, 2::4]\n    dh = deltas[:, 3::4]\n\n    pred_ctr_x = dx * widths[:, np.newaxis] + ctr_x[:, np.newaxis]\n    pred_ctr_y = dy * heights[:, np.newaxis] + ctr_y[:, np.newaxis]\n    pred_w = np.exp(dw) * widths[:, np.newaxis]\n    pred_h = np.exp(dh) * heights[:, np.newaxis]\n\n    pred_boxes = np.zeros(deltas.shape, dtype=deltas.dtype)\n    # x1\n    pred_boxes[:, 0::4] = pred_ctr_x - 0.5 * pred_w\n    # y1\n    pred_boxes[:, 1::4] = pred_ctr_y - 0.5 * pred_h\n    # x2\n    pred_boxes[:, 2::4] = pred_ctr_x + 0.5 * pred_w\n    # y2\n    pred_boxes[:, 3::4] = pred_ctr_y + 0.5 * pred_h\n\n    return pred_boxes\n\ndef clip_boxes(boxes, im_shape):\n    \"\"\"\n    Clip boxes to image boundaries.\n    \"\"\"\n\n    # x1 >= 0\n    boxes[:, 0::4] = np.maximum(np.minimum(boxes[:, 0::4], im_shape[1] - 1), 0)\n    # y1 >= 0\n    boxes[:, 1::4] = np.maximum(np.minimum(boxes[:, 1::4], im_shape[0] - 1), 0)\n    # x2 < im_shape[1]\n    boxes[:, 2::4] = np.maximum(np.minimum(boxes[:, 2::4], im_shape[1] - 1), 0)\n    # y2 < im_shape[0]\n    boxes[:, 3::4] = np.maximum(np.minimum(boxes[:, 3::4], im_shape[0] - 1), 0)\n    return boxes\n"
  },
  {
    "path": "lib/fast_rcnn/config.py",
    "content": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\n\"\"\"Fast R-CNN config system.\n\nThis file specifies default config options for Fast R-CNN. You should not\nchange values in this file. Instead, you should write a config file (in yaml)\nand use cfg_from_file(yaml_file) to load it and override the default options.\n\nMost tools in $ROOT/tools take a --cfg option to specify an override file.\n    - See tools/{train,test}_net.py for example code that uses cfg_from_file()\n    - See experiments/cfgs/*.yml for example YAML config override files\n\"\"\"\n\nimport os\nimport os.path as osp\nimport numpy as np\n# `pip install easydict` if you don't have it\nfrom easydict import EasyDict as edict\n\n__C = edict()\n# Consumers can get config by:\n#   from fast_rcnn_config import cfg\ncfg = __C\n\n#\n# Training options\n#\n\n__C.TRAIN = edict()\n\n# Scales to use during training (can list multiple scales)\n# Each scale is the pixel size of an image's shortest side\n__C.TRAIN.SCALES = (600,)\n\n# Max pixel size of the longest side of a scaled input image\n__C.TRAIN.MAX_SIZE = 1000\n\n# Images to use per minibatch\n__C.TRAIN.IMS_PER_BATCH = 2\n\n# Minibatch size (number of regions of interest [ROIs])\n__C.TRAIN.BATCH_SIZE = 128\n\n# Fraction of minibatch that is labeled foreground (i.e. class > 0)\n__C.TRAIN.FG_FRACTION = 0.25\n\n# Overlap threshold for a ROI to be considered foreground (if >= FG_THRESH)\n__C.TRAIN.FG_THRESH = 0.5\n\n# Overlap threshold for a ROI to be considered background (class = 0 if\n# overlap in [LO, HI))\n__C.TRAIN.BG_THRESH_HI = 0.5\n__C.TRAIN.BG_THRESH_LO = 0.1\n\n# Use horizontally-flipped images during training?\n__C.TRAIN.USE_FLIPPED = True\n\n# Train bounding-box regressors\n__C.TRAIN.BBOX_REG = True\n\n# Overlap required between a ROI and ground-truth box in order for that ROI to\n# be used as a bounding-box regression training example\n__C.TRAIN.BBOX_THRESH = 0.5\n\n# Iterations between snapshots\n__C.TRAIN.SNAPSHOT_ITERS = 10000\n\n# solver.prototxt specifies the snapshot path prefix, this adds an optional\n# infix to yield the path: <prefix>[_<infix>]_iters_XYZ.caffemodel\n__C.TRAIN.SNAPSHOT_INFIX = ''\n\n# Use a prefetch thread in roi_data_layer.layer\n# So far I haven't found this useful; likely more engineering work is required\n__C.TRAIN.USE_PREFETCH = False\n\n# Normalize the targets (subtract empirical mean, divide by empirical stddev)\n__C.TRAIN.BBOX_NORMALIZE_TARGETS = True\n# Deprecated (inside weights)\n__C.TRAIN.BBOX_INSIDE_WEIGHTS = (1.0, 1.0, 1.0, 1.0)\n# Normalize the targets using \"precomputed\" (or made up) means and stdevs\n# (BBOX_NORMALIZE_TARGETS must also be True)\n__C.TRAIN.BBOX_NORMALIZE_TARGETS_PRECOMPUTED = False\n__C.TRAIN.BBOX_NORMALIZE_MEANS = (0.0, 0.0, 0.0, 0.0)\n__C.TRAIN.BBOX_NORMALIZE_STDS = (0.1, 0.1, 0.2, 0.2)\n\n# Train using these proposals\n__C.TRAIN.PROPOSAL_METHOD = 'selective_search'\n\n# Make minibatches from images that have similar aspect ratios (i.e. both\n# tall and thin or both short and wide) in order to avoid wasting computation\n# on zero-padding.\n__C.TRAIN.ASPECT_GROUPING = True\n\n# Use RPN to detect objects\n__C.TRAIN.HAS_RPN = False\n# IOU >= thresh: positive example\n__C.TRAIN.RPN_POSITIVE_OVERLAP = 0.7\n# IOU < thresh: negative example\n__C.TRAIN.RPN_NEGATIVE_OVERLAP = 0.3\n# If an anchor statisfied by positive and negative conditions set to negative\n__C.TRAIN.RPN_CLOBBER_POSITIVES = False\n# Max number of foreground examples\n__C.TRAIN.RPN_FG_FRACTION = 0.5\n# Total number of examples\n__C.TRAIN.RPN_BATCHSIZE = 256\n# NMS threshold used on RPN proposals\n__C.TRAIN.RPN_NMS_THRESH = 0.7\n# Number of top scoring boxes to keep before apply NMS to RPN proposals\n__C.TRAIN.RPN_PRE_NMS_TOP_N = 12000\n# Number of top scoring boxes to keep after applying NMS to RPN proposals\n__C.TRAIN.RPN_POST_NMS_TOP_N = 2000\n# Proposal height and width both need to be greater than RPN_MIN_SIZE (at orig image scale)\n__C.TRAIN.RPN_MIN_SIZE = 16\n# Deprecated (outside weights)\n__C.TRAIN.RPN_BBOX_INSIDE_WEIGHTS = (1.0, 1.0, 1.0, 1.0)\n# Give the positive RPN examples weight of p * 1 / {num positives}\n# and give negatives a weight of (1 - p)\n# Set to -1.0 to use uniform example weighting\n__C.TRAIN.RPN_POSITIVE_WEIGHT = -1.0\n\n\n#\n# Testing options\n#\n\n__C.TEST = edict()\n\n# Scales to use during testing (can list multiple scales)\n# Each scale is the pixel size of an image's shortest side\n__C.TEST.SCALES = (600,)\n\n# Max pixel size of the longest side of a scaled input image\n__C.TEST.MAX_SIZE = 1000\n\n# Overlap threshold used for non-maximum suppression (suppress boxes with\n# IoU >= this threshold)\n__C.TEST.NMS = 0.3\n\n# Experimental: treat the (K+1) units in the cls_score layer as linear\n# predictors (trained, eg, with one-vs-rest SVMs).\n__C.TEST.SVM = False\n\n# Test using bounding-box regressors\n__C.TEST.BBOX_REG = True\n\n# Propose boxes\n__C.TEST.HAS_RPN = False\n\n# Test using these proposals\n__C.TEST.PROPOSAL_METHOD = 'selective_search'\n\n## NMS threshold used on RPN proposals\n__C.TEST.RPN_NMS_THRESH = 0.7\n## Number of top scoring boxes to keep before apply NMS to RPN proposals\n__C.TEST.RPN_PRE_NMS_TOP_N = 6000\n## Number of top scoring boxes to keep after applying NMS to RPN proposals\n__C.TEST.RPN_POST_NMS_TOP_N = 300\n# Proposal height and width both need to be greater than RPN_MIN_SIZE (at orig image scale)\n__C.TEST.RPN_MIN_SIZE = 16\n\n\n#\n# MISC\n#\n\n# The mapping from image coordinates to feature map coordinates might cause\n# some boxes that are distinct in image space to become identical in feature\n# coordinates. If DEDUP_BOXES > 0, then DEDUP_BOXES is used as the scale factor\n# for identifying duplicate boxes.\n# 1/16 is correct for {Alex,Caffe}Net, VGG_CNN_M_1024, and VGG16\n__C.DEDUP_BOXES = 1./16.\n\n# Pixel mean values (BGR order) as a (1, 1, 3) array\n# We use the same pixel mean for all networks even though it's not exactly what\n# they were trained with\n__C.PIXEL_MEANS = np.array([[[102.9801, 115.9465, 122.7717]]])\n\n# For reproducibility\n__C.RNG_SEED = 3\n\n# A small number that's used many times\n__C.EPS = 1e-14\n\n# Root directory of project\n__C.ROOT_DIR = osp.abspath(osp.join(osp.dirname(__file__), '..', '..'))\n\n# Data directory\n__C.DATA_DIR = osp.abspath(osp.join(__C.ROOT_DIR, 'data'))\n\n# Model directory\n__C.MODELS_DIR = osp.abspath(osp.join(__C.ROOT_DIR, 'models', 'pascal_voc'))\n\n# Name (or path to) the matlab executable\n__C.MATLAB = 'matlab'\n\n# Place outputs under an experiments directory\n__C.EXP_DIR = 'default'\n\n# Use GPU implementation of non-maximum suppression\n__C.USE_GPU_NMS = True\n\n# Default GPU device id\n__C.GPU_ID = 0\n\n\ndef get_output_dir(imdb, net=None):\n    \"\"\"Return the directory where experimental artifacts are placed.\n    If the directory does not exist, it is created.\n\n    A canonical path is built using the name from an imdb and a network\n    (if not None).\n    \"\"\"\n    outdir = osp.abspath(osp.join(__C.ROOT_DIR, 'output', __C.EXP_DIR, imdb.name))\n    if net is not None:\n        outdir = osp.join(outdir, net.name)\n    if not os.path.exists(outdir):\n        os.makedirs(outdir)\n    return outdir\n\ndef _merge_a_into_b(a, b):\n    \"\"\"Merge config dictionary a into config dictionary b, clobbering the\n    options in b whenever they are also specified in a.\n    \"\"\"\n    if type(a) is not edict:\n        return\n\n    for k, v in a.iteritems():\n        # a must specify keys that are in b\n        if not b.has_key(k):\n            raise KeyError('{} is not a valid config key'.format(k))\n\n        # the types must match, too\n        old_type = type(b[k])\n        if old_type is not type(v):\n            if isinstance(b[k], np.ndarray):\n                v = np.array(v, dtype=b[k].dtype)\n            else:\n                raise ValueError(('Type mismatch ({} vs. {}) '\n                                'for config key: {}').format(type(b[k]),\n                                                            type(v), k))\n\n        # recursively merge dicts\n        if type(v) is edict:\n            try:\n                _merge_a_into_b(a[k], b[k])\n            except:\n                print('Error under config key: {}'.format(k))\n                raise\n        else:\n            b[k] = v\n\ndef cfg_from_file(filename):\n    \"\"\"Load a config file and merge it into the default options.\"\"\"\n    import yaml\n    with open(filename, 'r') as f:\n        yaml_cfg = edict(yaml.load(f))\n\n    _merge_a_into_b(yaml_cfg, __C)\n\ndef cfg_from_list(cfg_list):\n    \"\"\"Set config keys via list (e.g., from command line).\"\"\"\n    from ast import literal_eval\n    assert len(cfg_list) % 2 == 0\n    for k, v in zip(cfg_list[0::2], cfg_list[1::2]):\n        key_list = k.split('.')\n        d = __C\n        for subkey in key_list[:-1]:\n            assert d.has_key(subkey)\n            d = d[subkey]\n        subkey = key_list[-1]\n        assert d.has_key(subkey)\n        try:\n            value = literal_eval(v)\n        except:\n            # handle the case when v is a string literal\n            value = v\n        assert type(value) == type(d[subkey]), \\\n            'type {} does not match original type {}'.format(\n            type(value), type(d[subkey]))\n        d[subkey] = value\n"
  },
  {
    "path": "lib/fast_rcnn/nms_wrapper.py",
    "content": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\nfrom fast_rcnn.config import cfg\nfrom nms.gpu_nms import gpu_nms\nfrom nms.cpu_nms import cpu_nms\n\ndef nms(dets, thresh, force_cpu=False):\n    \"\"\"Dispatch to either CPU or GPU NMS implementations.\"\"\"\n\n    if dets.shape[0] == 0:\n        return []\n    if cfg.USE_GPU_NMS and not force_cpu:\n        return gpu_nms(dets, thresh, device_id=cfg.GPU_ID)\n    else:\n        return cpu_nms(dets, thresh)\n"
  },
  {
    "path": "lib/fast_rcnn/test.py",
    "content": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\n\"\"\"Test a Fast R-CNN network on an imdb (image database).\"\"\"\n\nfrom fast_rcnn.config import cfg, get_output_dir\nfrom fast_rcnn.bbox_transform import clip_boxes, bbox_transform_inv\nimport argparse\nfrom utils.timer import Timer\nimport numpy as np\nimport cv2\nimport caffe\nfrom fast_rcnn.nms_wrapper import nms\nimport cPickle\nfrom utils.blob import im_list_to_blob\nimport os\n\ndef _get_image_blob(im):\n    \"\"\"Converts an image into a network input.\n\n    Arguments:\n        im (ndarray): a color image in BGR order\n\n    Returns:\n        blob (ndarray): a data blob holding an image pyramid\n        im_scale_factors (list): list of image scales (relative to im) used\n            in the image pyramid\n    \"\"\"\n    im_orig = im.astype(np.float32, copy=True)\n    im_orig -= cfg.PIXEL_MEANS\n\n    im_shape = im_orig.shape\n    im_size_min = np.min(im_shape[0:2])\n    im_size_max = np.max(im_shape[0:2])\n\n    processed_ims = []\n    im_scale_factors = []\n\n    for target_size in cfg.TEST.SCALES:\n        im_scale = float(target_size) / float(im_size_min)\n        # Prevent the biggest axis from being more than MAX_SIZE\n        if np.round(im_scale * im_size_max) > cfg.TEST.MAX_SIZE:\n            im_scale = float(cfg.TEST.MAX_SIZE) / float(im_size_max)\n        im = cv2.resize(im_orig, None, None, fx=im_scale, fy=im_scale,\n                        interpolation=cv2.INTER_LINEAR)\n        im_scale_factors.append(im_scale)\n        processed_ims.append(im)\n\n    # Create a blob to hold the input images\n    blob = im_list_to_blob(processed_ims)\n\n    return blob, np.array(im_scale_factors)\n\ndef _get_rois_blob(im_rois, im_scale_factors):\n    \"\"\"Converts RoIs into network inputs.\n\n    Arguments:\n        im_rois (ndarray): R x 4 matrix of RoIs in original image coordinates\n        im_scale_factors (list): scale factors as returned by _get_image_blob\n\n    Returns:\n        blob (ndarray): R x 5 matrix of RoIs in the image pyramid\n    \"\"\"\n    rois, levels = _project_im_rois(im_rois, im_scale_factors)\n    rois_blob = np.hstack((levels, rois))\n    return rois_blob.astype(np.float32, copy=False)\n\ndef _project_im_rois(im_rois, scales):\n    \"\"\"Project image RoIs into the image pyramid built by _get_image_blob.\n\n    Arguments:\n        im_rois (ndarray): R x 4 matrix of RoIs in original image coordinates\n        scales (list): scale factors as returned by _get_image_blob\n\n    Returns:\n        rois (ndarray): R x 4 matrix of projected RoI coordinates\n        levels (list): image pyramid levels used by each projected RoI\n    \"\"\"\n    im_rois = im_rois.astype(np.float, copy=False)\n\n    if len(scales) > 1:\n        widths = im_rois[:, 2] - im_rois[:, 0] + 1\n        heights = im_rois[:, 3] - im_rois[:, 1] + 1\n\n        areas = widths * heights\n        scaled_areas = areas[:, np.newaxis] * (scales[np.newaxis, :] ** 2)\n        diff_areas = np.abs(scaled_areas - 224 * 224)\n        levels = diff_areas.argmin(axis=1)[:, np.newaxis]\n    else:\n        levels = np.zeros((im_rois.shape[0], 1), dtype=np.int)\n\n    rois = im_rois * scales[levels]\n\n    return rois, levels\n\ndef _get_blobs(im, rois):\n    \"\"\"Convert an image and RoIs within that image into network inputs.\"\"\"\n    blobs = {'data' : None, 'rois' : None}\n    blobs['data'], im_scale_factors = _get_image_blob(im)\n    if not cfg.TEST.HAS_RPN:\n        blobs['rois'] = _get_rois_blob(rois, im_scale_factors)\n    return blobs, im_scale_factors\n\ndef im_detect(net, im, boxes=None):\n    \"\"\"Detect object classes in an image given object proposals.\n\n    Arguments:\n        net (caffe.Net): Fast R-CNN network to use\n        im (ndarray): color image to test (in BGR order)\n        boxes (ndarray): R x 4 array of object proposals or None (for RPN)\n\n    Returns:\n        scores (ndarray): R x K array of object class scores (K includes\n            background as object category 0)\n        boxes (ndarray): R x (4*K) array of predicted bounding boxes\n    \"\"\"\n    blobs, im_scales = _get_blobs(im, boxes)\n\n    # When mapping from image ROIs to feature map ROIs, there's some aliasing\n    # (some distinct image ROIs get mapped to the same feature ROI).\n    # Here, we identify duplicate feature ROIs, so we only compute features\n    # on the unique subset.\n    if cfg.DEDUP_BOXES > 0 and not cfg.TEST.HAS_RPN:\n        v = np.array([1, 1e3, 1e6, 1e9, 1e12])\n        hashes = np.round(blobs['rois'] * cfg.DEDUP_BOXES).dot(v)\n        _, index, inv_index = np.unique(hashes, return_index=True,\n                                        return_inverse=True)\n        blobs['rois'] = blobs['rois'][index, :]\n        boxes = boxes[index, :]\n\n    if cfg.TEST.HAS_RPN:\n        im_blob = blobs['data']\n        blobs['im_info'] = np.array(\n            [[im_blob.shape[2], im_blob.shape[3], im_scales[0]]],\n            dtype=np.float32)\n\n    # reshape network inputs\n    net.blobs['data'].reshape(*(blobs['data'].shape))\n    if cfg.TEST.HAS_RPN:\n        net.blobs['im_info'].reshape(*(blobs['im_info'].shape))\n    else:\n        net.blobs['rois'].reshape(*(blobs['rois'].shape))\n\n    # do forward\n    forward_kwargs = {'data': blobs['data'].astype(np.float32, copy=False)}\n    if cfg.TEST.HAS_RPN:\n        forward_kwargs['im_info'] = blobs['im_info'].astype(np.float32, copy=False)\n    else:\n        forward_kwargs['rois'] = blobs['rois'].astype(np.float32, copy=False)\n    blobs_out = net.forward(**forward_kwargs)\n\n    if cfg.TEST.HAS_RPN:\n        assert len(im_scales) == 1, \"Only single-image batch implemented\"\n        rois = net.blobs['rois'].data.copy()\n        # unscale back to raw image space\n        boxes = rois[:, 1:5] / im_scales[0]\n\n    if cfg.TEST.SVM:\n        # use the raw scores before softmax under the assumption they\n        # were trained as linear SVMs\n        scores = net.blobs['cls_score'].data\n    else:\n        # use softmax estimated probabilities\n        scores = blobs_out['cls_prob']\n\n    if cfg.TEST.BBOX_REG:\n        # Apply bounding-box regression deltas\n        box_deltas = blobs_out['bbox_pred']\n        pred_boxes = bbox_transform_inv(boxes, box_deltas)\n        pred_boxes = clip_boxes(pred_boxes, im.shape)\n    else:\n        # Simply repeat the boxes, once for each class\n        pred_boxes = np.tile(boxes, (1, scores.shape[1]))\n\n    if cfg.DEDUP_BOXES > 0 and not cfg.TEST.HAS_RPN:\n        # Map scores and predictions back to the original set of boxes\n        scores = scores[inv_index, :]\n        pred_boxes = pred_boxes[inv_index, :]\n\n    return scores, pred_boxes\n\ndef vis_detections(im, class_name, dets, thresh=0.3):\n    \"\"\"Visual debugging of detections.\"\"\"\n    import matplotlib.pyplot as plt\n    im = im[:, :, (2, 1, 0)]\n    for i in xrange(np.minimum(10, dets.shape[0])):\n        bbox = dets[i, :4]\n        score = dets[i, -1]\n        if score > thresh:\n            plt.cla()\n            plt.imshow(im)\n            plt.gca().add_patch(\n                plt.Rectangle((bbox[0], bbox[1]),\n                              bbox[2] - bbox[0],\n                              bbox[3] - bbox[1], fill=False,\n                              edgecolor='g', linewidth=3)\n                )\n            plt.title('{}  {:.3f}'.format(class_name, score))\n            plt.show()\n\ndef apply_nms(all_boxes, thresh):\n    \"\"\"Apply non-maximum suppression to all predicted boxes output by the\n    test_net method.\n    \"\"\"\n    num_classes = len(all_boxes)\n    num_images = len(all_boxes[0])\n    nms_boxes = [[[] for _ in xrange(num_images)]\n                 for _ in xrange(num_classes)]\n    for cls_ind in xrange(num_classes):\n        for im_ind in xrange(num_images):\n            dets = all_boxes[cls_ind][im_ind]\n            if dets == []:\n                continue\n            # CPU NMS is much faster than GPU NMS when the number of boxes\n            # is relative small (e.g., < 10k)\n            # TODO(rbg): autotune NMS dispatch\n            keep = nms(dets, thresh, force_cpu=True)\n            if len(keep) == 0:\n                continue\n            nms_boxes[cls_ind][im_ind] = dets[keep, :].copy()\n    return nms_boxes\n\ndef test_net(net, imdb, max_per_image=100, thresh=0.05, vis=False):\n    \"\"\"Test a Fast R-CNN network on an image database.\"\"\"\n    num_images = len(imdb.image_index)\n    # all detections are collected into:\n    #    all_boxes[cls][image] = N x 5 array of detections in\n    #    (x1, y1, x2, y2, score)\n    all_boxes = [[[] for _ in xrange(num_images)]\n                 for _ in xrange(imdb.num_classes)]\n\n    output_dir = get_output_dir(imdb, net)\n\n    # timers\n    _t = {'im_detect' : Timer(), 'misc' : Timer()}\n\n    if not cfg.TEST.HAS_RPN:\n        roidb = imdb.roidb\n\n    for i in xrange(num_images):\n        # filter out any ground truth boxes\n        if cfg.TEST.HAS_RPN:\n            box_proposals = None\n        else:\n            # The roidb may contain ground-truth rois (for example, if the roidb\n            # comes from the training or val split). We only want to evaluate\n            # detection on the *non*-ground-truth rois. We select those the rois\n            # that have the gt_classes field set to 0, which means there's no\n            # ground truth.\n            box_proposals = roidb[i]['boxes'][roidb[i]['gt_classes'] == 0]\n\n        im = cv2.imread(imdb.image_path_at(i))\n        _t['im_detect'].tic()\n        scores, boxes = im_detect(net, im, box_proposals)\n        _t['im_detect'].toc()\n\n        _t['misc'].tic()\n        # skip j = 0, because it's the background class\n        for j in xrange(1, imdb.num_classes):\n            inds = np.where(scores[:, j] > thresh)[0]\n            cls_scores = scores[inds, j]\n            cls_boxes = boxes[inds, j*4:(j+1)*4]\n            cls_dets = np.hstack((cls_boxes, cls_scores[:, np.newaxis])) \\\n                .astype(np.float32, copy=False)\n            keep = nms(cls_dets, cfg.TEST.NMS)\n            cls_dets = cls_dets[keep, :]\n            if vis:\n                vis_detections(im, imdb.classes[j], cls_dets)\n            all_boxes[j][i] = cls_dets\n\n        # Limit to max_per_image detections *over all classes*\n        if max_per_image > 0:\n            image_scores = np.hstack([all_boxes[j][i][:, -1]\n                                      for j in xrange(1, imdb.num_classes)])\n            if len(image_scores) > max_per_image:\n                image_thresh = np.sort(image_scores)[-max_per_image]\n                for j in xrange(1, imdb.num_classes):\n                    keep = np.where(all_boxes[j][i][:, -1] >= image_thresh)[0]\n                    all_boxes[j][i] = all_boxes[j][i][keep, :]\n        _t['misc'].toc()\n\n        print 'im_detect: {:d}/{:d} {:.3f}s {:.3f}s' \\\n              .format(i + 1, num_images, _t['im_detect'].average_time,\n                      _t['misc'].average_time)\n\n    det_file = os.path.join(output_dir, 'detections.pkl')\n    with open(det_file, 'wb') as f:\n        cPickle.dump(all_boxes, f, cPickle.HIGHEST_PROTOCOL)\n\n    print 'Evaluating detections'\n    imdb.evaluate_detections(all_boxes, output_dir)\n"
  },
  {
    "path": "lib/fast_rcnn/train.py",
    "content": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\n\"\"\"Train a Fast R-CNN network.\"\"\"\n\nimport caffe\nfrom fast_rcnn.config import cfg\nimport roi_data_layer.roidb as rdl_roidb\nfrom utils.timer import Timer\nimport numpy as np\nimport os\n\nfrom caffe.proto import caffe_pb2\nimport google.protobuf as pb2\n\nclass SolverWrapper(object):\n    \"\"\"A simple wrapper around Caffe's solver.\n    This wrapper gives us control over he snapshotting process, which we\n    use to unnormalize the learned bounding-box regression weights.\n    \"\"\"\n\n    def __init__(self, solver_prototxt, roidb, output_dir,\n                 pretrained_model=None):\n        \"\"\"Initialize the SolverWrapper.\"\"\"\n        self.output_dir = output_dir\n\n        if (cfg.TRAIN.HAS_RPN and cfg.TRAIN.BBOX_REG and\n            cfg.TRAIN.BBOX_NORMALIZE_TARGETS):\n            # RPN can only use precomputed normalization because there are no\n            # fixed statistics to compute a priori\n            assert cfg.TRAIN.BBOX_NORMALIZE_TARGETS_PRECOMPUTED\n\n        if cfg.TRAIN.BBOX_REG:\n            print 'Computing bounding-box regression targets...'\n            self.bbox_means, self.bbox_stds = \\\n                    rdl_roidb.add_bbox_regression_targets(roidb)\n            print 'done'\n\n        self.solver = caffe.SGDSolver(solver_prototxt)\n        if pretrained_model is not None:\n            print ('Loading pretrained model '\n                   'weights from {:s}').format(pretrained_model)\n            self.solver.net.copy_from(pretrained_model)\n\n        self.solver_param = caffe_pb2.SolverParameter()\n        with open(solver_prototxt, 'rt') as f:\n            pb2.text_format.Merge(f.read(), self.solver_param)\n\n        self.solver.net.layers[0].set_roidb(roidb)\n\n    def snapshot(self):\n        \"\"\"Take a snapshot of the network after unnormalizing the learned\n        bounding-box regression weights. This enables easy use at test-time.\n        \"\"\"\n        net = self.solver.net\n\n        scale_bbox_params = (cfg.TRAIN.BBOX_REG and\n                             cfg.TRAIN.BBOX_NORMALIZE_TARGETS and\n                             net.params.has_key('bbox_pred'))\n\n        if scale_bbox_params:\n            # save original values\n            orig_0 = net.params['bbox_pred'][0].data.copy()\n            orig_1 = net.params['bbox_pred'][1].data.copy()\n\n            # scale and shift with bbox reg unnormalization; then save snapshot\n            net.params['bbox_pred'][0].data[...] = \\\n                    (net.params['bbox_pred'][0].data *\n                     self.bbox_stds[:, np.newaxis])\n            net.params['bbox_pred'][1].data[...] = \\\n                    (net.params['bbox_pred'][1].data *\n                     self.bbox_stds + self.bbox_means)\n\n        infix = ('_' + cfg.TRAIN.SNAPSHOT_INFIX\n                 if cfg.TRAIN.SNAPSHOT_INFIX != '' else '')\n        filename = (self.solver_param.snapshot_prefix + infix +\n                    '_iter_{:d}'.format(self.solver.iter) + '.caffemodel')\n        filename = os.path.join(self.output_dir, filename)\n\n        net.save(str(filename))\n        print 'Wrote snapshot to: {:s}'.format(filename)\n\n        if scale_bbox_params:\n            # restore net to original state\n            net.params['bbox_pred'][0].data[...] = orig_0\n            net.params['bbox_pred'][1].data[...] = orig_1\n        return filename\n\n    def train_model(self, max_iters):\n        \"\"\"Network training loop.\"\"\"\n        last_snapshot_iter = -1\n        timer = Timer()\n        model_paths = []\n        while self.solver.iter < max_iters:\n            # Make one SGD update\n            timer.tic()\n            self.solver.step(1)\n            timer.toc()\n            if self.solver.iter % (10 * self.solver_param.display) == 0:\n                print 'speed: {:.3f}s / iter'.format(timer.average_time)\n\n            if self.solver.iter % cfg.TRAIN.SNAPSHOT_ITERS == 0:\n                last_snapshot_iter = self.solver.iter\n                model_paths.append(self.snapshot())\n\n        if last_snapshot_iter != self.solver.iter:\n            model_paths.append(self.snapshot())\n        return model_paths\n\ndef get_training_roidb(imdb):\n    \"\"\"Returns a roidb (Region of Interest database) for use in training.\"\"\"\n    if cfg.TRAIN.USE_FLIPPED:\n        print 'Appending horizontally-flipped training examples...'\n        imdb.append_flipped_images()\n        print 'done'\n\n    print 'Preparing training data...'\n    rdl_roidb.prepare_roidb(imdb)\n    print 'done'\n\n    return imdb.roidb\n\ndef filter_roidb(roidb):\n    \"\"\"Remove roidb entries that have no usable RoIs.\"\"\"\n\n    def is_valid(entry):\n        # Valid images have:\n        #   (1) At least one foreground RoI OR\n        #   (2) At least one background RoI\n        overlaps = entry['max_overlaps']\n        # find boxes with sufficient overlap\n        fg_inds = np.where(overlaps >= cfg.TRAIN.FG_THRESH)[0]\n        # Select background RoIs as those within [BG_THRESH_LO, BG_THRESH_HI)\n        bg_inds = np.where((overlaps < cfg.TRAIN.BG_THRESH_HI) &\n                           (overlaps >= cfg.TRAIN.BG_THRESH_LO))[0]\n        # image is only valid if such boxes exist\n        valid = len(fg_inds) > 0 or len(bg_inds) > 0\n        return valid\n\n    num = len(roidb)\n    filtered_roidb = [entry for entry in roidb if is_valid(entry)]\n    num_after = len(filtered_roidb)\n    print 'Filtered {} roidb entries: {} -> {}'.format(num - num_after,\n                                                       num, num_after)\n    return filtered_roidb\n\ndef train_net(solver_prototxt, roidb, output_dir,\n              pretrained_model=None, max_iters=40000):\n    \"\"\"Train a Fast R-CNN network.\"\"\"\n\n    roidb = filter_roidb(roidb)\n    sw = SolverWrapper(solver_prototxt, roidb, output_dir,\n                       pretrained_model=pretrained_model)\n\n    print 'Solving...'\n    model_paths = sw.train_model(max_iters)\n    print 'done solving'\n    return model_paths\n"
  },
  {
    "path": "lib/nms/.gitignore",
    "content": "*.c\n*.cpp\n*.so\n"
  },
  {
    "path": "lib/nms/__init__.py",
    "content": ""
  },
  {
    "path": "lib/nms/cpu_nms.pyx",
    "content": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\nimport numpy as np\ncimport numpy as np\n\ncdef inline np.float32_t max(np.float32_t a, np.float32_t b):\n    return a if a >= b else b\n\ncdef inline np.float32_t min(np.float32_t a, np.float32_t b):\n    return a if a <= b else b\n\ndef cpu_nms(np.ndarray[np.float32_t, ndim=2] dets, np.float thresh):\n    cdef np.ndarray[np.float32_t, ndim=1] x1 = dets[:, 0]\n    cdef np.ndarray[np.float32_t, ndim=1] y1 = dets[:, 1]\n    cdef np.ndarray[np.float32_t, ndim=1] x2 = dets[:, 2]\n    cdef np.ndarray[np.float32_t, ndim=1] y2 = dets[:, 3]\n    cdef np.ndarray[np.float32_t, ndim=1] scores = dets[:, 4]\n\n    cdef np.ndarray[np.float32_t, ndim=1] areas = (x2 - x1 + 1) * (y2 - y1 + 1)\n    cdef np.ndarray[np.int_t, ndim=1] order = scores.argsort()[::-1]\n\n    cdef int ndets = dets.shape[0]\n    cdef np.ndarray[np.int_t, ndim=1] suppressed = \\\n            np.zeros((ndets), dtype=np.int)\n\n    # nominal indices\n    cdef int _i, _j\n    # sorted indices\n    cdef int i, j\n    # temp variables for box i's (the box currently under consideration)\n    cdef np.float32_t ix1, iy1, ix2, iy2, iarea\n    # variables for computing overlap with box j (lower scoring box)\n    cdef np.float32_t xx1, yy1, xx2, yy2\n    cdef np.float32_t w, h\n    cdef np.float32_t inter, ovr\n\n    keep = []\n    for _i in range(ndets):\n        i = order[_i]\n        if suppressed[i] == 1:\n            continue\n        keep.append(i)\n        ix1 = x1[i]\n        iy1 = y1[i]\n        ix2 = x2[i]\n        iy2 = y2[i]\n        iarea = areas[i]\n        for _j in range(_i + 1, ndets):\n            j = order[_j]\n            if suppressed[j] == 1:\n                continue\n            xx1 = max(ix1, x1[j])\n            yy1 = max(iy1, y1[j])\n            xx2 = min(ix2, x2[j])\n            yy2 = min(iy2, y2[j])\n            w = max(0.0, xx2 - xx1 + 1)\n            h = max(0.0, yy2 - yy1 + 1)\n            inter = w * h\n            ovr = inter / (iarea + areas[j] - inter)\n            if ovr >= thresh:\n                suppressed[j] = 1\n\n    return keep\n"
  },
  {
    "path": "lib/nms/gpu_nms.hpp",
    "content": "void _nms(int* keep_out, int* num_out, const float* boxes_host, int boxes_num,\n          int boxes_dim, float nms_overlap_thresh, int device_id);\n"
  },
  {
    "path": "lib/nms/gpu_nms.pyx",
    "content": "# --------------------------------------------------------\n# Faster R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\nimport numpy as np\ncimport numpy as np\n\nassert sizeof(int) == sizeof(np.int32_t)\n\ncdef extern from \"gpu_nms.hpp\":\n    void _nms(np.int32_t*, int*, np.float32_t*, int, int, float, int)\n\ndef gpu_nms(np.ndarray[np.float32_t, ndim=2] dets, np.float thresh,\n            np.int32_t device_id=0):\n    cdef int boxes_num = dets.shape[0]\n    cdef int boxes_dim = dets.shape[1]\n    cdef int num_out\n    cdef np.ndarray[np.int32_t, ndim=1] \\\n        keep = np.zeros(boxes_num, dtype=np.int32)\n    cdef np.ndarray[np.float32_t, ndim=1] \\\n        scores = dets[:, 4]\n    cdef np.ndarray[np.int_t, ndim=1] \\\n        order = scores.argsort()[::-1]\n    cdef np.ndarray[np.float32_t, ndim=2] \\\n        sorted_dets = dets[order, :]\n    _nms(&keep[0], &num_out, &sorted_dets[0, 0], boxes_num, boxes_dim, thresh, device_id)\n    keep = keep[:num_out]\n    return list(order[keep])\n"
  },
  {
    "path": "lib/nms/nms_kernel.cu",
    "content": "// ------------------------------------------------------------------\n// Faster R-CNN\n// Copyright (c) 2015 Microsoft\n// Licensed under The MIT License [see fast-rcnn/LICENSE for details]\n// Written by Shaoqing Ren\n// ------------------------------------------------------------------\n\n#include \"gpu_nms.hpp\"\n#include <vector>\n#include <iostream>\n\n#define CUDA_CHECK(condition) \\\n  /* Code block avoids redefinition of cudaError_t error */ \\\n  do { \\\n    cudaError_t error = condition; \\\n    if (error != cudaSuccess) { \\\n      std::cout << cudaGetErrorString(error) << std::endl; \\\n    } \\\n  } while (0)\n\n#define DIVUP(m,n) ((m) / (n) + ((m) % (n) > 0))\nint const threadsPerBlock = sizeof(unsigned long long) * 8;\n\n__device__ inline float devIoU(float const * const a, float const * const b) {\n  float left = max(a[0], b[0]), right = min(a[2], b[2]);\n  float top = max(a[1], b[1]), bottom = min(a[3], b[3]);\n  float width = max(right - left + 1, 0.f), height = max(bottom - top + 1, 0.f);\n  float interS = width * height;\n  float Sa = (a[2] - a[0] + 1) * (a[3] - a[1] + 1);\n  float Sb = (b[2] - b[0] + 1) * (b[3] - b[1] + 1);\n  return interS / (Sa + Sb - interS);\n}\n\n__global__ void nms_kernel(const int n_boxes, const float nms_overlap_thresh,\n                           const float *dev_boxes, unsigned long long *dev_mask) {\n  const int row_start = blockIdx.y;\n  const int col_start = blockIdx.x;\n\n  // if (row_start > col_start) return;\n\n  const int row_size =\n        min(n_boxes - row_start * threadsPerBlock, threadsPerBlock);\n  const int col_size =\n        min(n_boxes - col_start * threadsPerBlock, threadsPerBlock);\n\n  __shared__ float block_boxes[threadsPerBlock * 5];\n  if (threadIdx.x < col_size) {\n    block_boxes[threadIdx.x * 5 + 0] =\n        dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 0];\n    block_boxes[threadIdx.x * 5 + 1] =\n        dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 1];\n    block_boxes[threadIdx.x * 5 + 2] =\n        dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 2];\n    block_boxes[threadIdx.x * 5 + 3] =\n        dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 3];\n    block_boxes[threadIdx.x * 5 + 4] =\n        dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 4];\n  }\n  __syncthreads();\n\n  if (threadIdx.x < row_size) {\n    const int cur_box_idx = threadsPerBlock * row_start + threadIdx.x;\n    const float *cur_box = dev_boxes + cur_box_idx * 5;\n    int i = 0;\n    unsigned long long t = 0;\n    int start = 0;\n    if (row_start == col_start) {\n      start = threadIdx.x + 1;\n    }\n    for (i = start; i < col_size; i++) {\n      if (devIoU(cur_box, block_boxes + i * 5) > nms_overlap_thresh) {\n        t |= 1ULL << i;\n      }\n    }\n    const int col_blocks = DIVUP(n_boxes, threadsPerBlock);\n    dev_mask[cur_box_idx * col_blocks + col_start] = t;\n  }\n}\n\nvoid _set_device(int device_id) {\n  int current_device;\n  CUDA_CHECK(cudaGetDevice(&current_device));\n  if (current_device == device_id) {\n    return;\n  }\n  // The call to cudaSetDevice must come before any calls to Get, which\n  // may perform initialization using the GPU.\n  CUDA_CHECK(cudaSetDevice(device_id));\n}\n\nvoid _nms(int* keep_out, int* num_out, const float* boxes_host, int boxes_num,\n          int boxes_dim, float nms_overlap_thresh, int device_id) {\n  _set_device(device_id);\n\n  float* boxes_dev = NULL;\n  unsigned long long* mask_dev = NULL;\n\n  const int col_blocks = DIVUP(boxes_num, threadsPerBlock);\n\n  CUDA_CHECK(cudaMalloc(&boxes_dev,\n                        boxes_num * boxes_dim * sizeof(float)));\n  CUDA_CHECK(cudaMemcpy(boxes_dev,\n                        boxes_host,\n                        boxes_num * boxes_dim * sizeof(float),\n                        cudaMemcpyHostToDevice));\n\n  CUDA_CHECK(cudaMalloc(&mask_dev,\n                        boxes_num * col_blocks * sizeof(unsigned long long)));\n\n  dim3 blocks(DIVUP(boxes_num, threadsPerBlock),\n              DIVUP(boxes_num, threadsPerBlock));\n  dim3 threads(threadsPerBlock);\n  nms_kernel<<<blocks, threads>>>(boxes_num,\n                                  nms_overlap_thresh,\n                                  boxes_dev,\n                                  mask_dev);\n\n  std::vector<unsigned long long> mask_host(boxes_num * col_blocks);\n  CUDA_CHECK(cudaMemcpy(&mask_host[0],\n                        mask_dev,\n                        sizeof(unsigned long long) * boxes_num * col_blocks,\n                        cudaMemcpyDeviceToHost));\n\n  std::vector<unsigned long long> remv(col_blocks);\n  memset(&remv[0], 0, sizeof(unsigned long long) * col_blocks);\n\n  int num_to_keep = 0;\n  for (int i = 0; i < boxes_num; i++) {\n    int nblock = i / threadsPerBlock;\n    int inblock = i % threadsPerBlock;\n\n    if (!(remv[nblock] & (1ULL << inblock))) {\n      keep_out[num_to_keep++] = i;\n      unsigned long long *p = &mask_host[0] + i * col_blocks;\n      for (int j = nblock; j < col_blocks; j++) {\n        remv[j] |= p[j];\n      }\n    }\n  }\n  *num_out = num_to_keep;\n\n  CUDA_CHECK(cudaFree(boxes_dev));\n  CUDA_CHECK(cudaFree(mask_dev));\n}\n"
  },
  {
    "path": "lib/nms/py_cpu_nms.py",
    "content": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\nimport numpy as np\n\ndef py_cpu_nms(dets, thresh):\n    \"\"\"Pure Python NMS baseline.\"\"\"\n    x1 = dets[:, 0]\n    y1 = dets[:, 1]\n    x2 = dets[:, 2]\n    y2 = dets[:, 3]\n    scores = dets[:, 4]\n\n    areas = (x2 - x1 + 1) * (y2 - y1 + 1)\n    order = scores.argsort()[::-1]\n\n    keep = []\n    while order.size > 0:\n        i = order[0]\n        keep.append(i)\n        xx1 = np.maximum(x1[i], x1[order[1:]])\n        yy1 = np.maximum(y1[i], y1[order[1:]])\n        xx2 = np.minimum(x2[i], x2[order[1:]])\n        yy2 = np.minimum(y2[i], y2[order[1:]])\n\n        w = np.maximum(0.0, xx2 - xx1 + 1)\n        h = np.maximum(0.0, yy2 - yy1 + 1)\n        inter = w * h\n        ovr = inter / (areas[i] + areas[order[1:]] - inter)\n\n        inds = np.where(ovr <= thresh)[0]\n        order = order[inds + 1]\n\n    return keep\n"
  },
  {
    "path": "lib/pycocotools/UPSTREAM_REV",
    "content": "https://github.com/pdollar/coco/commit/3ac47c77ebd5a1ed4254a98b7fbf2ef4765a3574\n"
  },
  {
    "path": "lib/pycocotools/__init__.py",
    "content": "__author__ = 'tylin'\n"
  },
  {
    "path": "lib/pycocotools/_mask.pyx",
    "content": "# distutils: language = c\n# distutils: sources = ../MatlabAPI/private/maskApi.c\n\n#**************************************************************************\n# Microsoft COCO Toolbox.      version 2.0\n# Data, paper, and tutorials available at:  http://mscoco.org/\n# Code written by Piotr Dollar and Tsung-Yi Lin, 2015.\n# Licensed under the Simplified BSD License [see coco/license.txt]\n#**************************************************************************\n\n__author__ = 'tsungyi'\n\n# import both Python-level and C-level symbols of Numpy\n# the API uses Numpy to interface C and Python\nimport numpy as np\ncimport numpy as np\nfrom libc.stdlib cimport malloc, free\n\n# intialized Numpy. must do.\nnp.import_array()\n\n# import numpy C function\n# we use PyArray_ENABLEFLAGS to make Numpy ndarray responsible to memoery management\ncdef extern from \"numpy/arrayobject.h\":\n    void PyArray_ENABLEFLAGS(np.ndarray arr, int flags)\n\n# Declare the prototype of the C functions in MaskApi.h\ncdef extern from \"maskApi.h\":\n    ctypedef unsigned int uint\n    ctypedef unsigned long siz\n    ctypedef unsigned char byte\n    ctypedef double* BB\n    ctypedef struct RLE:\n        siz h,\n        siz w,\n        siz m,\n        uint* cnts,\n    void rlesInit( RLE **R, siz n )\n    void rleEncode( RLE *R, const byte *M, siz h, siz w, siz n )\n    void rleDecode( const RLE *R, byte *mask, siz n )\n    void rleMerge( const RLE *R, RLE *M, siz n, bint intersect )\n    void rleArea( const RLE *R, siz n, uint *a )\n    void rleIou( RLE *dt, RLE *gt, siz m, siz n, byte *iscrowd, double *o )\n    void bbIou( BB dt, BB gt, siz m, siz n, byte *iscrowd, double *o )\n    void rleToBbox( const RLE *R, BB bb, siz n )\n    void rleFrBbox( RLE *R, const BB bb, siz h, siz w, siz n )\n    void rleFrPoly( RLE *R, const double *xy, siz k, siz h, siz w )\n    char* rleToString( const RLE *R )\n    void rleFrString( RLE *R, char *s, siz h, siz w )\n\n# python class to wrap RLE array in C\n# the class handles the memory allocation and deallocation\ncdef class RLEs:\n    cdef RLE *_R\n    cdef siz _n\n\n    def __cinit__(self, siz n =0):\n        rlesInit(&self._R, n)\n        self._n = n\n\n    # free the RLE array here\n    def __dealloc__(self):\n        if self._R is not NULL:\n            for i in range(self._n):\n                free(self._R[i].cnts)\n            free(self._R)\n    def __getattr__(self, key):\n        if key == 'n':\n            return self._n\n        raise AttributeError(key)\n\n# python class to wrap Mask array in C\n# the class handles the memory allocation and deallocation\ncdef class Masks:\n    cdef byte *_mask\n    cdef siz _h\n    cdef siz _w\n    cdef siz _n\n\n    def __cinit__(self, h, w, n):\n        self._mask = <byte*> malloc(h*w*n* sizeof(byte))\n        self._h = h\n        self._w = w\n        self._n = n\n    # def __dealloc__(self):\n        # the memory management of _mask has been passed to np.ndarray\n        # it doesn't need to be freed here\n\n    # called when passing into np.array() and return an np.ndarray in column-major order\n    def __array__(self):\n        cdef np.npy_intp shape[1]\n        shape[0] = <np.npy_intp> self._h*self._w*self._n\n        # Create a 1D array, and reshape it to fortran/Matlab column-major array\n        ndarray = np.PyArray_SimpleNewFromData(1, shape, np.NPY_UINT8, self._mask).reshape((self._h, self._w, self._n), order='F')\n        # The _mask allocated by Masks is now handled by ndarray\n        PyArray_ENABLEFLAGS(ndarray, np.NPY_OWNDATA)\n        return ndarray\n\n# internal conversion from Python RLEs object to compressed RLE format\ndef _toString(RLEs Rs):\n    cdef siz n = Rs.n\n    cdef bytes py_string\n    cdef char* c_string\n    objs = []\n    for i in range(n):\n        c_string = rleToString( <RLE*> &Rs._R[i] )\n        py_string = c_string\n        objs.append({\n            'size': [Rs._R[i].h, Rs._R[i].w],\n            'counts': py_string\n        })\n        free(c_string)\n    return objs\n\n# internal conversion from compressed RLE format to Python RLEs object\ndef _frString(rleObjs):\n    cdef siz n = len(rleObjs)\n    Rs = RLEs(n)\n    cdef bytes py_string\n    cdef char* c_string\n    for i, obj in enumerate(rleObjs):\n        py_string = str(obj['counts'])\n        c_string = py_string\n        rleFrString( <RLE*> &Rs._R[i], <char*> c_string, obj['size'][0], obj['size'][1] )\n    return Rs\n\n# encode mask to RLEs objects\n# list of RLE string can be generated by RLEs member function\ndef encode(np.ndarray[np.uint8_t, ndim=3, mode='fortran'] mask):\n    h, w, n = mask.shape[0], mask.shape[1], mask.shape[2]\n    cdef RLEs Rs = RLEs(n)\n    rleEncode(Rs._R,<byte*>mask.data,h,w,n)\n    objs = _toString(Rs)\n    return objs\n\n# decode mask from compressed list of RLE string or RLEs object\ndef decode(rleObjs):\n    cdef RLEs Rs = _frString(rleObjs)\n    h, w, n = Rs._R[0].h, Rs._R[0].w, Rs._n\n    masks = Masks(h, w, n)\n    rleDecode( <RLE*>Rs._R, masks._mask, n );\n    return np.array(masks)\n\ndef merge(rleObjs, bint intersect=0):\n    cdef RLEs Rs = _frString(rleObjs)\n    cdef RLEs R = RLEs(1)\n    rleMerge(<RLE*>Rs._R, <RLE*> R._R, <siz> Rs._n, intersect)\n    obj = _toString(R)[0]\n    return obj\n\ndef area(rleObjs):\n    cdef RLEs Rs = _frString(rleObjs)\n    cdef uint* _a = <uint*> malloc(Rs._n* sizeof(uint))\n    rleArea(Rs._R, Rs._n, _a)\n    cdef np.npy_intp shape[1]\n    shape[0] = <np.npy_intp> Rs._n\n    a = np.array((Rs._n, ), dtype=np.uint8)\n    a = np.PyArray_SimpleNewFromData(1, shape, np.NPY_UINT32, _a)\n    PyArray_ENABLEFLAGS(a, np.NPY_OWNDATA)\n    return a\n\n# iou computation. support function overload (RLEs-RLEs and bbox-bbox).\ndef iou( dt, gt, pyiscrowd ):\n    def _preproc(objs):\n        if len(objs) == 0:\n            return objs\n        if type(objs) == np.ndarray:\n            if len(objs.shape) == 1:\n                objs = objs.reshape((objs[0], 1))\n            # check if it's Nx4 bbox\n            if not len(objs.shape) == 2 or not objs.shape[1] == 4:\n                raise Exception('numpy ndarray input is only for *bounding boxes* and should have Nx4 dimension')\n            objs = objs.astype(np.double)\n        elif type(objs) == list:\n            # check if list is in box format and convert it to np.ndarray\n            isbox = np.all(np.array([(len(obj)==4) and ((type(obj)==list) or (type(obj)==np.ndarray)) for obj in objs]))\n            isrle = np.all(np.array([type(obj) == dict for obj in objs]))\n            if isbox:\n                objs = np.array(objs, dtype=np.double)\n                if len(objs.shape) == 1:\n                    objs = objs.reshape((1,objs.shape[0]))\n            elif isrle:\n                objs = _frString(objs)\n            else:\n                raise Exception('list input can be bounding box (Nx4) or RLEs ([RLE])')\n        else:\n            raise Exception('unrecognized type.  The following type: RLEs (rle), np.ndarray (box), and list (box) are supported.')\n        return objs\n    def _rleIou(RLEs dt, RLEs gt, np.ndarray[np.uint8_t, ndim=1] iscrowd, siz m, siz n, np.ndarray[np.double_t,  ndim=1] _iou):\n        rleIou( <RLE*> dt._R, <RLE*> gt._R, m, n, <byte*> iscrowd.data, <double*> _iou.data )\n    def _bbIou(np.ndarray[np.double_t, ndim=2] dt, np.ndarray[np.double_t, ndim=2] gt, np.ndarray[np.uint8_t, ndim=1] iscrowd, siz m, siz n, np.ndarray[np.double_t, ndim=1] _iou):\n        bbIou( <BB> dt.data, <BB> gt.data, m, n, <byte*> iscrowd.data, <double*>_iou.data )\n    def _len(obj):\n        cdef siz N = 0\n        if type(obj) == RLEs:\n            N = obj.n\n        elif len(obj)==0:\n            pass\n        elif type(obj) == np.ndarray:\n            N = obj.shape[0]\n        return N\n    # convert iscrowd to numpy array\n    cdef np.ndarray[np.uint8_t, ndim=1] iscrowd = np.array(pyiscrowd, dtype=np.uint8)\n    # simple type checking\n    cdef siz m, n\n    dt = _preproc(dt)\n    gt = _preproc(gt)\n    m = _len(dt)\n    n = _len(gt)\n    if m == 0 or n == 0:\n        return []\n    if not type(dt) == type(gt):\n        raise Exception('The dt and gt should have the same data type, either RLEs, list or np.ndarray')\n\n    # define local variables\n    cdef double* _iou = <double*> 0\n    cdef np.npy_intp shape[1]\n    # check type and assign iou function\n    if type(dt) == RLEs:\n        _iouFun = _rleIou\n    elif type(dt) == np.ndarray:\n        _iouFun = _bbIou\n    else:\n        raise Exception('input data type not allowed.')\n    _iou = <double*> malloc(m*n* sizeof(double))\n    iou = np.zeros((m*n, ), dtype=np.double)\n    shape[0] = <np.npy_intp> m*n\n    iou = np.PyArray_SimpleNewFromData(1, shape, np.NPY_DOUBLE, _iou)\n    PyArray_ENABLEFLAGS(iou, np.NPY_OWNDATA)\n    _iouFun(dt, gt, iscrowd, m, n, iou)\n    return iou.reshape((m,n), order='F')\n\ndef toBbox( rleObjs ):\n    cdef RLEs Rs = _frString(rleObjs)\n    cdef siz n = Rs.n\n    cdef BB _bb = <BB> malloc(4*n* sizeof(double))\n    rleToBbox( <const RLE*> Rs._R, _bb, n )\n    cdef np.npy_intp shape[1]\n    shape[0] = <np.npy_intp> 4*n\n    bb = np.array((1,4*n), dtype=np.double)\n    bb = np.PyArray_SimpleNewFromData(1, shape, np.NPY_DOUBLE, _bb).reshape((n, 4))\n    PyArray_ENABLEFLAGS(bb, np.NPY_OWNDATA)\n    return bb\n\ndef frBbox(np.ndarray[np.double_t, ndim=2] bb, siz h, siz w ):\n    cdef siz n = bb.shape[0]\n    Rs = RLEs(n)\n    rleFrBbox( <RLE*> Rs._R, <const BB> bb.data, h, w, n )\n    objs = _toString(Rs)\n    return objs\n\ndef frPoly( poly, siz h, siz w ):\n    cdef np.ndarray[np.double_t, ndim=1] np_poly\n    n = len(poly)\n    Rs = RLEs(n)\n    for i, p in enumerate(poly):\n        np_poly = np.array(p, dtype=np.double, order='F')\n        rleFrPoly( <RLE*>&Rs._R[i], <const double*> np_poly.data, len(np_poly)/2, h, w )\n    objs = _toString(Rs)\n    return objs\n\ndef frUncompressedRLE(ucRles, siz h, siz w):\n    cdef np.ndarray[np.uint32_t, ndim=1] cnts\n    cdef RLE R\n    cdef uint *data\n    n = len(ucRles)\n    objs = []\n    for i in range(n):\n        Rs = RLEs(1)\n        cnts = np.array(ucRles[i]['counts'], dtype=np.uint32)\n        # time for malloc can be saved here but it's fine\n        data = <uint*> malloc(len(cnts)* sizeof(uint))\n        for j in range(len(cnts)):\n            data[j] = <uint> cnts[j]\n        R = RLE(ucRles[i]['size'][0], ucRles[i]['size'][1], len(cnts), <uint*> data)\n        Rs._R[0] = R\n        objs.append(_toString(Rs)[0])\n    return objs\n\ndef frPyObjects(pyobj, siz h, w):\n    if type(pyobj) == np.ndarray:\n        objs = frBbox(pyobj, h, w )\n    elif type(pyobj) == list and len(pyobj[0]) == 4:\n        objs = frBbox(pyobj, h, w )\n    elif type(pyobj) == list and len(pyobj[0]) > 4:\n        objs = frPoly(pyobj, h, w )\n    elif type(pyobj) == list and type(pyobj[0]) == dict:\n        objs = frUncompressedRLE(pyobj, h, w)\n    else:\n        raise Exception('input type is not supported.')\n    return objs\n"
  },
  {
    "path": "lib/pycocotools/coco.py",
    "content": "__author__ = 'tylin'\n__version__ = '1.0.1'\n# Interface for accessing the Microsoft COCO dataset.\n\n# Microsoft COCO is a large image dataset designed for object detection,\n# segmentation, and caption generation. pycocotools is a Python API that\n# assists in loading, parsing and visualizing the annotations in COCO.\n# Please visit http://mscoco.org/ for more information on COCO, including\n# for the data, paper, and tutorials. The exact format of the annotations\n# is also described on the COCO website. For example usage of the pycocotools\n# please see pycocotools_demo.ipynb. In addition to this API, please download both\n# the COCO images and annotations in order to run the demo.\n\n# An alternative to using the API is to load the annotations directly\n# into Python dictionary\n# Using the API provides additional utility functions. Note that this API\n# supports both *instance* and *caption* annotations. In the case of\n# captions not all functions are defined (e.g. categories are undefined).\n\n# The following API functions are defined:\n#  COCO       - COCO api class that loads COCO annotation file and prepare data structures.\n#  decodeMask - Decode binary mask M encoded via run-length encoding.\n#  encodeMask - Encode binary mask M using run-length encoding.\n#  getAnnIds  - Get ann ids that satisfy given filter conditions.\n#  getCatIds  - Get cat ids that satisfy given filter conditions.\n#  getImgIds  - Get img ids that satisfy given filter conditions.\n#  loadAnns   - Load anns with the specified ids.\n#  loadCats   - Load cats with the specified ids.\n#  loadImgs   - Load imgs with the specified ids.\n#  segToMask  - Convert polygon segmentation to binary mask.\n#  showAnns   - Display the specified annotations.\n#  loadRes    - Load algorithm results and create API for accessing them.\n#  download   - Download COCO images from mscoco.org server.\n# Throughout the API \"ann\"=annotation, \"cat\"=category, and \"img\"=image.\n# Help on each functions can be accessed by: \"help COCO>function\".\n\n# See also COCO>decodeMask,\n# COCO>encodeMask, COCO>getAnnIds, COCO>getCatIds,\n# COCO>getImgIds, COCO>loadAnns, COCO>loadCats,\n# COCO>loadImgs, COCO>segToMask, COCO>showAnns\n\n# Microsoft COCO Toolbox.      version 2.0\n# Data, paper, and tutorials available at:  http://mscoco.org/\n# Code written by Piotr Dollar and Tsung-Yi Lin, 2014.\n# Licensed under the Simplified BSD License [see bsd.txt]\n\nimport json\nimport datetime\nimport time\nimport matplotlib.pyplot as plt\nfrom matplotlib.collections import PatchCollection\nfrom matplotlib.patches import Polygon\nimport numpy as np\nfrom skimage.draw import polygon\nimport urllib\nimport copy\nimport itertools\nimport mask\nimport os\n\nclass COCO:\n    def __init__(self, annotation_file=None):\n        \"\"\"\n        Constructor of Microsoft COCO helper class for reading and visualizing annotations.\n        :param annotation_file (str): location of annotation file\n        :param image_folder (str): location to the folder that hosts images.\n        :return:\n        \"\"\"\n        # load dataset\n        self.dataset = {}\n        self.anns = []\n        self.imgToAnns = {}\n        self.catToImgs = {}\n        self.imgs = {}\n        self.cats = {}\n        if not annotation_file == None:\n            print 'loading annotations into memory...'\n            tic = time.time()\n            dataset = json.load(open(annotation_file, 'r'))\n            print 'Done (t=%0.2fs)'%(time.time()- tic)\n            self.dataset = dataset\n            self.createIndex()\n\n    def createIndex(self):\n        # create index\n        print 'creating index...'\n        anns = {}\n        imgToAnns = {}\n        catToImgs = {}\n        cats = {}\n        imgs = {}\n        if 'annotations' in self.dataset:\n            imgToAnns = {ann['image_id']: [] for ann in self.dataset['annotations']}\n            anns =      {ann['id']:       [] for ann in self.dataset['annotations']}\n            for ann in self.dataset['annotations']:\n                imgToAnns[ann['image_id']] += [ann]\n                anns[ann['id']] = ann\n\n        if 'images' in self.dataset:\n            imgs      = {im['id']: {} for im in self.dataset['images']}\n            for img in self.dataset['images']:\n                imgs[img['id']] = img\n\n        if 'categories' in self.dataset:\n            cats = {cat['id']: [] for cat in self.dataset['categories']}\n            for cat in self.dataset['categories']:\n                cats[cat['id']] = cat\n            catToImgs = {cat['id']: [] for cat in self.dataset['categories']}\n            if 'annotations' in self.dataset:\n                for ann in self.dataset['annotations']:\n                    catToImgs[ann['category_id']] += [ann['image_id']]\n\n        print 'index created!'\n\n        # create class members\n        self.anns = anns\n        self.imgToAnns = imgToAnns\n        self.catToImgs = catToImgs\n        self.imgs = imgs\n        self.cats = cats\n\n    def info(self):\n        \"\"\"\n        Print information about the annotation file.\n        :return:\n        \"\"\"\n        for key, value in self.dataset['info'].items():\n            print '%s: %s'%(key, value)\n\n    def getAnnIds(self, imgIds=[], catIds=[], areaRng=[], iscrowd=None):\n        \"\"\"\n        Get ann ids that satisfy given filter conditions. default skips that filter\n        :param imgIds  (int array)     : get anns for given imgs\n               catIds  (int array)     : get anns for given cats\n               areaRng (float array)   : get anns for given area range (e.g. [0 inf])\n               iscrowd (boolean)       : get anns for given crowd label (False or True)\n        :return: ids (int array)       : integer array of ann ids\n        \"\"\"\n        imgIds = imgIds if type(imgIds) == list else [imgIds]\n        catIds = catIds if type(catIds) == list else [catIds]\n\n        if len(imgIds) == len(catIds) == len(areaRng) == 0:\n            anns = self.dataset['annotations']\n        else:\n            if not len(imgIds) == 0:\n                # this can be changed by defaultdict\n                lists = [self.imgToAnns[imgId] for imgId in imgIds if imgId in self.imgToAnns]\n                anns = list(itertools.chain.from_iterable(lists))\n            else:\n                anns = self.dataset['annotations']\n            anns = anns if len(catIds)  == 0 else [ann for ann in anns if ann['category_id'] in catIds]\n            anns = anns if len(areaRng) == 0 else [ann for ann in anns if ann['area'] > areaRng[0] and ann['area'] < areaRng[1]]\n        if not iscrowd == None:\n            ids = [ann['id'] for ann in anns if ann['iscrowd'] == iscrowd]\n        else:\n            ids = [ann['id'] for ann in anns]\n        return ids\n\n    def getCatIds(self, catNms=[], supNms=[], catIds=[]):\n        \"\"\"\n        filtering parameters. default skips that filter.\n        :param catNms (str array)  : get cats for given cat names\n        :param supNms (str array)  : get cats for given supercategory names\n        :param catIds (int array)  : get cats for given cat ids\n        :return: ids (int array)   : integer array of cat ids\n        \"\"\"\n        catNms = catNms if type(catNms) == list else [catNms]\n        supNms = supNms if type(supNms) == list else [supNms]\n        catIds = catIds if type(catIds) == list else [catIds]\n\n        if len(catNms) == len(supNms) == len(catIds) == 0:\n            cats = self.dataset['categories']\n        else:\n            cats = self.dataset['categories']\n            cats = cats if len(catNms) == 0 else [cat for cat in cats if cat['name']          in catNms]\n            cats = cats if len(supNms) == 0 else [cat for cat in cats if cat['supercategory'] in supNms]\n            cats = cats if len(catIds) == 0 else [cat for cat in cats if cat['id']            in catIds]\n        ids = [cat['id'] for cat in cats]\n        return ids\n\n    def getImgIds(self, imgIds=[], catIds=[]):\n        '''\n        Get img ids that satisfy given filter conditions.\n        :param imgIds (int array) : get imgs for given ids\n        :param catIds (int array) : get imgs with all given cats\n        :return: ids (int array)  : integer array of img ids\n        '''\n        imgIds = imgIds if type(imgIds) == list else [imgIds]\n        catIds = catIds if type(catIds) == list else [catIds]\n\n        if len(imgIds) == len(catIds) == 0:\n            ids = self.imgs.keys()\n        else:\n            ids = set(imgIds)\n            for i, catId in enumerate(catIds):\n                if i == 0 and len(ids) == 0:\n                    ids = set(self.catToImgs[catId])\n                else:\n                    ids &= set(self.catToImgs[catId])\n        return list(ids)\n\n    def loadAnns(self, ids=[]):\n        \"\"\"\n        Load anns with the specified ids.\n        :param ids (int array)       : integer ids specifying anns\n        :return: anns (object array) : loaded ann objects\n        \"\"\"\n        if type(ids) == list:\n            return [self.anns[id] for id in ids]\n        elif type(ids) == int:\n            return [self.anns[ids]]\n\n    def loadCats(self, ids=[]):\n        \"\"\"\n        Load cats with the specified ids.\n        :param ids (int array)       : integer ids specifying cats\n        :return: cats (object array) : loaded cat objects\n        \"\"\"\n        if type(ids) == list:\n            return [self.cats[id] for id in ids]\n        elif type(ids) == int:\n            return [self.cats[ids]]\n\n    def loadImgs(self, ids=[]):\n        \"\"\"\n        Load anns with the specified ids.\n        :param ids (int array)       : integer ids specifying img\n        :return: imgs (object array) : loaded img objects\n        \"\"\"\n        if type(ids) == list:\n            return [self.imgs[id] for id in ids]\n        elif type(ids) == int:\n            return [self.imgs[ids]]\n\n    def showAnns(self, anns):\n        \"\"\"\n        Display the specified annotations.\n        :param anns (array of object): annotations to display\n        :return: None\n        \"\"\"\n        if len(anns) == 0:\n            return 0\n        if 'segmentation' in anns[0]:\n            datasetType = 'instances'\n        elif 'caption' in anns[0]:\n            datasetType = 'captions'\n        if datasetType == 'instances':\n            ax = plt.gca()\n            polygons = []\n            color = []\n            for ann in anns:\n                c = np.random.random((1, 3)).tolist()[0]\n                if type(ann['segmentation']) == list:\n                    # polygon\n                    for seg in ann['segmentation']:\n                        poly = np.array(seg).reshape((len(seg)/2, 2))\n                        polygons.append(Polygon(poly, True,alpha=0.4))\n                        color.append(c)\n                else:\n                    # mask\n                    t = self.imgs[ann['image_id']]\n                    if type(ann['segmentation']['counts']) == list:\n                        rle = mask.frPyObjects([ann['segmentation']], t['height'], t['width'])\n                    else:\n                        rle = [ann['segmentation']]\n                    m = mask.decode(rle)\n                    img = np.ones( (m.shape[0], m.shape[1], 3) )\n                    if ann['iscrowd'] == 1:\n                        color_mask = np.array([2.0,166.0,101.0])/255\n                    if ann['iscrowd'] == 0:\n                        color_mask = np.random.random((1, 3)).tolist()[0]\n                    for i in range(3):\n                        img[:,:,i] = color_mask[i]\n                    ax.imshow(np.dstack( (img, m*0.5) ))\n            p = PatchCollection(polygons, facecolors=color, edgecolors=(0,0,0,1), linewidths=3, alpha=0.4)\n            ax.add_collection(p)\n        elif datasetType == 'captions':\n            for ann in anns:\n                print ann['caption']\n\n    def loadRes(self, resFile):\n        \"\"\"\n        Load result file and return a result api object.\n        :param   resFile (str)     : file name of result file\n        :return: res (obj)         : result api object\n        \"\"\"\n        res = COCO()\n        res.dataset['images'] = [img for img in self.dataset['images']]\n        # res.dataset['info'] = copy.deepcopy(self.dataset['info'])\n        # res.dataset['licenses'] = copy.deepcopy(self.dataset['licenses'])\n\n        print 'Loading and preparing results...     '\n        tic = time.time()\n        anns    = json.load(open(resFile))\n        assert type(anns) == list, 'results in not an array of objects'\n        annsImgIds = [ann['image_id'] for ann in anns]\n        assert set(annsImgIds) == (set(annsImgIds) & set(self.getImgIds())), \\\n               'Results do not correspond to current coco set'\n        if 'caption' in anns[0]:\n            imgIds = set([img['id'] for img in res.dataset['images']]) & set([ann['image_id'] for ann in anns])\n            res.dataset['images'] = [img for img in res.dataset['images'] if img['id'] in imgIds]\n            for id, ann in enumerate(anns):\n                ann['id'] = id+1\n        elif 'bbox' in anns[0] and not anns[0]['bbox'] == []:\n            res.dataset['categories'] = copy.deepcopy(self.dataset['categories'])\n            for id, ann in enumerate(anns):\n                bb = ann['bbox']\n                x1, x2, y1, y2 = [bb[0], bb[0]+bb[2], bb[1], bb[1]+bb[3]]\n                if not 'segmentation' in ann:\n                    ann['segmentation'] = [[x1, y1, x1, y2, x2, y2, x2, y1]]\n                ann['area'] = bb[2]*bb[3]\n                ann['id'] = id+1\n                ann['iscrowd'] = 0\n        elif 'segmentation' in anns[0]:\n            res.dataset['categories'] = copy.deepcopy(self.dataset['categories'])\n            for id, ann in enumerate(anns):\n                # now only support compressed RLE format as segmentation results\n                ann['area'] = mask.area([ann['segmentation']])[0]\n                if not 'bbox' in ann:\n                    ann['bbox'] = mask.toBbox([ann['segmentation']])[0]\n                ann['id'] = id+1\n                ann['iscrowd'] = 0\n        print 'DONE (t=%0.2fs)'%(time.time()- tic)\n\n        res.dataset['annotations'] = anns\n        res.createIndex()\n        return res\n\n    def download( self, tarDir = None, imgIds = [] ):\n        '''\n        Download COCO images from mscoco.org server.\n        :param tarDir (str): COCO results directory name\n               imgIds (list): images to be downloaded\n        :return:\n        '''\n        if tarDir is None:\n            print 'Please specify target directory'\n            return -1\n        if len(imgIds) == 0:\n            imgs = self.imgs.values()\n        else:\n            imgs = self.loadImgs(imgIds)\n        N = len(imgs)\n        if not os.path.exists(tarDir):\n            os.makedirs(tarDir)\n        for i, img in enumerate(imgs):\n            tic = time.time()\n            fname = os.path.join(tarDir, img['file_name'])\n            if not os.path.exists(fname):\n                urllib.urlretrieve(img['coco_url'], fname)\n            print 'downloaded %d/%d images (t=%.1fs)'%(i, N, time.time()- tic)\n"
  },
  {
    "path": "lib/pycocotools/cocoeval.py",
    "content": "__author__ = 'tsungyi'\n\nimport numpy as np\nimport datetime\nimport time\nfrom collections import defaultdict\nimport mask\nimport copy\n\nclass COCOeval:\n    # Interface for evaluating detection on the Microsoft COCO dataset.\n    #\n    # The usage for CocoEval is as follows:\n    #  cocoGt=..., cocoDt=...       # load dataset and results\n    #  E = CocoEval(cocoGt,cocoDt); # initialize CocoEval object\n    #  E.params.recThrs = ...;      # set parameters as desired\n    #  E.evaluate();                # run per image evaluation\n    #  E.accumulate();              # accumulate per image results\n    #  E.summarize();               # display summary metrics of results\n    # For example usage see evalDemo.m and http://mscoco.org/.\n    #\n    # The evaluation parameters are as follows (defaults in brackets):\n    #  imgIds     - [all] N img ids to use for evaluation\n    #  catIds     - [all] K cat ids to use for evaluation\n    #  iouThrs    - [.5:.05:.95] T=10 IoU thresholds for evaluation\n    #  recThrs    - [0:.01:1] R=101 recall thresholds for evaluation\n    #  areaRng    - [...] A=4 object area ranges for evaluation\n    #  maxDets    - [1 10 100] M=3 thresholds on max detections per image\n    #  useSegm    - [1] if true evaluate against ground-truth segments\n    #  useCats    - [1] if true use category labels for evaluation    # Note: if useSegm=0 the evaluation is run on bounding boxes.\n    # Note: if useCats=0 category labels are ignored as in proposal scoring.\n    # Note: multiple areaRngs [Ax2] and maxDets [Mx1] can be specified.\n    #\n    # evaluate(): evaluates detections on every image and every category and\n    # concats the results into the \"evalImgs\" with fields:\n    #  dtIds      - [1xD] id for each of the D detections (dt)\n    #  gtIds      - [1xG] id for each of the G ground truths (gt)\n    #  dtMatches  - [TxD] matching gt id at each IoU or 0\n    #  gtMatches  - [TxG] matching dt id at each IoU or 0\n    #  dtScores   - [1xD] confidence of each dt\n    #  gtIgnore   - [1xG] ignore flag for each gt\n    #  dtIgnore   - [TxD] ignore flag for each dt at each IoU\n    #\n    # accumulate(): accumulates the per-image, per-category evaluation\n    # results in \"evalImgs\" into the dictionary \"eval\" with fields:\n    #  params     - parameters used for evaluation\n    #  date       - date evaluation was performed\n    #  counts     - [T,R,K,A,M] parameter dimensions (see above)\n    #  precision  - [TxRxKxAxM] precision for every evaluation setting\n    #  recall     - [TxKxAxM] max recall for every evaluation setting\n    # Note: precision and recall==-1 for settings with no gt objects.\n    #\n    # See also coco, mask, pycocoDemo, pycocoEvalDemo\n    #\n    # Microsoft COCO Toolbox.      version 2.0\n    # Data, paper, and tutorials available at:  http://mscoco.org/\n    # Code written by Piotr Dollar and Tsung-Yi Lin, 2015.\n    # Licensed under the Simplified BSD License [see coco/license.txt]\n    def __init__(self, cocoGt=None, cocoDt=None):\n        '''\n        Initialize CocoEval using coco APIs for gt and dt\n        :param cocoGt: coco object with ground truth annotations\n        :param cocoDt: coco object with detection results\n        :return: None\n        '''\n        self.cocoGt   = cocoGt              # ground truth COCO API\n        self.cocoDt   = cocoDt              # detections COCO API\n        self.params   = {}                  # evaluation parameters\n        self.evalImgs = defaultdict(list)   # per-image per-category evaluation results [KxAxI] elements\n        self.eval     = {}                  # accumulated evaluation results\n        self._gts = defaultdict(list)       # gt for evaluation\n        self._dts = defaultdict(list)       # dt for evaluation\n        self.params = Params()              # parameters\n        self._paramsEval = {}               # parameters for evaluation\n        self.stats = []                     # result summarization\n        self.ious = {}                      # ious between all gts and dts\n        if not cocoGt is None:\n            self.params.imgIds = sorted(cocoGt.getImgIds())\n            self.params.catIds = sorted(cocoGt.getCatIds())\n\n\n    def _prepare(self):\n        '''\n        Prepare ._gts and ._dts for evaluation based on params\n        :return: None\n        '''\n        #\n        def _toMask(objs, coco):\n            # modify segmentation by reference\n            for obj in objs:\n                t = coco.imgs[obj['image_id']]\n                if type(obj['segmentation']) == list:\n                    if type(obj['segmentation'][0]) == dict:\n                        print 'debug'\n                    obj['segmentation'] = mask.frPyObjects(obj['segmentation'],t['height'],t['width'])\n                    if len(obj['segmentation']) == 1:\n                        obj['segmentation'] = obj['segmentation'][0]\n                    else:\n                        # an object can have multiple polygon regions\n                        # merge them into one RLE mask\n                        obj['segmentation'] = mask.merge(obj['segmentation'])\n                elif type(obj['segmentation']) == dict and type(obj['segmentation']['counts']) == list:\n                    obj['segmentation'] = mask.frPyObjects([obj['segmentation']],t['height'],t['width'])[0]\n                elif type(obj['segmentation']) == dict and \\\n                     type(obj['segmentation']['counts'] == unicode or type(obj['segmentation']['counts']) == str):\n                    pass\n                else:\n                    raise Exception('segmentation format not supported.')\n        p = self.params\n        if p.useCats:\n            gts=self.cocoGt.loadAnns(self.cocoGt.getAnnIds(imgIds=p.imgIds, catIds=p.catIds))\n            dts=self.cocoDt.loadAnns(self.cocoDt.getAnnIds(imgIds=p.imgIds, catIds=p.catIds))\n        else:\n            gts=self.cocoGt.loadAnns(self.cocoGt.getAnnIds(imgIds=p.imgIds))\n            dts=self.cocoDt.loadAnns(self.cocoDt.getAnnIds(imgIds=p.imgIds))\n\n        if p.useSegm:\n            _toMask(gts, self.cocoGt)\n            _toMask(dts, self.cocoDt)\n        self._gts = defaultdict(list)       # gt for evaluation\n        self._dts = defaultdict(list)       # dt for evaluation\n        for gt in gts:\n            self._gts[gt['image_id'], gt['category_id']].append(gt)\n        for dt in dts:\n            self._dts[dt['image_id'], dt['category_id']].append(dt)\n        self.evalImgs = defaultdict(list)   # per-image per-category evaluation results\n        self.eval     = {}                  # accumulated evaluation results\n\n    def evaluate(self):\n        '''\n        Run per image evaluation on given images and store results (a list of dict) in self.evalImgs\n        :return: None\n        '''\n        tic = time.time()\n        print 'Running per image evaluation...      '\n        p = self.params\n        p.imgIds = list(np.unique(p.imgIds))\n        if p.useCats:\n            p.catIds = list(np.unique(p.catIds))\n        p.maxDets = sorted(p.maxDets)\n        self.params=p\n\n        self._prepare()\n        # loop through images, area range, max detection number\n        catIds = p.catIds if p.useCats else [-1]\n\n        computeIoU = self.computeIoU\n        self.ious = {(imgId, catId): computeIoU(imgId, catId) \\\n                        for imgId in p.imgIds\n                        for catId in catIds}\n\n        evaluateImg = self.evaluateImg\n        maxDet = p.maxDets[-1]\n        self.evalImgs = [evaluateImg(imgId, catId, areaRng, maxDet)\n                 for catId in catIds\n                 for areaRng in p.areaRng\n                 for imgId in p.imgIds\n             ]\n        self._paramsEval = copy.deepcopy(self.params)\n        toc = time.time()\n        print 'DONE (t=%0.2fs).'%(toc-tic)\n\n    def computeIoU(self, imgId, catId):\n        p = self.params\n        if p.useCats:\n            gt = self._gts[imgId,catId]\n            dt = self._dts[imgId,catId]\n        else:\n            gt = [_ for cId in p.catIds for _ in self._gts[imgId,cId]]\n            dt = [_ for cId in p.catIds for _ in self._dts[imgId,cId]]\n        if len(gt) == 0 and len(dt) ==0:\n            return []\n        dt = sorted(dt, key=lambda x: -x['score'])\n        if len(dt) > p.maxDets[-1]:\n            dt=dt[0:p.maxDets[-1]]\n\n        if p.useSegm:\n            g = [g['segmentation'] for g in gt]\n            d = [d['segmentation'] for d in dt]\n        else:\n            g = [g['bbox'] for g in gt]\n            d = [d['bbox'] for d in dt]\n\n        # compute iou between each dt and gt region\n        iscrowd = [int(o['iscrowd']) for o in gt]\n        ious = mask.iou(d,g,iscrowd)\n        return ious\n\n    def evaluateImg(self, imgId, catId, aRng, maxDet):\n        '''\n        perform evaluation for single category and image\n        :return: dict (single image results)\n        '''\n        #\n        p = self.params\n        if p.useCats:\n            gt = self._gts[imgId,catId]\n            dt = self._dts[imgId,catId]\n        else:\n            gt = [_ for cId in p.catIds for _ in self._gts[imgId,cId]]\n            dt = [_ for cId in p.catIds for _ in self._dts[imgId,cId]]\n        if len(gt) == 0 and len(dt) ==0:\n            return None\n\n        for g in gt:\n            if 'ignore' not in g:\n                g['ignore'] = 0\n            if g['iscrowd'] == 1 or g['ignore'] or (g['area']<aRng[0] or g['area']>aRng[1]):\n                g['_ignore'] = 1\n            else:\n                g['_ignore'] = 0\n\n        # sort dt highest score first, sort gt ignore last\n        # gt = sorted(gt, key=lambda x: x['_ignore'])\n        gtind = [ind for (ind, g) in sorted(enumerate(gt), key=lambda (ind, g): g['_ignore']) ]\n\n        gt = [gt[ind] for ind in gtind]\n        dt = sorted(dt, key=lambda x: -x['score'])[0:maxDet]\n        iscrowd = [int(o['iscrowd']) for o in gt]\n        # load computed ious\n        N_iou = len(self.ious[imgId, catId])\n        ious = self.ious[imgId, catId][0:maxDet, np.array(gtind)] if N_iou >0 else self.ious[imgId, catId]\n\n        T = len(p.iouThrs)\n        G = len(gt)\n        D = len(dt)\n        gtm  = np.zeros((T,G))\n        dtm  = np.zeros((T,D))\n        gtIg = np.array([g['_ignore'] for g in gt])\n        dtIg = np.zeros((T,D))\n        if not len(ious)==0:\n            for tind, t in enumerate(p.iouThrs):\n                for dind, d in enumerate(dt):\n                    # information about best match so far (m=-1 -> unmatched)\n                    iou = min([t,1-1e-10])\n                    m   = -1\n                    for gind, g in enumerate(gt):\n                        # if this gt already matched, and not a crowd, continue\n                        if gtm[tind,gind]>0 and not iscrowd[gind]:\n                            continue\n                        # if dt matched to reg gt, and on ignore gt, stop\n                        if m>-1 and gtIg[m]==0 and gtIg[gind]==1:\n                            break\n                        # continue to next gt unless better match made\n                        if ious[dind,gind] < iou:\n                            continue\n                        # match successful and best so far, store appropriately\n                        iou=ious[dind,gind]\n                        m=gind\n                    # if match made store id of match for both dt and gt\n                    if m ==-1:\n                        continue\n                    dtIg[tind,dind] = gtIg[m]\n                    dtm[tind,dind]  = gt[m]['id']\n                    gtm[tind,m]     = d['id']\n        # set unmatched detections outside of area range to ignore\n        a = np.array([d['area']<aRng[0] or d['area']>aRng[1] for d in dt]).reshape((1, len(dt)))\n        dtIg = np.logical_or(dtIg, np.logical_and(dtm==0, np.repeat(a,T,0)))\n        # store results for given image and category\n        return {\n                'image_id':     imgId,\n                'category_id':  catId,\n                'aRng':         aRng,\n                'maxDet':       maxDet,\n                'dtIds':        [d['id'] for d in dt],\n                'gtIds':        [g['id'] for g in gt],\n                'dtMatches':    dtm,\n                'gtMatches':    gtm,\n                'dtScores':     [d['score'] for d in dt],\n                'gtIgnore':     gtIg,\n                'dtIgnore':     dtIg,\n            }\n\n    def accumulate(self, p = None):\n        '''\n        Accumulate per image evaluation results and store the result in self.eval\n        :param p: input params for evaluation\n        :return: None\n        '''\n        print 'Accumulating evaluation results...   '\n        tic = time.time()\n        if not self.evalImgs:\n            print 'Please run evaluate() first'\n        # allows input customized parameters\n        if p is None:\n            p = self.params\n        p.catIds = p.catIds if p.useCats == 1 else [-1]\n        T           = len(p.iouThrs)\n        R           = len(p.recThrs)\n        K           = len(p.catIds) if p.useCats else 1\n        A           = len(p.areaRng)\n        M           = len(p.maxDets)\n        precision   = -np.ones((T,R,K,A,M)) # -1 for the precision of absent categories\n        recall      = -np.ones((T,K,A,M))\n\n        # create dictionary for future indexing\n        _pe = self._paramsEval\n        catIds = _pe.catIds if _pe.useCats else [-1]\n        setK = set(catIds)\n        setA = set(map(tuple, _pe.areaRng))\n        setM = set(_pe.maxDets)\n        setI = set(_pe.imgIds)\n        # get inds to evaluate\n        k_list = [n for n, k in enumerate(p.catIds)  if k in setK]\n        m_list = [m for n, m in enumerate(p.maxDets) if m in setM]\n        a_list = [n for n, a in enumerate(map(lambda x: tuple(x), p.areaRng)) if a in setA]\n        i_list = [n for n, i in enumerate(p.imgIds)  if i in setI]\n        # K0 = len(_pe.catIds)\n        I0 = len(_pe.imgIds)\n        A0 = len(_pe.areaRng)\n        # retrieve E at each category, area range, and max number of detections\n        for k, k0 in enumerate(k_list):\n            Nk = k0*A0*I0\n            for a, a0 in enumerate(a_list):\n                Na = a0*I0\n                for m, maxDet in enumerate(m_list):\n                    E = [self.evalImgs[Nk+Na+i] for i in i_list]\n                    E = filter(None, E)\n                    if len(E) == 0:\n                        continue\n                    dtScores = np.concatenate([e['dtScores'][0:maxDet] for e in E])\n\n                    # different sorting method generates slightly different results.\n                    # mergesort is used to be consistent as Matlab implementation.\n                    inds = np.argsort(-dtScores, kind='mergesort')\n\n                    dtm  = np.concatenate([e['dtMatches'][:,0:maxDet] for e in E], axis=1)[:,inds]\n                    dtIg = np.concatenate([e['dtIgnore'][:,0:maxDet]  for e in E], axis=1)[:,inds]\n                    gtIg = np.concatenate([e['gtIgnore']  for e in E])\n                    npig = len([ig for ig in gtIg if ig == 0])\n                    if npig == 0:\n                        continue\n                    tps = np.logical_and(               dtm,  np.logical_not(dtIg) )\n                    fps = np.logical_and(np.logical_not(dtm), np.logical_not(dtIg) )\n\n                    tp_sum = np.cumsum(tps, axis=1).astype(dtype=np.float)\n                    fp_sum = np.cumsum(fps, axis=1).astype(dtype=np.float)\n                    for t, (tp, fp) in enumerate(zip(tp_sum, fp_sum)):\n                        tp = np.array(tp)\n                        fp = np.array(fp)\n                        nd = len(tp)\n                        rc = tp / npig\n                        pr = tp / (fp+tp+np.spacing(1))\n                        q  = np.zeros((R,))\n\n                        if nd:\n                            recall[t,k,a,m] = rc[-1]\n                        else:\n                            recall[t,k,a,m] = 0\n\n                        # numpy is slow without cython optimization for accessing elements\n                        # use python array gets significant speed improvement\n                        pr = pr.tolist(); q = q.tolist()\n\n                        for i in range(nd-1, 0, -1):\n                            if pr[i] > pr[i-1]:\n                                pr[i-1] = pr[i]\n\n                        inds = np.searchsorted(rc, p.recThrs)\n                        try:\n                            for ri, pi in enumerate(inds):\n                                q[ri] = pr[pi]\n                        except:\n                            pass\n                        precision[t,:,k,a,m] = np.array(q)\n        self.eval = {\n            'params': p,\n            'counts': [T, R, K, A, M],\n            'date': datetime.datetime.now().strftime(\"%Y-%m-%d %H:%M:%S\"),\n            'precision': precision,\n            'recall':   recall,\n        }\n        toc = time.time()\n        print 'DONE (t=%0.2fs).'%( toc-tic )\n\n    def summarize(self):\n        '''\n        Compute and display summary metrics for evaluation results.\n        Note this functin can *only* be applied on the default parameter setting\n        '''\n        def _summarize( ap=1, iouThr=None, areaRng='all', maxDets=100 ):\n            p = self.params\n            iStr        = ' {:<18} {} @[ IoU={:<9} | area={:>6} | maxDets={:>3} ] = {}'\n            titleStr    = 'Average Precision' if ap == 1 else 'Average Recall'\n            typeStr     = '(AP)' if ap==1 else '(AR)'\n            iouStr      = '%0.2f:%0.2f'%(p.iouThrs[0], p.iouThrs[-1]) if iouThr is None else '%0.2f'%(iouThr)\n            areaStr     = areaRng\n            maxDetsStr  = '%d'%(maxDets)\n\n            aind = [i for i, aRng in enumerate(['all', 'small', 'medium', 'large']) if aRng == areaRng]\n            mind = [i for i, mDet in enumerate([1, 10, 100]) if mDet == maxDets]\n            if ap == 1:\n                # dimension of precision: [TxRxKxAxM]\n                s = self.eval['precision']\n                # IoU\n                if iouThr is not None:\n                    t = np.where(iouThr == p.iouThrs)[0]\n                    s = s[t]\n                # areaRng\n                s = s[:,:,:,aind,mind]\n            else:\n                # dimension of recall: [TxKxAxM]\n                s = self.eval['recall']\n                s = s[:,:,aind,mind]\n            if len(s[s>-1])==0:\n                mean_s = -1\n            else:\n                mean_s = np.mean(s[s>-1])\n            print iStr.format(titleStr, typeStr, iouStr, areaStr, maxDetsStr, '%.3f'%(float(mean_s)))\n            return mean_s\n\n        if not self.eval:\n            raise Exception('Please run accumulate() first')\n        self.stats = np.zeros((12,))\n        self.stats[0] = _summarize(1)\n        self.stats[1] = _summarize(1,iouThr=.5)\n        self.stats[2] = _summarize(1,iouThr=.75)\n        self.stats[3] = _summarize(1,areaRng='small')\n        self.stats[4] = _summarize(1,areaRng='medium')\n        self.stats[5] = _summarize(1,areaRng='large')\n        self.stats[6] = _summarize(0,maxDets=1)\n        self.stats[7] = _summarize(0,maxDets=10)\n        self.stats[8] = _summarize(0,maxDets=100)\n        self.stats[9]  = _summarize(0,areaRng='small')\n        self.stats[10] = _summarize(0,areaRng='medium')\n        self.stats[11] = _summarize(0,areaRng='large')\n\n    def __str__(self):\n        self.summarize()\n\nclass Params:\n    '''\n    Params for coco evaluation api\n    '''\n    def __init__(self):\n        self.imgIds = []\n        self.catIds = []\n        # np.arange causes trouble.  the data point on arange is slightly larger than the true value\n        self.iouThrs = np.linspace(.5, 0.95, np.round((0.95-.5)/.05)+1, endpoint=True)\n        self.recThrs = np.linspace(.0, 1.00, np.round((1.00-.0)/.01)+1, endpoint=True)\n        self.maxDets = [1,10,100]\n        self.areaRng = [ [0**2,1e5**2], [0**2, 32**2], [32**2, 96**2], [96**2, 1e5**2] ]\n        self.useSegm = 0\n        self.useCats = 1"
  },
  {
    "path": "lib/pycocotools/license.txt",
    "content": "Copyright (c) 2014, Piotr Dollar and Tsung-Yi Lin\nAll rights reserved.\n\nRedistribution and use in source and binary forms, with or without\nmodification, are permitted provided that the following conditions are met: \n\n1. Redistributions of source code must retain the above copyright notice, this\n   list of conditions and the following disclaimer. \n2. Redistributions in binary form must reproduce the above copyright notice,\n   this list of conditions and the following disclaimer in the documentation\n   and/or other materials provided with the distribution. \n\nTHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS \"AS IS\" AND\nANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED\nWARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE\nDISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR\nANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES\n(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;\nLOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND\nON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS\nSOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n\nThe views and conclusions contained in the software and documentation are those\nof the authors and should not be interpreted as representing official policies, \neither expressed or implied, of the FreeBSD Project.\n"
  },
  {
    "path": "lib/pycocotools/mask.py",
    "content": "__author__ = 'tsungyi'\n\nimport pycocotools._mask as _mask\n\n# Interface for manipulating masks stored in RLE format.\n#\n# RLE is a simple yet efficient format for storing binary masks. RLE\n# first divides a vector (or vectorized image) into a series of piecewise\n# constant regions and then for each piece simply stores the length of\n# that piece. For example, given M=[0 0 1 1 1 0 1] the RLE counts would\n# be [2 3 1 1], or for M=[1 1 1 1 1 1 0] the counts would be [0 6 1]\n# (note that the odd counts are always the numbers of zeros). Instead of\n# storing the counts directly, additional compression is achieved with a\n# variable bitrate representation based on a common scheme called LEB128.\n#\n# Compression is greatest given large piecewise constant regions.\n# Specifically, the size of the RLE is proportional to the number of\n# *boundaries* in M (or for an image the number of boundaries in the y\n# direction). Assuming fairly simple shapes, the RLE representation is\n# O(sqrt(n)) where n is number of pixels in the object. Hence space usage\n# is substantially lower, especially for large simple objects (large n).\n#\n# Many common operations on masks can be computed directly using the RLE\n# (without need for decoding). This includes computations such as area,\n# union, intersection, etc. All of these operations are linear in the\n# size of the RLE, in other words they are O(sqrt(n)) where n is the area\n# of the object. Computing these operations on the original mask is O(n).\n# Thus, using the RLE can result in substantial computational savings.\n#\n# The following API functions are defined:\n#  encode         - Encode binary masks using RLE.\n#  decode         - Decode binary masks encoded via RLE.\n#  merge          - Compute union or intersection of encoded masks.\n#  iou            - Compute intersection over union between masks.\n#  area           - Compute area of encoded masks.\n#  toBbox         - Get bounding boxes surrounding encoded masks.\n#  frPyObjects    - Convert polygon, bbox, and uncompressed RLE to encoded RLE mask.\n#\n# Usage:\n#  Rs     = encode( masks )\n#  masks  = decode( Rs )\n#  R      = merge( Rs, intersect=false )\n#  o      = iou( dt, gt, iscrowd )\n#  a      = area( Rs )\n#  bbs    = toBbox( Rs )\n#  Rs     = frPyObjects( [pyObjects], h, w )\n#\n# In the API the following formats are used:\n#  Rs      - [dict] Run-length encoding of binary masks\n#  R       - dict Run-length encoding of binary mask\n#  masks   - [hxwxn] Binary mask(s) (must have type np.ndarray(dtype=uint8) in column-major order)\n#  iscrowd - [nx1] list of np.ndarray. 1 indicates corresponding gt image has crowd region to ignore\n#  bbs     - [nx4] Bounding box(es) stored as [x y w h]\n#  poly    - Polygon stored as [[x1 y1 x2 y2...],[x1 y1 ...],...] (2D list)\n#  dt,gt   - May be either bounding boxes or encoded masks\n# Both poly and bbs are 0-indexed (bbox=[0 0 1 1] encloses first pixel).\n#\n# Finally, a note about the intersection over union (iou) computation.\n# The standard iou of a ground truth (gt) and detected (dt) object is\n#  iou(gt,dt) = area(intersect(gt,dt)) / area(union(gt,dt))\n# For \"crowd\" regions, we use a modified criteria. If a gt object is\n# marked as \"iscrowd\", we allow a dt to match any subregion of the gt.\n# Choosing gt' in the crowd gt that best matches the dt can be done using\n# gt'=intersect(dt,gt). Since by definition union(gt',dt)=dt, computing\n#  iou(gt,dt,iscrowd) = iou(gt',dt) = area(intersect(gt,dt)) / area(dt)\n# For crowd gt regions we use this modified criteria above for the iou.\n#\n# To compile run \"python setup.py build_ext --inplace\"\n# Please do not contact us for help with compiling.\n#\n# Microsoft COCO Toolbox.      version 2.0\n# Data, paper, and tutorials available at:  http://mscoco.org/\n# Code written by Piotr Dollar and Tsung-Yi Lin, 2015.\n# Licensed under the Simplified BSD License [see coco/license.txt]\n\nencode      = _mask.encode\ndecode      = _mask.decode\niou         = _mask.iou\nmerge       = _mask.merge\narea        = _mask.area\ntoBbox      = _mask.toBbox\nfrPyObjects = _mask.frPyObjects"
  },
  {
    "path": "lib/pycocotools/maskApi.c",
    "content": "/**************************************************************************\n* Microsoft COCO Toolbox.      version 2.0\n* Data, paper, and tutorials available at:  http://mscoco.org/\n* Code written by Piotr Dollar and Tsung-Yi Lin, 2015.\n* Licensed under the Simplified BSD License [see coco/license.txt]\n**************************************************************************/\n#include \"maskApi.h\"\n#include <math.h>\n#include <stdlib.h>\n\nuint umin( uint a, uint b ) { return (a<b) ? a : b; }\nuint umax( uint a, uint b ) { return (a>b) ? a : b; }\n\nvoid rleInit( RLE *R, siz h, siz w, siz m, uint *cnts ) {\n  R->h=h; R->w=w; R->m=m; R->cnts=(m==0)?0:malloc(sizeof(uint)*m);\n  if(cnts) for(siz j=0; j<m; j++) R->cnts[j]=cnts[j];\n}\n\nvoid rleFree( RLE *R ) {\n  free(R->cnts); R->cnts=0;\n}\n\nvoid rlesInit( RLE **R, siz n ) {\n  *R = (RLE*) malloc(sizeof(RLE)*n);\n  for(siz i=0; i<n; i++) rleInit((*R)+i,0,0,0,0);\n}\n\nvoid rlesFree( RLE **R, siz n ) {\n  for(siz i=0; i<n; i++) rleFree((*R)+i); free(*R); *R=0;\n}\n\nvoid rleEncode( RLE *R, const byte *M, siz h, siz w, siz n ) {\n  siz i, j, k, a=w*h; uint c, *cnts; byte p;\n  cnts = malloc(sizeof(uint)*(a+1));\n  for(i=0; i<n; i++) {\n    const byte *T=M+a*i; k=0; p=0; c=0;\n    for(j=0; j<a; j++) { if(T[j]!=p) { cnts[k++]=c; c=0; p=T[j]; } c++; }\n    cnts[k++]=c; rleInit(R+i,h,w,k,cnts);\n  }\n  free(cnts);\n}\n\nvoid rleDecode( const RLE *R, byte *M, siz n ) {\n  for( siz i=0; i<n; i++ ) {\n    byte v=0; for( siz j=0; j<R[i].m; j++ ) {\n      for( siz k=0; k<R[i].cnts[j]; k++ ) *(M++)=v; v=!v; }}\n}\n\nvoid rleMerge( const RLE *R, RLE *M, siz n, bool intersect ) {\n  uint *cnts, c, ca, cb, cc, ct; bool v, va, vb, vp;\n  siz i, a, b, h=R[0].h, w=R[0].w, m=R[0].m; RLE A, B;\n  if(n==0) { rleInit(M,0,0,0,0); return; }\n  if(n==1) { rleInit(M,h,w,m,R[0].cnts); return; }\n  cnts = malloc(sizeof(uint)*(h*w+1));\n  for( a=0; a<m; a++ ) cnts[a]=R[0].cnts[a];\n  for( i=1; i<n; i++ ) {\n    B=R[i]; if(B.h!=h||B.w!=w) { h=w=m=0; break; }\n    rleInit(&A,h,w,m,cnts); ca=A.cnts[0]; cb=B.cnts[0];\n    v=va=vb=0; m=0; a=b=1; cc=0; ct=1;\n    while( ct>0 ) {\n      c=umin(ca,cb); cc+=c; ct=0;\n      ca-=c; if(!ca && a<A.m) { ca=A.cnts[a++]; va=!va; } ct+=ca;\n      cb-=c; if(!cb && b<B.m) { cb=B.cnts[b++]; vb=!vb; } ct+=cb;\n      vp=v; if(intersect) v=va&&vb; else v=va||vb;\n      if( v!=vp||ct==0 ) { cnts[m++]=cc; cc=0; }\n    }\n    rleFree(&A);\n  }\n  rleInit(M,h,w,m,cnts); free(cnts);\n}\n\nvoid rleArea( const RLE *R, siz n, uint *a ) {\n  for( siz i=0; i<n; i++ ) {\n    a[i]=0; for( siz j=1; j<R[i].m; j+=2 ) a[i]+=R[i].cnts[j]; }\n}\n\nvoid rleIou( RLE *dt, RLE *gt, siz m, siz n, byte *iscrowd, double *o ) {\n  siz g, d; BB db, gb; bool crowd;\n  db=malloc(sizeof(double)*m*4); rleToBbox(dt,db,m);\n  gb=malloc(sizeof(double)*n*4); rleToBbox(gt,gb,n);\n  bbIou(db,gb,m,n,iscrowd,o); free(db); free(gb);\n  for( g=0; g<n; g++ ) for( d=0; d<m; d++ ) if(o[g*m+d]>0) {\n    crowd=iscrowd!=NULL && iscrowd[g];\n    if(dt[d].h!=gt[g].h || dt[d].w!=gt[g].w) { o[g*m+d]=-1; continue; }\n    siz ka, kb, a, b; uint c, ca, cb, ct, i, u; bool va, vb;\n    ca=dt[d].cnts[0]; ka=dt[d].m; va=vb=0;\n    cb=gt[g].cnts[0]; kb=gt[g].m; a=b=1; i=u=0; ct=1;\n    while( ct>0 ) {\n      c=umin(ca,cb); if(va||vb) { u+=c; if(va&&vb) i+=c; } ct=0;\n      ca-=c; if(!ca && a<ka) { ca=dt[d].cnts[a++]; va=!va; } ct+=ca;\n      cb-=c; if(!cb && b<kb) { cb=gt[g].cnts[b++]; vb=!vb; } ct+=cb;\n    }\n    if(i==0) u=1; else if(crowd) rleArea(dt+d,1,&u);\n    o[g*m+d] = (double)i/(double)u;\n  }\n}\n\nvoid bbIou( BB dt, BB gt, siz m, siz n, byte *iscrowd, double *o ) {\n  double h, w, i, u, ga, da; siz g, d; bool crowd;\n  for( g=0; g<n; g++ ) {\n    BB G=gt+g*4; ga=G[2]*G[3]; crowd=iscrowd!=NULL && iscrowd[g];\n    for( d=0; d<m; d++ ) {\n      BB D=dt+d*4; da=D[2]*D[3]; o[g*m+d]=0;\n      w=fmin(D[2]+D[0],G[2]+G[0])-fmax(D[0],G[0]); if(w<=0) continue;\n      h=fmin(D[3]+D[1],G[3]+G[1])-fmax(D[1],G[1]); if(h<=0) continue;\n      i=w*h; u = crowd ? da : da+ga-i; o[g*m+d]=i/u;\n    }\n  }\n}\n\nvoid rleToBbox( const RLE *R, BB bb, siz n ) {\n  for( siz i=0; i<n; i++ ) {\n    uint h, w, x, y, xs, ys, xe, ye, cc, t; siz j, m;\n    h=(uint)R[i].h; w=(uint)R[i].w; m=R[i].m;\n    m=((siz)(m/2))*2; xs=w; ys=h; xe=ye=0; cc=0;\n    if(m==0) { bb[4*i+0]=bb[4*i+1]=bb[4*i+2]=bb[4*i+3]=0; continue; }\n    for( j=0; j<m; j++ ) {\n      cc+=R[i].cnts[j]; t=cc-j%2; y=t%h; x=(t-y)/h;\n      xs=umin(xs,x); xe=umax(xe,x); ys=umin(ys,y); ye=umax(ye,y);\n    }\n    bb[4*i+0]=xs; bb[4*i+2]=xe-xs+1;\n    bb[4*i+1]=ys; bb[4*i+3]=ye-ys+1;\n  }\n}\n\nvoid rleFrBbox( RLE *R, const BB bb, siz h, siz w, siz n ) {\n  for( siz i=0; i<n; i++ ) {\n    double xs=bb[4*i+0], xe=xs+bb[4*i+2];\n    double ys=bb[4*i+1], ye=ys+bb[4*i+3];\n    double xy[8] = {xs,ys,xs,ye,xe,ye,xe,ys};\n    rleFrPoly( R+i, xy, 4, h, w );\n  }\n}\n\nint uintCompare(const void *a, const void *b) {\n  uint c=*((uint*)a), d=*((uint*)b); return c>d?1:c<d?-1:0;\n}\n\nvoid rleFrPoly( RLE *R, const double *xy, siz k, siz h, siz w ) {\n  // upsample and get discrete points densely along entire boundary\n  siz j, m=0; double scale=5; int *x, *y, *u, *v; uint *a, *b;\n  x=malloc(sizeof(int)*(k+1)); y=malloc(sizeof(int)*(k+1));\n  for(j=0; j<k; j++) x[j]=(int)(scale*xy[j*2+0]+.5); x[k]=x[0];\n  for(j=0; j<k; j++) y[j]=(int)(scale*xy[j*2+1]+.5); y[k]=y[0];\n  for(j=0; j<k; j++) m+=umax(abs(x[j]-x[j+1]),abs(y[j]-y[j+1]))+1;\n  u=malloc(sizeof(int)*m); v=malloc(sizeof(int)*m); m=0;\n  for( j=0; j<k; j++ ) {\n    int xs=x[j], xe=x[j+1], ys=y[j], ye=y[j+1], dx, dy, t;\n    bool flip; double s; dx=abs(xe-xs); dy=abs(ys-ye);\n    flip = (dx>=dy && xs>xe) || (dx<dy && ys>ye);\n    if(flip) { t=xs; xs=xe; xe=t; t=ys; ys=ye; ye=t; }\n    s = dx>=dy ? (double)(ye-ys)/dx : (double)(xe-xs)/dy;\n    if(dx>=dy) for( int d=0; d<=dx; d++ ) {\n      t=flip?dx-d:d; u[m]=t+xs; v[m]=(int)(ys+s*t+.5); m++;\n    } else for( int d=0; d<=dy; d++ ) {\n      t=flip?dy-d:d; v[m]=t+ys; u[m]=(int)(xs+s*t+.5); m++;\n    }\n  }\n  // get points along y-boundary and downsample\n  free(x); free(y); k=m; m=0; double xd, yd;\n  x=malloc(sizeof(int)*k); y=malloc(sizeof(int)*k);\n  for( j=1; j<k; j++ ) if(u[j]!=u[j-1]) {\n    xd=(double)(u[j]<u[j-1]?u[j]:u[j]-1); xd=(xd+.5)/scale-.5;\n    if( floor(xd)!=xd || xd<0 || xd>w-1 ) continue;\n    yd=(double)(v[j]<v[j-1]?v[j]:v[j-1]); yd=(yd+.5)/scale-.5;\n    if(yd<0) yd=0; else if(yd>h) yd=h; yd=ceil(yd);\n    x[m]=(int) xd; y[m]=(int) yd; m++;\n  }\n  // compute rle encoding given y-boundary points\n  k=m; a=malloc(sizeof(uint)*(k+1));\n  for( j=0; j<k; j++ ) a[j]=(uint)(x[j]*(int)(h)+y[j]);\n  a[k++]=(uint)(h*w); free(u); free(v); free(x); free(y);\n  qsort(a,k,sizeof(uint),uintCompare); uint p=0;\n  for( j=0; j<k; j++ ) { uint t=a[j]; a[j]-=p; p=t; }\n  b=malloc(sizeof(uint)*k); j=m=0; b[m++]=a[j++];\n  while(j<k) if(a[j]>0) b[m++]=a[j++]; else {\n    j++; if(j<k) b[m-1]+=a[j++]; }\n  rleInit(R,h,w,m,b); free(a); free(b);\n}\n\nchar* rleToString( const RLE *R ) {\n  // Similar to LEB128 but using 6 bits/char and ascii chars 48-111.\n  siz i, m=R->m, p=0; long x; bool more;\n  char *s=malloc(sizeof(char)*m*6);\n  for( i=0; i<m; i++ ) {\n    x=(long) R->cnts[i]; if(i>2) x-=(long) R->cnts[i-2]; more=1;\n    while( more ) {\n      char c=x & 0x1f; x >>= 5; more=(c & 0x10) ? x!=-1 : x!=0;\n      if(more) c |= 0x20; c+=48; s[p++]=c;\n    }\n  }\n  s[p]=0; return s;\n}\n\nvoid rleFrString( RLE *R, char *s, siz h, siz w ) {\n  siz m=0, p=0, k; long x; bool more; uint *cnts;\n  while( s[m] ) m++; cnts=malloc(sizeof(uint)*m); m=0;\n  while( s[p] ) {\n    x=0; k=0; more=1;\n    while( more ) {\n      char c=s[p]-48; x |= (c & 0x1f) << 5*k;\n      more = c & 0x20; p++; k++;\n      if(!more && (c & 0x10)) x |= -1 << 5*k;\n    }\n    if(m>2) x+=(long) cnts[m-2]; cnts[m++]=(uint) x;\n  }\n  rleInit(R,h,w,m,cnts); free(cnts);\n}\n"
  },
  {
    "path": "lib/pycocotools/maskApi.h",
    "content": "/**************************************************************************\n* Microsoft COCO Toolbox.      version 2.0\n* Data, paper, and tutorials available at:  http://mscoco.org/\n* Code written by Piotr Dollar and Tsung-Yi Lin, 2015.\n* Licensed under the Simplified BSD License [see coco/license.txt]\n**************************************************************************/\n#pragma once\n#include <stdbool.h>\n\ntypedef unsigned int uint;\ntypedef unsigned long siz;\ntypedef unsigned char byte;\ntypedef double* BB;\ntypedef struct { siz h, w, m; uint *cnts; } RLE;\n\n// Initialize/destroy RLE.\nvoid rleInit( RLE *R, siz h, siz w, siz m, uint *cnts );\nvoid rleFree( RLE *R );\n\n// Initialize/destroy RLE array.\nvoid rlesInit( RLE **R, siz n );\nvoid rlesFree( RLE **R, siz n );\n\n// Encode binary masks using RLE.\nvoid rleEncode( RLE *R, const byte *mask, siz h, siz w, siz n );\n\n// Decode binary masks encoded via RLE.\nvoid rleDecode( const RLE *R, byte *mask, siz n );\n\n// Compute union or intersection of encoded masks.\nvoid rleMerge( const RLE *R, RLE *M, siz n, bool intersect );\n\n// Compute area of encoded masks.\nvoid rleArea( const RLE *R, siz n, uint *a );\n\n// Compute intersection over union between masks.\nvoid rleIou( RLE *dt, RLE *gt, siz m, siz n, byte *iscrowd, double *o );\n\n// Compute intersection over union between bounding boxes.\nvoid bbIou( BB dt, BB gt, siz m, siz n, byte *iscrowd, double *o );\n\n// Get bounding boxes surrounding encoded masks.\nvoid rleToBbox( const RLE *R, BB bb, siz n );\n\n// Convert bounding boxes to encoded masks.\nvoid rleFrBbox( RLE *R, const BB bb, siz h, siz w, siz n );\n\n// Convert polygon to encoded mask.\nvoid rleFrPoly( RLE *R, const double *xy, siz k, siz h, siz w );\n\n// Get compressed string representation of encoded mask.\nchar* rleToString( const RLE *R );\n\n// Convert from compressed string representation of encoded mask.\nvoid rleFrString( RLE *R, char *s, siz h, siz w );\n"
  },
  {
    "path": "lib/roi_data_layer/__init__.py",
    "content": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n"
  },
  {
    "path": "lib/roi_data_layer/layer.py",
    "content": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\n\"\"\"The data layer used during training to train a Fast R-CNN network.\n\nRoIDataLayer implements a Caffe Python layer.\n\"\"\"\n\nimport caffe\nfrom fast_rcnn.config import cfg\nfrom roi_data_layer.minibatch import get_minibatch\nimport numpy as np\nimport yaml\nfrom multiprocessing import Process, Queue\n\nclass RoIDataLayer(caffe.Layer):\n    \"\"\"Fast R-CNN data layer used for training.\"\"\"\n\n    def _shuffle_roidb_inds(self):\n        \"\"\"Randomly permute the training roidb.\"\"\"\n        if cfg.TRAIN.ASPECT_GROUPING:\n            widths = np.array([r['width'] for r in self._roidb])\n            heights = np.array([r['height'] for r in self._roidb])\n            horz = (widths >= heights)\n            vert = np.logical_not(horz)\n            horz_inds = np.where(horz)[0]\n            vert_inds = np.where(vert)[0]\n            inds = np.hstack((\n                np.random.permutation(horz_inds),\n                np.random.permutation(vert_inds)))\n            inds = np.reshape(inds, (-1, 2))\n            row_perm = np.random.permutation(np.arange(inds.shape[0]))\n            inds = np.reshape(inds[row_perm, :], (-1,))\n            self._perm = inds\n        else:\n            self._perm = np.random.permutation(np.arange(len(self._roidb)))\n        self._cur = 0\n\n    def _get_next_minibatch_inds(self):\n        \"\"\"Return the roidb indices for the next minibatch.\"\"\"\n        if self._cur + cfg.TRAIN.IMS_PER_BATCH >= len(self._roidb):\n            self._shuffle_roidb_inds()\n\n        db_inds = self._perm[self._cur:self._cur + cfg.TRAIN.IMS_PER_BATCH]\n        self._cur += cfg.TRAIN.IMS_PER_BATCH\n        return db_inds\n\n    def _get_next_minibatch(self):\n        \"\"\"Return the blobs to be used for the next minibatch.\n\n        If cfg.TRAIN.USE_PREFETCH is True, then blobs will be computed in a\n        separate process and made available through self._blob_queue.\n        \"\"\"\n        if cfg.TRAIN.USE_PREFETCH:\n            return self._blob_queue.get()\n        else:\n            db_inds = self._get_next_minibatch_inds()\n            minibatch_db = [self._roidb[i] for i in db_inds]\n            return get_minibatch(minibatch_db, self._num_classes)\n\n    def set_roidb(self, roidb):\n        \"\"\"Set the roidb to be used by this layer during training.\"\"\"\n        self._roidb = roidb\n        self._shuffle_roidb_inds()\n        if cfg.TRAIN.USE_PREFETCH:\n            self._blob_queue = Queue(10)\n            self._prefetch_process = BlobFetcher(self._blob_queue,\n                                                 self._roidb,\n                                                 self._num_classes)\n            self._prefetch_process.start()\n            # Terminate the child process when the parent exists\n            def cleanup():\n                print 'Terminating BlobFetcher'\n                self._prefetch_process.terminate()\n                self._prefetch_process.join()\n            import atexit\n            atexit.register(cleanup)\n\n    def setup(self, bottom, top):\n        \"\"\"Setup the RoIDataLayer.\"\"\"\n\n        # parse the layer parameter string, which must be valid YAML\n        layer_params = yaml.load(self.param_str_)\n\n        self._num_classes = layer_params['num_classes']\n\n        self._name_to_top_map = {}\n\n        # data blob: holds a batch of N images, each with 3 channels\n        idx = 0\n        top[idx].reshape(cfg.TRAIN.IMS_PER_BATCH, 3,\n            max(cfg.TRAIN.SCALES), cfg.TRAIN.MAX_SIZE)\n        self._name_to_top_map['data'] = idx\n        idx += 1\n\n        if cfg.TRAIN.HAS_RPN:\n            top[idx].reshape(1, 3)\n            self._name_to_top_map['im_info'] = idx\n            idx += 1\n\n            top[idx].reshape(1, 4)\n            self._name_to_top_map['gt_boxes'] = idx\n            idx += 1\n        else: # not using RPN\n            # rois blob: holds R regions of interest, each is a 5-tuple\n            # (n, x1, y1, x2, y2) specifying an image batch index n and a\n            # rectangle (x1, y1, x2, y2)\n            top[idx].reshape(1, 5)\n            self._name_to_top_map['rois'] = idx\n            idx += 1\n\n            # labels blob: R categorical labels in [0, ..., K] for K foreground\n            # classes plus background\n            top[idx].reshape(1)\n            self._name_to_top_map['labels'] = idx\n            idx += 1\n\n            if cfg.TRAIN.BBOX_REG:\n                # bbox_targets blob: R bounding-box regression targets with 4\n                # targets per class\n                top[idx].reshape(1, self._num_classes * 4)\n                self._name_to_top_map['bbox_targets'] = idx\n                idx += 1\n\n                # bbox_inside_weights blob: At most 4 targets per roi are active;\n                # thisbinary vector sepcifies the subset of active targets\n                top[idx].reshape(1, self._num_classes * 4)\n                self._name_to_top_map['bbox_inside_weights'] = idx\n                idx += 1\n\n                top[idx].reshape(1, self._num_classes * 4)\n                self._name_to_top_map['bbox_outside_weights'] = idx\n                idx += 1\n\n        print 'RoiDataLayer: name_to_top:', self._name_to_top_map\n        assert len(top) == len(self._name_to_top_map)\n\n    def forward(self, bottom, top):\n        \"\"\"Get blobs and copy them into this layer's top blob vector.\"\"\"\n        blobs = self._get_next_minibatch()\n\n        for blob_name, blob in blobs.iteritems():\n            top_ind = self._name_to_top_map[blob_name]\n            # Reshape net's input blobs\n            top[top_ind].reshape(*(blob.shape))\n            # Copy data into net's input blobs\n            top[top_ind].data[...] = blob.astype(np.float32, copy=False)\n\n    def backward(self, top, propagate_down, bottom):\n        \"\"\"This layer does not propagate gradients.\"\"\"\n        pass\n\n    def reshape(self, bottom, top):\n        \"\"\"Reshaping happens during the call to forward.\"\"\"\n        pass\n\nclass BlobFetcher(Process):\n    \"\"\"Experimental class for prefetching blobs in a separate process.\"\"\"\n    def __init__(self, queue, roidb, num_classes):\n        super(BlobFetcher, self).__init__()\n        self._queue = queue\n        self._roidb = roidb\n        self._num_classes = num_classes\n        self._perm = None\n        self._cur = 0\n        self._shuffle_roidb_inds()\n        # fix the random seed for reproducibility\n        np.random.seed(cfg.RNG_SEED)\n\n    def _shuffle_roidb_inds(self):\n        \"\"\"Randomly permute the training roidb.\"\"\"\n        # TODO(rbg): remove duplicated code\n        self._perm = np.random.permutation(np.arange(len(self._roidb)))\n        self._cur = 0\n\n    def _get_next_minibatch_inds(self):\n        \"\"\"Return the roidb indices for the next minibatch.\"\"\"\n        # TODO(rbg): remove duplicated code\n        if self._cur + cfg.TRAIN.IMS_PER_BATCH >= len(self._roidb):\n            self._shuffle_roidb_inds()\n\n        db_inds = self._perm[self._cur:self._cur + cfg.TRAIN.IMS_PER_BATCH]\n        self._cur += cfg.TRAIN.IMS_PER_BATCH\n        return db_inds\n\n    def run(self):\n        print 'BlobFetcher started'\n        while True:\n            db_inds = self._get_next_minibatch_inds()\n            minibatch_db = [self._roidb[i] for i in db_inds]\n            blobs = get_minibatch(minibatch_db, self._num_classes)\n            self._queue.put(blobs)\n"
  },
  {
    "path": "lib/roi_data_layer/minibatch.py",
    "content": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\n\"\"\"Compute minibatch blobs for training a Fast R-CNN network.\"\"\"\n\nimport numpy as np\nimport numpy.random as npr\nimport cv2\nfrom fast_rcnn.config import cfg\nfrom utils.blob import prep_im_for_blob, im_list_to_blob\n\ndef get_minibatch(roidb, num_classes):\n    \"\"\"Given a roidb, construct a minibatch sampled from it.\"\"\"\n    num_images = len(roidb)\n    # Sample random scales to use for each image in this batch\n    random_scale_inds = npr.randint(0, high=len(cfg.TRAIN.SCALES),\n                                    size=num_images)\n    assert(cfg.TRAIN.BATCH_SIZE % num_images == 0), \\\n        'num_images ({}) must divide BATCH_SIZE ({})'. \\\n        format(num_images, cfg.TRAIN.BATCH_SIZE)\n    rois_per_image = cfg.TRAIN.BATCH_SIZE / num_images\n    fg_rois_per_image = np.round(cfg.TRAIN.FG_FRACTION * rois_per_image)\n\n    # Get the input image blob, formatted for caffe\n    im_blob, im_scales = _get_image_blob(roidb, random_scale_inds)\n\n    blobs = {'data': im_blob}\n\n    if cfg.TRAIN.HAS_RPN:\n        assert len(im_scales) == 1, \"Single batch only\"\n        assert len(roidb) == 1, \"Single batch only\"\n        # gt boxes: (x1, y1, x2, y2, cls)\n        gt_inds = np.where(roidb[0]['gt_classes'] != 0)[0]\n        gt_boxes = np.empty((len(gt_inds), 5), dtype=np.float32)\n        gt_boxes[:, 0:4] = roidb[0]['boxes'][gt_inds, :] * im_scales[0]\n        gt_boxes[:, 4] = roidb[0]['gt_classes'][gt_inds]\n        blobs['gt_boxes'] = gt_boxes\n        blobs['im_info'] = np.array(\n            [[im_blob.shape[2], im_blob.shape[3], im_scales[0]]],\n            dtype=np.float32)\n    else: # not using RPN\n        # Now, build the region of interest and label blobs\n        rois_blob = np.zeros((0, 5), dtype=np.float32)\n        labels_blob = np.zeros((0), dtype=np.float32)\n        bbox_targets_blob = np.zeros((0, 4 * num_classes), dtype=np.float32)\n        bbox_inside_blob = np.zeros(bbox_targets_blob.shape, dtype=np.float32)\n        # all_overlaps = []\n        for im_i in xrange(num_images):\n            labels, overlaps, im_rois, bbox_targets, bbox_inside_weights \\\n                = _sample_rois(roidb[im_i], fg_rois_per_image, rois_per_image,\n                               num_classes)\n\n            # Add to RoIs blob\n            rois = _project_im_rois(im_rois, im_scales[im_i])\n            batch_ind = im_i * np.ones((rois.shape[0], 1))\n            rois_blob_this_image = np.hstack((batch_ind, rois))\n            rois_blob = np.vstack((rois_blob, rois_blob_this_image))\n\n            # Add to labels, bbox targets, and bbox loss blobs\n            labels_blob = np.hstack((labels_blob, labels))\n            bbox_targets_blob = np.vstack((bbox_targets_blob, bbox_targets))\n            bbox_inside_blob = np.vstack((bbox_inside_blob, bbox_inside_weights))\n            # all_overlaps = np.hstack((all_overlaps, overlaps))\n\n        # For debug visualizations\n        # _vis_minibatch(im_blob, rois_blob, labels_blob, all_overlaps)\n\n        blobs['rois'] = rois_blob\n        blobs['labels'] = labels_blob\n\n        if cfg.TRAIN.BBOX_REG:\n            blobs['bbox_targets'] = bbox_targets_blob\n            blobs['bbox_inside_weights'] = bbox_inside_blob\n            blobs['bbox_outside_weights'] = \\\n                np.array(bbox_inside_blob > 0).astype(np.float32)\n\n    return blobs\n\ndef _sample_rois(roidb, fg_rois_per_image, rois_per_image, num_classes):\n    \"\"\"Generate a random sample of RoIs comprising foreground and background\n    examples.\n    \"\"\"\n    # label = class RoI has max overlap with\n    labels = roidb['max_classes']\n    overlaps = roidb['max_overlaps']\n    rois = roidb['boxes']\n\n    # Select foreground RoIs as those with >= FG_THRESH overlap\n    fg_inds = np.where(overlaps >= cfg.TRAIN.FG_THRESH)[0]\n    # Guard against the case when an image has fewer than fg_rois_per_image\n    # foreground RoIs\n    fg_rois_per_this_image = np.minimum(fg_rois_per_image, fg_inds.size)\n    # Sample foreground regions without replacement\n    if fg_inds.size > 0:\n        fg_inds = npr.choice(\n                fg_inds, size=fg_rois_per_this_image, replace=False)\n\n    # Select background RoIs as those within [BG_THRESH_LO, BG_THRESH_HI)\n    bg_inds = np.where((overlaps < cfg.TRAIN.BG_THRESH_HI) &\n                       (overlaps >= cfg.TRAIN.BG_THRESH_LO))[0]\n    # Compute number of background RoIs to take from this image (guarding\n    # against there being fewer than desired)\n    bg_rois_per_this_image = rois_per_image - fg_rois_per_this_image\n    bg_rois_per_this_image = np.minimum(bg_rois_per_this_image,\n                                        bg_inds.size)\n    # Sample foreground regions without replacement\n    if bg_inds.size > 0:\n        bg_inds = npr.choice(\n                bg_inds, size=bg_rois_per_this_image, replace=False)\n\n    # The indices that we're selecting (both fg and bg)\n    keep_inds = np.append(fg_inds, bg_inds)\n    # Select sampled values from various arrays:\n    labels = labels[keep_inds]\n    # Clamp labels for the background RoIs to 0\n    labels[fg_rois_per_this_image:] = 0\n    overlaps = overlaps[keep_inds]\n    rois = rois[keep_inds]\n\n    bbox_targets, bbox_inside_weights = _get_bbox_regression_labels(\n            roidb['bbox_targets'][keep_inds, :], num_classes)\n\n    return labels, overlaps, rois, bbox_targets, bbox_inside_weights\n\ndef _get_image_blob(roidb, scale_inds):\n    \"\"\"Builds an input blob from the images in the roidb at the specified\n    scales.\n    \"\"\"\n    num_images = len(roidb)\n    processed_ims = []\n    im_scales = []\n    for i in xrange(num_images):\n        im = cv2.imread(roidb[i]['image'])\n        if roidb[i]['flipped']:\n            im = im[:, ::-1, :]\n        target_size = cfg.TRAIN.SCALES[scale_inds[i]]\n        im, im_scale = prep_im_for_blob(im, cfg.PIXEL_MEANS, target_size,\n                                        cfg.TRAIN.MAX_SIZE)\n        im_scales.append(im_scale)\n        processed_ims.append(im)\n\n    # Create a blob to hold the input images\n    blob = im_list_to_blob(processed_ims)\n\n    return blob, im_scales\n\ndef _project_im_rois(im_rois, im_scale_factor):\n    \"\"\"Project image RoIs into the rescaled training image.\"\"\"\n    rois = im_rois * im_scale_factor\n    return rois\n\ndef _get_bbox_regression_labels(bbox_target_data, num_classes):\n    \"\"\"Bounding-box regression targets are stored in a compact form in the\n    roidb.\n\n    This function expands those targets into the 4-of-4*K representation used\n    by the network (i.e. only one class has non-zero targets). The loss weights\n    are similarly expanded.\n\n    Returns:\n        bbox_target_data (ndarray): N x 4K blob of regression targets\n        bbox_inside_weights (ndarray): N x 4K blob of loss weights\n    \"\"\"\n    clss = bbox_target_data[:, 0]\n    bbox_targets = np.zeros((clss.size, 4 * num_classes), dtype=np.float32)\n    bbox_inside_weights = np.zeros(bbox_targets.shape, dtype=np.float32)\n    inds = np.where(clss > 0)[0]\n    for ind in inds:\n        cls = clss[ind]\n        start = 4 * cls\n        end = start + 4\n        bbox_targets[ind, start:end] = bbox_target_data[ind, 1:]\n        bbox_inside_weights[ind, start:end] = cfg.TRAIN.BBOX_INSIDE_WEIGHTS\n    return bbox_targets, bbox_inside_weights\n\ndef _vis_minibatch(im_blob, rois_blob, labels_blob, overlaps):\n    \"\"\"Visualize a mini-batch for debugging.\"\"\"\n    import matplotlib.pyplot as plt\n    for i in xrange(rois_blob.shape[0]):\n        rois = rois_blob[i, :]\n        im_ind = rois[0]\n        roi = rois[1:]\n        im = im_blob[im_ind, :, :, :].transpose((1, 2, 0)).copy()\n        im += cfg.PIXEL_MEANS\n        im = im[:, :, (2, 1, 0)]\n        im = im.astype(np.uint8)\n        cls = labels_blob[i]\n        plt.imshow(im)\n        print 'class: ', cls, ' overlap: ', overlaps[i]\n        plt.gca().add_patch(\n            plt.Rectangle((roi[0], roi[1]), roi[2] - roi[0],\n                          roi[3] - roi[1], fill=False,\n                          edgecolor='r', linewidth=3)\n            )\n        plt.show()\n"
  },
  {
    "path": "lib/roi_data_layer/roidb.py",
    "content": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\n\"\"\"Transform a roidb into a trainable roidb by adding a bunch of metadata.\"\"\"\n\nimport numpy as np\nfrom fast_rcnn.config import cfg\nfrom fast_rcnn.bbox_transform import bbox_transform\nfrom utils.cython_bbox import bbox_overlaps\nimport PIL\n\ndef prepare_roidb(imdb):\n    \"\"\"Enrich the imdb's roidb by adding some derived quantities that\n    are useful for training. This function precomputes the maximum\n    overlap, taken over ground-truth boxes, between each ROI and\n    each ground-truth box. The class with maximum overlap is also\n    recorded.\n    \"\"\"\n    sizes = [PIL.Image.open(imdb.image_path_at(i)).size\n             for i in xrange(imdb.num_images)]\n    roidb = imdb.roidb\n    for i in xrange(len(imdb.image_index)):\n        roidb[i]['image'] = imdb.image_path_at(i)\n        roidb[i]['width'] = sizes[i][0]\n        roidb[i]['height'] = sizes[i][1]\n        # need gt_overlaps as a dense array for argmax\n        gt_overlaps = roidb[i]['gt_overlaps'].toarray()\n        # max overlap with gt over classes (columns)\n        max_overlaps = gt_overlaps.max(axis=1)\n        # gt class that had the max overlap\n        max_classes = gt_overlaps.argmax(axis=1)\n        roidb[i]['max_classes'] = max_classes\n        roidb[i]['max_overlaps'] = max_overlaps\n        # sanity checks\n        # max overlap of 0 => class should be zero (background)\n        zero_inds = np.where(max_overlaps == 0)[0]\n        assert all(max_classes[zero_inds] == 0)\n        # max overlap > 0 => class should not be zero (must be a fg class)\n        nonzero_inds = np.where(max_overlaps > 0)[0]\n        assert all(max_classes[nonzero_inds] != 0)\n\ndef add_bbox_regression_targets(roidb):\n    \"\"\"Add information needed to train bounding-box regressors.\"\"\"\n    assert len(roidb) > 0\n    assert 'max_classes' in roidb[0], 'Did you call prepare_roidb first?'\n\n    num_images = len(roidb)\n    # Infer number of classes from the number of columns in gt_overlaps\n    num_classes = roidb[0]['gt_overlaps'].shape[1]\n    for im_i in xrange(num_images):\n        rois = roidb[im_i]['boxes']\n        max_overlaps = roidb[im_i]['max_overlaps']\n        max_classes = roidb[im_i]['max_classes']\n        roidb[im_i]['bbox_targets'] = \\\n                _compute_targets(rois, max_overlaps, max_classes)\n\n    if cfg.TRAIN.BBOX_NORMALIZE_TARGETS_PRECOMPUTED:\n        # Use fixed / precomputed \"means\" and \"stds\" instead of empirical values\n        means = np.tile(\n                np.array(cfg.TRAIN.BBOX_NORMALIZE_MEANS), (num_classes, 1))\n        stds = np.tile(\n                np.array(cfg.TRAIN.BBOX_NORMALIZE_STDS), (num_classes, 1))\n    else:\n        # Compute values needed for means and stds\n        # var(x) = E(x^2) - E(x)^2\n        class_counts = np.zeros((num_classes, 1)) + cfg.EPS\n        sums = np.zeros((num_classes, 4))\n        squared_sums = np.zeros((num_classes, 4))\n        for im_i in xrange(num_images):\n            targets = roidb[im_i]['bbox_targets']\n            for cls in xrange(1, num_classes):\n                cls_inds = np.where(targets[:, 0] == cls)[0]\n                if cls_inds.size > 0:\n                    class_counts[cls] += cls_inds.size\n                    sums[cls, :] += targets[cls_inds, 1:].sum(axis=0)\n                    squared_sums[cls, :] += \\\n                            (targets[cls_inds, 1:] ** 2).sum(axis=0)\n\n        means = sums / class_counts\n        stds = np.sqrt(squared_sums / class_counts - means ** 2)\n\n    print 'bbox target means:'\n    print means\n    print means[1:, :].mean(axis=0) # ignore bg class\n    print 'bbox target stdevs:'\n    print stds\n    print stds[1:, :].mean(axis=0) # ignore bg class\n\n    # Normalize targets\n    if cfg.TRAIN.BBOX_NORMALIZE_TARGETS:\n        print \"Normalizing targets\"\n        for im_i in xrange(num_images):\n            targets = roidb[im_i]['bbox_targets']\n            for cls in xrange(1, num_classes):\n                cls_inds = np.where(targets[:, 0] == cls)[0]\n                roidb[im_i]['bbox_targets'][cls_inds, 1:] -= means[cls, :]\n                roidb[im_i]['bbox_targets'][cls_inds, 1:] /= stds[cls, :]\n    else:\n        print \"NOT normalizing targets\"\n\n    # These values will be needed for making predictions\n    # (the predicts will need to be unnormalized and uncentered)\n    return means.ravel(), stds.ravel()\n\ndef _compute_targets(rois, overlaps, labels):\n    \"\"\"Compute bounding-box regression targets for an image.\"\"\"\n    # Indices of ground-truth ROIs\n    gt_inds = np.where(overlaps == 1)[0]\n    if len(gt_inds) == 0:\n        # Bail if the image has no ground-truth ROIs\n        return np.zeros((rois.shape[0], 5), dtype=np.float32)\n    # Indices of examples for which we try to make predictions\n    ex_inds = np.where(overlaps >= cfg.TRAIN.BBOX_THRESH)[0]\n\n    # Get IoU overlap between each ex ROI and gt ROI\n    ex_gt_overlaps = bbox_overlaps(\n        np.ascontiguousarray(rois[ex_inds, :], dtype=np.float),\n        np.ascontiguousarray(rois[gt_inds, :], dtype=np.float))\n\n    # Find which gt ROI each ex ROI has max overlap with:\n    # this will be the ex ROI's gt target\n    gt_assignment = ex_gt_overlaps.argmax(axis=1)\n    gt_rois = rois[gt_inds[gt_assignment], :]\n    ex_rois = rois[ex_inds, :]\n\n    targets = np.zeros((rois.shape[0], 5), dtype=np.float32)\n    targets[ex_inds, 0] = labels[ex_inds]\n    targets[ex_inds, 1:] = bbox_transform(ex_rois, gt_rois)\n    return targets\n"
  },
  {
    "path": "lib/rpn/README.md",
    "content": "### `rpn` module overview\n\n##### `generate_anchors.py`\n\nGenerates a regular grid of multi-scale, multi-aspect anchor boxes.\n\n##### `proposal_layer.py`\n\nConverts RPN outputs (per-anchor scores and bbox regression estimates) into object proposals.\n\n##### `anchor_target_layer.py` \n\nGenerates training targets/labels for each anchor. Classification labels are 1 (object), 0 (not object) or -1 (ignore).\nBbox regression targets are specified when the classification label is > 0.\n\n##### `proposal_target_layer.py`\n\nGenerates training targets/labels for each object proposal: classification labels 0 - K (bg or object class 1, ... , K)\nand bbox regression targets in that case that the label is > 0.\n\n##### `generate.py`\n\nGenerate object detection proposals from an imdb using an RPN.\n"
  },
  {
    "path": "lib/rpn/__init__.py",
    "content": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick and Sean Bell\n# --------------------------------------------------------\n"
  },
  {
    "path": "lib/rpn/anchor_target_layer.py",
    "content": "# --------------------------------------------------------\n# Faster R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick and Sean Bell\n# --------------------------------------------------------\n\nimport os\nimport caffe\nimport yaml\nfrom fast_rcnn.config import cfg\nimport numpy as np\nimport numpy.random as npr\nfrom generate_anchors import generate_anchors\nfrom utils.cython_bbox import bbox_overlaps\nfrom fast_rcnn.bbox_transform import bbox_transform\n\nDEBUG = False\n\nclass AnchorTargetLayer(caffe.Layer):\n    \"\"\"\n    Assign anchors to ground-truth targets. Produces anchor classification\n    labels and bounding-box regression targets.\n    \"\"\"\n\n    def setup(self, bottom, top):\n        layer_params = yaml.load(self.param_str_)\n        anchor_scales = layer_params.get('scales', (8, 16, 32))\n        self._anchors = generate_anchors(scales=np.array(anchor_scales))\n        self._num_anchors = self._anchors.shape[0]\n        self._feat_stride = layer_params['feat_stride']\n\n        if DEBUG:\n            print 'anchors:'\n            print self._anchors\n            print 'anchor shapes:'\n            print np.hstack((\n                self._anchors[:, 2::4] - self._anchors[:, 0::4],\n                self._anchors[:, 3::4] - self._anchors[:, 1::4],\n            ))\n            self._counts = cfg.EPS\n            self._sums = np.zeros((1, 4))\n            self._squared_sums = np.zeros((1, 4))\n            self._fg_sum = 0\n            self._bg_sum = 0\n            self._count = 0\n\n        # allow boxes to sit over the edge by a small amount\n        self._allowed_border = layer_params.get('allowed_border', 0)\n\n        height, width = bottom[0].data.shape[-2:]\n        if DEBUG:\n            print 'AnchorTargetLayer: height', height, 'width', width\n\n        A = self._num_anchors\n        # labels\n        top[0].reshape(1, 1, A * height, width)\n        # bbox_targets\n        top[1].reshape(1, A * 4, height, width)\n        # bbox_inside_weights\n        top[2].reshape(1, A * 4, height, width)\n        # bbox_outside_weights\n        top[3].reshape(1, A * 4, height, width)\n\n    def forward(self, bottom, top):\n        # Algorithm:\n        #\n        # for each (H, W) location i\n        #   generate 9 anchor boxes centered on cell i\n        #   apply predicted bbox deltas at cell i to each of the 9 anchors\n        # filter out-of-image anchors\n        # measure GT overlap\n\n        assert bottom[0].data.shape[0] == 1, \\\n            'Only single item batches are supported'\n\n        # map of shape (..., H, W)\n        height, width = bottom[0].data.shape[-2:]\n        # GT boxes (x1, y1, x2, y2, label)\n        gt_boxes = bottom[1].data\n        # im_info\n        im_info = bottom[2].data[0, :]\n\n        if DEBUG:\n            print ''\n            print 'im_size: ({}, {})'.format(im_info[0], im_info[1])\n            print 'scale: {}'.format(im_info[2])\n            print 'height, width: ({}, {})'.format(height, width)\n            print 'rpn: gt_boxes.shape', gt_boxes.shape\n            print 'rpn: gt_boxes', gt_boxes\n\n        # 1. Generate proposals from bbox deltas and shifted anchors\n        shift_x = np.arange(0, width) * self._feat_stride\n        shift_y = np.arange(0, height) * self._feat_stride\n        shift_x, shift_y = np.meshgrid(shift_x, shift_y)\n        shifts = np.vstack((shift_x.ravel(), shift_y.ravel(),\n                            shift_x.ravel(), shift_y.ravel())).transpose()\n        # add A anchors (1, A, 4) to\n        # cell K shifts (K, 1, 4) to get\n        # shift anchors (K, A, 4)\n        # reshape to (K*A, 4) shifted anchors\n        A = self._num_anchors\n        K = shifts.shape[0]\n        all_anchors = (self._anchors.reshape((1, A, 4)) +\n                       shifts.reshape((1, K, 4)).transpose((1, 0, 2)))\n        all_anchors = all_anchors.reshape((K * A, 4))\n        total_anchors = int(K * A)\n\n        # only keep anchors inside the image\n        inds_inside = np.where(\n            (all_anchors[:, 0] >= -self._allowed_border) &\n            (all_anchors[:, 1] >= -self._allowed_border) &\n            (all_anchors[:, 2] < im_info[1] + self._allowed_border) &  # width\n            (all_anchors[:, 3] < im_info[0] + self._allowed_border)    # height\n        )[0]\n\n        if DEBUG:\n            print 'total_anchors', total_anchors\n            print 'inds_inside', len(inds_inside)\n\n        # keep only inside anchors\n        anchors = all_anchors[inds_inside, :]\n        if DEBUG:\n            print 'anchors.shape', anchors.shape\n\n        # label: 1 is positive, 0 is negative, -1 is dont care\n        labels = np.empty((len(inds_inside), ), dtype=np.float32)\n        labels.fill(-1)\n\n        # overlaps between the anchors and the gt boxes\n        # overlaps (ex, gt)\n        overlaps = bbox_overlaps(\n            np.ascontiguousarray(anchors, dtype=np.float),\n            np.ascontiguousarray(gt_boxes, dtype=np.float))\n        argmax_overlaps = overlaps.argmax(axis=1)\n        max_overlaps = overlaps[np.arange(len(inds_inside)), argmax_overlaps]\n        gt_argmax_overlaps = overlaps.argmax(axis=0)\n        gt_max_overlaps = overlaps[gt_argmax_overlaps,\n                                   np.arange(overlaps.shape[1])]\n        gt_argmax_overlaps = np.where(overlaps == gt_max_overlaps)[0]\n\n        if not cfg.TRAIN.RPN_CLOBBER_POSITIVES:\n            # assign bg labels first so that positive labels can clobber them\n            labels[max_overlaps < cfg.TRAIN.RPN_NEGATIVE_OVERLAP] = 0\n\n        # fg label: for each gt, anchor with highest overlap\n        labels[gt_argmax_overlaps] = 1\n\n        # fg label: above threshold IOU\n        labels[max_overlaps >= cfg.TRAIN.RPN_POSITIVE_OVERLAP] = 1\n\n        if cfg.TRAIN.RPN_CLOBBER_POSITIVES:\n            # assign bg labels last so that negative labels can clobber positives\n            labels[max_overlaps < cfg.TRAIN.RPN_NEGATIVE_OVERLAP] = 0\n\n        # subsample positive labels if we have too many\n        num_fg = int(cfg.TRAIN.RPN_FG_FRACTION * cfg.TRAIN.RPN_BATCHSIZE)\n        fg_inds = np.where(labels == 1)[0]\n        if len(fg_inds) > num_fg:\n            disable_inds = npr.choice(\n                fg_inds, size=(len(fg_inds) - num_fg), replace=False)\n            labels[disable_inds] = -1\n\n        # subsample negative labels if we have too many\n        num_bg = cfg.TRAIN.RPN_BATCHSIZE - np.sum(labels == 1)\n        bg_inds = np.where(labels == 0)[0]\n        if len(bg_inds) > num_bg:\n            disable_inds = npr.choice(\n                bg_inds, size=(len(bg_inds) - num_bg), replace=False)\n            labels[disable_inds] = -1\n            #print \"was %s inds, disabling %s, now %s inds\" % (\n                #len(bg_inds), len(disable_inds), np.sum(labels == 0))\n\n        bbox_targets = np.zeros((len(inds_inside), 4), dtype=np.float32)\n        bbox_targets = _compute_targets(anchors, gt_boxes[argmax_overlaps, :])\n\n        bbox_inside_weights = np.zeros((len(inds_inside), 4), dtype=np.float32)\n        bbox_inside_weights[labels == 1, :] = np.array(cfg.TRAIN.RPN_BBOX_INSIDE_WEIGHTS)\n\n        bbox_outside_weights = np.zeros((len(inds_inside), 4), dtype=np.float32)\n        if cfg.TRAIN.RPN_POSITIVE_WEIGHT < 0:\n            # uniform weighting of examples (given non-uniform sampling)\n            num_examples = np.sum(labels >= 0)\n            positive_weights = np.ones((1, 4)) * 1.0 / num_examples\n            negative_weights = np.ones((1, 4)) * 1.0 / num_examples\n        else:\n            assert ((cfg.TRAIN.RPN_POSITIVE_WEIGHT > 0) &\n                    (cfg.TRAIN.RPN_POSITIVE_WEIGHT < 1))\n            positive_weights = (cfg.TRAIN.RPN_POSITIVE_WEIGHT /\n                                np.sum(labels == 1))\n            negative_weights = ((1.0 - cfg.TRAIN.RPN_POSITIVE_WEIGHT) /\n                                np.sum(labels == 0))\n        bbox_outside_weights[labels == 1, :] = positive_weights\n        bbox_outside_weights[labels == 0, :] = negative_weights\n\n        if DEBUG:\n            self._sums += bbox_targets[labels == 1, :].sum(axis=0)\n            self._squared_sums += (bbox_targets[labels == 1, :] ** 2).sum(axis=0)\n            self._counts += np.sum(labels == 1)\n            means = self._sums / self._counts\n            stds = np.sqrt(self._squared_sums / self._counts - means ** 2)\n            print 'means:'\n            print means\n            print 'stdevs:'\n            print stds\n\n        # map up to original set of anchors\n        labels = _unmap(labels, total_anchors, inds_inside, fill=-1)\n        bbox_targets = _unmap(bbox_targets, total_anchors, inds_inside, fill=0)\n        bbox_inside_weights = _unmap(bbox_inside_weights, total_anchors, inds_inside, fill=0)\n        bbox_outside_weights = _unmap(bbox_outside_weights, total_anchors, inds_inside, fill=0)\n\n        if DEBUG:\n            print 'rpn: max max_overlap', np.max(max_overlaps)\n            print 'rpn: num_positive', np.sum(labels == 1)\n            print 'rpn: num_negative', np.sum(labels == 0)\n            self._fg_sum += np.sum(labels == 1)\n            self._bg_sum += np.sum(labels == 0)\n            self._count += 1\n            print 'rpn: num_positive avg', self._fg_sum / self._count\n            print 'rpn: num_negative avg', self._bg_sum / self._count\n\n        # labels\n        labels = labels.reshape((1, height, width, A)).transpose(0, 3, 1, 2)\n        labels = labels.reshape((1, 1, A * height, width))\n        top[0].reshape(*labels.shape)\n        top[0].data[...] = labels\n\n        # bbox_targets\n        bbox_targets = bbox_targets \\\n            .reshape((1, height, width, A * 4)).transpose(0, 3, 1, 2)\n        top[1].reshape(*bbox_targets.shape)\n        top[1].data[...] = bbox_targets\n\n        # bbox_inside_weights\n        bbox_inside_weights = bbox_inside_weights \\\n            .reshape((1, height, width, A * 4)).transpose(0, 3, 1, 2)\n        assert bbox_inside_weights.shape[2] == height\n        assert bbox_inside_weights.shape[3] == width\n        top[2].reshape(*bbox_inside_weights.shape)\n        top[2].data[...] = bbox_inside_weights\n\n        # bbox_outside_weights\n        bbox_outside_weights = bbox_outside_weights \\\n            .reshape((1, height, width, A * 4)).transpose(0, 3, 1, 2)\n        assert bbox_outside_weights.shape[2] == height\n        assert bbox_outside_weights.shape[3] == width\n        top[3].reshape(*bbox_outside_weights.shape)\n        top[3].data[...] = bbox_outside_weights\n\n    def backward(self, top, propagate_down, bottom):\n        \"\"\"This layer does not propagate gradients.\"\"\"\n        pass\n\n    def reshape(self, bottom, top):\n        \"\"\"Reshaping happens during the call to forward.\"\"\"\n        pass\n\n\ndef _unmap(data, count, inds, fill=0):\n    \"\"\" Unmap a subset of item (data) back to the original set of items (of\n    size count) \"\"\"\n    if len(data.shape) == 1:\n        ret = np.empty((count, ), dtype=np.float32)\n        ret.fill(fill)\n        ret[inds] = data\n    else:\n        ret = np.empty((count, ) + data.shape[1:], dtype=np.float32)\n        ret.fill(fill)\n        ret[inds, :] = data\n    return ret\n\n\ndef _compute_targets(ex_rois, gt_rois):\n    \"\"\"Compute bounding-box regression targets for an image.\"\"\"\n\n    assert ex_rois.shape[0] == gt_rois.shape[0]\n    assert ex_rois.shape[1] == 4\n    assert gt_rois.shape[1] == 5\n\n    return bbox_transform(ex_rois, gt_rois[:, :4]).astype(np.float32, copy=False)\n"
  },
  {
    "path": "lib/rpn/generate.py",
    "content": "# --------------------------------------------------------\n# Faster R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\nfrom fast_rcnn.config import cfg\nfrom utils.blob import im_list_to_blob\nfrom utils.timer import Timer\nimport numpy as np\nimport cv2\n\ndef _vis_proposals(im, dets, thresh=0.5):\n    \"\"\"Draw detected bounding boxes.\"\"\"\n    inds = np.where(dets[:, -1] >= thresh)[0]\n    if len(inds) == 0:\n        return\n\n    class_name = 'obj'\n    im = im[:, :, (2, 1, 0)]\n    fig, ax = plt.subplots(figsize=(12, 12))\n    ax.imshow(im, aspect='equal')\n    for i in inds:\n        bbox = dets[i, :4]\n        score = dets[i, -1]\n\n        ax.add_patch(\n            plt.Rectangle((bbox[0], bbox[1]),\n                          bbox[2] - bbox[0],\n                          bbox[3] - bbox[1], fill=False,\n                          edgecolor='red', linewidth=3.5)\n            )\n        ax.text(bbox[0], bbox[1] - 2,\n                '{:s} {:.3f}'.format(class_name, score),\n                bbox=dict(facecolor='blue', alpha=0.5),\n                fontsize=14, color='white')\n\n    ax.set_title(('{} detections with '\n                  'p({} | box) >= {:.1f}').format(class_name, class_name,\n                                                  thresh),\n                  fontsize=14)\n    plt.axis('off')\n    plt.tight_layout()\n    plt.draw()\n\ndef _get_image_blob(im):\n    \"\"\"Converts an image into a network input.\n\n    Arguments:\n        im (ndarray): a color image in BGR order\n\n    Returns:\n        blob (ndarray): a data blob holding an image pyramid\n        im_scale_factors (list): list of image scales (relative to im) used\n            in the image pyramid\n    \"\"\"\n    im_orig = im.astype(np.float32, copy=True)\n    im_orig -= cfg.PIXEL_MEANS\n\n    im_shape = im_orig.shape\n    im_size_min = np.min(im_shape[0:2])\n    im_size_max = np.max(im_shape[0:2])\n\n    processed_ims = []\n\n    assert len(cfg.TEST.SCALES) == 1\n    target_size = cfg.TEST.SCALES[0]\n\n    im_scale = float(target_size) / float(im_size_min)\n    # Prevent the biggest axis from being more than MAX_SIZE\n    if np.round(im_scale * im_size_max) > cfg.TEST.MAX_SIZE:\n        im_scale = float(cfg.TEST.MAX_SIZE) / float(im_size_max)\n    im = cv2.resize(im_orig, None, None, fx=im_scale, fy=im_scale,\n                    interpolation=cv2.INTER_LINEAR)\n    im_info = np.hstack((im.shape[:2], im_scale))[np.newaxis, :]\n    processed_ims.append(im)\n\n    # Create a blob to hold the input images\n    blob = im_list_to_blob(processed_ims)\n\n    return blob, im_info\n\ndef im_proposals(net, im):\n    \"\"\"Generate RPN proposals on a single image.\"\"\"\n    blobs = {}\n    blobs['data'], blobs['im_info'] = _get_image_blob(im)\n    net.blobs['data'].reshape(*(blobs['data'].shape))\n    net.blobs['im_info'].reshape(*(blobs['im_info'].shape))\n    blobs_out = net.forward(\n            data=blobs['data'].astype(np.float32, copy=False),\n            im_info=blobs['im_info'].astype(np.float32, copy=False))\n\n    scale = blobs['im_info'][0, 2]\n    boxes = blobs_out['rois'][:, 1:].copy() / scale\n    scores = blobs_out['scores'].copy()\n    return boxes, scores\n\ndef imdb_proposals(net, imdb):\n    \"\"\"Generate RPN proposals on all images in an imdb.\"\"\"\n\n    _t = Timer()\n    imdb_boxes = [[] for _ in xrange(imdb.num_images)]\n    for i in xrange(imdb.num_images):\n        im = cv2.imread(imdb.image_path_at(i))\n        _t.tic()\n        imdb_boxes[i], scores = im_proposals(net, im)\n        _t.toc()\n        print 'im_proposals: {:d}/{:d} {:.3f}s' \\\n              .format(i + 1, imdb.num_images, _t.average_time)\n        if 0:\n            dets = np.hstack((imdb_boxes[i], scores))\n            # from IPython import embed; embed()\n            _vis_proposals(im, dets[:3, :], thresh=0.9)\n            plt.show()\n\n    return imdb_boxes\n"
  },
  {
    "path": "lib/rpn/generate_anchors.py",
    "content": "# --------------------------------------------------------\n# Faster R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick and Sean Bell\n# --------------------------------------------------------\n\nimport numpy as np\n\n# Verify that we compute the same anchors as Shaoqing's matlab implementation:\n#\n#    >> load output/rpn_cachedir/faster_rcnn_VOC2007_ZF_stage1_rpn/anchors.mat\n#    >> anchors\n#\n#    anchors =\n#\n#       -83   -39   100    56\n#      -175   -87   192   104\n#      -359  -183   376   200\n#       -55   -55    72    72\n#      -119  -119   136   136\n#      -247  -247   264   264\n#       -35   -79    52    96\n#       -79  -167    96   184\n#      -167  -343   184   360\n\n#array([[ -83.,  -39.,  100.,   56.],\n#       [-175.,  -87.,  192.,  104.],\n#       [-359., -183.,  376.,  200.],\n#       [ -55.,  -55.,   72.,   72.],\n#       [-119., -119.,  136.,  136.],\n#       [-247., -247.,  264.,  264.],\n#       [ -35.,  -79.,   52.,   96.],\n#       [ -79., -167.,   96.,  184.],\n#       [-167., -343.,  184.,  360.]])\n\ndef generate_anchors(base_size=16, ratios=[0.5, 1, 2],\n                     scales=2**np.arange(3, 6)):\n    \"\"\"\n    Generate anchor (reference) windows by enumerating aspect ratios X\n    scales wrt a reference (0, 0, 15, 15) window.\n    \"\"\"\n\n    base_anchor = np.array([1, 1, base_size, base_size]) - 1\n    ratio_anchors = _ratio_enum(base_anchor, ratios)\n    anchors = np.vstack([_scale_enum(ratio_anchors[i, :], scales)\n                         for i in xrange(ratio_anchors.shape[0])])\n    return anchors\n\ndef _whctrs(anchor):\n    \"\"\"\n    Return width, height, x center, and y center for an anchor (window).\n    \"\"\"\n\n    w = anchor[2] - anchor[0] + 1\n    h = anchor[3] - anchor[1] + 1\n    x_ctr = anchor[0] + 0.5 * (w - 1)\n    y_ctr = anchor[1] + 0.5 * (h - 1)\n    return w, h, x_ctr, y_ctr\n\ndef _mkanchors(ws, hs, x_ctr, y_ctr):\n    \"\"\"\n    Given a vector of widths (ws) and heights (hs) around a center\n    (x_ctr, y_ctr), output a set of anchors (windows).\n    \"\"\"\n\n    ws = ws[:, np.newaxis]\n    hs = hs[:, np.newaxis]\n    anchors = np.hstack((x_ctr - 0.5 * (ws - 1),\n                         y_ctr - 0.5 * (hs - 1),\n                         x_ctr + 0.5 * (ws - 1),\n                         y_ctr + 0.5 * (hs - 1)))\n    return anchors\n\ndef _ratio_enum(anchor, ratios):\n    \"\"\"\n    Enumerate a set of anchors for each aspect ratio wrt an anchor.\n    \"\"\"\n\n    w, h, x_ctr, y_ctr = _whctrs(anchor)\n    size = w * h\n    size_ratios = size / ratios\n    ws = np.round(np.sqrt(size_ratios))\n    hs = np.round(ws * ratios)\n    anchors = _mkanchors(ws, hs, x_ctr, y_ctr)\n    return anchors\n\ndef _scale_enum(anchor, scales):\n    \"\"\"\n    Enumerate a set of anchors for each scale wrt an anchor.\n    \"\"\"\n\n    w, h, x_ctr, y_ctr = _whctrs(anchor)\n    ws = w * scales\n    hs = h * scales\n    anchors = _mkanchors(ws, hs, x_ctr, y_ctr)\n    return anchors\n\nif __name__ == '__main__':\n    import time\n    t = time.time()\n    a = generate_anchors()\n    print time.time() - t\n    print a\n    from IPython import embed; embed()\n"
  },
  {
    "path": "lib/rpn/proposal_layer.py",
    "content": "# --------------------------------------------------------\n# Faster R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick and Sean Bell\n# --------------------------------------------------------\n\nimport caffe\nimport numpy as np\nimport yaml\nfrom fast_rcnn.config import cfg\nfrom generate_anchors import generate_anchors\nfrom fast_rcnn.bbox_transform import bbox_transform_inv, clip_boxes\nfrom fast_rcnn.nms_wrapper import nms\n\nDEBUG = False\n\nclass ProposalLayer(caffe.Layer):\n    \"\"\"\n    Outputs object detection proposals by applying estimated bounding-box\n    transformations to a set of regular boxes (called \"anchors\").\n    \"\"\"\n\n    def setup(self, bottom, top):\n        # parse the layer parameter string, which must be valid YAML\n        layer_params = yaml.load(self.param_str_)\n\n        self._feat_stride = layer_params['feat_stride']\n        anchor_scales = layer_params.get('scales', (8, 16, 32))\n        self._anchors = generate_anchors(scales=np.array(anchor_scales))\n        self._num_anchors = self._anchors.shape[0]\n\n        if DEBUG:\n            print 'feat_stride: {}'.format(self._feat_stride)\n            print 'anchors:'\n            print self._anchors\n\n        # rois blob: holds R regions of interest, each is a 5-tuple\n        # (n, x1, y1, x2, y2) specifying an image batch index n and a\n        # rectangle (x1, y1, x2, y2)\n        top[0].reshape(1, 5)\n\n        # scores blob: holds scores for R regions of interest\n        if len(top) > 1:\n            top[1].reshape(1, 1, 1, 1)\n\n    def forward(self, bottom, top):\n        # Algorithm:\n        #\n        # for each (H, W) location i\n        #   generate A anchor boxes centered on cell i\n        #   apply predicted bbox deltas at cell i to each of the A anchors\n        # clip predicted boxes to image\n        # remove predicted boxes with either height or width < threshold\n        # sort all (proposal, score) pairs by score from highest to lowest\n        # take top pre_nms_topN proposals before NMS\n        # apply NMS with threshold 0.7 to remaining proposals\n        # take after_nms_topN proposals after NMS\n        # return the top proposals (-> RoIs top, scores top)\n\n        assert bottom[0].data.shape[0] == 1, \\\n            'Only single item batches are supported'\n\n        cfg_key = str(self.phase) # either 'TRAIN' or 'TEST'\n        pre_nms_topN  = cfg[cfg_key].RPN_PRE_NMS_TOP_N\n        post_nms_topN = cfg[cfg_key].RPN_POST_NMS_TOP_N\n        nms_thresh    = cfg[cfg_key].RPN_NMS_THRESH\n        min_size      = cfg[cfg_key].RPN_MIN_SIZE\n\n        # the first set of _num_anchors channels are bg probs\n        # the second set are the fg probs, which we want\n        scores = bottom[0].data[:, self._num_anchors:, :, :]\n        bbox_deltas = bottom[1].data\n        im_info = bottom[2].data[0, :]\n\n        if DEBUG:\n            print 'im_size: ({}, {})'.format(im_info[0], im_info[1])\n            print 'scale: {}'.format(im_info[2])\n\n        # 1. Generate proposals from bbox deltas and shifted anchors\n        height, width = scores.shape[-2:]\n\n        if DEBUG:\n            print 'score map size: {}'.format(scores.shape)\n\n        # Enumerate all shifts\n        shift_x = np.arange(0, width) * self._feat_stride\n        shift_y = np.arange(0, height) * self._feat_stride\n        shift_x, shift_y = np.meshgrid(shift_x, shift_y)\n        shifts = np.vstack((shift_x.ravel(), shift_y.ravel(),\n                            shift_x.ravel(), shift_y.ravel())).transpose()\n\n        # Enumerate all shifted anchors:\n        #\n        # add A anchors (1, A, 4) to\n        # cell K shifts (K, 1, 4) to get\n        # shift anchors (K, A, 4)\n        # reshape to (K*A, 4) shifted anchors\n        A = self._num_anchors\n        K = shifts.shape[0]\n        anchors = self._anchors.reshape((1, A, 4)) + \\\n                  shifts.reshape((1, K, 4)).transpose((1, 0, 2))\n        anchors = anchors.reshape((K * A, 4))\n\n        # Transpose and reshape predicted bbox transformations to get them\n        # into the same order as the anchors:\n        #\n        # bbox deltas will be (1, 4 * A, H, W) format\n        # transpose to (1, H, W, 4 * A)\n        # reshape to (1 * H * W * A, 4) where rows are ordered by (h, w, a)\n        # in slowest to fastest order\n        bbox_deltas = bbox_deltas.transpose((0, 2, 3, 1)).reshape((-1, 4))\n\n        # Same story for the scores:\n        #\n        # scores are (1, A, H, W) format\n        # transpose to (1, H, W, A)\n        # reshape to (1 * H * W * A, 1) where rows are ordered by (h, w, a)\n        scores = scores.transpose((0, 2, 3, 1)).reshape((-1, 1))\n\n        # Convert anchors into proposals via bbox transformations\n        proposals = bbox_transform_inv(anchors, bbox_deltas)\n\n        # 2. clip predicted boxes to image\n        proposals = clip_boxes(proposals, im_info[:2])\n\n        # 3. remove predicted boxes with either height or width < threshold\n        # (NOTE: convert min_size to input image scale stored in im_info[2])\n        keep = _filter_boxes(proposals, min_size * im_info[2])\n        proposals = proposals[keep, :]\n        scores = scores[keep]\n\n        # 4. sort all (proposal, score) pairs by score from highest to lowest\n        # 5. take top pre_nms_topN (e.g. 6000)\n        order = scores.ravel().argsort()[::-1]\n        if pre_nms_topN > 0:\n            order = order[:pre_nms_topN]\n        proposals = proposals[order, :]\n        scores = scores[order]\n\n        # 6. apply nms (e.g. threshold = 0.7)\n        # 7. take after_nms_topN (e.g. 300)\n        # 8. return the top proposals (-> RoIs top)\n        keep = nms(np.hstack((proposals, scores)), nms_thresh)\n        if post_nms_topN > 0:\n            keep = keep[:post_nms_topN]\n        proposals = proposals[keep, :]\n        scores = scores[keep]\n\n        # Output rois blob\n        # Our RPN implementation only supports a single input image, so all\n        # batch inds are 0\n        batch_inds = np.zeros((proposals.shape[0], 1), dtype=np.float32)\n        blob = np.hstack((batch_inds, proposals.astype(np.float32, copy=False)))\n        top[0].reshape(*(blob.shape))\n        top[0].data[...] = blob\n\n        # [Optional] output scores blob\n        if len(top) > 1:\n            top[1].reshape(*(scores.shape))\n            top[1].data[...] = scores\n\n    def backward(self, top, propagate_down, bottom):\n        \"\"\"This layer does not propagate gradients.\"\"\"\n        pass\n\n    def reshape(self, bottom, top):\n        \"\"\"Reshaping happens during the call to forward.\"\"\"\n        pass\n\ndef _filter_boxes(boxes, min_size):\n    \"\"\"Remove all boxes with any side smaller than min_size.\"\"\"\n    ws = boxes[:, 2] - boxes[:, 0] + 1\n    hs = boxes[:, 3] - boxes[:, 1] + 1\n    keep = np.where((ws >= min_size) & (hs >= min_size))[0]\n    return keep\n"
  },
  {
    "path": "lib/rpn/proposal_target_layer.py",
    "content": "# --------------------------------------------------------\n# Faster R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick and Sean Bell\n# --------------------------------------------------------\n\nimport caffe\nimport yaml\nimport numpy as np\nimport numpy.random as npr\nfrom fast_rcnn.config import cfg\nfrom fast_rcnn.bbox_transform import bbox_transform\nfrom utils.cython_bbox import bbox_overlaps\n\nDEBUG = False\n\nclass ProposalTargetLayer(caffe.Layer):\n    \"\"\"\n    Assign object detection proposals to ground-truth targets. Produces proposal\n    classification labels and bounding-box regression targets.\n    \"\"\"\n\n    def setup(self, bottom, top):\n        layer_params = yaml.load(self.param_str_)\n        self._num_classes = layer_params['num_classes']\n\n        # sampled rois (0, x1, y1, x2, y2)\n        top[0].reshape(1, 5)\n        # labels\n        top[1].reshape(1, 1)\n        # bbox_targets\n        top[2].reshape(1, self._num_classes * 4)\n        # bbox_inside_weights\n        top[3].reshape(1, self._num_classes * 4)\n        # bbox_outside_weights\n        top[4].reshape(1, self._num_classes * 4)\n\n    def forward(self, bottom, top):\n        # Proposal ROIs (0, x1, y1, x2, y2) coming from RPN\n        # (i.e., rpn.proposal_layer.ProposalLayer), or any other source\n        all_rois = bottom[0].data\n        # GT boxes (x1, y1, x2, y2, label)\n        # TODO(rbg): it's annoying that sometimes I have extra info before\n        # and other times after box coordinates -- normalize to one format\n        gt_boxes = bottom[1].data\n\n        # Include ground-truth boxes in the set of candidate rois\n        zeros = np.zeros((gt_boxes.shape[0], 1), dtype=gt_boxes.dtype)\n        all_rois = np.vstack(\n            (all_rois, np.hstack((zeros, gt_boxes[:, :-1])))\n        )\n\n        # Sanity check: single batch only\n        assert np.all(all_rois[:, 0] == 0), \\\n                'Only single item batches are supported'\n\n        num_images = 1\n        rois_per_image = cfg.TRAIN.BATCH_SIZE / num_images\n        fg_rois_per_image = np.round(cfg.TRAIN.FG_FRACTION * rois_per_image)\n\n        # Sample rois with classification labels and bounding box regression\n        # targets\n        labels, rois, bbox_targets, bbox_inside_weights = _sample_rois(\n            all_rois, gt_boxes, fg_rois_per_image,\n            rois_per_image, self._num_classes)\n\n        if DEBUG:\n            print 'num fg: {}'.format((labels > 0).sum())\n            print 'num bg: {}'.format((labels == 0).sum())\n            self._count += 1\n            self._fg_num += (labels > 0).sum()\n            self._bg_num += (labels == 0).sum()\n            print 'num fg avg: {}'.format(self._fg_num / self._count)\n            print 'num bg avg: {}'.format(self._bg_num / self._count)\n            print 'ratio: {:.3f}'.format(float(self._fg_num) / float(self._bg_num))\n\n        # sampled rois\n        top[0].reshape(*rois.shape)\n        top[0].data[...] = rois\n\n        # classification labels\n        top[1].reshape(*labels.shape)\n        top[1].data[...] = labels\n\n        # bbox_targets\n        top[2].reshape(*bbox_targets.shape)\n        top[2].data[...] = bbox_targets\n\n        # bbox_inside_weights\n        top[3].reshape(*bbox_inside_weights.shape)\n        top[3].data[...] = bbox_inside_weights\n\n        # bbox_outside_weights\n        top[4].reshape(*bbox_inside_weights.shape)\n        top[4].data[...] = np.array(bbox_inside_weights > 0).astype(np.float32)\n\n    def backward(self, top, propagate_down, bottom):\n        \"\"\"This layer does not propagate gradients.\"\"\"\n        pass\n\n    def reshape(self, bottom, top):\n        \"\"\"Reshaping happens during the call to forward.\"\"\"\n        pass\n\n\ndef _get_bbox_regression_labels(bbox_target_data, num_classes):\n    \"\"\"Bounding-box regression targets (bbox_target_data) are stored in a\n    compact form N x (class, tx, ty, tw, th)\n\n    This function expands those targets into the 4-of-4*K representation used\n    by the network (i.e. only one class has non-zero targets).\n\n    Returns:\n        bbox_target (ndarray): N x 4K blob of regression targets\n        bbox_inside_weights (ndarray): N x 4K blob of loss weights\n    \"\"\"\n\n    clss = bbox_target_data[:, 0]\n    bbox_targets = np.zeros((clss.size, 4 * num_classes), dtype=np.float32)\n    bbox_inside_weights = np.zeros(bbox_targets.shape, dtype=np.float32)\n    inds = np.where(clss > 0)[0]\n    for ind in inds:\n        cls = clss[ind]\n        start = 4 * cls\n        end = start + 4\n        bbox_targets[ind, start:end] = bbox_target_data[ind, 1:]\n        bbox_inside_weights[ind, start:end] = cfg.TRAIN.BBOX_INSIDE_WEIGHTS\n    return bbox_targets, bbox_inside_weights\n\n\ndef _compute_targets(ex_rois, gt_rois, labels):\n    \"\"\"Compute bounding-box regression targets for an image.\"\"\"\n\n    assert ex_rois.shape[0] == gt_rois.shape[0]\n    assert ex_rois.shape[1] == 4\n    assert gt_rois.shape[1] == 4\n\n    targets = bbox_transform(ex_rois, gt_rois)\n    if cfg.TRAIN.BBOX_NORMALIZE_TARGETS_PRECOMPUTED:\n        # Optionally normalize targets by a precomputed mean and stdev\n        targets = ((targets - np.array(cfg.TRAIN.BBOX_NORMALIZE_MEANS))\n                / np.array(cfg.TRAIN.BBOX_NORMALIZE_STDS))\n    return np.hstack(\n            (labels[:, np.newaxis], targets)).astype(np.float32, copy=False)\n\ndef _sample_rois(all_rois, gt_boxes, fg_rois_per_image, rois_per_image, num_classes):\n    \"\"\"Generate a random sample of RoIs comprising foreground and background\n    examples.\n    \"\"\"\n    # overlaps: (rois x gt_boxes)\n    overlaps = bbox_overlaps(\n        np.ascontiguousarray(all_rois[:, 1:5], dtype=np.float),\n        np.ascontiguousarray(gt_boxes[:, :4], dtype=np.float))\n    gt_assignment = overlaps.argmax(axis=1)\n    max_overlaps = overlaps.max(axis=1)\n    labels = gt_boxes[gt_assignment, 4]\n\n    # Select foreground RoIs as those with >= FG_THRESH overlap\n    fg_inds = np.where(max_overlaps >= cfg.TRAIN.FG_THRESH)[0]\n    # Guard against the case when an image has fewer than fg_rois_per_image\n    # foreground RoIs\n    fg_rois_per_this_image = min(fg_rois_per_image, fg_inds.size)\n    # Sample foreground regions without replacement\n    if fg_inds.size > 0:\n        fg_inds = npr.choice(fg_inds, size=fg_rois_per_this_image, replace=False)\n\n    # Select background RoIs as those within [BG_THRESH_LO, BG_THRESH_HI)\n    bg_inds = np.where((max_overlaps < cfg.TRAIN.BG_THRESH_HI) &\n                       (max_overlaps >= cfg.TRAIN.BG_THRESH_LO))[0]\n    # Compute number of background RoIs to take from this image (guarding\n    # against there being fewer than desired)\n    bg_rois_per_this_image = rois_per_image - fg_rois_per_this_image\n    bg_rois_per_this_image = min(bg_rois_per_this_image, bg_inds.size)\n    # Sample background regions without replacement\n    if bg_inds.size > 0:\n        bg_inds = npr.choice(bg_inds, size=bg_rois_per_this_image, replace=False)\n\n    # The indices that we're selecting (both fg and bg)\n    keep_inds = np.append(fg_inds, bg_inds)\n    # Select sampled values from various arrays:\n    labels = labels[keep_inds]\n    # Clamp labels for the background RoIs to 0\n    labels[fg_rois_per_this_image:] = 0\n    rois = all_rois[keep_inds]\n\n    bbox_target_data = _compute_targets(\n        rois[:, 1:5], gt_boxes[gt_assignment[keep_inds], :4], labels)\n\n    bbox_targets, bbox_inside_weights = \\\n        _get_bbox_regression_labels(bbox_target_data, num_classes)\n\n    return labels, rois, bbox_targets, bbox_inside_weights\n"
  },
  {
    "path": "lib/setup.py",
    "content": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\nimport os\nfrom os.path import join as pjoin\nfrom setuptools import setup\nfrom distutils.extension import Extension\nfrom Cython.Distutils import build_ext\nimport subprocess\nimport numpy as np\n\ndef find_in_path(name, path):\n    \"Find a file in a search path\"\n    # Adapted fom\n    # http://code.activestate.com/recipes/52224-find-a-file-given-a-search-path/\n    for dir in path.split(os.pathsep):\n        binpath = pjoin(dir, name)\n        if os.path.exists(binpath):\n            return os.path.abspath(binpath)\n    return None\n\n\ndef locate_cuda():\n    \"\"\"Locate the CUDA environment on the system\n\n    Returns a dict with keys 'home', 'nvcc', 'include', and 'lib64'\n    and values giving the absolute path to each directory.\n\n    Starts by looking for the CUDAHOME env variable. If not found, everything\n    is based on finding 'nvcc' in the PATH.\n    \"\"\"\n\n    # first check if the CUDAHOME env variable is in use\n    if 'CUDAHOME' in os.environ:\n        home = os.environ['CUDAHOME']\n        nvcc = pjoin(home, 'bin', 'nvcc')\n    else:\n        # otherwise, search the PATH for NVCC\n        default_path = pjoin(os.sep, 'usr', 'local', 'cuda', 'bin')\n        nvcc = find_in_path('nvcc', os.environ['PATH'] + os.pathsep + default_path)\n        if nvcc is None:\n            raise EnvironmentError('The nvcc binary could not be '\n                'located in your $PATH. Either add it to your path, or set $CUDAHOME')\n        home = os.path.dirname(os.path.dirname(nvcc))\n\n    cudaconfig = {'home':home, 'nvcc':nvcc,\n                  'include': pjoin(home, 'include'),\n                  'lib64': pjoin(home, 'lib64')}\n    for k, v in cudaconfig.iteritems():\n        if not os.path.exists(v):\n            raise EnvironmentError('The CUDA %s path could not be located in %s' % (k, v))\n\n    return cudaconfig\nCUDA = locate_cuda()\n\n\n# Obtain the numpy include directory.  This logic works across numpy versions.\ntry:\n    numpy_include = np.get_include()\nexcept AttributeError:\n    numpy_include = np.get_numpy_include()\n\ndef customize_compiler_for_nvcc(self):\n    \"\"\"inject deep into distutils to customize how the dispatch\n    to gcc/nvcc works.\n\n    If you subclass UnixCCompiler, it's not trivial to get your subclass\n    injected in, and still have the right customizations (i.e.\n    distutils.sysconfig.customize_compiler) run on it. So instead of going\n    the OO route, I have this. Note, it's kindof like a wierd functional\n    subclassing going on.\"\"\"\n\n    # tell the compiler it can processes .cu\n    self.src_extensions.append('.cu')\n\n    # save references to the default compiler_so and _comple methods\n    default_compiler_so = self.compiler_so\n    super = self._compile\n\n    # now redefine the _compile method. This gets executed for each\n    # object but distutils doesn't have the ability to change compilers\n    # based on source extension: we add it.\n    def _compile(obj, src, ext, cc_args, extra_postargs, pp_opts):\n        if os.path.splitext(src)[1] == '.cu':\n            # use the cuda for .cu files\n            self.set_executable('compiler_so', CUDA['nvcc'])\n            # use only a subset of the extra_postargs, which are 1-1 translated\n            # from the extra_compile_args in the Extension class\n            postargs = extra_postargs['nvcc']\n        else:\n            postargs = extra_postargs['gcc']\n\n        super(obj, src, ext, cc_args, postargs, pp_opts)\n        # reset the default compiler_so, which we might have changed for cuda\n        self.compiler_so = default_compiler_so\n\n    # inject our redefined _compile method into the class\n    self._compile = _compile\n\n\n# run the customize_compiler\nclass custom_build_ext(build_ext):\n    def build_extensions(self):\n        customize_compiler_for_nvcc(self.compiler)\n        build_ext.build_extensions(self)\n\n\next_modules = [\n    Extension(\n        \"utils.cython_bbox\",\n        [\"utils/bbox.pyx\"],\n        extra_compile_args={'gcc': [\"-Wno-cpp\", \"-Wno-unused-function\"]},\n        include_dirs = [numpy_include]\n    ),\n    Extension(\n        \"nms.cpu_nms\",\n        [\"nms/cpu_nms.pyx\"],\n        extra_compile_args={'gcc': [\"-Wno-cpp\", \"-Wno-unused-function\"]},\n        include_dirs = [numpy_include]\n    ),\n    Extension('nms.gpu_nms',\n        ['nms/nms_kernel.cu', 'nms/gpu_nms.pyx'],\n        library_dirs=[CUDA['lib64']],\n        libraries=['cudart'],\n        language='c++',\n        runtime_library_dirs=[CUDA['lib64']],\n        # this syntax is specific to this build system\n        # we're only going to use certain compiler args with nvcc and not with\n        # gcc the implementation of this trick is in customize_compiler() below\n        extra_compile_args={'gcc': [\"-Wno-unused-function\"],\n                            'nvcc': ['-arch=sm_35',\n                                     '--ptxas-options=-v',\n                                     '-c',\n                                     '--compiler-options',\n                                     \"'-fPIC'\"]},\n        include_dirs = [numpy_include, CUDA['include']]\n    ),\n    Extension(\n        'pycocotools._mask',\n        sources=['pycocotools/maskApi.c', 'pycocotools/_mask.pyx'],\n        include_dirs = [numpy_include, 'pycocotools'],\n        extra_compile_args={\n            'gcc': ['-Wno-cpp', '-Wno-unused-function', '-std=c99']},\n    ),\n]\n\nsetup(\n    name='fast_rcnn',\n    ext_modules=ext_modules,\n    # inject our custom trigger\n    cmdclass={'build_ext': custom_build_ext},\n)\n"
  },
  {
    "path": "lib/transform/__init__.py",
    "content": ""
  },
  {
    "path": "lib/transform/torch_image_transform_layer.py",
    "content": "# --------------------------------------------------------\n# Fast/er R-CNN\n# Licensed under The MIT License [see LICENSE for details]\n# --------------------------------------------------------\n\n\"\"\" Transform images for compatibility with models trained with\nhttps://github.com/facebook/fb.resnet.torch.\n\nUsage in model prototxt:\n\nlayer {\n  name: 'data_xform'\n  type: 'Python'\n  bottom: 'data_caffe'\n  top: 'data'\n  python_param {\n    module: 'transform.torch_image_transform_layer'\n    layer: 'TorchImageTransformLayer'\n  }\n}\n\"\"\"\n\nimport caffe\nfrom fast_rcnn.config import cfg\nimport numpy as np\n\nclass TorchImageTransformLayer(caffe.Layer):\n    def setup(self, bottom, top):\n        # (1, 3, 1, 1) shaped arrays\n        self.PIXEL_MEANS = \\\n            np.array([[[[0.48462227599918]],\n                       [[0.45624044862054]],\n                       [[0.40588363755159]]]])\n        self.PIXEL_STDS = \\\n            np.array([[[[0.22889466674951]],\n                       [[0.22446679341259]],\n                       [[0.22495548344775]]]])\n        # The default (\"old\") pixel means that were already subtracted\n        channel_swap = (0, 3, 1, 2)\n        self.OLD_PIXEL_MEANS = \\\n            cfg.PIXEL_MEANS[np.newaxis, :, :, :].transpose(channel_swap)\n\n        top[0].reshape(*(bottom[0].shape))\n\n    def forward(self, bottom, top):\n        ims = bottom[0].data\n        # Invert the channel means that were already subtracted\n        ims += self.OLD_PIXEL_MEANS\n        # 1. Permute BGR to RGB and normalize to [0, 1]\n        ims = ims[:, [2, 1, 0], :, :] / 255.0\n        # 2. Remove channel means\n        ims -= self.PIXEL_MEANS\n        # 3. Standardize channels\n        ims /= self.PIXEL_STDS\n        top[0].reshape(*(ims.shape))\n        top[0].data[...] = ims\n\n    def backward(self, top, propagate_down, bottom):\n        \"\"\"This layer does not propagate gradients.\"\"\"\n        pass\n\n    def reshape(self, bottom, top):\n        \"\"\"Reshaping happens during the call to forward.\"\"\"\n        pass\n"
  },
  {
    "path": "lib/utils/.gitignore",
    "content": "*.c\n*.so\n"
  },
  {
    "path": "lib/utils/__init__.py",
    "content": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n"
  },
  {
    "path": "lib/utils/bbox.pyx",
    "content": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Sergey Karayev\n# --------------------------------------------------------\n\ncimport cython\nimport numpy as np\ncimport numpy as np\n\nDTYPE = np.float\nctypedef np.float_t DTYPE_t\n\ndef bbox_overlaps(\n        np.ndarray[DTYPE_t, ndim=2] boxes,\n        np.ndarray[DTYPE_t, ndim=2] query_boxes):\n    \"\"\"\n    Parameters\n    ----------\n    boxes: (N, 4) ndarray of float\n    query_boxes: (K, 4) ndarray of float\n    Returns\n    -------\n    overlaps: (N, K) ndarray of overlap between boxes and query_boxes\n    \"\"\"\n    cdef unsigned int N = boxes.shape[0]\n    cdef unsigned int K = query_boxes.shape[0]\n    cdef np.ndarray[DTYPE_t, ndim=2] overlaps = np.zeros((N, K), dtype=DTYPE)\n    cdef DTYPE_t iw, ih, box_area\n    cdef DTYPE_t ua\n    cdef unsigned int k, n\n    for k in range(K):\n        box_area = (\n            (query_boxes[k, 2] - query_boxes[k, 0] + 1) *\n            (query_boxes[k, 3] - query_boxes[k, 1] + 1)\n        )\n        for n in range(N):\n            iw = (\n                min(boxes[n, 2], query_boxes[k, 2]) -\n                max(boxes[n, 0], query_boxes[k, 0]) + 1\n            )\n            if iw > 0:\n                ih = (\n                    min(boxes[n, 3], query_boxes[k, 3]) -\n                    max(boxes[n, 1], query_boxes[k, 1]) + 1\n                )\n                if ih > 0:\n                    ua = float(\n                        (boxes[n, 2] - boxes[n, 0] + 1) *\n                        (boxes[n, 3] - boxes[n, 1] + 1) +\n                        box_area - iw * ih\n                    )\n                    overlaps[n, k] = iw * ih / ua\n    return overlaps\n"
  },
  {
    "path": "lib/utils/blob.py",
    "content": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\n\"\"\"Blob helper functions.\"\"\"\n\nimport numpy as np\nimport cv2\n\ndef im_list_to_blob(ims):\n    \"\"\"Convert a list of images into a network input.\n\n    Assumes images are already prepared (means subtracted, BGR order, ...).\n    \"\"\"\n    max_shape = np.array([im.shape for im in ims]).max(axis=0)\n    num_images = len(ims)\n    blob = np.zeros((num_images, max_shape[0], max_shape[1], 3),\n                    dtype=np.float32)\n    for i in xrange(num_images):\n        im = ims[i]\n        blob[i, 0:im.shape[0], 0:im.shape[1], :] = im\n    # Move channels (axis 3) to axis 1\n    # Axis order will become: (batch elem, channel, height, width)\n    channel_swap = (0, 3, 1, 2)\n    blob = blob.transpose(channel_swap)\n    return blob\n\ndef prep_im_for_blob(im, pixel_means, target_size, max_size):\n    \"\"\"Mean subtract and scale an image for use in a blob.\"\"\"\n    im = im.astype(np.float32, copy=False)\n    im -= pixel_means\n    im_shape = im.shape\n    im_size_min = np.min(im_shape[0:2])\n    im_size_max = np.max(im_shape[0:2])\n    im_scale = float(target_size) / float(im_size_min)\n    # Prevent the biggest axis from being more than MAX_SIZE\n    if np.round(im_scale * im_size_max) > max_size:\n        im_scale = float(max_size) / float(im_size_max)\n    im = cv2.resize(im, None, None, fx=im_scale, fy=im_scale,\n                    interpolation=cv2.INTER_LINEAR)\n\n    return im, im_scale\n"
  },
  {
    "path": "lib/utils/timer.py",
    "content": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\nimport time\n\nclass Timer(object):\n    \"\"\"A simple timer.\"\"\"\n    def __init__(self):\n        self.total_time = 0.\n        self.calls = 0\n        self.start_time = 0.\n        self.diff = 0.\n        self.average_time = 0.\n\n    def tic(self):\n        # using time.time instead of time.clock because time time.clock\n        # does not normalize for multithreading\n        self.start_time = time.time()\n\n    def toc(self, average=True):\n        self.diff = time.time() - self.start_time\n        self.total_time += self.diff\n        self.calls += 1\n        self.average_time = self.total_time / self.calls\n        if average:\n            return self.average_time\n        else:\n            return self.diff\n"
  },
  {
    "path": "models/README.md",
    "content": "## Model Zoo\n\n### COCO Faster R-CNN VGG-16 trained using end-to-end\n\nModel URL: https://dl.dropboxusercontent.com/s/cotx0y81zvbbhnt/coco_vgg16_faster_rcnn_final.caffemodel?dl=0\n\nTraining command:\n```\ntools/train_net.py \\\n    --gpu 0 \\\n    --solver ./models/coco/VGG16/faster_rcnn_end2end/solver.prototxt \\\n    --weights data/imagenet_models/VGG16.v2.caffemodel \\\n    --imdb coco_2014_train+coco_2014_valminusminival \\\n    --iters 490000 \\\n    --cfg ./experiments/cfgs/faster_rcnn_end2end.yml\n```\n\n`py-faster-rcnn` commit: 68eec95\n\ntest-dev2015 results\n```\n Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.242\n Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.453\n Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.235\n Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.077\n Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.264\n Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.371\n Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.238\n Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.340\n Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.346\n Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.120\n Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.385\n Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.544\n```\n\ntest-standard2015 results\n```\n Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.242\n Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.453\n Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.234\n Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.072\n Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.264\n Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.369\n Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.238\n Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.341\n Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.347\n Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.115\n Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.389\n Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.544\n```\n"
  },
  {
    "path": "models/coco/VGG16/fast_rcnn/solver.prototxt",
    "content": "train_net: \"models/coco/VGG16/fast_rcnn/train.prototxt\"\nbase_lr: 0.001\nlr_policy: \"step\"\ngamma: 0.1\nstepsize: 200000\ndisplay: 20\naverage_loss: 100\n# iter_size: 1\nmomentum: 0.9\nweight_decay: 0.0005\n# We disable standard caffe solver snapshotting and implement our own snapshot\n# function\nsnapshot: 0\n# We still use the snapshot prefix, though\nsnapshot_prefix: \"vgg16_fast_rcnn\"\n#debug_info: true\n"
  },
  {
    "path": "models/coco/VGG16/fast_rcnn/test.prototxt",
    "content": "name: \"VGG_ILSVRC_16_layers\"\n\ninput: \"data\"\ninput_shape {\n  dim: 1\n  dim: 3\n  dim: 224\n  dim: 224\n}\n\ninput: \"rois\"\ninput_shape {\n  dim: 1 # to be changed on-the-fly to num ROIs\n  dim: 5 # [batch ind, x1, y1, x2, y2] zero-based indexing\n}\n\nlayer {\n  name: \"conv1_1\"\n  type: \"Convolution\"\n  bottom: \"data\"\n  top: \"conv1_1\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 64\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu1_1\"\n  type: \"ReLU\"\n  bottom: \"conv1_1\"\n  top: \"conv1_1\"\n}\nlayer {\n  name: \"conv1_2\"\n  type: \"Convolution\"\n  bottom: \"conv1_1\"\n  top: \"conv1_2\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 64\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu1_2\"\n  type: \"ReLU\"\n  bottom: \"conv1_2\"\n  top: \"conv1_2\"\n}\nlayer {\n  name: \"pool1\"\n  type: \"Pooling\"\n  bottom: \"conv1_2\"\n  top: \"pool1\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv2_1\"\n  type: \"Convolution\"\n  bottom: \"pool1\"\n  top: \"conv2_1\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 128\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu2_1\"\n  type: \"ReLU\"\n  bottom: \"conv2_1\"\n  top: \"conv2_1\"\n}\nlayer {\n  name: \"conv2_2\"\n  type: \"Convolution\"\n  bottom: \"conv2_1\"\n  top: \"conv2_2\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 128\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu2_2\"\n  type: \"ReLU\"\n  bottom: \"conv2_2\"\n  top: \"conv2_2\"\n}\nlayer {\n  name: \"pool2\"\n  type: \"Pooling\"\n  bottom: \"conv2_2\"\n  top: \"pool2\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv3_1\"\n  type: \"Convolution\"\n  bottom: \"pool2\"\n  top: \"conv3_1\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_1\"\n  type: \"ReLU\"\n  bottom: \"conv3_1\"\n  top: \"conv3_1\"\n}\nlayer {\n  name: \"conv3_2\"\n  type: \"Convolution\"\n  bottom: \"conv3_1\"\n  top: \"conv3_2\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_2\"\n  type: \"ReLU\"\n  bottom: \"conv3_2\"\n  top: \"conv3_2\"\n}\nlayer {\n  name: \"conv3_3\"\n  type: \"Convolution\"\n  bottom: \"conv3_2\"\n  top: \"conv3_3\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_3\"\n  type: \"ReLU\"\n  bottom: \"conv3_3\"\n  top: \"conv3_3\"\n}\nlayer {\n  name: \"pool3\"\n  type: \"Pooling\"\n  bottom: \"conv3_3\"\n  top: \"pool3\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv4_1\"\n  type: \"Convolution\"\n  bottom: \"pool3\"\n  top: \"conv4_1\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_1\"\n  type: \"ReLU\"\n  bottom: \"conv4_1\"\n  top: \"conv4_1\"\n}\nlayer {\n  name: \"conv4_2\"\n  type: \"Convolution\"\n  bottom: \"conv4_1\"\n  top: \"conv4_2\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_2\"\n  type: \"ReLU\"\n  bottom: \"conv4_2\"\n  top: \"conv4_2\"\n}\nlayer {\n  name: \"conv4_3\"\n  type: \"Convolution\"\n  bottom: \"conv4_2\"\n  top: \"conv4_3\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_3\"\n  type: \"ReLU\"\n  bottom: \"conv4_3\"\n  top: \"conv4_3\"\n}\nlayer {\n  name: \"pool4\"\n  type: \"Pooling\"\n  bottom: \"conv4_3\"\n  top: \"pool4\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv5_1\"\n  type: \"Convolution\"\n  bottom: \"pool4\"\n  top: \"conv5_1\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_1\"\n  type: \"ReLU\"\n  bottom: \"conv5_1\"\n  top: \"conv5_1\"\n}\nlayer {\n  name: \"conv5_2\"\n  type: \"Convolution\"\n  bottom: \"conv5_1\"\n  top: \"conv5_2\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_2\"\n  type: \"ReLU\"\n  bottom: \"conv5_2\"\n  top: \"conv5_2\"\n}\nlayer {\n  name: \"conv5_3\"\n  type: \"Convolution\"\n  bottom: \"conv5_2\"\n  top: \"conv5_3\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_3\"\n  type: \"ReLU\"\n  bottom: \"conv5_3\"\n  top: \"conv5_3\"\n}\nlayer {\n  name: \"roi_pool5\"\n  type: \"ROIPooling\"\n  bottom: \"conv5_3\"\n  bottom: \"rois\"\n  top: \"pool5\"\n  roi_pooling_param {\n    pooled_w: 7\n    pooled_h: 7\n    spatial_scale: 0.0625 # 1/16\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"pool5\"\n  top: \"fc6\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu7\"\n  type: \"ReLU\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n}\nlayer {\n  name: \"cls_score\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"cls_score\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 81\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"bbox_pred\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"bbox_pred\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 324\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.001\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"cls_prob\"\n  type: \"Softmax\"\n  bottom: \"cls_score\"\n  top: \"cls_prob\"\n}\n"
  },
  {
    "path": "models/coco/VGG16/fast_rcnn/train.prototxt",
    "content": "name: \"VGG_ILSVRC_16_layers\"\nlayer {\n  name: 'data'\n  type: 'Python'\n  top: 'data'\n  top: 'rois'\n  top: 'labels'\n  top: 'bbox_targets'\n  top: 'bbox_inside_weights'\n  top: 'bbox_outside_weights'\n  python_param {\n    module: 'roi_data_layer.layer'\n    layer: 'RoIDataLayer'\n    param_str: \"'num_classes': 81\"\n  }\n}\nlayer {\n  name: \"conv1_1\"\n  type: \"Convolution\"\n  bottom: \"data\"\n  top: \"conv1_1\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 64\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu1_1\"\n  type: \"ReLU\"\n  bottom: \"conv1_1\"\n  top: \"conv1_1\"\n}\nlayer {\n  name: \"conv1_2\"\n  type: \"Convolution\"\n  bottom: \"conv1_1\"\n  top: \"conv1_2\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 64\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu1_2\"\n  type: \"ReLU\"\n  bottom: \"conv1_2\"\n  top: \"conv1_2\"\n}\nlayer {\n  name: \"pool1\"\n  type: \"Pooling\"\n  bottom: \"conv1_2\"\n  top: \"pool1\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv2_1\"\n  type: \"Convolution\"\n  bottom: \"pool1\"\n  top: \"conv2_1\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 128\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu2_1\"\n  type: \"ReLU\"\n  bottom: \"conv2_1\"\n  top: \"conv2_1\"\n}\nlayer {\n  name: \"conv2_2\"\n  type: \"Convolution\"\n  bottom: \"conv2_1\"\n  top: \"conv2_2\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 128\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu2_2\"\n  type: \"ReLU\"\n  bottom: \"conv2_2\"\n  top: \"conv2_2\"\n}\nlayer {\n  name: \"pool2\"\n  type: \"Pooling\"\n  bottom: \"conv2_2\"\n  top: \"pool2\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv3_1\"\n  type: \"Convolution\"\n  bottom: \"pool2\"\n  top: \"conv3_1\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_1\"\n  type: \"ReLU\"\n  bottom: \"conv3_1\"\n  top: \"conv3_1\"\n}\nlayer {\n  name: \"conv3_2\"\n  type: \"Convolution\"\n  bottom: \"conv3_1\"\n  top: \"conv3_2\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_2\"\n  type: \"ReLU\"\n  bottom: \"conv3_2\"\n  top: \"conv3_2\"\n}\nlayer {\n  name: \"conv3_3\"\n  type: \"Convolution\"\n  bottom: \"conv3_2\"\n  top: \"conv3_3\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_3\"\n  type: \"ReLU\"\n  bottom: \"conv3_3\"\n  top: \"conv3_3\"\n}\nlayer {\n  name: \"pool3\"\n  type: \"Pooling\"\n  bottom: \"conv3_3\"\n  top: \"pool3\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv4_1\"\n  type: \"Convolution\"\n  bottom: \"pool3\"\n  top: \"conv4_1\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_1\"\n  type: \"ReLU\"\n  bottom: \"conv4_1\"\n  top: \"conv4_1\"\n}\nlayer {\n  name: \"conv4_2\"\n  type: \"Convolution\"\n  bottom: \"conv4_1\"\n  top: \"conv4_2\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_2\"\n  type: \"ReLU\"\n  bottom: \"conv4_2\"\n  top: \"conv4_2\"\n}\nlayer {\n  name: \"conv4_3\"\n  type: \"Convolution\"\n  bottom: \"conv4_2\"\n  top: \"conv4_3\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_3\"\n  type: \"ReLU\"\n  bottom: \"conv4_3\"\n  top: \"conv4_3\"\n}\nlayer {\n  name: \"pool4\"\n  type: \"Pooling\"\n  bottom: \"conv4_3\"\n  top: \"pool4\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv5_1\"\n  type: \"Convolution\"\n  bottom: \"pool4\"\n  top: \"conv5_1\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_1\"\n  type: \"ReLU\"\n  bottom: \"conv5_1\"\n  top: \"conv5_1\"\n}\nlayer {\n  name: \"conv5_2\"\n  type: \"Convolution\"\n  bottom: \"conv5_1\"\n  top: \"conv5_2\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_2\"\n  type: \"ReLU\"\n  bottom: \"conv5_2\"\n  top: \"conv5_2\"\n}\nlayer {\n  name: \"conv5_3\"\n  type: \"Convolution\"\n  bottom: \"conv5_2\"\n  top: \"conv5_3\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_3\"\n  type: \"ReLU\"\n  bottom: \"conv5_3\"\n  top: \"conv5_3\"\n}\nlayer {\n  name: \"roi_pool5\"\n  type: \"ROIPooling\"\n  bottom: \"conv5_3\"\n  bottom: \"rois\"\n  top: \"pool5\"\n  roi_pooling_param {\n    pooled_w: 7\n    pooled_h: 7\n    spatial_scale: 0.0625 # 1/16\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"pool5\"\n  top: \"fc6\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu7\"\n  type: \"ReLU\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n}\nlayer {\n  name: \"cls_score\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"cls_score\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 81\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"bbox_pred\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"bbox_pred\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 324\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.001\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"loss_cls\"\n  type: \"SoftmaxWithLoss\"\n  bottom: \"cls_score\"\n  bottom: \"labels\"\n  top: \"loss_cls\"\n  loss_weight: 1\n}\nlayer {\n  name: \"loss_bbox\"\n  type: \"SmoothL1Loss\"\n  bottom: \"bbox_pred\"\n  bottom: \"bbox_targets\"\n  bottom: \"bbox_inside_weights\"\n  bottom: \"bbox_outside_weights\"\n  top: \"loss_bbox\"\n  loss_weight: 1\n}\n"
  },
  {
    "path": "models/coco/VGG16/faster_rcnn_end2end/solver.prototxt",
    "content": "train_net: \"models/coco/VGG16/faster_rcnn_end2end/train.prototxt\"\nbase_lr: 0.001\nlr_policy: \"step\"\ngamma: 0.1\nstepsize: 350000\ndisplay: 20\naverage_loss: 100\nmomentum: 0.9\nweight_decay: 0.0005\n# We disable standard caffe solver snapshotting and implement our own snapshot\n# function\nsnapshot: 0\n# We still use the snapshot prefix, though\nsnapshot_prefix: \"vgg16_faster_rcnn\"\niter_size: 2\n"
  },
  {
    "path": "models/coco/VGG16/faster_rcnn_end2end/test.prototxt",
    "content": "name: \"VGG_ILSVRC_16_layers\"\n\ninput: \"data\"\ninput_shape {\n  dim: 1\n  dim: 3\n  dim: 224\n  dim: 224\n}\n\ninput: \"im_info\"\ninput_shape {\n  dim: 1\n  dim: 3\n}\n\nlayer {\n  name: \"conv1_1\"\n  type: \"Convolution\"\n  bottom: \"data\"\n  top: \"conv1_1\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 64\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu1_1\"\n  type: \"ReLU\"\n  bottom: \"conv1_1\"\n  top: \"conv1_1\"\n}\nlayer {\n  name: \"conv1_2\"\n  type: \"Convolution\"\n  bottom: \"conv1_1\"\n  top: \"conv1_2\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 64\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu1_2\"\n  type: \"ReLU\"\n  bottom: \"conv1_2\"\n  top: \"conv1_2\"\n}\nlayer {\n  name: \"pool1\"\n  type: \"Pooling\"\n  bottom: \"conv1_2\"\n  top: \"pool1\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv2_1\"\n  type: \"Convolution\"\n  bottom: \"pool1\"\n  top: \"conv2_1\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 128\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu2_1\"\n  type: \"ReLU\"\n  bottom: \"conv2_1\"\n  top: \"conv2_1\"\n}\nlayer {\n  name: \"conv2_2\"\n  type: \"Convolution\"\n  bottom: \"conv2_1\"\n  top: \"conv2_2\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 128\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu2_2\"\n  type: \"ReLU\"\n  bottom: \"conv2_2\"\n  top: \"conv2_2\"\n}\nlayer {\n  name: \"pool2\"\n  type: \"Pooling\"\n  bottom: \"conv2_2\"\n  top: \"pool2\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv3_1\"\n  type: \"Convolution\"\n  bottom: \"pool2\"\n  top: \"conv3_1\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_1\"\n  type: \"ReLU\"\n  bottom: \"conv3_1\"\n  top: \"conv3_1\"\n}\nlayer {\n  name: \"conv3_2\"\n  type: \"Convolution\"\n  bottom: \"conv3_1\"\n  top: \"conv3_2\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_2\"\n  type: \"ReLU\"\n  bottom: \"conv3_2\"\n  top: \"conv3_2\"\n}\nlayer {\n  name: \"conv3_3\"\n  type: \"Convolution\"\n  bottom: \"conv3_2\"\n  top: \"conv3_3\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_3\"\n  type: \"ReLU\"\n  bottom: \"conv3_3\"\n  top: \"conv3_3\"\n}\nlayer {\n  name: \"pool3\"\n  type: \"Pooling\"\n  bottom: \"conv3_3\"\n  top: \"pool3\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv4_1\"\n  type: \"Convolution\"\n  bottom: \"pool3\"\n  top: \"conv4_1\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_1\"\n  type: \"ReLU\"\n  bottom: \"conv4_1\"\n  top: \"conv4_1\"\n}\nlayer {\n  name: \"conv4_2\"\n  type: \"Convolution\"\n  bottom: \"conv4_1\"\n  top: \"conv4_2\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_2\"\n  type: \"ReLU\"\n  bottom: \"conv4_2\"\n  top: \"conv4_2\"\n}\nlayer {\n  name: \"conv4_3\"\n  type: \"Convolution\"\n  bottom: \"conv4_2\"\n  top: \"conv4_3\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_3\"\n  type: \"ReLU\"\n  bottom: \"conv4_3\"\n  top: \"conv4_3\"\n}\nlayer {\n  name: \"pool4\"\n  type: \"Pooling\"\n  bottom: \"conv4_3\"\n  top: \"pool4\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv5_1\"\n  type: \"Convolution\"\n  bottom: \"pool4\"\n  top: \"conv5_1\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_1\"\n  type: \"ReLU\"\n  bottom: \"conv5_1\"\n  top: \"conv5_1\"\n}\nlayer {\n  name: \"conv5_2\"\n  type: \"Convolution\"\n  bottom: \"conv5_1\"\n  top: \"conv5_2\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_2\"\n  type: \"ReLU\"\n  bottom: \"conv5_2\"\n  top: \"conv5_2\"\n}\nlayer {\n  name: \"conv5_3\"\n  type: \"Convolution\"\n  bottom: \"conv5_2\"\n  top: \"conv5_3\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_3\"\n  type: \"ReLU\"\n  bottom: \"conv5_3\"\n  top: \"conv5_3\"\n}\n\n#========= RPN ============\n\nlayer {\n  name: \"rpn_conv/3x3\"\n  type: \"Convolution\"\n  bottom: \"conv5_3\"\n  top: \"rpn/output\"\n  param { lr_mult: 1.0 decay_mult: 1.0 }\n  param { lr_mult: 2.0 decay_mult: 0 }\n  convolution_param {\n    num_output: 512\n    kernel_size: 3 pad: 1 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_relu/3x3\"\n  type: \"ReLU\"\n  bottom: \"rpn/output\"\n  top: \"rpn/output\"\n}\n\nlayer {\n  name: \"rpn_cls_score\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_cls_score\"\n  param { lr_mult: 1.0 decay_mult: 1.0 }\n  param { lr_mult: 2.0 decay_mult: 0 }\n  convolution_param {\n    num_output: 24   # 2(bg/fg) * 12(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_bbox_pred\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_bbox_pred\"\n  param { lr_mult: 1.0 decay_mult: 1.0 }\n  param { lr_mult: 2.0 decay_mult: 0 }\n  convolution_param {\n    num_output: 48   # 4 * 12(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n   bottom: \"rpn_cls_score\"\n   top: \"rpn_cls_score_reshape\"\n   name: \"rpn_cls_score_reshape\"\n   type: \"Reshape\"\n   reshape_param { shape { dim: 0 dim: 2 dim: -1 dim: 0 } }\n}\n\n#========= RoI Proposal ============\n\nlayer {\n  name: \"rpn_cls_prob\"\n  type: \"Softmax\"\n  bottom: \"rpn_cls_score_reshape\"\n  top: \"rpn_cls_prob\"\n}\nlayer {\n  name: 'rpn_cls_prob_reshape'\n  type: 'Reshape'\n  bottom: 'rpn_cls_prob'\n  top: 'rpn_cls_prob_reshape'\n  reshape_param { shape { dim: 0 dim: 24 dim: -1 dim: 0 } }\n}\nlayer {\n  name: 'proposal'\n  type: 'Python'\n  bottom: 'rpn_cls_prob_reshape'\n  bottom: 'rpn_bbox_pred'\n  bottom: 'im_info'\n  top: 'rois'\n  python_param {\n    module: 'rpn.proposal_layer'\n    layer: 'ProposalLayer'\n    param_str: \"'feat_stride': 16 \\n'scales': !!python/tuple [4, 8, 16, 32]\"\n  }\n}\n\n#========= RCNN ============\n\nlayer {\n  name: \"roi_pool5\"\n  type: \"ROIPooling\"\n  bottom: \"conv5_3\"\n  bottom: \"rois\"\n  top: \"pool5\"\n  roi_pooling_param {\n    pooled_w: 7\n    pooled_h: 7\n    spatial_scale: 0.0625 # 1/16\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"pool5\"\n  top: \"fc6\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu7\"\n  type: \"ReLU\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n}\nlayer {\n  name: \"cls_score\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"cls_score\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 81\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"bbox_pred\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"bbox_pred\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 324\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.001\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"cls_prob\"\n  type: \"Softmax\"\n  bottom: \"cls_score\"\n  top: \"cls_prob\"\n}\n"
  },
  {
    "path": "models/coco/VGG16/faster_rcnn_end2end/train.prototxt",
    "content": "name: \"VGG_ILSVRC_16_layers\"\nlayer {\n  name: 'input-data'\n  type: 'Python'\n  top: 'data'\n  top: 'im_info'\n  top: 'gt_boxes'\n  python_param {\n    module: 'roi_data_layer.layer'\n    layer: 'RoIDataLayer'\n    param_str: \"'num_classes': 81\"\n  }\n}\n\nlayer {\n  name: \"conv1_1\"\n  type: \"Convolution\"\n  bottom: \"data\"\n  top: \"conv1_1\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 64\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu1_1\"\n  type: \"ReLU\"\n  bottom: \"conv1_1\"\n  top: \"conv1_1\"\n}\nlayer {\n  name: \"conv1_2\"\n  type: \"Convolution\"\n  bottom: \"conv1_1\"\n  top: \"conv1_2\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 64\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu1_2\"\n  type: \"ReLU\"\n  bottom: \"conv1_2\"\n  top: \"conv1_2\"\n}\nlayer {\n  name: \"pool1\"\n  type: \"Pooling\"\n  bottom: \"conv1_2\"\n  top: \"pool1\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv2_1\"\n  type: \"Convolution\"\n  bottom: \"pool1\"\n  top: \"conv2_1\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 128\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu2_1\"\n  type: \"ReLU\"\n  bottom: \"conv2_1\"\n  top: \"conv2_1\"\n}\nlayer {\n  name: \"conv2_2\"\n  type: \"Convolution\"\n  bottom: \"conv2_1\"\n  top: \"conv2_2\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 128\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu2_2\"\n  type: \"ReLU\"\n  bottom: \"conv2_2\"\n  top: \"conv2_2\"\n}\nlayer {\n  name: \"pool2\"\n  type: \"Pooling\"\n  bottom: \"conv2_2\"\n  top: \"pool2\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv3_1\"\n  type: \"Convolution\"\n  bottom: \"pool2\"\n  top: \"conv3_1\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_1\"\n  type: \"ReLU\"\n  bottom: \"conv3_1\"\n  top: \"conv3_1\"\n}\nlayer {\n  name: \"conv3_2\"\n  type: \"Convolution\"\n  bottom: \"conv3_1\"\n  top: \"conv3_2\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_2\"\n  type: \"ReLU\"\n  bottom: \"conv3_2\"\n  top: \"conv3_2\"\n}\nlayer {\n  name: \"conv3_3\"\n  type: \"Convolution\"\n  bottom: \"conv3_2\"\n  top: \"conv3_3\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_3\"\n  type: \"ReLU\"\n  bottom: \"conv3_3\"\n  top: \"conv3_3\"\n}\nlayer {\n  name: \"pool3\"\n  type: \"Pooling\"\n  bottom: \"conv3_3\"\n  top: \"pool3\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv4_1\"\n  type: \"Convolution\"\n  bottom: \"pool3\"\n  top: \"conv4_1\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_1\"\n  type: \"ReLU\"\n  bottom: \"conv4_1\"\n  top: \"conv4_1\"\n}\nlayer {\n  name: \"conv4_2\"\n  type: \"Convolution\"\n  bottom: \"conv4_1\"\n  top: \"conv4_2\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_2\"\n  type: \"ReLU\"\n  bottom: \"conv4_2\"\n  top: \"conv4_2\"\n}\nlayer {\n  name: \"conv4_3\"\n  type: \"Convolution\"\n  bottom: \"conv4_2\"\n  top: \"conv4_3\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_3\"\n  type: \"ReLU\"\n  bottom: \"conv4_3\"\n  top: \"conv4_3\"\n}\nlayer {\n  name: \"pool4\"\n  type: \"Pooling\"\n  bottom: \"conv4_3\"\n  top: \"pool4\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv5_1\"\n  type: \"Convolution\"\n  bottom: \"pool4\"\n  top: \"conv5_1\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_1\"\n  type: \"ReLU\"\n  bottom: \"conv5_1\"\n  top: \"conv5_1\"\n}\nlayer {\n  name: \"conv5_2\"\n  type: \"Convolution\"\n  bottom: \"conv5_1\"\n  top: \"conv5_2\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_2\"\n  type: \"ReLU\"\n  bottom: \"conv5_2\"\n  top: \"conv5_2\"\n}\nlayer {\n  name: \"conv5_3\"\n  type: \"Convolution\"\n  bottom: \"conv5_2\"\n  top: \"conv5_3\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_3\"\n  type: \"ReLU\"\n  bottom: \"conv5_3\"\n  top: \"conv5_3\"\n}\n\n#========= RPN ============\n\nlayer {\n  name: \"rpn_conv/3x3\"\n  type: \"Convolution\"\n  bottom: \"conv5_3\"\n  top: \"rpn/output\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 512\n    kernel_size: 3 pad: 1 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_relu/3x3\"\n  type: \"ReLU\"\n  bottom: \"rpn/output\"\n  top: \"rpn/output\"\n}\n\nlayer {\n  name: \"rpn_cls_score\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_cls_score\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 24   # 2(bg/fg) * 12(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\n\nlayer {\n  name: \"rpn_bbox_pred\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_bbox_pred\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 48   # 4 * 12(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\n\nlayer {\n   bottom: \"rpn_cls_score\"\n   top: \"rpn_cls_score_reshape\"\n   name: \"rpn_cls_score_reshape\"\n   type: \"Reshape\"\n   reshape_param { shape { dim: 0 dim: 2 dim: -1 dim: 0 } }\n}\n\nlayer {\n  name: 'rpn-data'\n  type: 'Python'\n  bottom: 'rpn_cls_score'\n  bottom: 'gt_boxes'\n  bottom: 'im_info'\n  bottom: 'data'\n  top: 'rpn_labels'\n  top: 'rpn_bbox_targets'\n  top: 'rpn_bbox_inside_weights'\n  top: 'rpn_bbox_outside_weights'\n  python_param {\n    module: 'rpn.anchor_target_layer'\n    layer: 'AnchorTargetLayer'\n    param_str: \"'feat_stride': 16 \\n'scales': !!python/tuple [4, 8, 16, 32]\"\n  }\n}\n\nlayer {\n  name: \"rpn_loss_cls\"\n  type: \"SoftmaxWithLoss\"\n  bottom: \"rpn_cls_score_reshape\"\n  bottom: \"rpn_labels\"\n  propagate_down: 1\n  propagate_down: 0\n  top: \"rpn_cls_loss\"\n  loss_weight: 1\n  loss_param {\n    ignore_label: -1\n    normalize: true\n  }\n}\n\nlayer {\n  name: \"rpn_loss_bbox\"\n  type: \"SmoothL1Loss\"\n  bottom: \"rpn_bbox_pred\"\n  bottom: \"rpn_bbox_targets\"\n  bottom: 'rpn_bbox_inside_weights'\n  bottom: 'rpn_bbox_outside_weights'\n  top: \"rpn_loss_bbox\"\n  loss_weight: 1\n  smooth_l1_loss_param { sigma: 3.0 }\n}\n\n#========= RoI Proposal ============\n\nlayer {\n  name: \"rpn_cls_prob\"\n  type: \"Softmax\"\n  bottom: \"rpn_cls_score_reshape\"\n  top: \"rpn_cls_prob\"\n}\n\nlayer {\n  name: 'rpn_cls_prob_reshape'\n  type: 'Reshape'\n  bottom: 'rpn_cls_prob'\n  top: 'rpn_cls_prob_reshape'\n  reshape_param { shape { dim: 0 dim: 24 dim: -1 dim: 0 } }\n}\n\nlayer {\n  name: 'proposal'\n  type: 'Python'\n  bottom: 'rpn_cls_prob_reshape'\n  bottom: 'rpn_bbox_pred'\n  bottom: 'im_info'\n  top: 'rpn_rois'\n  python_param {\n    module: 'rpn.proposal_layer'\n    layer: 'ProposalLayer'\n    param_str: \"'feat_stride': 16 \\n'scales': !!python/tuple [4, 8, 16, 32]\"\n  }\n}\n\nlayer {\n  name: 'roi-data'\n  type: 'Python'\n  bottom: 'rpn_rois'\n  bottom: 'gt_boxes'\n  top: 'rois'\n  top: 'labels'\n  top: 'bbox_targets'\n  top: 'bbox_inside_weights'\n  top: 'bbox_outside_weights'\n  python_param {\n    module: 'rpn.proposal_target_layer'\n    layer: 'ProposalTargetLayer'\n    param_str: \"'num_classes': 81\"\n  }\n}\n\n#========= RCNN ============\n\nlayer {\n  name: \"roi_pool5\"\n  type: \"ROIPooling\"\n  bottom: \"conv5_3\"\n  bottom: \"rois\"\n  top: \"pool5\"\n  roi_pooling_param {\n    pooled_w: 7\n    pooled_h: 7\n    spatial_scale: 0.0625 # 1/16\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"pool5\"\n  top: \"fc6\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu7\"\n  type: \"ReLU\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n}\nlayer {\n  name: \"cls_score\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"cls_score\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 81\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"bbox_pred\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"bbox_pred\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 324\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.001\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"loss_cls\"\n  type: \"SoftmaxWithLoss\"\n  bottom: \"cls_score\"\n  bottom: \"labels\"\n  propagate_down: 1\n  propagate_down: 0\n  top: \"loss_cls\"\n  loss_weight: 1\n}\nlayer {\n  name: \"loss_bbox\"\n  type: \"SmoothL1Loss\"\n  bottom: \"bbox_pred\"\n  bottom: \"bbox_targets\"\n  bottom: \"bbox_inside_weights\"\n  bottom: \"bbox_outside_weights\"\n  top: \"loss_bbox\"\n  loss_weight: 1\n}\n"
  },
  {
    "path": "models/coco/VGG_CNN_M_1024/fast_rcnn/solver.prototxt",
    "content": "train_net: \"models/coco/VGG_CNN_M_1024/fast_rcnn/train.prototxt\"\nbase_lr: 0.001\nlr_policy: \"step\"\ngamma: 0.1\nstepsize: 200000\ndisplay: 20\naverage_loss: 100\nmomentum: 0.9\nweight_decay: 0.0005\n# We disable standard caffe solver snapshotting and implement our own snapshot\n# function\nsnapshot: 0\n# We still use the snapshot prefix, though\nsnapshot_prefix: \"vgg_cnn_m_1024_fast_rcnn\"\n#debug_info: true\n"
  },
  {
    "path": "models/coco/VGG_CNN_M_1024/fast_rcnn/test.prototxt",
    "content": "name: \"VGG_CNN_M_1024\"\ninput: \"data\"\ninput_shape {\n  dim: 1\n  dim: 3\n  dim: 224\n  dim: 224\n}\ninput: \"rois\"\ninput_shape {\n  dim: 1 # to be changed on-the-fly to num ROIs\n  dim: 5 # [batch ind, x1, y1, x2, y2] zero-based indexing\n}\nlayer {\n  name: \"conv1\"\n  type: \"Convolution\"\n  bottom: \"data\"\n  top: \"conv1\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 96\n    kernel_size: 7\n    stride: 2\n  }\n}\nlayer {\n  name: \"relu1\"\n  type: \"ReLU\"\n  bottom: \"conv1\"\n  top: \"conv1\"\n}\nlayer {\n  name: \"norm1\"\n  type: \"LRN\"\n  bottom: \"conv1\"\n  top: \"norm1\"\n  lrn_param {\n    local_size: 5\n    alpha: 0.0005\n    beta: 0.75\n    k: 2\n  }\n}\nlayer {\n  name: \"pool1\"\n  type: \"Pooling\"\n  bottom: \"norm1\"\n  top: \"pool1\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 3\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv2\"\n  type: \"Convolution\"\n  bottom: \"pool1\"\n  top: \"conv2\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 5\n    stride: 2\n  }\n}\nlayer {\n  name: \"relu2\"\n  type: \"ReLU\"\n  bottom: \"conv2\"\n  top: \"conv2\"\n}\nlayer {\n  name: \"norm2\"\n  type: \"LRN\"\n  bottom: \"conv2\"\n  top: \"norm2\"\n  lrn_param {\n    local_size: 5\n    alpha: 0.0005\n    beta: 0.75\n    k: 2\n  }\n}\nlayer {\n  name: \"pool2\"\n  type: \"Pooling\"\n  bottom: \"norm2\"\n  top: \"pool2\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 3\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv3\"\n  type: \"Convolution\"\n  bottom: \"pool2\"\n  top: \"conv3\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3\"\n  type: \"ReLU\"\n  bottom: \"conv3\"\n  top: \"conv3\"\n}\nlayer {\n  name: \"conv4\"\n  type: \"Convolution\"\n  bottom: \"conv3\"\n  top: \"conv4\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4\"\n  type: \"ReLU\"\n  bottom: \"conv4\"\n  top: \"conv4\"\n}\nlayer {\n  name: \"conv5\"\n  type: \"Convolution\"\n  bottom: \"conv4\"\n  top: \"conv5\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5\"\n  type: \"ReLU\"\n  bottom: \"conv5\"\n  top: \"conv5\"\n}\nlayer {\n  name: \"roi_pool5\"\n  type: \"ROIPooling\"\n  bottom: \"conv5\"\n  bottom: \"rois\"\n  top: \"pool5\"\n  roi_pooling_param {\n    pooled_w: 6\n    pooled_h: 6\n    spatial_scale: 0.0625 # 1/16\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"pool5\"\n  top: \"fc6\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 1024\n  }\n}\nlayer {\n  name: \"relu7\"\n  type: \"ReLU\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n}\nlayer {\n  name: \"cls_score\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"cls_score\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 81\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"bbox_pred\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"bbox_pred\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 324\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.001\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"cls_prob\"\n  type: \"Softmax\"\n  bottom: \"cls_score\"\n  top: \"cls_prob\"\n}\n"
  },
  {
    "path": "models/coco/VGG_CNN_M_1024/fast_rcnn/train.prototxt",
    "content": "name: \"VGG_CNN_M_1024\"\nlayer {\n  name: 'data'\n  type: 'Python'\n  top: 'data'\n  top: 'rois'\n  top: 'labels'\n  top: 'bbox_targets'\n  top: 'bbox_inside_weights'\n  top: 'bbox_outside_weights'\n  python_param {\n    module: 'roi_data_layer.layer'\n    layer: 'RoIDataLayer'\n    param_str: \"'num_classes': 81\"\n  }\n}\nlayer {\n  name: \"conv1\"\n  type: \"Convolution\"\n  bottom: \"data\"\n  top: \"conv1\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 96\n    kernel_size: 7\n    stride: 2\n  }\n}\nlayer {\n  name: \"relu1\"\n  type: \"ReLU\"\n  bottom: \"conv1\"\n  top: \"conv1\"\n}\nlayer {\n  name: \"norm1\"\n  type: \"LRN\"\n  bottom: \"conv1\"\n  top: \"norm1\"\n  lrn_param {\n    local_size: 5\n    alpha: 0.0005\n    beta: 0.75\n    k: 2\n  }\n}\nlayer {\n  name: \"pool1\"\n  type: \"Pooling\"\n  bottom: \"norm1\"\n  top: \"pool1\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 3\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv2\"\n  type: \"Convolution\"\n  bottom: \"pool1\"\n  top: \"conv2\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 5\n    stride: 2\n  }\n}\nlayer {\n  name: \"relu2\"\n  type: \"ReLU\"\n  bottom: \"conv2\"\n  top: \"conv2\"\n}\nlayer {\n  name: \"norm2\"\n  type: \"LRN\"\n  bottom: \"conv2\"\n  top: \"norm2\"\n  lrn_param {\n    local_size: 5\n    alpha: 0.0005\n    beta: 0.75\n    k: 2\n  }\n}\nlayer {\n  name: \"pool2\"\n  type: \"Pooling\"\n  bottom: \"norm2\"\n  top: \"pool2\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 3\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv3\"\n  type: \"Convolution\"\n  bottom: \"pool2\"\n  top: \"conv3\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3\"\n  type: \"ReLU\"\n  bottom: \"conv3\"\n  top: \"conv3\"\n}\nlayer {\n  name: \"conv4\"\n  type: \"Convolution\"\n  bottom: \"conv3\"\n  top: \"conv4\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4\"\n  type: \"ReLU\"\n  bottom: \"conv4\"\n  top: \"conv4\"\n}\nlayer {\n  name: \"conv5\"\n  type: \"Convolution\"\n  bottom: \"conv4\"\n  top: \"conv5\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5\"\n  type: \"ReLU\"\n  bottom: \"conv5\"\n  top: \"conv5\"\n}\nlayer {\n  name: \"roi_pool5\"\n  type: \"ROIPooling\"\n  bottom: \"conv5\"\n  bottom: \"rois\"\n  top: \"pool5\"\n  roi_pooling_param {\n    pooled_w: 6\n    pooled_h: 6\n    spatial_scale: 0.0625 # 1/16\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"pool5\"\n  top: \"fc6\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 1024\n  }\n}\nlayer {\n  name: \"relu7\"\n  type: \"ReLU\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n}\nlayer {\n  name: \"cls_score\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"cls_score\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 81\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"bbox_pred\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"bbox_pred\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 324\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.001\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"loss_cls\"\n  type: \"SoftmaxWithLoss\"\n  bottom: \"cls_score\"\n  bottom: \"labels\"\n  top: \"loss_cls\"\n  loss_weight: 1\n}\nlayer {\n  name: \"loss_bbox\"\n  type: \"SmoothL1Loss\"\n  bottom: \"bbox_pred\"\n  bottom: \"bbox_targets\"\n  bottom: \"bbox_inside_weights\"\n  bottom: \"bbox_outside_weights\"\n  top: \"loss_bbox\"\n  loss_weight: 1\n}\n"
  },
  {
    "path": "models/coco/VGG_CNN_M_1024/faster_rcnn_end2end/solver.prototxt",
    "content": "train_net: \"models/coco/VGG_CNN_M_1024/faster_rcnn_end2end/train.prototxt\"\nbase_lr: 0.001\nlr_policy: \"step\"\ngamma: 0.1\nstepsize: 350000\ndisplay: 20\naverage_loss: 100\nmomentum: 0.9\nweight_decay: 0.0005\n# We disable standard caffe solver snapshotting and implement our own snapshot\n# function\nsnapshot: 0\n# We still use the snapshot prefix, though\nsnapshot_prefix: \"vgg_cnn_m_1024_faster_rcnn\"\n"
  },
  {
    "path": "models/coco/VGG_CNN_M_1024/faster_rcnn_end2end/test.prototxt",
    "content": "name: \"VGG_CNN_M_1024\"\ninput: \"data\"\ninput_shape {\n  dim: 1\n  dim: 3\n  dim: 224\n  dim: 224\n}\ninput: \"im_info\"\ninput_shape {\n  dim: 1\n  dim: 3\n}\nlayer {\n  name: \"conv1\"\n  type: \"Convolution\"\n  bottom: \"data\"\n  top: \"conv1\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 96\n    kernel_size: 7\n    stride: 2\n  }\n}\nlayer {\n  name: \"relu1\"\n  type: \"ReLU\"\n  bottom: \"conv1\"\n  top: \"conv1\"\n}\nlayer {\n  name: \"norm1\"\n  type: \"LRN\"\n  bottom: \"conv1\"\n  top: \"norm1\"\n  lrn_param {\n    local_size: 5\n    alpha: 0.0005\n    beta: 0.75\n    k: 2\n  }\n}\nlayer {\n  name: \"pool1\"\n  type: \"Pooling\"\n  bottom: \"norm1\"\n  top: \"pool1\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 3\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv2\"\n  type: \"Convolution\"\n  bottom: \"pool1\"\n  top: \"conv2\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 5\n    stride: 2\n  }\n}\nlayer {\n  name: \"relu2\"\n  type: \"ReLU\"\n  bottom: \"conv2\"\n  top: \"conv2\"\n}\nlayer {\n  name: \"norm2\"\n  type: \"LRN\"\n  bottom: \"conv2\"\n  top: \"norm2\"\n  lrn_param {\n    local_size: 5\n    alpha: 0.0005\n    beta: 0.75\n    k: 2\n  }\n}\nlayer {\n  name: \"pool2\"\n  type: \"Pooling\"\n  bottom: \"norm2\"\n  top: \"pool2\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 3\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv3\"\n  type: \"Convolution\"\n  bottom: \"pool2\"\n  top: \"conv3\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3\"\n  type: \"ReLU\"\n  bottom: \"conv3\"\n  top: \"conv3\"\n}\nlayer {\n  name: \"conv4\"\n  type: \"Convolution\"\n  bottom: \"conv3\"\n  top: \"conv4\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4\"\n  type: \"ReLU\"\n  bottom: \"conv4\"\n  top: \"conv4\"\n}\nlayer {\n  name: \"conv5\"\n  type: \"Convolution\"\n  bottom: \"conv4\"\n  top: \"conv5\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5\"\n  type: \"ReLU\"\n  bottom: \"conv5\"\n  top: \"conv5\"\n}\n\n#========= RPN ============\n\nlayer {\n  name: \"rpn_conv/3x3\"\n  type: \"Convolution\"\n  bottom: \"conv5\"\n  top: \"rpn/output\"\n  param { lr_mult: 1.0 decay_mult: 1.0 }\n  param { lr_mult: 2.0 decay_mult: 0 }\n  convolution_param {\n    num_output: 256\n    kernel_size: 3 pad: 1 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_relu/3x3\"\n  type: \"ReLU\"\n  bottom: \"rpn/output\"\n  top: \"rpn/output\"\n}\n\n#layer {\n#  name: \"rpn_conv/3x3\"\n#  type: \"Convolution\"\n#  bottom: \"conv5\"\n#  top: \"rpn_conv/3x3\"\n#  param { lr_mult: 1.0 decay_mult: 1.0 }\n#  param { lr_mult: 2.0 decay_mult: 0 }\n#  convolution_param {\n#    num_output: 192\n#    kernel_size: 3 pad: 1 stride: 1\n#    weight_filler { type: \"gaussian\" std: 0.01 }\n#    bias_filler { type: \"constant\" value: 0 }\n#  }\n#}\n#layer {\n#  name: \"rpn_conv/5x5\"\n#  type: \"Convolution\"\n#  bottom: \"conv5\"\n#  top: \"rpn_conv/5x5\"\n#  param { lr_mult: 1.0 decay_mult: 1.0 }\n#  param { lr_mult: 2.0 decay_mult: 0 }\n#  convolution_param {\n#    num_output: 64\n#    kernel_size: 5 pad: 2 stride: 1\n#    weight_filler { type: \"gaussian\" std: 0.0036 }\n#    bias_filler { type: \"constant\" value: 0 }\n#  }\n#}\n#layer {\n#  name: \"rpn/output\"\n#  type: \"Concat\"\n#  bottom: \"rpn_conv/3x3\"\n#  bottom: \"rpn_conv/5x5\"\n#  top: \"rpn/output\"\n#}\n#layer {\n#  name: \"rpn_relu/output\"\n#  type: \"ReLU\"\n#  bottom: \"rpn/output\"\n#  top: \"rpn/output\"\n#}\n\nlayer {\n  name: \"rpn_cls_score\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_cls_score\"\n  param { lr_mult: 1.0 decay_mult: 1.0 }\n  param { lr_mult: 2.0 decay_mult: 0 }\n  convolution_param {\n    num_output: 24  # 2(bg/fg) * 12(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_bbox_pred\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_bbox_pred\"\n  param { lr_mult: 1.0 decay_mult: 1.0 }\n  param { lr_mult: 2.0 decay_mult: 0 }\n  convolution_param {\n    num_output: 48   # 4 * 12(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n   bottom: \"rpn_cls_score\"\n   top: \"rpn_cls_score_reshape\"\n   name: \"rpn_cls_score_reshape\"\n   type: \"Reshape\"\n   reshape_param { shape { dim: 0 dim: 2 dim: -1 dim: 0 } }\n}\n\n#========= RoI Proposal ============\n\nlayer {\n  name: \"rpn_cls_prob\"\n  type: \"Softmax\"\n  bottom: \"rpn_cls_score_reshape\"\n  top: \"rpn_cls_prob\"\n}\nlayer {\n  name: 'rpn_cls_prob_reshape'\n  type: 'Reshape'\n  bottom: 'rpn_cls_prob'\n  top: 'rpn_cls_prob_reshape'\n  reshape_param { shape { dim: 0 dim: 24 dim: -1 dim: 0 } }\n}\nlayer {\n  name: 'proposal'\n  type: 'Python'\n  bottom: 'rpn_cls_prob_reshape'\n  bottom: 'rpn_bbox_pred'\n  bottom: 'im_info'\n  top: 'rois'\n  python_param {\n    module: 'rpn.proposal_layer'\n    layer: 'ProposalLayer'\n    param_str: \"'feat_stride': 16 \\n'scales': !!python/tuple [4, 8, 16, 32]\"\n  }\n}\n\n#========= RCNN ============\n\nlayer {\n  name: \"roi_pool5\"\n  type: \"ROIPooling\"\n  bottom: \"conv5\"\n  bottom: \"rois\"\n  top: \"pool5\"\n  roi_pooling_param {\n    pooled_w: 6\n    pooled_h: 6\n    spatial_scale: 0.0625 # 1/16\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"pool5\"\n  top: \"fc6\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 1024\n  }\n}\nlayer {\n  name: \"relu7\"\n  type: \"ReLU\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n}\nlayer {\n  name: \"cls_score\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"cls_score\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 81\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"bbox_pred\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"bbox_pred\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 324\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.001\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"cls_prob\"\n  type: \"Softmax\"\n  bottom: \"cls_score\"\n  top: \"cls_prob\"\n}\n"
  },
  {
    "path": "models/coco/VGG_CNN_M_1024/faster_rcnn_end2end/train.prototxt",
    "content": "name: \"VGG_CNN_M_1024\"\nlayer {\n  name: 'input-data'\n  type: 'Python'\n  top: 'data'\n  top: 'im_info'\n  top: 'gt_boxes'\n  python_param {\n    module: 'roi_data_layer.layer'\n    layer: 'RoIDataLayer'\n    param_str: \"'num_classes': 81\"\n  }\n}\nlayer {\n  name: \"conv1\"\n  type: \"Convolution\"\n  bottom: \"data\"\n  top: \"conv1\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 96\n    kernel_size: 7\n    stride: 2\n  }\n}\nlayer {\n  name: \"relu1\"\n  type: \"ReLU\"\n  bottom: \"conv1\"\n  top: \"conv1\"\n}\nlayer {\n  name: \"norm1\"\n  type: \"LRN\"\n  bottom: \"conv1\"\n  top: \"norm1\"\n  lrn_param {\n    local_size: 5\n    alpha: 0.0005\n    beta: 0.75\n    k: 2\n  }\n}\nlayer {\n  name: \"pool1\"\n  type: \"Pooling\"\n  bottom: \"norm1\"\n  top: \"pool1\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 3\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv2\"\n  type: \"Convolution\"\n  bottom: \"pool1\"\n  top: \"conv2\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 5\n    stride: 2\n  }\n}\nlayer {\n  name: \"relu2\"\n  type: \"ReLU\"\n  bottom: \"conv2\"\n  top: \"conv2\"\n}\nlayer {\n  name: \"norm2\"\n  type: \"LRN\"\n  bottom: \"conv2\"\n  top: \"norm2\"\n  lrn_param {\n    local_size: 5\n    alpha: 0.0005\n    beta: 0.75\n    k: 2\n  }\n}\nlayer {\n  name: \"pool2\"\n  type: \"Pooling\"\n  bottom: \"norm2\"\n  top: \"pool2\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 3\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv3\"\n  type: \"Convolution\"\n  bottom: \"pool2\"\n  top: \"conv3\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3\"\n  type: \"ReLU\"\n  bottom: \"conv3\"\n  top: \"conv3\"\n}\nlayer {\n  name: \"conv4\"\n  type: \"Convolution\"\n  bottom: \"conv3\"\n  top: \"conv4\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4\"\n  type: \"ReLU\"\n  bottom: \"conv4\"\n  top: \"conv4\"\n}\nlayer {\n  name: \"conv5\"\n  type: \"Convolution\"\n  bottom: \"conv4\"\n  top: \"conv5\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5\"\n  type: \"ReLU\"\n  bottom: \"conv5\"\n  top: \"conv5\"\n}\n\n#========= RPN ============\n\nlayer {\n  name: \"rpn_conv/3x3\"\n  type: \"Convolution\"\n  bottom: \"conv5\"\n  top: \"rpn/output\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 256\n    kernel_size: 3 pad: 1 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_relu/3x3\"\n  type: \"ReLU\"\n  bottom: \"rpn/output\"\n  top: \"rpn/output\"\n}\nlayer {\n  name: \"rpn_cls_score\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_cls_score\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 24   # 2(bg/fg) * 12(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\n\nlayer {\n  name: \"rpn_bbox_pred\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_bbox_pred\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 48   # 4 * 12(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\n\nlayer {\n   bottom: \"rpn_cls_score\"\n   top: \"rpn_cls_score_reshape\"\n   name: \"rpn_cls_score_reshape\"\n   type: \"Reshape\"\n   reshape_param { shape { dim: 0 dim: 2 dim: -1 dim: 0 } }\n}\n\nlayer {\n  name: 'rpn-data'\n  type: 'Python'\n  bottom: 'rpn_cls_score'\n  bottom: 'gt_boxes'\n  bottom: 'im_info'\n  bottom: 'data'\n  top: 'rpn_labels'\n  top: 'rpn_bbox_targets'\n  top: 'rpn_bbox_inside_weights'\n  top: 'rpn_bbox_outside_weights'\n  python_param {\n    module: 'rpn.anchor_target_layer'\n    layer: 'AnchorTargetLayer'\n    param_str: \"'feat_stride': 16 \\n'scales': !!python/tuple [4, 8, 16, 32]\"\n  }\n}\n\nlayer {\n  name: \"rpn_loss_cls\"\n  type: \"SoftmaxWithLoss\"\n  bottom: \"rpn_cls_score_reshape\"\n  bottom: \"rpn_labels\"\n  propagate_down: 1\n  propagate_down: 0\n  top: \"rpn_cls_loss\"\n  loss_weight: 1\n  loss_param {\n    ignore_label: -1\n    normalize: true\n  }\n}\n\nlayer {\n  name: \"rpn_loss_bbox\"\n  type: \"SmoothL1Loss\"\n  bottom: \"rpn_bbox_pred\"\n  bottom: \"rpn_bbox_targets\"\n  bottom: 'rpn_bbox_inside_weights'\n  bottom: 'rpn_bbox_outside_weights'\n  top: \"rpn_loss_bbox\"\n  loss_weight: 1\n  smooth_l1_loss_param { sigma: 3.0 }\n}\n\n#========= RoI Proposal ============\n\nlayer {\n  name: \"rpn_cls_prob\"\n  type: \"Softmax\"\n  bottom: \"rpn_cls_score_reshape\"\n  top: \"rpn_cls_prob\"\n}\n\nlayer {\n  name: 'rpn_cls_prob_reshape'\n  type: 'Reshape'\n  bottom: 'rpn_cls_prob'\n  top: 'rpn_cls_prob_reshape'\n  reshape_param { shape { dim: 0 dim: 24 dim: -1 dim: 0 } }\n}\n\nlayer {\n  name: 'proposal'\n  type: 'Python'\n  bottom: 'rpn_cls_prob_reshape'\n  bottom: 'rpn_bbox_pred'\n  bottom: 'im_info'\n  top: 'rpn_rois'\n  python_param {\n    module: 'rpn.proposal_layer'\n    layer: 'ProposalLayer'\n    param_str: \"'feat_stride': 16 \\n'scales': !!python/tuple [4, 8, 16, 32]\"\n  }\n}\n\nlayer {\n  name: 'roi-data'\n  type: 'Python'\n  bottom: 'rpn_rois'\n  bottom: 'gt_boxes'\n  top: 'rois'\n  top: 'labels'\n  top: 'bbox_targets'\n  top: 'bbox_inside_weights'\n  top: 'bbox_outside_weights'\n  python_param {\n    module: 'rpn.proposal_target_layer'\n    layer: 'ProposalTargetLayer'\n    param_str: \"'num_classes': 81\"\n  }\n}\n\n#========= RCNN ============\n\nlayer {\n  name: \"roi_pool5\"\n  type: \"ROIPooling\"\n  bottom: \"conv5\"\n  bottom: \"rois\"\n  top: \"pool5\"\n  roi_pooling_param {\n    pooled_w: 6\n    pooled_h: 6\n    spatial_scale: 0.0625 # 1/16\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"pool5\"\n  top: \"fc6\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 1024\n  }\n}\nlayer {\n  name: \"relu7\"\n  type: \"ReLU\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n}\nlayer {\n  name: \"cls_score\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"cls_score\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 81\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"bbox_pred\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"bbox_pred\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 324\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.001\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"loss_cls\"\n  type: \"SoftmaxWithLoss\"\n  bottom: \"cls_score\"\n  bottom: \"labels\"\n  propagate_down: 1\n  propagate_down: 0\n  top: \"loss_cls\"\n  loss_weight: 1\n}\nlayer {\n  name: \"loss_bbox\"\n  type: \"SmoothL1Loss\"\n  bottom: \"bbox_pred\"\n  bottom: \"bbox_targets\"\n  bottom: \"bbox_inside_weights\"\n  bottom: \"bbox_outside_weights\"\n  top: \"loss_bbox\"\n  loss_weight: 1\n}\n"
  },
  {
    "path": "models/pascal_voc/VGG16/fast_rcnn/solver.prototxt",
    "content": "train_net: \"models/pascal_voc/VGG16/fast_rcnn/train.prototxt\"\nbase_lr: 0.001\nlr_policy: \"step\"\ngamma: 0.1\nstepsize: 30000\ndisplay: 20\naverage_loss: 100\n# iter_size: 1\nmomentum: 0.9\nweight_decay: 0.0005\n# We disable standard caffe solver snapshotting and implement our own snapshot\n# function\nsnapshot: 0\n# We still use the snapshot prefix, though\nsnapshot_prefix: \"vgg16_fast_rcnn\"\n#debug_info: true\n"
  },
  {
    "path": "models/pascal_voc/VGG16/fast_rcnn/test.prototxt",
    "content": "name: \"VGG_ILSVRC_16_layers\"\n\ninput: \"data\"\ninput_shape {\n  dim: 1\n  dim: 3\n  dim: 224\n  dim: 224\n}\n\ninput: \"rois\"\ninput_shape {\n  dim: 1 # to be changed on-the-fly to num ROIs\n  dim: 5 # [batch ind, x1, y1, x2, y2] zero-based indexing\n}\n\nlayer {\n  name: \"conv1_1\"\n  type: \"Convolution\"\n  bottom: \"data\"\n  top: \"conv1_1\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 64\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu1_1\"\n  type: \"ReLU\"\n  bottom: \"conv1_1\"\n  top: \"conv1_1\"\n}\nlayer {\n  name: \"conv1_2\"\n  type: \"Convolution\"\n  bottom: \"conv1_1\"\n  top: \"conv1_2\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 64\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu1_2\"\n  type: \"ReLU\"\n  bottom: \"conv1_2\"\n  top: \"conv1_2\"\n}\nlayer {\n  name: \"pool1\"\n  type: \"Pooling\"\n  bottom: \"conv1_2\"\n  top: \"pool1\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv2_1\"\n  type: \"Convolution\"\n  bottom: \"pool1\"\n  top: \"conv2_1\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 128\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu2_1\"\n  type: \"ReLU\"\n  bottom: \"conv2_1\"\n  top: \"conv2_1\"\n}\nlayer {\n  name: \"conv2_2\"\n  type: \"Convolution\"\n  bottom: \"conv2_1\"\n  top: \"conv2_2\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 128\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu2_2\"\n  type: \"ReLU\"\n  bottom: \"conv2_2\"\n  top: \"conv2_2\"\n}\nlayer {\n  name: \"pool2\"\n  type: \"Pooling\"\n  bottom: \"conv2_2\"\n  top: \"pool2\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv3_1\"\n  type: \"Convolution\"\n  bottom: \"pool2\"\n  top: \"conv3_1\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_1\"\n  type: \"ReLU\"\n  bottom: \"conv3_1\"\n  top: \"conv3_1\"\n}\nlayer {\n  name: \"conv3_2\"\n  type: \"Convolution\"\n  bottom: \"conv3_1\"\n  top: \"conv3_2\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_2\"\n  type: \"ReLU\"\n  bottom: \"conv3_2\"\n  top: \"conv3_2\"\n}\nlayer {\n  name: \"conv3_3\"\n  type: \"Convolution\"\n  bottom: \"conv3_2\"\n  top: \"conv3_3\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_3\"\n  type: \"ReLU\"\n  bottom: \"conv3_3\"\n  top: \"conv3_3\"\n}\nlayer {\n  name: \"pool3\"\n  type: \"Pooling\"\n  bottom: \"conv3_3\"\n  top: \"pool3\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv4_1\"\n  type: \"Convolution\"\n  bottom: \"pool3\"\n  top: \"conv4_1\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_1\"\n  type: \"ReLU\"\n  bottom: \"conv4_1\"\n  top: \"conv4_1\"\n}\nlayer {\n  name: \"conv4_2\"\n  type: \"Convolution\"\n  bottom: \"conv4_1\"\n  top: \"conv4_2\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_2\"\n  type: \"ReLU\"\n  bottom: \"conv4_2\"\n  top: \"conv4_2\"\n}\nlayer {\n  name: \"conv4_3\"\n  type: \"Convolution\"\n  bottom: \"conv4_2\"\n  top: \"conv4_3\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_3\"\n  type: \"ReLU\"\n  bottom: \"conv4_3\"\n  top: \"conv4_3\"\n}\nlayer {\n  name: \"pool4\"\n  type: \"Pooling\"\n  bottom: \"conv4_3\"\n  top: \"pool4\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv5_1\"\n  type: \"Convolution\"\n  bottom: \"pool4\"\n  top: \"conv5_1\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_1\"\n  type: \"ReLU\"\n  bottom: \"conv5_1\"\n  top: \"conv5_1\"\n}\nlayer {\n  name: \"conv5_2\"\n  type: \"Convolution\"\n  bottom: \"conv5_1\"\n  top: \"conv5_2\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_2\"\n  type: \"ReLU\"\n  bottom: \"conv5_2\"\n  top: \"conv5_2\"\n}\nlayer {\n  name: \"conv5_3\"\n  type: \"Convolution\"\n  bottom: \"conv5_2\"\n  top: \"conv5_3\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_3\"\n  type: \"ReLU\"\n  bottom: \"conv5_3\"\n  top: \"conv5_3\"\n}\nlayer {\n  name: \"roi_pool5\"\n  type: \"ROIPooling\"\n  bottom: \"conv5_3\"\n  bottom: \"rois\"\n  top: \"pool5\"\n  roi_pooling_param {\n    pooled_w: 7\n    pooled_h: 7\n    spatial_scale: 0.0625 # 1/16\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"pool5\"\n  top: \"fc6\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"drop6\"\n  type: \"Dropout\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n  dropout_param {\n    dropout_ratio: 0.5\n  }\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu7\"\n  type: \"ReLU\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n}\nlayer {\n  name: \"drop7\"\n  type: \"Dropout\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n  dropout_param {\n    dropout_ratio: 0.5\n  }\n}\nlayer {\n  name: \"cls_score\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"cls_score\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 21\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"bbox_pred\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"bbox_pred\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 84\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.001\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"cls_prob\"\n  type: \"Softmax\"\n  bottom: \"cls_score\"\n  top: \"cls_prob\"\n}\n"
  },
  {
    "path": "models/pascal_voc/VGG16/fast_rcnn/train.prototxt",
    "content": "name: \"VGG_ILSVRC_16_layers\"\nlayer {\n  name: 'data'\n  type: 'Python'\n  top: 'data'\n  top: 'rois'\n  top: 'labels'\n  top: 'bbox_targets'\n  top: 'bbox_inside_weights'\n  top: 'bbox_outside_weights'\n  python_param {\n    module: 'roi_data_layer.layer'\n    layer: 'RoIDataLayer'\n    param_str: \"'num_classes': 21\"\n  }\n}\nlayer {\n  name: \"conv1_1\"\n  type: \"Convolution\"\n  bottom: \"data\"\n  top: \"conv1_1\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 64\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu1_1\"\n  type: \"ReLU\"\n  bottom: \"conv1_1\"\n  top: \"conv1_1\"\n}\nlayer {\n  name: \"conv1_2\"\n  type: \"Convolution\"\n  bottom: \"conv1_1\"\n  top: \"conv1_2\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 64\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu1_2\"\n  type: \"ReLU\"\n  bottom: \"conv1_2\"\n  top: \"conv1_2\"\n}\nlayer {\n  name: \"pool1\"\n  type: \"Pooling\"\n  bottom: \"conv1_2\"\n  top: \"pool1\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv2_1\"\n  type: \"Convolution\"\n  bottom: \"pool1\"\n  top: \"conv2_1\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 128\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu2_1\"\n  type: \"ReLU\"\n  bottom: \"conv2_1\"\n  top: \"conv2_1\"\n}\nlayer {\n  name: \"conv2_2\"\n  type: \"Convolution\"\n  bottom: \"conv2_1\"\n  top: \"conv2_2\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 128\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu2_2\"\n  type: \"ReLU\"\n  bottom: \"conv2_2\"\n  top: \"conv2_2\"\n}\nlayer {\n  name: \"pool2\"\n  type: \"Pooling\"\n  bottom: \"conv2_2\"\n  top: \"pool2\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv3_1\"\n  type: \"Convolution\"\n  bottom: \"pool2\"\n  top: \"conv3_1\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_1\"\n  type: \"ReLU\"\n  bottom: \"conv3_1\"\n  top: \"conv3_1\"\n}\nlayer {\n  name: \"conv3_2\"\n  type: \"Convolution\"\n  bottom: \"conv3_1\"\n  top: \"conv3_2\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_2\"\n  type: \"ReLU\"\n  bottom: \"conv3_2\"\n  top: \"conv3_2\"\n}\nlayer {\n  name: \"conv3_3\"\n  type: \"Convolution\"\n  bottom: \"conv3_2\"\n  top: \"conv3_3\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_3\"\n  type: \"ReLU\"\n  bottom: \"conv3_3\"\n  top: \"conv3_3\"\n}\nlayer {\n  name: \"pool3\"\n  type: \"Pooling\"\n  bottom: \"conv3_3\"\n  top: \"pool3\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv4_1\"\n  type: \"Convolution\"\n  bottom: \"pool3\"\n  top: \"conv4_1\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_1\"\n  type: \"ReLU\"\n  bottom: \"conv4_1\"\n  top: \"conv4_1\"\n}\nlayer {\n  name: \"conv4_2\"\n  type: \"Convolution\"\n  bottom: \"conv4_1\"\n  top: \"conv4_2\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_2\"\n  type: \"ReLU\"\n  bottom: \"conv4_2\"\n  top: \"conv4_2\"\n}\nlayer {\n  name: \"conv4_3\"\n  type: \"Convolution\"\n  bottom: \"conv4_2\"\n  top: \"conv4_3\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_3\"\n  type: \"ReLU\"\n  bottom: \"conv4_3\"\n  top: \"conv4_3\"\n}\nlayer {\n  name: \"pool4\"\n  type: \"Pooling\"\n  bottom: \"conv4_3\"\n  top: \"pool4\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv5_1\"\n  type: \"Convolution\"\n  bottom: \"pool4\"\n  top: \"conv5_1\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_1\"\n  type: \"ReLU\"\n  bottom: \"conv5_1\"\n  top: \"conv5_1\"\n}\nlayer {\n  name: \"conv5_2\"\n  type: \"Convolution\"\n  bottom: \"conv5_1\"\n  top: \"conv5_2\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_2\"\n  type: \"ReLU\"\n  bottom: \"conv5_2\"\n  top: \"conv5_2\"\n}\nlayer {\n  name: \"conv5_3\"\n  type: \"Convolution\"\n  bottom: \"conv5_2\"\n  top: \"conv5_3\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_3\"\n  type: \"ReLU\"\n  bottom: \"conv5_3\"\n  top: \"conv5_3\"\n}\nlayer {\n  name: \"roi_pool5\"\n  type: \"ROIPooling\"\n  bottom: \"conv5_3\"\n  bottom: \"rois\"\n  top: \"pool5\"\n  roi_pooling_param {\n    pooled_w: 7\n    pooled_h: 7\n    spatial_scale: 0.0625 # 1/16\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"pool5\"\n  top: \"fc6\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"drop6\"\n  type: \"Dropout\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n  dropout_param {\n    dropout_ratio: 0.5\n  }\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu7\"\n  type: \"ReLU\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n}\nlayer {\n  name: \"drop7\"\n  type: \"Dropout\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n  dropout_param {\n    dropout_ratio: 0.5\n  }\n}\nlayer {\n  name: \"cls_score\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"cls_score\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 21\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"bbox_pred\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"bbox_pred\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 84\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.001\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"loss_cls\"\n  type: \"SoftmaxWithLoss\"\n  bottom: \"cls_score\"\n  bottom: \"labels\"\n  top: \"loss_cls\"\n  loss_weight: 1\n}\nlayer {\n  name: \"loss_bbox\"\n  type: \"SmoothL1Loss\"\n  bottom: \"bbox_pred\"\n  bottom: \"bbox_targets\"\n  bottom: \"bbox_inside_weights\"\n  bottom: \"bbox_outside_weights\"\n  top: \"loss_bbox\"\n  loss_weight: 1\n}\n"
  },
  {
    "path": "models/pascal_voc/VGG16/faster_rcnn_alt_opt/faster_rcnn_test.pt",
    "content": "name: \"VGG_ILSVRC_16_layers\"\n\ninput: \"data\"\ninput_shape {\n  dim: 1\n  dim: 3\n  dim: 224\n  dim: 224\n}\n\ninput: \"im_info\"\ninput_shape {\n  dim: 1\n  dim: 3\n}\n\nlayer {\n  name: \"conv1_1\"\n  type: \"Convolution\"\n  bottom: \"data\"\n  top: \"conv1_1\"\n  convolution_param {\n    num_output: 64\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu1_1\"\n  type: \"ReLU\"\n  bottom: \"conv1_1\"\n  top: \"conv1_1\"\n}\nlayer {\n  name: \"conv1_2\"\n  type: \"Convolution\"\n  bottom: \"conv1_1\"\n  top: \"conv1_2\"\n  convolution_param {\n    num_output: 64\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu1_2\"\n  type: \"ReLU\"\n  bottom: \"conv1_2\"\n  top: \"conv1_2\"\n}\nlayer {\n  name: \"pool1\"\n  type: \"Pooling\"\n  bottom: \"conv1_2\"\n  top: \"pool1\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2 stride: 2\n  }\n}\nlayer {\n  name: \"conv2_1\"\n  type: \"Convolution\"\n  bottom: \"pool1\"\n  top: \"conv2_1\"\n  convolution_param {\n    num_output: 128\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu2_1\"\n  type: \"ReLU\"\n  bottom: \"conv2_1\"\n  top: \"conv2_1\"\n}\nlayer {\n  name: \"conv2_2\"\n  type: \"Convolution\"\n  bottom: \"conv2_1\"\n  top: \"conv2_2\"\n  convolution_param {\n    num_output: 128\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu2_2\"\n  type: \"ReLU\"\n  bottom: \"conv2_2\"\n  top: \"conv2_2\"\n}\nlayer {\n  name: \"pool2\"\n  type: \"Pooling\"\n  bottom: \"conv2_2\"\n  top: \"pool2\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2 stride: 2\n  }\n}\nlayer {\n  name: \"conv3_1\"\n  type: \"Convolution\"\n  bottom: \"pool2\"\n  top: \"conv3_1\"\n  convolution_param {\n    num_output: 256\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_1\"\n  type: \"ReLU\"\n  bottom: \"conv3_1\"\n  top: \"conv3_1\"\n}\nlayer {\n  name: \"conv3_2\"\n  type: \"Convolution\"\n  bottom: \"conv3_1\"\n  top: \"conv3_2\"\n  convolution_param {\n    num_output: 256\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_2\"\n  type: \"ReLU\"\n  bottom: \"conv3_2\"\n  top: \"conv3_2\"\n}\nlayer {\n  name: \"conv3_3\"\n  type: \"Convolution\"\n  bottom: \"conv3_2\"\n  top: \"conv3_3\"\n  convolution_param {\n    num_output: 256\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_3\"\n  type: \"ReLU\"\n  bottom: \"conv3_3\"\n  top: \"conv3_3\"\n}\nlayer {\n  name: \"pool3\"\n  type: \"Pooling\"\n  bottom: \"conv3_3\"\n  top: \"pool3\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2 stride: 2\n  }\n}\nlayer {\n  name: \"conv4_1\"\n  type: \"Convolution\"\n  bottom: \"pool3\"\n  top: \"conv4_1\"\n  convolution_param {\n    num_output: 512\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_1\"\n  type: \"ReLU\"\n  bottom: \"conv4_1\"\n  top: \"conv4_1\"\n}\nlayer {\n  name: \"conv4_2\"\n  type: \"Convolution\"\n  bottom: \"conv4_1\"\n  top: \"conv4_2\"\n  convolution_param {\n    num_output: 512\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_2\"\n  type: \"ReLU\"\n  bottom: \"conv4_2\"\n  top: \"conv4_2\"\n}\nlayer {\n  name: \"conv4_3\"\n  type: \"Convolution\"\n  bottom: \"conv4_2\"\n  top: \"conv4_3\"\n  convolution_param {\n    num_output: 512\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_3\"\n  type: \"ReLU\"\n  bottom: \"conv4_3\"\n  top: \"conv4_3\"\n}\nlayer {\n  name: \"pool4\"\n  type: \"Pooling\"\n  bottom: \"conv4_3\"\n  top: \"pool4\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2 stride: 2\n  }\n}\nlayer {\n  name: \"conv5_1\"\n  type: \"Convolution\"\n  bottom: \"pool4\"\n  top: \"conv5_1\"\n  convolution_param {\n    num_output: 512\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_1\"\n  type: \"ReLU\"\n  bottom: \"conv5_1\"\n  top: \"conv5_1\"\n}\nlayer {\n  name: \"conv5_2\"\n  type: \"Convolution\"\n  bottom: \"conv5_1\"\n  top: \"conv5_2\"\n  convolution_param {\n    num_output: 512\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_2\"\n  type: \"ReLU\"\n  bottom: \"conv5_2\"\n  top: \"conv5_2\"\n}\nlayer {\n  name: \"conv5_3\"\n  type: \"Convolution\"\n  bottom: \"conv5_2\"\n  top: \"conv5_3\"\n  convolution_param {\n    num_output: 512\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_3\"\n  type: \"ReLU\"\n  bottom: \"conv5_3\"\n  top: \"conv5_3\"\n}\n\n#========= RPN ============\n\nlayer {\n  name: \"rpn_conv/3x3\"\n  type: \"Convolution\"\n  bottom: \"conv5_3\"\n  top: \"rpn/output\"\n  convolution_param {\n    num_output: 512\n    kernel_size: 3 pad: 1 stride: 1\n  }\n}\nlayer {\n  name: \"rpn_relu/3x3\"\n  type: \"ReLU\"\n  bottom: \"rpn/output\"\n  top: \"rpn/output\"\n}\n\nlayer {\n  name: \"rpn_cls_score\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_cls_score\"\n  convolution_param {\n    num_output: 18   # 2(bg/fg) * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n  }\n}\nlayer {\n  name: \"rpn_bbox_pred\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_bbox_pred\"\n  convolution_param {\n    num_output: 36   # 4 * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n  }\n}\nlayer {\n   bottom: \"rpn_cls_score\"\n   top: \"rpn_cls_score_reshape\"\n   name: \"rpn_cls_score_reshape\"\n   type: \"Reshape\"\n   reshape_param { shape { dim: 0 dim: 2 dim: -1 dim: 0 } }\n}\n\n#========= RoI Proposal ============\n\nlayer {\n  name: \"rpn_cls_prob\"\n  type: \"Softmax\"\n  bottom: \"rpn_cls_score_reshape\"\n  top: \"rpn_cls_prob\"\n}\nlayer {\n  name: 'rpn_cls_prob_reshape'\n  type: 'Reshape'\n  bottom: 'rpn_cls_prob'\n  top: 'rpn_cls_prob_reshape'\n  reshape_param { shape { dim: 0 dim: 18 dim: -1 dim: 0 } }\n}\nlayer {\n  name: 'proposal'\n  type: 'Python'\n  bottom: 'rpn_cls_prob_reshape'\n  bottom: 'rpn_bbox_pred'\n  bottom: 'im_info'\n  top: 'rois'\n  python_param {\n    module: 'rpn.proposal_layer'\n    layer: 'ProposalLayer'\n    param_str: \"'feat_stride': 16\"\n  }\n}\n\n#========= RCNN ============\n\nlayer {\n  name: \"roi_pool5\"\n  type: \"ROIPooling\"\n  bottom: \"conv5_3\"\n  bottom: \"rois\"\n  top: \"pool5\"\n  roi_pooling_param {\n    pooled_w: 7\n    pooled_h: 7\n    spatial_scale: 0.0625 # 1/16\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"pool5\"\n  top: \"fc6\"\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu7\"\n  type: \"ReLU\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n}\nlayer {\n  name: \"cls_score\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"cls_score\"\n  inner_product_param {\n    num_output: 21\n  }\n}\nlayer {\n  name: \"bbox_pred\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"bbox_pred\"\n  inner_product_param {\n    num_output: 84\n  }\n}\nlayer {\n  name: \"cls_prob\"\n  type: \"Softmax\"\n  bottom: \"cls_score\"\n  top: \"cls_prob\"\n}\n"
  },
  {
    "path": "models/pascal_voc/VGG16/faster_rcnn_alt_opt/rpn_test.pt",
    "content": "name: \"VGG_ILSVRC_16_layers\"\n\ninput: \"data\"\ninput_shape {\n  dim: 1\n  dim: 3\n  dim: 224\n  dim: 224\n}\n\ninput: \"im_info\"\ninput_shape {\n  dim: 1\n  dim: 3\n}\n\nlayer {\n  name: \"conv1_1\"\n  type: \"Convolution\"\n  bottom: \"data\"\n  top: \"conv1_1\"\n  convolution_param {\n    num_output: 64\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu1_1\"\n  type: \"ReLU\"\n  bottom: \"conv1_1\"\n  top: \"conv1_1\"\n}\nlayer {\n  name: \"conv1_2\"\n  type: \"Convolution\"\n  bottom: \"conv1_1\"\n  top: \"conv1_2\"\n  convolution_param {\n    num_output: 64\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu1_2\"\n  type: \"ReLU\"\n  bottom: \"conv1_2\"\n  top: \"conv1_2\"\n}\nlayer {\n  name: \"pool1\"\n  type: \"Pooling\"\n  bottom: \"conv1_2\"\n  top: \"pool1\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2 stride: 2\n  }\n}\nlayer {\n  name: \"conv2_1\"\n  type: \"Convolution\"\n  bottom: \"pool1\"\n  top: \"conv2_1\"\n  convolution_param {\n    num_output: 128\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu2_1\"\n  type: \"ReLU\"\n  bottom: \"conv2_1\"\n  top: \"conv2_1\"\n}\nlayer {\n  name: \"conv2_2\"\n  type: \"Convolution\"\n  bottom: \"conv2_1\"\n  top: \"conv2_2\"\n  convolution_param {\n    num_output: 128\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu2_2\"\n  type: \"ReLU\"\n  bottom: \"conv2_2\"\n  top: \"conv2_2\"\n}\nlayer {\n  name: \"pool2\"\n  type: \"Pooling\"\n  bottom: \"conv2_2\"\n  top: \"pool2\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2 stride: 2\n  }\n}\nlayer {\n  name: \"conv3_1\"\n  type: \"Convolution\"\n  bottom: \"pool2\"\n  top: \"conv3_1\"\n  convolution_param {\n    num_output: 256\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_1\"\n  type: \"ReLU\"\n  bottom: \"conv3_1\"\n  top: \"conv3_1\"\n}\nlayer {\n  name: \"conv3_2\"\n  type: \"Convolution\"\n  bottom: \"conv3_1\"\n  top: \"conv3_2\"\n  convolution_param {\n    num_output: 256\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_2\"\n  type: \"ReLU\"\n  bottom: \"conv3_2\"\n  top: \"conv3_2\"\n}\nlayer {\n  name: \"conv3_3\"\n  type: \"Convolution\"\n  bottom: \"conv3_2\"\n  top: \"conv3_3\"\n  convolution_param {\n    num_output: 256\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_3\"\n  type: \"ReLU\"\n  bottom: \"conv3_3\"\n  top: \"conv3_3\"\n}\nlayer {\n  name: \"pool3\"\n  type: \"Pooling\"\n  bottom: \"conv3_3\"\n  top: \"pool3\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2 stride: 2\n  }\n}\nlayer {\n  name: \"conv4_1\"\n  type: \"Convolution\"\n  bottom: \"pool3\"\n  top: \"conv4_1\"\n  convolution_param {\n    num_output: 512\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_1\"\n  type: \"ReLU\"\n  bottom: \"conv4_1\"\n  top: \"conv4_1\"\n}\nlayer {\n  name: \"conv4_2\"\n  type: \"Convolution\"\n  bottom: \"conv4_1\"\n  top: \"conv4_2\"\n  convolution_param {\n    num_output: 512\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_2\"\n  type: \"ReLU\"\n  bottom: \"conv4_2\"\n  top: \"conv4_2\"\n}\nlayer {\n  name: \"conv4_3\"\n  type: \"Convolution\"\n  bottom: \"conv4_2\"\n  top: \"conv4_3\"\n  convolution_param {\n    num_output: 512\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_3\"\n  type: \"ReLU\"\n  bottom: \"conv4_3\"\n  top: \"conv4_3\"\n}\nlayer {\n  name: \"pool4\"\n  type: \"Pooling\"\n  bottom: \"conv4_3\"\n  top: \"pool4\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2 stride: 2\n  }\n}\nlayer {\n  name: \"conv5_1\"\n  type: \"Convolution\"\n  bottom: \"pool4\"\n  top: \"conv5_1\"\n  convolution_param {\n    num_output: 512\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_1\"\n  type: \"ReLU\"\n  bottom: \"conv5_1\"\n  top: \"conv5_1\"\n}\nlayer {\n  name: \"conv5_2\"\n  type: \"Convolution\"\n  bottom: \"conv5_1\"\n  top: \"conv5_2\"\n  convolution_param {\n    num_output: 512\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_2\"\n  type: \"ReLU\"\n  bottom: \"conv5_2\"\n  top: \"conv5_2\"\n}\nlayer {\n  name: \"conv5_3\"\n  type: \"Convolution\"\n  bottom: \"conv5_2\"\n  top: \"conv5_3\"\n  convolution_param {\n    num_output: 512\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_3\"\n  type: \"ReLU\"\n  bottom: \"conv5_3\"\n  top: \"conv5_3\"\n}\n\n#========= RPN ============\n\nlayer {\n  name: \"rpn_conv/3x3\"\n  type: \"Convolution\"\n  bottom: \"conv5_3\"\n  top: \"rpn/output\"\n  convolution_param {\n    num_output: 512\n    kernel_size: 3 pad: 1 stride: 1\n  }\n}\nlayer {\n  name: \"rpn_relu/3x3\"\n  type: \"ReLU\"\n  bottom: \"rpn/output\"\n  top: \"rpn/output\"\n}\n\nlayer {\n  name: \"rpn_cls_score\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_cls_score\"\n  convolution_param {\n    num_output: 18   # 2(bg/fg) * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n  }\n}\nlayer {\n  name: \"rpn_bbox_pred\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_bbox_pred\"\n  convolution_param {\n    num_output: 36   # 4 * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n  }\n}\nlayer {\n   bottom: \"rpn_cls_score\"\n   top: \"rpn_cls_score_reshape\"\n   name: \"rpn_cls_score_reshape\"\n   type: \"Reshape\"\n   reshape_param { shape { dim: 0 dim: 2 dim: -1 dim: 0 } }\n}\n\n#========= RoI Proposal ============\n\nlayer {\n  name: \"rpn_cls_prob\"\n  type: \"Softmax\"\n  bottom: \"rpn_cls_score_reshape\"\n  top: \"rpn_cls_prob\"\n}\nlayer {\n  name: 'rpn_cls_prob_reshape'\n  type: 'Reshape'\n  bottom: 'rpn_cls_prob'\n  top: 'rpn_cls_prob_reshape'\n  reshape_param { shape { dim: 0 dim: 18 dim: -1 dim: 0 } }\n}\nlayer {\n  name: 'proposal'\n  type: 'Python'\n  bottom: 'rpn_cls_prob_reshape'\n  bottom: 'rpn_bbox_pred'\n  bottom: 'im_info'\n  top: 'rois'\n  top: 'scores'\n  python_param {\n    module: 'rpn.proposal_layer'\n    layer: 'ProposalLayer'\n    param_str: \"'feat_stride': 16\"\n  }\n}\n"
  },
  {
    "path": "models/pascal_voc/VGG16/faster_rcnn_alt_opt/stage1_fast_rcnn_solver30k40k.pt",
    "content": "train_net: \"models/pascal_voc/VGG16/faster_rcnn_alt_opt/stage1_fast_rcnn_train.pt\"\n\nbase_lr: 0.001\nlr_policy: \"step\"\ngamma: 0.1\nstepsize: 30000\ndisplay: 20\naverage_loss: 100\nmomentum: 0.9\nweight_decay: 0.0005\n\n# We disable standard caffe solver snapshotting and implement our own snapshot\n# function\nsnapshot: 0\n# We still use the snapshot prefix, though\nsnapshot_prefix: \"vgg16_fast_rcnn\"\n"
  },
  {
    "path": "models/pascal_voc/VGG16/faster_rcnn_alt_opt/stage1_fast_rcnn_train.pt",
    "content": "name: \"VGG_ILSVRC_16_layers\"\nlayer {\n  name: 'data'\n  type: 'Python'\n  top: 'data'\n  top: 'rois'\n  top: 'labels'\n  top: 'bbox_targets'\n  top: 'bbox_inside_weights'\n  top: 'bbox_outside_weights'\n  python_param {\n    module: 'roi_data_layer.layer'\n    layer: 'RoIDataLayer'\n    param_str: \"'num_classes': 21\"\n  }\n}\nlayer {\n  name: \"conv1_1\"\n  type: \"Convolution\"\n  bottom: \"data\"\n  top: \"conv1_1\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 64\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu1_1\"\n  type: \"ReLU\"\n  bottom: \"conv1_1\"\n  top: \"conv1_1\"\n}\nlayer {\n  name: \"conv1_2\"\n  type: \"Convolution\"\n  bottom: \"conv1_1\"\n  top: \"conv1_2\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 64\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu1_2\"\n  type: \"ReLU\"\n  bottom: \"conv1_2\"\n  top: \"conv1_2\"\n}\nlayer {\n  name: \"pool1\"\n  type: \"Pooling\"\n  bottom: \"conv1_2\"\n  top: \"pool1\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv2_1\"\n  type: \"Convolution\"\n  bottom: \"pool1\"\n  top: \"conv2_1\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 128\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu2_1\"\n  type: \"ReLU\"\n  bottom: \"conv2_1\"\n  top: \"conv2_1\"\n}\nlayer {\n  name: \"conv2_2\"\n  type: \"Convolution\"\n  bottom: \"conv2_1\"\n  top: \"conv2_2\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 128\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu2_2\"\n  type: \"ReLU\"\n  bottom: \"conv2_2\"\n  top: \"conv2_2\"\n}\nlayer {\n  name: \"pool2\"\n  type: \"Pooling\"\n  bottom: \"conv2_2\"\n  top: \"pool2\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv3_1\"\n  type: \"Convolution\"\n  bottom: \"pool2\"\n  top: \"conv3_1\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_1\"\n  type: \"ReLU\"\n  bottom: \"conv3_1\"\n  top: \"conv3_1\"\n}\nlayer {\n  name: \"conv3_2\"\n  type: \"Convolution\"\n  bottom: \"conv3_1\"\n  top: \"conv3_2\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_2\"\n  type: \"ReLU\"\n  bottom: \"conv3_2\"\n  top: \"conv3_2\"\n}\nlayer {\n  name: \"conv3_3\"\n  type: \"Convolution\"\n  bottom: \"conv3_2\"\n  top: \"conv3_3\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_3\"\n  type: \"ReLU\"\n  bottom: \"conv3_3\"\n  top: \"conv3_3\"\n}\nlayer {\n  name: \"pool3\"\n  type: \"Pooling\"\n  bottom: \"conv3_3\"\n  top: \"pool3\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv4_1\"\n  type: \"Convolution\"\n  bottom: \"pool3\"\n  top: \"conv4_1\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_1\"\n  type: \"ReLU\"\n  bottom: \"conv4_1\"\n  top: \"conv4_1\"\n}\nlayer {\n  name: \"conv4_2\"\n  type: \"Convolution\"\n  bottom: \"conv4_1\"\n  top: \"conv4_2\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_2\"\n  type: \"ReLU\"\n  bottom: \"conv4_2\"\n  top: \"conv4_2\"\n}\nlayer {\n  name: \"conv4_3\"\n  type: \"Convolution\"\n  bottom: \"conv4_2\"\n  top: \"conv4_3\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_3\"\n  type: \"ReLU\"\n  bottom: \"conv4_3\"\n  top: \"conv4_3\"\n}\nlayer {\n  name: \"pool4\"\n  type: \"Pooling\"\n  bottom: \"conv4_3\"\n  top: \"pool4\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv5_1\"\n  type: \"Convolution\"\n  bottom: \"pool4\"\n  top: \"conv5_1\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_1\"\n  type: \"ReLU\"\n  bottom: \"conv5_1\"\n  top: \"conv5_1\"\n}\nlayer {\n  name: \"conv5_2\"\n  type: \"Convolution\"\n  bottom: \"conv5_1\"\n  top: \"conv5_2\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_2\"\n  type: \"ReLU\"\n  bottom: \"conv5_2\"\n  top: \"conv5_2\"\n}\nlayer {\n  name: \"conv5_3\"\n  type: \"Convolution\"\n  bottom: \"conv5_2\"\n  top: \"conv5_3\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_3\"\n  type: \"ReLU\"\n  bottom: \"conv5_3\"\n  top: \"conv5_3\"\n}\nlayer {\n  name: \"roi_pool5\"\n  type: \"ROIPooling\"\n  bottom: \"conv5_3\"\n  bottom: \"rois\"\n  top: \"pool5\"\n  roi_pooling_param {\n    pooled_w: 7\n    pooled_h: 7\n    spatial_scale: 0.0625 # 1/16\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"pool5\"\n  top: \"fc6\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"drop6\"\n  type: \"Dropout\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n  dropout_param {\n    dropout_ratio: 0.5\n  }\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu7\"\n  type: \"ReLU\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n}\nlayer {\n  name: \"drop7\"\n  type: \"Dropout\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n  dropout_param {\n    dropout_ratio: 0.5\n  }\n}\nlayer {\n  name: \"cls_score\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"cls_score\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 21\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"bbox_pred\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"bbox_pred\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 84\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.001\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"loss_cls\"\n  type: \"SoftmaxWithLoss\"\n  bottom: \"cls_score\"\n  bottom: \"labels\"\n  top: \"loss_cls\"\n  loss_weight: 1\n}\nlayer {\n  name: \"loss_bbox\"\n  type: \"SmoothL1Loss\"\n  bottom: \"bbox_pred\"\n  bottom: \"bbox_targets\"\n  bottom: \"bbox_inside_weights\"\n  bottom: \"bbox_outside_weights\"\n  top: \"loss_bbox\"\n  loss_weight: 1\n}\n\n#========= RPN ============\n# Dummy layers so that initial parameters are saved into the output net\n\nlayer {\n  name: \"rpn_conv/3x3\"\n  type: \"Convolution\"\n  bottom: \"conv5_3\"\n  top: \"rpn/output\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 512\n    kernel_size: 3 pad: 1 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_relu/3x3\"\n  type: \"ReLU\"\n  bottom: \"rpn/output\"\n  top: \"rpn/output\"\n}\n\nlayer {\n  name: \"rpn_cls_score\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_cls_score\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 18   # 2(bg/fg) * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_bbox_pred\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_bbox_pred\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 36   # 4 * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"silence_rpn_cls_score\"\n  type: \"Silence\"\n  bottom: \"rpn_cls_score\"\n}\nlayer {\n  name: \"silence_rpn_bbox_pred\"\n  type: \"Silence\"\n  bottom: \"rpn_bbox_pred\"\n}\n"
  },
  {
    "path": "models/pascal_voc/VGG16/faster_rcnn_alt_opt/stage1_rpn_solver60k80k.pt",
    "content": "train_net: \"models/pascal_voc/VGG16/faster_rcnn_alt_opt/stage1_rpn_train.pt\"\n\nbase_lr: 0.001\nlr_policy: \"step\"\ngamma: 0.1\nstepsize: 60000\ndisplay: 20\naverage_loss: 100\nmomentum: 0.9\nweight_decay: 0.0005\n\n# We disable standard caffe solver snapshotting and implement our own snapshot\n# function\nsnapshot: 0\n# We still use the snapshot prefix, though\nsnapshot_prefix: \"vgg16_rpn\"\n"
  },
  {
    "path": "models/pascal_voc/VGG16/faster_rcnn_alt_opt/stage1_rpn_train.pt",
    "content": "name: \"VGG_ILSVRC_16_layers\"\nlayer {\n  name: 'input-data'\n  type: 'Python'\n  top: 'data'\n  top: 'im_info'\n  top: 'gt_boxes'\n  python_param {\n    module: 'roi_data_layer.layer'\n    layer: 'RoIDataLayer'\n    param_str: \"'num_classes': 21\"\n  }\n}\nlayer {\n  name: \"conv1_1\"\n  type: \"Convolution\"\n  bottom: \"data\"\n  top: \"conv1_1\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 64\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu1_1\"\n  type: \"ReLU\"\n  bottom: \"conv1_1\"\n  top: \"conv1_1\"\n}\nlayer {\n  name: \"conv1_2\"\n  type: \"Convolution\"\n  bottom: \"conv1_1\"\n  top: \"conv1_2\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 64\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu1_2\"\n  type: \"ReLU\"\n  bottom: \"conv1_2\"\n  top: \"conv1_2\"\n}\nlayer {\n  name: \"pool1\"\n  type: \"Pooling\"\n  bottom: \"conv1_2\"\n  top: \"pool1\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv2_1\"\n  type: \"Convolution\"\n  bottom: \"pool1\"\n  top: \"conv2_1\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 128\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu2_1\"\n  type: \"ReLU\"\n  bottom: \"conv2_1\"\n  top: \"conv2_1\"\n}\nlayer {\n  name: \"conv2_2\"\n  type: \"Convolution\"\n  bottom: \"conv2_1\"\n  top: \"conv2_2\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 128\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu2_2\"\n  type: \"ReLU\"\n  bottom: \"conv2_2\"\n  top: \"conv2_2\"\n}\nlayer {\n  name: \"pool2\"\n  type: \"Pooling\"\n  bottom: \"conv2_2\"\n  top: \"pool2\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv3_1\"\n  type: \"Convolution\"\n  bottom: \"pool2\"\n  top: \"conv3_1\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_1\"\n  type: \"ReLU\"\n  bottom: \"conv3_1\"\n  top: \"conv3_1\"\n}\nlayer {\n  name: \"conv3_2\"\n  type: \"Convolution\"\n  bottom: \"conv3_1\"\n  top: \"conv3_2\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_2\"\n  type: \"ReLU\"\n  bottom: \"conv3_2\"\n  top: \"conv3_2\"\n}\nlayer {\n  name: \"conv3_3\"\n  type: \"Convolution\"\n  bottom: \"conv3_2\"\n  top: \"conv3_3\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_3\"\n  type: \"ReLU\"\n  bottom: \"conv3_3\"\n  top: \"conv3_3\"\n}\nlayer {\n  name: \"pool3\"\n  type: \"Pooling\"\n  bottom: \"conv3_3\"\n  top: \"pool3\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv4_1\"\n  type: \"Convolution\"\n  bottom: \"pool3\"\n  top: \"conv4_1\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_1\"\n  type: \"ReLU\"\n  bottom: \"conv4_1\"\n  top: \"conv4_1\"\n}\nlayer {\n  name: \"conv4_2\"\n  type: \"Convolution\"\n  bottom: \"conv4_1\"\n  top: \"conv4_2\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_2\"\n  type: \"ReLU\"\n  bottom: \"conv4_2\"\n  top: \"conv4_2\"\n}\nlayer {\n  name: \"conv4_3\"\n  type: \"Convolution\"\n  bottom: \"conv4_2\"\n  top: \"conv4_3\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_3\"\n  type: \"ReLU\"\n  bottom: \"conv4_3\"\n  top: \"conv4_3\"\n}\nlayer {\n  name: \"pool4\"\n  type: \"Pooling\"\n  bottom: \"conv4_3\"\n  top: \"pool4\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv5_1\"\n  type: \"Convolution\"\n  bottom: \"pool4\"\n  top: \"conv5_1\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_1\"\n  type: \"ReLU\"\n  bottom: \"conv5_1\"\n  top: \"conv5_1\"\n}\nlayer {\n  name: \"conv5_2\"\n  type: \"Convolution\"\n  bottom: \"conv5_1\"\n  top: \"conv5_2\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_2\"\n  type: \"ReLU\"\n  bottom: \"conv5_2\"\n  top: \"conv5_2\"\n}\nlayer {\n  name: \"conv5_3\"\n  type: \"Convolution\"\n  bottom: \"conv5_2\"\n  top: \"conv5_3\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_3\"\n  type: \"ReLU\"\n  bottom: \"conv5_3\"\n  top: \"conv5_3\"\n}\n\n#========= RPN ============\n\nlayer {\n  name: \"rpn_conv/3x3\"\n  type: \"Convolution\"\n  bottom: \"conv5_3\"\n  top: \"rpn/output\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 512\n    kernel_size: 3 pad: 1 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_relu/3x3\"\n  type: \"ReLU\"\n  bottom: \"rpn/output\"\n  top: \"rpn/output\"\n}\n\nlayer {\n  name: \"rpn_cls_score\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_cls_score\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 18   # 2(bg/fg) * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\n\nlayer {\n  name: \"rpn_bbox_pred\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_bbox_pred\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 36   # 4 * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\n\nlayer {\n   bottom: \"rpn_cls_score\"\n   top: \"rpn_cls_score_reshape\"\n   name: \"rpn_cls_score_reshape\"\n   type: \"Reshape\"\n   reshape_param { shape { dim: 0 dim: 2 dim: -1 dim: 0 } }\n}\n\nlayer {\n  name: 'rpn-data'\n  type: 'Python'\n  bottom: 'rpn_cls_score'\n  bottom: 'gt_boxes'\n  bottom: 'im_info'\n  bottom: 'data'\n  top: 'rpn_labels'\n  top: 'rpn_bbox_targets'\n  top: 'rpn_bbox_inside_weights'\n  top: 'rpn_bbox_outside_weights'\n  python_param {\n    module: 'rpn.anchor_target_layer'\n    layer: 'AnchorTargetLayer'\n    param_str: \"'feat_stride': 16\"\n  }\n}\n\nlayer {\n  name: \"rpn_loss_cls\"\n  type: \"SoftmaxWithLoss\"\n  bottom: \"rpn_cls_score_reshape\"\n  bottom: \"rpn_labels\"\n  propagate_down: 1\n  propagate_down: 0\n  top: \"rpn_cls_loss\"\n  loss_weight: 1\n  loss_param {\n    ignore_label: -1\n    normalize: true\n  }\n}\n\nlayer {\n  name: \"rpn_loss_bbox\"\n  type: \"SmoothL1Loss\"\n  bottom: \"rpn_bbox_pred\"\n  bottom: \"rpn_bbox_targets\"\n  bottom: 'rpn_bbox_inside_weights'\n  bottom: 'rpn_bbox_outside_weights'\n  top: \"rpn_loss_bbox\"\n  loss_weight: 1\n  smooth_l1_loss_param { sigma: 3.0 }\n}\n\n#========= RCNN ============\n# Dummy layers so that initial parameters are saved into the output net\n\nlayer {\n  name: \"dummy_roi_pool_conv5\"\n  type: \"DummyData\"\n  top: \"dummy_roi_pool_conv5\"\n  dummy_data_param {\n    shape { dim: 1 dim: 25088 }\n    data_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"dummy_roi_pool_conv5\"\n  top: \"fc6\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"drop6\"\n  type: \"Dropout\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n  dropout_param {\n    dropout_ratio: 0.5\n  }\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"silence_fc7\"\n  type: \"Silence\"\n  bottom: \"fc7\"\n}\n"
  },
  {
    "path": "models/pascal_voc/VGG16/faster_rcnn_alt_opt/stage2_fast_rcnn_solver30k40k.pt",
    "content": "train_net: \"models/pascal_voc/VGG16/faster_rcnn_alt_opt/stage2_fast_rcnn_train.pt\"\n\nbase_lr: 0.001\nlr_policy: \"step\"\ngamma: 0.1\nstepsize: 30000\ndisplay: 20\naverage_loss: 100\nmomentum: 0.9\nweight_decay: 0.0005\n\n# We disable standard caffe solver snapshotting and implement our own snapshot\n# function\nsnapshot: 0\n# We still use the snapshot prefix, though\nsnapshot_prefix: \"vgg16_fast_rcnn\"\n"
  },
  {
    "path": "models/pascal_voc/VGG16/faster_rcnn_alt_opt/stage2_fast_rcnn_train.pt",
    "content": "name: \"VGG_ILSVRC_16_layers\"\nlayer {\n  name: 'data'\n  type: 'Python'\n  top: 'data'\n  top: 'rois'\n  top: 'labels'\n  top: 'bbox_targets'\n  top: 'bbox_inside_weights'\n  top: 'bbox_outside_weights'\n  python_param {\n    module: 'roi_data_layer.layer'\n    layer: 'RoIDataLayer'\n    param_str: \"'num_classes': 21\"\n  }\n}\nlayer {\n  name: \"conv1_1\"\n  type: \"Convolution\"\n  bottom: \"data\"\n  top: \"conv1_1\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 64\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu1_1\"\n  type: \"ReLU\"\n  bottom: \"conv1_1\"\n  top: \"conv1_1\"\n}\nlayer {\n  name: \"conv1_2\"\n  type: \"Convolution\"\n  bottom: \"conv1_1\"\n  top: \"conv1_2\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 64\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu1_2\"\n  type: \"ReLU\"\n  bottom: \"conv1_2\"\n  top: \"conv1_2\"\n}\nlayer {\n  name: \"pool1\"\n  type: \"Pooling\"\n  bottom: \"conv1_2\"\n  top: \"pool1\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv2_1\"\n  type: \"Convolution\"\n  bottom: \"pool1\"\n  top: \"conv2_1\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 128\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu2_1\"\n  type: \"ReLU\"\n  bottom: \"conv2_1\"\n  top: \"conv2_1\"\n}\nlayer {\n  name: \"conv2_2\"\n  type: \"Convolution\"\n  bottom: \"conv2_1\"\n  top: \"conv2_2\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 128\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu2_2\"\n  type: \"ReLU\"\n  bottom: \"conv2_2\"\n  top: \"conv2_2\"\n}\nlayer {\n  name: \"pool2\"\n  type: \"Pooling\"\n  bottom: \"conv2_2\"\n  top: \"pool2\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv3_1\"\n  type: \"Convolution\"\n  bottom: \"pool2\"\n  top: \"conv3_1\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_1\"\n  type: \"ReLU\"\n  bottom: \"conv3_1\"\n  top: \"conv3_1\"\n}\nlayer {\n  name: \"conv3_2\"\n  type: \"Convolution\"\n  bottom: \"conv3_1\"\n  top: \"conv3_2\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_2\"\n  type: \"ReLU\"\n  bottom: \"conv3_2\"\n  top: \"conv3_2\"\n}\nlayer {\n  name: \"conv3_3\"\n  type: \"Convolution\"\n  bottom: \"conv3_2\"\n  top: \"conv3_3\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_3\"\n  type: \"ReLU\"\n  bottom: \"conv3_3\"\n  top: \"conv3_3\"\n}\nlayer {\n  name: \"pool3\"\n  type: \"Pooling\"\n  bottom: \"conv3_3\"\n  top: \"pool3\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv4_1\"\n  type: \"Convolution\"\n  bottom: \"pool3\"\n  top: \"conv4_1\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_1\"\n  type: \"ReLU\"\n  bottom: \"conv4_1\"\n  top: \"conv4_1\"\n}\nlayer {\n  name: \"conv4_2\"\n  type: \"Convolution\"\n  bottom: \"conv4_1\"\n  top: \"conv4_2\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_2\"\n  type: \"ReLU\"\n  bottom: \"conv4_2\"\n  top: \"conv4_2\"\n}\nlayer {\n  name: \"conv4_3\"\n  type: \"Convolution\"\n  bottom: \"conv4_2\"\n  top: \"conv4_3\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_3\"\n  type: \"ReLU\"\n  bottom: \"conv4_3\"\n  top: \"conv4_3\"\n}\nlayer {\n  name: \"pool4\"\n  type: \"Pooling\"\n  bottom: \"conv4_3\"\n  top: \"pool4\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv5_1\"\n  type: \"Convolution\"\n  bottom: \"pool4\"\n  top: \"conv5_1\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_1\"\n  type: \"ReLU\"\n  bottom: \"conv5_1\"\n  top: \"conv5_1\"\n}\nlayer {\n  name: \"conv5_2\"\n  type: \"Convolution\"\n  bottom: \"conv5_1\"\n  top: \"conv5_2\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_2\"\n  type: \"ReLU\"\n  bottom: \"conv5_2\"\n  top: \"conv5_2\"\n}\nlayer {\n  name: \"conv5_3\"\n  type: \"Convolution\"\n  bottom: \"conv5_2\"\n  top: \"conv5_3\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_3\"\n  type: \"ReLU\"\n  bottom: \"conv5_3\"\n  top: \"conv5_3\"\n}\nlayer {\n  name: \"roi_pool5\"\n  type: \"ROIPooling\"\n  bottom: \"conv5_3\"\n  bottom: \"rois\"\n  top: \"pool5\"\n  roi_pooling_param {\n    pooled_w: 7\n    pooled_h: 7\n    spatial_scale: 0.0625 # 1/16\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"pool5\"\n  top: \"fc6\"\n  param { lr_mult: 1 }\n  param { lr_mult: 2 }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"drop6\"\n  type: \"Dropout\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n  dropout_param {\n    dropout_ratio: 0.5\n  }\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  param { lr_mult: 1 }\n  param { lr_mult: 2 }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu7\"\n  type: \"ReLU\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n}\nlayer {\n  name: \"drop7\"\n  type: \"Dropout\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n  dropout_param {\n    dropout_ratio: 0.5\n  }\n}\nlayer {\n  name: \"cls_score\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"cls_score\"\n  param { lr_mult: 1 }\n  param { lr_mult: 2 }\n  inner_product_param {\n    num_output: 21\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"bbox_pred\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"bbox_pred\"\n  param { lr_mult: 1 }\n  param { lr_mult: 2 }\n  inner_product_param {\n    num_output: 84\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.001\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"loss_cls\"\n  type: \"SoftmaxWithLoss\"\n  bottom: \"cls_score\"\n  bottom: \"labels\"\n  top: \"loss_cls\"\n  loss_weight: 1\n}\nlayer {\n  name: \"loss_bbox\"\n  type: \"SmoothL1Loss\"\n  bottom: \"bbox_pred\"\n  bottom: \"bbox_targets\"\n  bottom: \"bbox_inside_weights\"\n  bottom: \"bbox_outside_weights\"\n  top: \"loss_bbox\"\n  loss_weight: 1\n}\n\n#========= RPN ============\n# Dummy layers so that initial parameters are saved into the output net\n\nlayer {\n  name: \"rpn_conv/3x3\"\n  type: \"Convolution\"\n  bottom: \"conv5_3\"\n  top: \"rpn/output\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 512\n    kernel_size: 3 pad: 1 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_relu/3x3\"\n  type: \"ReLU\"\n  bottom: \"rpn/output\"\n  top: \"rpn/output\"\n}\n\nlayer {\n  name: \"rpn_cls_score\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_cls_score\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 18   # 2(bg/fg) * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_bbox_pred\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_bbox_pred\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 36   # 4 * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"silence_rpn_cls_score\"\n  type: \"Silence\"\n  bottom: \"rpn_cls_score\"\n}\nlayer {\n  name: \"silence_rpn_bbox_pred\"\n  type: \"Silence\"\n  bottom: \"rpn_bbox_pred\"\n}\n"
  },
  {
    "path": "models/pascal_voc/VGG16/faster_rcnn_alt_opt/stage2_rpn_solver60k80k.pt",
    "content": "train_net: \"models/pascal_voc/VGG16/faster_rcnn_alt_opt/stage2_rpn_train.pt\"\n\nbase_lr: 0.001\nlr_policy: \"step\"\ngamma: 0.1\nstepsize: 60000\ndisplay: 20\naverage_loss: 100\nmomentum: 0.9\nweight_decay: 0.0005\n\n# We disable standard caffe solver snapshotting and implement our own snapshot\n# function\nsnapshot: 0\n# We still use the snapshot prefix, though\nsnapshot_prefix: \"vgg16_rpn\"\n"
  },
  {
    "path": "models/pascal_voc/VGG16/faster_rcnn_alt_opt/stage2_rpn_train.pt",
    "content": "name: \"VGG_ILSVRC_16_layers\"\nlayer {\n  name: 'input-data'\n  type: 'Python'\n  top: 'data'\n  top: 'im_info'\n  top: 'gt_boxes'\n  python_param {\n    module: 'roi_data_layer.layer'\n    layer: 'RoIDataLayer'\n    param_str: \"'num_classes': 21\"\n  }\n}\nlayer {\n  name: \"conv1_1\"\n  type: \"Convolution\"\n  bottom: \"data\"\n  top: \"conv1_1\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 64\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu1_1\"\n  type: \"ReLU\"\n  bottom: \"conv1_1\"\n  top: \"conv1_1\"\n}\nlayer {\n  name: \"conv1_2\"\n  type: \"Convolution\"\n  bottom: \"conv1_1\"\n  top: \"conv1_2\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 64\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu1_2\"\n  type: \"ReLU\"\n  bottom: \"conv1_2\"\n  top: \"conv1_2\"\n}\nlayer {\n  name: \"pool1\"\n  type: \"Pooling\"\n  bottom: \"conv1_2\"\n  top: \"pool1\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv2_1\"\n  type: \"Convolution\"\n  bottom: \"pool1\"\n  top: \"conv2_1\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 128\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu2_1\"\n  type: \"ReLU\"\n  bottom: \"conv2_1\"\n  top: \"conv2_1\"\n}\nlayer {\n  name: \"conv2_2\"\n  type: \"Convolution\"\n  bottom: \"conv2_1\"\n  top: \"conv2_2\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 128\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu2_2\"\n  type: \"ReLU\"\n  bottom: \"conv2_2\"\n  top: \"conv2_2\"\n}\nlayer {\n  name: \"pool2\"\n  type: \"Pooling\"\n  bottom: \"conv2_2\"\n  top: \"pool2\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv3_1\"\n  type: \"Convolution\"\n  bottom: \"pool2\"\n  top: \"conv3_1\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_1\"\n  type: \"ReLU\"\n  bottom: \"conv3_1\"\n  top: \"conv3_1\"\n}\nlayer {\n  name: \"conv3_2\"\n  type: \"Convolution\"\n  bottom: \"conv3_1\"\n  top: \"conv3_2\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_2\"\n  type: \"ReLU\"\n  bottom: \"conv3_2\"\n  top: \"conv3_2\"\n}\nlayer {\n  name: \"conv3_3\"\n  type: \"Convolution\"\n  bottom: \"conv3_2\"\n  top: \"conv3_3\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_3\"\n  type: \"ReLU\"\n  bottom: \"conv3_3\"\n  top: \"conv3_3\"\n}\nlayer {\n  name: \"pool3\"\n  type: \"Pooling\"\n  bottom: \"conv3_3\"\n  top: \"pool3\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv4_1\"\n  type: \"Convolution\"\n  bottom: \"pool3\"\n  top: \"conv4_1\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_1\"\n  type: \"ReLU\"\n  bottom: \"conv4_1\"\n  top: \"conv4_1\"\n}\nlayer {\n  name: \"conv4_2\"\n  type: \"Convolution\"\n  bottom: \"conv4_1\"\n  top: \"conv4_2\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_2\"\n  type: \"ReLU\"\n  bottom: \"conv4_2\"\n  top: \"conv4_2\"\n}\nlayer {\n  name: \"conv4_3\"\n  type: \"Convolution\"\n  bottom: \"conv4_2\"\n  top: \"conv4_3\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_3\"\n  type: \"ReLU\"\n  bottom: \"conv4_3\"\n  top: \"conv4_3\"\n}\nlayer {\n  name: \"pool4\"\n  type: \"Pooling\"\n  bottom: \"conv4_3\"\n  top: \"pool4\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv5_1\"\n  type: \"Convolution\"\n  bottom: \"pool4\"\n  top: \"conv5_1\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_1\"\n  type: \"ReLU\"\n  bottom: \"conv5_1\"\n  top: \"conv5_1\"\n}\nlayer {\n  name: \"conv5_2\"\n  type: \"Convolution\"\n  bottom: \"conv5_1\"\n  top: \"conv5_2\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_2\"\n  type: \"ReLU\"\n  bottom: \"conv5_2\"\n  top: \"conv5_2\"\n}\nlayer {\n  name: \"conv5_3\"\n  type: \"Convolution\"\n  bottom: \"conv5_2\"\n  top: \"conv5_3\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_3\"\n  type: \"ReLU\"\n  bottom: \"conv5_3\"\n  top: \"conv5_3\"\n}\n\n#========= RPN ============\n\nlayer {\n  name: \"rpn_conv/3x3\"\n  type: \"Convolution\"\n  bottom: \"conv5_3\"\n  top: \"rpn/output\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 512\n    kernel_size: 3 pad: 1 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_relu/3x3\"\n  type: \"ReLU\"\n  bottom: \"rpn/output\"\n  top: \"rpn/output\"\n}\n\nlayer {\n  name: \"rpn_cls_score\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_cls_score\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 18   # 2(bg/fg) * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\n\nlayer {\n  name: \"rpn_bbox_pred\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_bbox_pred\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 36   # 4 * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\n\nlayer {\n   bottom: \"rpn_cls_score\"\n   top: \"rpn_cls_score_reshape\"\n   name: \"rpn_cls_score_reshape\"\n   type: \"Reshape\"\n   reshape_param { shape { dim: 0 dim: 2 dim: -1 dim: 0 } }\n}\n\nlayer {\n  name: 'rpn-data'\n  type: 'Python'\n  bottom: 'rpn_cls_score'\n  bottom: 'gt_boxes'\n  bottom: 'im_info'\n  bottom: 'data'\n  top: 'rpn_labels'\n  top: 'rpn_bbox_targets'\n  top: 'rpn_bbox_inside_weights'\n  top: 'rpn_bbox_outside_weights'\n  python_param {\n    module: 'rpn.anchor_target_layer'\n    layer: 'AnchorTargetLayer'\n    param_str: \"'feat_stride': 16\"\n  }\n}\n\nlayer {\n  name: \"rpn_loss_cls\"\n  type: \"SoftmaxWithLoss\"\n  bottom: \"rpn_cls_score_reshape\"\n  bottom: \"rpn_labels\"\n  propagate_down: 1\n  propagate_down: 0\n  top: \"rpn_cls_loss\"\n  loss_weight: 1\n  loss_param {\n    ignore_label: -1\n    normalize: true\n  }\n}\n\nlayer {\n  name: \"rpn_loss_bbox\"\n  type: \"SmoothL1Loss\"\n  bottom: \"rpn_bbox_pred\"\n  bottom: \"rpn_bbox_targets\"\n  bottom: 'rpn_bbox_inside_weights'\n  bottom: 'rpn_bbox_outside_weights'\n  top: \"rpn_loss_bbox\"\n  loss_weight: 1\n  smooth_l1_loss_param { sigma: 3.0 }\n}\n\n#========= RCNN ============\n# Dummy layers so that initial parameters are saved into the output net\n\nlayer {\n  name: \"dummy_roi_pool_conv5\"\n  type: \"DummyData\"\n  top: \"dummy_roi_pool_conv5\"\n  dummy_data_param {\n    shape { dim: 1 dim: 25088 }\n    data_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"dummy_roi_pool_conv5\"\n  top: \"fc6\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"drop6\"\n  type: \"Dropout\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n  dropout_param {\n    dropout_ratio: 0.5\n  }\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"silence_fc7\"\n  type: \"Silence\"\n  bottom: \"fc7\"\n}\n"
  },
  {
    "path": "models/pascal_voc/VGG16/faster_rcnn_end2end/solver.prototxt",
    "content": "train_net: \"models/pascal_voc/VGG16/faster_rcnn_end2end/train.prototxt\"\nbase_lr: 0.001\nlr_policy: \"step\"\ngamma: 0.1\nstepsize: 50000\ndisplay: 20\naverage_loss: 100\n# iter_size: 1\nmomentum: 0.9\nweight_decay: 0.0005\n# We disable standard caffe solver snapshotting and implement our own snapshot\n# function\nsnapshot: 0\n# We still use the snapshot prefix, though\nsnapshot_prefix: \"vgg16_faster_rcnn\"\niter_size: 2\n"
  },
  {
    "path": "models/pascal_voc/VGG16/faster_rcnn_end2end/test.prototxt",
    "content": "name: \"VGG_ILSVRC_16_layers\"\n\ninput: \"data\"\ninput_shape {\n  dim: 1\n  dim: 3\n  dim: 224\n  dim: 224\n}\n\ninput: \"im_info\"\ninput_shape {\n  dim: 1\n  dim: 3\n}\n\nlayer {\n  name: \"conv1_1\"\n  type: \"Convolution\"\n  bottom: \"data\"\n  top: \"conv1_1\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 64\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu1_1\"\n  type: \"ReLU\"\n  bottom: \"conv1_1\"\n  top: \"conv1_1\"\n}\nlayer {\n  name: \"conv1_2\"\n  type: \"Convolution\"\n  bottom: \"conv1_1\"\n  top: \"conv1_2\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 64\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu1_2\"\n  type: \"ReLU\"\n  bottom: \"conv1_2\"\n  top: \"conv1_2\"\n}\nlayer {\n  name: \"pool1\"\n  type: \"Pooling\"\n  bottom: \"conv1_2\"\n  top: \"pool1\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv2_1\"\n  type: \"Convolution\"\n  bottom: \"pool1\"\n  top: \"conv2_1\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 128\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu2_1\"\n  type: \"ReLU\"\n  bottom: \"conv2_1\"\n  top: \"conv2_1\"\n}\nlayer {\n  name: \"conv2_2\"\n  type: \"Convolution\"\n  bottom: \"conv2_1\"\n  top: \"conv2_2\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 128\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu2_2\"\n  type: \"ReLU\"\n  bottom: \"conv2_2\"\n  top: \"conv2_2\"\n}\nlayer {\n  name: \"pool2\"\n  type: \"Pooling\"\n  bottom: \"conv2_2\"\n  top: \"pool2\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv3_1\"\n  type: \"Convolution\"\n  bottom: \"pool2\"\n  top: \"conv3_1\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_1\"\n  type: \"ReLU\"\n  bottom: \"conv3_1\"\n  top: \"conv3_1\"\n}\nlayer {\n  name: \"conv3_2\"\n  type: \"Convolution\"\n  bottom: \"conv3_1\"\n  top: \"conv3_2\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_2\"\n  type: \"ReLU\"\n  bottom: \"conv3_2\"\n  top: \"conv3_2\"\n}\nlayer {\n  name: \"conv3_3\"\n  type: \"Convolution\"\n  bottom: \"conv3_2\"\n  top: \"conv3_3\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_3\"\n  type: \"ReLU\"\n  bottom: \"conv3_3\"\n  top: \"conv3_3\"\n}\nlayer {\n  name: \"pool3\"\n  type: \"Pooling\"\n  bottom: \"conv3_3\"\n  top: \"pool3\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv4_1\"\n  type: \"Convolution\"\n  bottom: \"pool3\"\n  top: \"conv4_1\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_1\"\n  type: \"ReLU\"\n  bottom: \"conv4_1\"\n  top: \"conv4_1\"\n}\nlayer {\n  name: \"conv4_2\"\n  type: \"Convolution\"\n  bottom: \"conv4_1\"\n  top: \"conv4_2\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_2\"\n  type: \"ReLU\"\n  bottom: \"conv4_2\"\n  top: \"conv4_2\"\n}\nlayer {\n  name: \"conv4_3\"\n  type: \"Convolution\"\n  bottom: \"conv4_2\"\n  top: \"conv4_3\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_3\"\n  type: \"ReLU\"\n  bottom: \"conv4_3\"\n  top: \"conv4_3\"\n}\nlayer {\n  name: \"pool4\"\n  type: \"Pooling\"\n  bottom: \"conv4_3\"\n  top: \"pool4\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv5_1\"\n  type: \"Convolution\"\n  bottom: \"pool4\"\n  top: \"conv5_1\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_1\"\n  type: \"ReLU\"\n  bottom: \"conv5_1\"\n  top: \"conv5_1\"\n}\nlayer {\n  name: \"conv5_2\"\n  type: \"Convolution\"\n  bottom: \"conv5_1\"\n  top: \"conv5_2\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_2\"\n  type: \"ReLU\"\n  bottom: \"conv5_2\"\n  top: \"conv5_2\"\n}\nlayer {\n  name: \"conv5_3\"\n  type: \"Convolution\"\n  bottom: \"conv5_2\"\n  top: \"conv5_3\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_3\"\n  type: \"ReLU\"\n  bottom: \"conv5_3\"\n  top: \"conv5_3\"\n}\n\n#========= RPN ============\n\nlayer {\n  name: \"rpn_conv/3x3\"\n  type: \"Convolution\"\n  bottom: \"conv5_3\"\n  top: \"rpn/output\"\n  param { lr_mult: 1.0 decay_mult: 1.0 }\n  param { lr_mult: 2.0 decay_mult: 0 }\n  convolution_param {\n    num_output: 512\n    kernel_size: 3 pad: 1 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_relu/3x3\"\n  type: \"ReLU\"\n  bottom: \"rpn/output\"\n  top: \"rpn/output\"\n}\n\nlayer {\n  name: \"rpn_cls_score\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_cls_score\"\n  param { lr_mult: 1.0 decay_mult: 1.0 }\n  param { lr_mult: 2.0 decay_mult: 0 }\n  convolution_param {\n    num_output: 18   # 2(bg/fg) * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_bbox_pred\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_bbox_pred\"\n  param { lr_mult: 1.0 decay_mult: 1.0 }\n  param { lr_mult: 2.0 decay_mult: 0 }\n  convolution_param {\n    num_output: 36   # 4 * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n   bottom: \"rpn_cls_score\"\n   top: \"rpn_cls_score_reshape\"\n   name: \"rpn_cls_score_reshape\"\n   type: \"Reshape\"\n   reshape_param { shape { dim: 0 dim: 2 dim: -1 dim: 0 } }\n}\n\n#========= RoI Proposal ============\n\nlayer {\n  name: \"rpn_cls_prob\"\n  type: \"Softmax\"\n  bottom: \"rpn_cls_score_reshape\"\n  top: \"rpn_cls_prob\"\n}\nlayer {\n  name: 'rpn_cls_prob_reshape'\n  type: 'Reshape'\n  bottom: 'rpn_cls_prob'\n  top: 'rpn_cls_prob_reshape'\n  reshape_param { shape { dim: 0 dim: 18 dim: -1 dim: 0 } }\n}\nlayer {\n  name: 'proposal'\n  type: 'Python'\n  bottom: 'rpn_cls_prob_reshape'\n  bottom: 'rpn_bbox_pred'\n  bottom: 'im_info'\n  top: 'rois'\n  python_param {\n    module: 'rpn.proposal_layer'\n    layer: 'ProposalLayer'\n    param_str: \"'feat_stride': 16\"\n  }\n}\n\n#========= RCNN ============\n\nlayer {\n  name: \"roi_pool5\"\n  type: \"ROIPooling\"\n  bottom: \"conv5_3\"\n  bottom: \"rois\"\n  top: \"pool5\"\n  roi_pooling_param {\n    pooled_w: 7\n    pooled_h: 7\n    spatial_scale: 0.0625 # 1/16\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"pool5\"\n  top: \"fc6\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"drop6\"\n  type: \"Dropout\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n  dropout_param {\n    dropout_ratio: 0.5\n  }\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu7\"\n  type: \"ReLU\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n}\nlayer {\n  name: \"drop7\"\n  type: \"Dropout\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n  dropout_param {\n    dropout_ratio: 0.5\n  }\n}\nlayer {\n  name: \"cls_score\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"cls_score\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 21\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"bbox_pred\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"bbox_pred\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 84\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.001\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"cls_prob\"\n  type: \"Softmax\"\n  bottom: \"cls_score\"\n  top: \"cls_prob\"\n}\n"
  },
  {
    "path": "models/pascal_voc/VGG16/faster_rcnn_end2end/train.prototxt",
    "content": "name: \"VGG_ILSVRC_16_layers\"\nlayer {\n  name: 'input-data'\n  type: 'Python'\n  top: 'data'\n  top: 'im_info'\n  top: 'gt_boxes'\n  python_param {\n    module: 'roi_data_layer.layer'\n    layer: 'RoIDataLayer'\n    param_str: \"'num_classes': 21\"\n  }\n}\n\nlayer {\n  name: \"conv1_1\"\n  type: \"Convolution\"\n  bottom: \"data\"\n  top: \"conv1_1\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 64\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu1_1\"\n  type: \"ReLU\"\n  bottom: \"conv1_1\"\n  top: \"conv1_1\"\n}\nlayer {\n  name: \"conv1_2\"\n  type: \"Convolution\"\n  bottom: \"conv1_1\"\n  top: \"conv1_2\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 64\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu1_2\"\n  type: \"ReLU\"\n  bottom: \"conv1_2\"\n  top: \"conv1_2\"\n}\nlayer {\n  name: \"pool1\"\n  type: \"Pooling\"\n  bottom: \"conv1_2\"\n  top: \"pool1\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv2_1\"\n  type: \"Convolution\"\n  bottom: \"pool1\"\n  top: \"conv2_1\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 128\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu2_1\"\n  type: \"ReLU\"\n  bottom: \"conv2_1\"\n  top: \"conv2_1\"\n}\nlayer {\n  name: \"conv2_2\"\n  type: \"Convolution\"\n  bottom: \"conv2_1\"\n  top: \"conv2_2\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 128\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu2_2\"\n  type: \"ReLU\"\n  bottom: \"conv2_2\"\n  top: \"conv2_2\"\n}\nlayer {\n  name: \"pool2\"\n  type: \"Pooling\"\n  bottom: \"conv2_2\"\n  top: \"pool2\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv3_1\"\n  type: \"Convolution\"\n  bottom: \"pool2\"\n  top: \"conv3_1\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_1\"\n  type: \"ReLU\"\n  bottom: \"conv3_1\"\n  top: \"conv3_1\"\n}\nlayer {\n  name: \"conv3_2\"\n  type: \"Convolution\"\n  bottom: \"conv3_1\"\n  top: \"conv3_2\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_2\"\n  type: \"ReLU\"\n  bottom: \"conv3_2\"\n  top: \"conv3_2\"\n}\nlayer {\n  name: \"conv3_3\"\n  type: \"Convolution\"\n  bottom: \"conv3_2\"\n  top: \"conv3_3\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3_3\"\n  type: \"ReLU\"\n  bottom: \"conv3_3\"\n  top: \"conv3_3\"\n}\nlayer {\n  name: \"pool3\"\n  type: \"Pooling\"\n  bottom: \"conv3_3\"\n  top: \"pool3\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv4_1\"\n  type: \"Convolution\"\n  bottom: \"pool3\"\n  top: \"conv4_1\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_1\"\n  type: \"ReLU\"\n  bottom: \"conv4_1\"\n  top: \"conv4_1\"\n}\nlayer {\n  name: \"conv4_2\"\n  type: \"Convolution\"\n  bottom: \"conv4_1\"\n  top: \"conv4_2\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_2\"\n  type: \"ReLU\"\n  bottom: \"conv4_2\"\n  top: \"conv4_2\"\n}\nlayer {\n  name: \"conv4_3\"\n  type: \"Convolution\"\n  bottom: \"conv4_2\"\n  top: \"conv4_3\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4_3\"\n  type: \"ReLU\"\n  bottom: \"conv4_3\"\n  top: \"conv4_3\"\n}\nlayer {\n  name: \"pool4\"\n  type: \"Pooling\"\n  bottom: \"conv4_3\"\n  top: \"pool4\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv5_1\"\n  type: \"Convolution\"\n  bottom: \"pool4\"\n  top: \"conv5_1\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_1\"\n  type: \"ReLU\"\n  bottom: \"conv5_1\"\n  top: \"conv5_1\"\n}\nlayer {\n  name: \"conv5_2\"\n  type: \"Convolution\"\n  bottom: \"conv5_1\"\n  top: \"conv5_2\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_2\"\n  type: \"ReLU\"\n  bottom: \"conv5_2\"\n  top: \"conv5_2\"\n}\nlayer {\n  name: \"conv5_3\"\n  type: \"Convolution\"\n  bottom: \"conv5_2\"\n  top: \"conv5_3\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5_3\"\n  type: \"ReLU\"\n  bottom: \"conv5_3\"\n  top: \"conv5_3\"\n}\n\n#========= RPN ============\n\nlayer {\n  name: \"rpn_conv/3x3\"\n  type: \"Convolution\"\n  bottom: \"conv5_3\"\n  top: \"rpn/output\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 512\n    kernel_size: 3 pad: 1 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_relu/3x3\"\n  type: \"ReLU\"\n  bottom: \"rpn/output\"\n  top: \"rpn/output\"\n}\n\nlayer {\n  name: \"rpn_cls_score\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_cls_score\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 18   # 2(bg/fg) * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\n\nlayer {\n  name: \"rpn_bbox_pred\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_bbox_pred\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 36   # 4 * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\n\nlayer {\n   bottom: \"rpn_cls_score\"\n   top: \"rpn_cls_score_reshape\"\n   name: \"rpn_cls_score_reshape\"\n   type: \"Reshape\"\n   reshape_param { shape { dim: 0 dim: 2 dim: -1 dim: 0 } }\n}\n\nlayer {\n  name: 'rpn-data'\n  type: 'Python'\n  bottom: 'rpn_cls_score'\n  bottom: 'gt_boxes'\n  bottom: 'im_info'\n  bottom: 'data'\n  top: 'rpn_labels'\n  top: 'rpn_bbox_targets'\n  top: 'rpn_bbox_inside_weights'\n  top: 'rpn_bbox_outside_weights'\n  python_param {\n    module: 'rpn.anchor_target_layer'\n    layer: 'AnchorTargetLayer'\n    param_str: \"'feat_stride': 16\"\n  }\n}\n\nlayer {\n  name: \"rpn_loss_cls\"\n  type: \"SoftmaxWithLoss\"\n  bottom: \"rpn_cls_score_reshape\"\n  bottom: \"rpn_labels\"\n  propagate_down: 1\n  propagate_down: 0\n  top: \"rpn_cls_loss\"\n  loss_weight: 1\n  loss_param {\n    ignore_label: -1\n    normalize: true\n  }\n}\n\nlayer {\n  name: \"rpn_loss_bbox\"\n  type: \"SmoothL1Loss\"\n  bottom: \"rpn_bbox_pred\"\n  bottom: \"rpn_bbox_targets\"\n  bottom: 'rpn_bbox_inside_weights'\n  bottom: 'rpn_bbox_outside_weights'\n  top: \"rpn_loss_bbox\"\n  loss_weight: 1\n  smooth_l1_loss_param { sigma: 3.0 }\n}\n\n#========= RoI Proposal ============\n\nlayer {\n  name: \"rpn_cls_prob\"\n  type: \"Softmax\"\n  bottom: \"rpn_cls_score_reshape\"\n  top: \"rpn_cls_prob\"\n}\n\nlayer {\n  name: 'rpn_cls_prob_reshape'\n  type: 'Reshape'\n  bottom: 'rpn_cls_prob'\n  top: 'rpn_cls_prob_reshape'\n  reshape_param { shape { dim: 0 dim: 18 dim: -1 dim: 0 } }\n}\n\nlayer {\n  name: 'proposal'\n  type: 'Python'\n  bottom: 'rpn_cls_prob_reshape'\n  bottom: 'rpn_bbox_pred'\n  bottom: 'im_info'\n  top: 'rpn_rois'\n#  top: 'rpn_scores'\n  python_param {\n    module: 'rpn.proposal_layer'\n    layer: 'ProposalLayer'\n    param_str: \"'feat_stride': 16\"\n  }\n}\n\n#layer {\n#  name: 'debug-data'\n#  type: 'Python'\n#  bottom: 'data'\n#  bottom: 'rpn_rois'\n#  bottom: 'rpn_scores'\n#  python_param {\n#    module: 'rpn.debug_layer'\n#    layer: 'RPNDebugLayer'\n#  }\n#}\n\nlayer {\n  name: 'roi-data'\n  type: 'Python'\n  bottom: 'rpn_rois'\n  bottom: 'gt_boxes'\n  top: 'rois'\n  top: 'labels'\n  top: 'bbox_targets'\n  top: 'bbox_inside_weights'\n  top: 'bbox_outside_weights'\n  python_param {\n    module: 'rpn.proposal_target_layer'\n    layer: 'ProposalTargetLayer'\n    param_str: \"'num_classes': 21\"\n  }\n}\n\n#========= RCNN ============\n\nlayer {\n  name: \"roi_pool5\"\n  type: \"ROIPooling\"\n  bottom: \"conv5_3\"\n  bottom: \"rois\"\n  top: \"pool5\"\n  roi_pooling_param {\n    pooled_w: 7\n    pooled_h: 7\n    spatial_scale: 0.0625 # 1/16\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"pool5\"\n  top: \"fc6\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"drop6\"\n  type: \"Dropout\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n  dropout_param {\n    dropout_ratio: 0.5\n  }\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu7\"\n  type: \"ReLU\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n}\nlayer {\n  name: \"drop7\"\n  type: \"Dropout\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n  dropout_param {\n    dropout_ratio: 0.5\n  }\n}\nlayer {\n  name: \"cls_score\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"cls_score\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 21\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"bbox_pred\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"bbox_pred\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 84\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.001\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"loss_cls\"\n  type: \"SoftmaxWithLoss\"\n  bottom: \"cls_score\"\n  bottom: \"labels\"\n  propagate_down: 1\n  propagate_down: 0\n  top: \"loss_cls\"\n  loss_weight: 1\n}\nlayer {\n  name: \"loss_bbox\"\n  type: \"SmoothL1Loss\"\n  bottom: \"bbox_pred\"\n  bottom: \"bbox_targets\"\n  bottom: \"bbox_inside_weights\"\n  bottom: \"bbox_outside_weights\"\n  top: \"loss_bbox\"\n  loss_weight: 1\n}\n"
  },
  {
    "path": "models/pascal_voc/VGG_CNN_M_1024/fast_rcnn/solver.prototxt",
    "content": "train_net: \"models/pascal_voc/VGG_CNN_M_1024/fast_rcnn/train.prototxt\"\nbase_lr: 0.001\nlr_policy: \"step\"\ngamma: 0.1\nstepsize: 30000\ndisplay: 20\naverage_loss: 100\nmomentum: 0.9\nweight_decay: 0.0005\n# We disable standard caffe solver snapshotting and implement our own snapshot\n# function\nsnapshot: 0\n# We still use the snapshot prefix, though\nsnapshot_prefix: \"vgg_cnn_m_1024_fast_rcnn\"\n#debug_info: true\n"
  },
  {
    "path": "models/pascal_voc/VGG_CNN_M_1024/fast_rcnn/test.prototxt",
    "content": "name: \"VGG_CNN_M_1024\"\ninput: \"data\"\ninput_shape {\n  dim: 1\n  dim: 3\n  dim: 224\n  dim: 224\n}\ninput: \"rois\"\ninput_shape {\n  dim: 1 # to be changed on-the-fly to num ROIs\n  dim: 5 # [batch ind, x1, y1, x2, y2] zero-based indexing\n}\nlayer {\n  name: \"conv1\"\n  type: \"Convolution\"\n  bottom: \"data\"\n  top: \"conv1\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 96\n    kernel_size: 7\n    stride: 2\n  }\n}\nlayer {\n  name: \"relu1\"\n  type: \"ReLU\"\n  bottom: \"conv1\"\n  top: \"conv1\"\n}\nlayer {\n  name: \"norm1\"\n  type: \"LRN\"\n  bottom: \"conv1\"\n  top: \"norm1\"\n  lrn_param {\n    local_size: 5\n    alpha: 0.0005\n    beta: 0.75\n    k: 2\n  }\n}\nlayer {\n  name: \"pool1\"\n  type: \"Pooling\"\n  bottom: \"norm1\"\n  top: \"pool1\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 3\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv2\"\n  type: \"Convolution\"\n  bottom: \"pool1\"\n  top: \"conv2\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 5\n    stride: 2\n  }\n}\nlayer {\n  name: \"relu2\"\n  type: \"ReLU\"\n  bottom: \"conv2\"\n  top: \"conv2\"\n}\nlayer {\n  name: \"norm2\"\n  type: \"LRN\"\n  bottom: \"conv2\"\n  top: \"norm2\"\n  lrn_param {\n    local_size: 5\n    alpha: 0.0005\n    beta: 0.75\n    k: 2\n  }\n}\nlayer {\n  name: \"pool2\"\n  type: \"Pooling\"\n  bottom: \"norm2\"\n  top: \"pool2\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 3\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv3\"\n  type: \"Convolution\"\n  bottom: \"pool2\"\n  top: \"conv3\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3\"\n  type: \"ReLU\"\n  bottom: \"conv3\"\n  top: \"conv3\"\n}\nlayer {\n  name: \"conv4\"\n  type: \"Convolution\"\n  bottom: \"conv3\"\n  top: \"conv4\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4\"\n  type: \"ReLU\"\n  bottom: \"conv4\"\n  top: \"conv4\"\n}\nlayer {\n  name: \"conv5\"\n  type: \"Convolution\"\n  bottom: \"conv4\"\n  top: \"conv5\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5\"\n  type: \"ReLU\"\n  bottom: \"conv5\"\n  top: \"conv5\"\n}\nlayer {\n  name: \"roi_pool5\"\n  type: \"ROIPooling\"\n  bottom: \"conv5\"\n  bottom: \"rois\"\n  top: \"pool5\"\n  roi_pooling_param {\n    pooled_w: 6\n    pooled_h: 6\n    spatial_scale: 0.0625 # 1/16\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"pool5\"\n  top: \"fc6\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"drop6\"\n  type: \"Dropout\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n  dropout_param {\n    dropout_ratio: 0.5\n  }\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 1024\n  }\n}\nlayer {\n  name: \"relu7\"\n  type: \"ReLU\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n}\nlayer {\n  name: \"drop7\"\n  type: \"Dropout\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n  dropout_param {\n    dropout_ratio: 0.5\n  }\n}\nlayer {\n  name: \"cls_score\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"cls_score\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 21\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"bbox_pred\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"bbox_pred\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 84\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.001\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"cls_prob\"\n  type: \"Softmax\"\n  bottom: \"cls_score\"\n  top: \"cls_prob\"\n}\n"
  },
  {
    "path": "models/pascal_voc/VGG_CNN_M_1024/fast_rcnn/train.prototxt",
    "content": "name: \"VGG_CNN_M_1024\"\nlayer {\n  name: 'data'\n  type: 'Python'\n  top: 'data'\n  top: 'rois'\n  top: 'labels'\n  top: 'bbox_targets'\n  top: 'bbox_inside_weights'\n  top: 'bbox_outside_weights'\n  python_param {\n    module: 'roi_data_layer.layer'\n    layer: 'RoIDataLayer'\n    param_str: \"'num_classes': 21\"\n  }\n}\nlayer {\n  name: \"conv1\"\n  type: \"Convolution\"\n  bottom: \"data\"\n  top: \"conv1\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 96\n    kernel_size: 7\n    stride: 2\n  }\n}\nlayer {\n  name: \"relu1\"\n  type: \"ReLU\"\n  bottom: \"conv1\"\n  top: \"conv1\"\n}\nlayer {\n  name: \"norm1\"\n  type: \"LRN\"\n  bottom: \"conv1\"\n  top: \"norm1\"\n  lrn_param {\n    local_size: 5\n    alpha: 0.0005\n    beta: 0.75\n    k: 2\n  }\n}\nlayer {\n  name: \"pool1\"\n  type: \"Pooling\"\n  bottom: \"norm1\"\n  top: \"pool1\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 3\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv2\"\n  type: \"Convolution\"\n  bottom: \"pool1\"\n  top: \"conv2\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 5\n    stride: 2\n  }\n}\nlayer {\n  name: \"relu2\"\n  type: \"ReLU\"\n  bottom: \"conv2\"\n  top: \"conv2\"\n}\nlayer {\n  name: \"norm2\"\n  type: \"LRN\"\n  bottom: \"conv2\"\n  top: \"norm2\"\n  lrn_param {\n    local_size: 5\n    alpha: 0.0005\n    beta: 0.75\n    k: 2\n  }\n}\nlayer {\n  name: \"pool2\"\n  type: \"Pooling\"\n  bottom: \"norm2\"\n  top: \"pool2\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 3\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv3\"\n  type: \"Convolution\"\n  bottom: \"pool2\"\n  top: \"conv3\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3\"\n  type: \"ReLU\"\n  bottom: \"conv3\"\n  top: \"conv3\"\n}\nlayer {\n  name: \"conv4\"\n  type: \"Convolution\"\n  bottom: \"conv3\"\n  top: \"conv4\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4\"\n  type: \"ReLU\"\n  bottom: \"conv4\"\n  top: \"conv4\"\n}\nlayer {\n  name: \"conv5\"\n  type: \"Convolution\"\n  bottom: \"conv4\"\n  top: \"conv5\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5\"\n  type: \"ReLU\"\n  bottom: \"conv5\"\n  top: \"conv5\"\n}\nlayer {\n  name: \"roi_pool5\"\n  type: \"ROIPooling\"\n  bottom: \"conv5\"\n  bottom: \"rois\"\n  top: \"pool5\"\n  roi_pooling_param {\n    pooled_w: 6\n    pooled_h: 6\n    spatial_scale: 0.0625 # 1/16\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"pool5\"\n  top: \"fc6\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"drop6\"\n  type: \"Dropout\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n  dropout_param {\n    dropout_ratio: 0.5\n  }\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 1024\n  }\n}\nlayer {\n  name: \"relu7\"\n  type: \"ReLU\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n}\nlayer {\n  name: \"drop7\"\n  type: \"Dropout\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n  dropout_param {\n    dropout_ratio: 0.5\n  }\n}\nlayer {\n  name: \"cls_score\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"cls_score\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 21\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"bbox_pred\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"bbox_pred\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 84\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.001\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"loss_cls\"\n  type: \"SoftmaxWithLoss\"\n  bottom: \"cls_score\"\n  bottom: \"labels\"\n  top: \"loss_cls\"\n  loss_weight: 1\n}\nlayer {\n  name: \"loss_bbox\"\n  type: \"SmoothL1Loss\"\n  bottom: \"bbox_pred\"\n  bottom: \"bbox_targets\"\n  bottom: \"bbox_inside_weights\"\n  bottom: \"bbox_outside_weights\"\n  top: \"loss_bbox\"\n  loss_weight: 1\n}\n"
  },
  {
    "path": "models/pascal_voc/VGG_CNN_M_1024/faster_rcnn_alt_opt/faster_rcnn_test.pt",
    "content": "name: \"VGG_CNN_M_1024\"\ninput: \"data\"\ninput_shape {\n  dim: 1\n  dim: 3\n  dim: 224\n  dim: 224\n}\ninput: \"im_info\"\ninput_shape {\n  dim: 1\n  dim: 3\n}\nlayer {\n  name: \"conv1\"\n  type: \"Convolution\"\n  bottom: \"data\"\n  top: \"conv1\"\n  convolution_param {\n    num_output: 96\n    kernel_size: 7\n    stride: 2\n  }\n}\nlayer {\n  name: \"relu1\"\n  type: \"ReLU\"\n  bottom: \"conv1\"\n  top: \"conv1\"\n}\nlayer {\n  name: \"norm1\"\n  type: \"LRN\"\n  bottom: \"conv1\"\n  top: \"norm1\"\n  lrn_param {\n    local_size: 5\n    alpha: 0.0005\n    beta: 0.75\n    k: 2\n  }\n}\nlayer {\n  name: \"pool1\"\n  type: \"Pooling\"\n  bottom: \"norm1\"\n  top: \"pool1\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 3\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv2\"\n  type: \"Convolution\"\n  bottom: \"pool1\"\n  top: \"conv2\"\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 5\n    stride: 2\n  }\n}\nlayer {\n  name: \"relu2\"\n  type: \"ReLU\"\n  bottom: \"conv2\"\n  top: \"conv2\"\n}\nlayer {\n  name: \"norm2\"\n  type: \"LRN\"\n  bottom: \"conv2\"\n  top: \"norm2\"\n  lrn_param {\n    local_size: 5\n    alpha: 0.0005\n    beta: 0.75\n    k: 2\n  }\n}\nlayer {\n  name: \"pool2\"\n  type: \"Pooling\"\n  bottom: \"norm2\"\n  top: \"pool2\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 3\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv3\"\n  type: \"Convolution\"\n  bottom: \"pool2\"\n  top: \"conv3\"\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3\"\n  type: \"ReLU\"\n  bottom: \"conv3\"\n  top: \"conv3\"\n}\nlayer {\n  name: \"conv4\"\n  type: \"Convolution\"\n  bottom: \"conv3\"\n  top: \"conv4\"\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4\"\n  type: \"ReLU\"\n  bottom: \"conv4\"\n  top: \"conv4\"\n}\nlayer {\n  name: \"conv5\"\n  type: \"Convolution\"\n  bottom: \"conv4\"\n  top: \"conv5\"\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5\"\n  type: \"ReLU\"\n  bottom: \"conv5\"\n  top: \"conv5\"\n}\n\n#========= RPN ============\n\nlayer {\n  name: \"rpn_conv/3x3\"\n  type: \"Convolution\"\n  bottom: \"conv5\"\n  top: \"rpn/output\"\n  convolution_param {\n    num_output: 256\n    kernel_size: 3 pad: 1 stride: 1\n  }\n}\nlayer {\n  name: \"rpn_relu/3x3\"\n  type: \"ReLU\"\n  bottom: \"rpn/output\"\n  top: \"rpn/output\"\n}\nlayer {\n  name: \"rpn_cls_score\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_cls_score\"\n  convolution_param {\n    num_output: 18   # 2(bg/fg) * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n  }\n}\nlayer {\n  name: \"rpn_bbox_pred\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_bbox_pred\"\n  convolution_param {\n    num_output: 36   # 4 * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n  }\n}\nlayer {\n   bottom: \"rpn_cls_score\"\n   top: \"rpn_cls_score_reshape\"\n   name: \"rpn_cls_score_reshape\"\n   type: \"Reshape\"\n   reshape_param { shape { dim: 0 dim: 2 dim: -1 dim: 0 } }\n}\n\n#========= RoI Proposal ============\n\nlayer {\n  name: \"rpn_cls_prob\"\n  type: \"Softmax\"\n  bottom: \"rpn_cls_score_reshape\"\n  top: \"rpn_cls_prob\"\n}\nlayer {\n  name: 'rpn_cls_prob_reshape'\n  type: 'Reshape'\n  bottom: 'rpn_cls_prob'\n  top: 'rpn_cls_prob_reshape'\n  reshape_param { shape { dim: 0 dim: 18 dim: -1 dim: 0 } }\n}\nlayer {\n  name: 'proposal'\n  type: 'Python'\n  bottom: 'rpn_cls_prob_reshape'\n  bottom: 'rpn_bbox_pred'\n  bottom: 'im_info'\n  top: 'rois'\n  python_param {\n    module: 'rpn.proposal_layer'\n    layer: 'ProposalLayer'\n    param_str: \"'feat_stride': 16\"\n  }\n}\n\n#========= RCNN ============\n\nlayer {\n  name: \"roi_pool5\"\n  type: \"ROIPooling\"\n  bottom: \"conv5\"\n  bottom: \"rois\"\n  top: \"pool5\"\n  roi_pooling_param {\n    pooled_w: 6\n    pooled_h: 6\n    spatial_scale: 0.0625 # 1/16\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"pool5\"\n  top: \"fc6\"\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  inner_product_param {\n    num_output: 1024\n  }\n}\nlayer {\n  name: \"relu7\"\n  type: \"ReLU\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n}\nlayer {\n  name: \"cls_score\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"cls_score\"\n  inner_product_param {\n    num_output: 21\n  }\n}\nlayer {\n  name: \"bbox_pred\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"bbox_pred\"\n  inner_product_param {\n    num_output: 84\n  }\n}\nlayer {\n  name: \"cls_prob\"\n  type: \"Softmax\"\n  bottom: \"cls_score\"\n  top: \"cls_prob\"\n}\n"
  },
  {
    "path": "models/pascal_voc/VGG_CNN_M_1024/faster_rcnn_alt_opt/rpn_test.pt",
    "content": "name: \"VGG_CNN_M_1024\"\ninput: \"data\"\ninput_shape {\n  dim: 1\n  dim: 3\n  dim: 224\n  dim: 224\n}\ninput: \"im_info\"\ninput_shape {\n  dim: 1\n  dim: 3\n}\nlayer {\n  name: \"conv1\"\n  type: \"Convolution\"\n  bottom: \"data\"\n  top: \"conv1\"\n  convolution_param {\n    num_output: 96\n    kernel_size: 7\n    stride: 2\n  }\n}\nlayer {\n  name: \"relu1\"\n  type: \"ReLU\"\n  bottom: \"conv1\"\n  top: \"conv1\"\n}\nlayer {\n  name: \"norm1\"\n  type: \"LRN\"\n  bottom: \"conv1\"\n  top: \"norm1\"\n  lrn_param {\n    local_size: 5\n    alpha: 0.0005\n    beta: 0.75\n    k: 2\n  }\n}\nlayer {\n  name: \"pool1\"\n  type: \"Pooling\"\n  bottom: \"norm1\"\n  top: \"pool1\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 3\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv2\"\n  type: \"Convolution\"\n  bottom: \"pool1\"\n  top: \"conv2\"\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 5\n    stride: 2\n  }\n}\nlayer {\n  name: \"relu2\"\n  type: \"ReLU\"\n  bottom: \"conv2\"\n  top: \"conv2\"\n}\nlayer {\n  name: \"norm2\"\n  type: \"LRN\"\n  bottom: \"conv2\"\n  top: \"norm2\"\n  lrn_param {\n    local_size: 5\n    alpha: 0.0005\n    beta: 0.75\n    k: 2\n  }\n}\nlayer {\n  name: \"pool2\"\n  type: \"Pooling\"\n  bottom: \"norm2\"\n  top: \"pool2\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 3\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv3\"\n  type: \"Convolution\"\n  bottom: \"pool2\"\n  top: \"conv3\"\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3\"\n  type: \"ReLU\"\n  bottom: \"conv3\"\n  top: \"conv3\"\n}\nlayer {\n  name: \"conv4\"\n  type: \"Convolution\"\n  bottom: \"conv3\"\n  top: \"conv4\"\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4\"\n  type: \"ReLU\"\n  bottom: \"conv4\"\n  top: \"conv4\"\n}\nlayer {\n  name: \"conv5\"\n  type: \"Convolution\"\n  bottom: \"conv4\"\n  top: \"conv5\"\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5\"\n  type: \"ReLU\"\n  bottom: \"conv5\"\n  top: \"conv5\"\n}\n\n#========= RPN ============\n\nlayer {\n  name: \"rpn_conv/3x3\"\n  type: \"Convolution\"\n  bottom: \"conv5\"\n  top: \"rpn/output\"\n  convolution_param {\n    num_output: 256\n    kernel_size: 3 pad: 1 stride: 1\n  }\n}\nlayer {\n  name: \"rpn_relu/3x3\"\n  type: \"ReLU\"\n  bottom: \"rpn/output\"\n  top: \"rpn/output\"\n}\nlayer {\n  name: \"rpn_cls_score\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_cls_score\"\n  convolution_param {\n    num_output: 18   # 2(bg/fg) * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n  }\n}\nlayer {\n  name: \"rpn_bbox_pred\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_bbox_pred\"\n  convolution_param {\n    num_output: 36   # 4 * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n  }\n}\nlayer {\n   bottom: \"rpn_cls_score\"\n   top: \"rpn_cls_score_reshape\"\n   name: \"rpn_cls_score_reshape\"\n   type: \"Reshape\"\n   reshape_param { shape { dim: 0 dim: 2 dim: -1 dim: 0 } }\n}\n\n#========= RoI Proposal ============\n\nlayer {\n  name: \"rpn_cls_prob\"\n  type: \"Softmax\"\n  bottom: \"rpn_cls_score_reshape\"\n  top: \"rpn_cls_prob\"\n}\nlayer {\n  name: 'rpn_cls_prob_reshape'\n  type: 'Reshape'\n  bottom: 'rpn_cls_prob'\n  top: 'rpn_cls_prob_reshape'\n  reshape_param { shape { dim: 0 dim: 18 dim: -1 dim: 0 } }\n}\nlayer {\n  name: 'proposal'\n  type: 'Python'\n  bottom: 'rpn_cls_prob_reshape'\n  bottom: 'rpn_bbox_pred'\n  bottom: 'im_info'\n  top: 'rois'\n  top: 'scores'\n  python_param {\n    module: 'rpn.proposal_layer'\n    layer: 'ProposalLayer'\n    param_str: \"'feat_stride': 16\"\n  }\n}\n"
  },
  {
    "path": "models/pascal_voc/VGG_CNN_M_1024/faster_rcnn_alt_opt/stage1_fast_rcnn_solver30k40k.pt",
    "content": "train_net: \"models/pascal_voc/VGG_CNN_M_1024/faster_rcnn_alt_opt/stage1_fast_rcnn_train.pt\"\n\nbase_lr: 0.001\nlr_policy: \"step\"\ngamma: 0.1\nstepsize: 30000\ndisplay: 20\naverage_loss: 100\nmomentum: 0.9\nweight_decay: 0.0005\n\n# We disable standard caffe solver snapshotting and implement our own snapshot\n# function\nsnapshot: 0\n# We still use the snapshot prefix, though\nsnapshot_prefix: \"vgg_cnn_m_1024_fast_rcnn\"\n"
  },
  {
    "path": "models/pascal_voc/VGG_CNN_M_1024/faster_rcnn_alt_opt/stage1_fast_rcnn_train.pt",
    "content": "name: \"VGG_CNN_M_1024\"\nlayer {\n  name: 'data'\n  type: 'Python'\n  top: 'data'\n  top: 'rois'\n  top: 'labels'\n  top: 'bbox_targets'\n  top: 'bbox_inside_weights'\n  top: 'bbox_outside_weights'\n  python_param {\n    module: 'roi_data_layer.layer'\n    layer: 'RoIDataLayer'\n    param_str: \"'num_classes': 21\"\n  }\n}\nlayer {\n  name: \"conv1\"\n  type: \"Convolution\"\n  bottom: \"data\"\n  top: \"conv1\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 96\n    kernel_size: 7 stride: 2\n  }\n}\nlayer {\n  name: \"relu1\"\n  type: \"ReLU\"\n  bottom: \"conv1\"\n  top: \"conv1\"\n}\nlayer {\n  name: \"norm1\"\n  type: \"LRN\"\n  bottom: \"conv1\"\n  top: \"norm1\"\n  lrn_param {\n    local_size: 5\n    alpha: 0.0005\n    beta: 0.75\n    k: 2\n  }\n}\nlayer {\n  name: \"pool1\"\n  type: \"Pooling\"\n  bottom: \"norm1\"\n  top: \"pool1\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 3 stride: 2\n  }\n}\nlayer {\n  name: \"conv2\"\n  type: \"Convolution\"\n  bottom: \"pool1\"\n  top: \"conv2\"\n  param { lr_mult: 1 }\n  param { lr_mult: 2 }\n  convolution_param {\n    num_output: 256\n    pad: 1 kernel_size: 5 stride: 2\n  }\n}\nlayer {\n  name: \"relu2\"\n  type: \"ReLU\"\n  bottom: \"conv2\"\n  top: \"conv2\"\n}\nlayer {\n  name: \"norm2\"\n  type: \"LRN\"\n  bottom: \"conv2\"\n  top: \"norm2\"\n  lrn_param {\n    local_size: 5\n    alpha: 0.0005\n    beta: 0.75\n    k: 2\n  }\n}\nlayer {\n  name: \"pool2\"\n  type: \"Pooling\"\n  bottom: \"norm2\"\n  top: \"pool2\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 3 stride: 2\n  }\n}\nlayer {\n  name: \"conv3\"\n  type: \"Convolution\"\n  bottom: \"pool2\"\n  top: \"conv3\"\n  param { lr_mult: 1 }\n  param { lr_mult: 2 }\n  convolution_param {\n    num_output: 512\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3\"\n  type: \"ReLU\"\n  bottom: \"conv3\"\n  top: \"conv3\"\n}\nlayer {\n  name: \"conv4\"\n  type: \"Convolution\"\n  bottom: \"conv3\"\n  top: \"conv4\"\n  param { lr_mult: 1 }\n  param { lr_mult: 2 }\n  convolution_param {\n    num_output: 512\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4\"\n  type: \"ReLU\"\n  bottom: \"conv4\"\n  top: \"conv4\"\n}\nlayer {\n  name: \"conv5\"\n  type: \"Convolution\"\n  bottom: \"conv4\"\n  top: \"conv5\"\n  param { lr_mult: 1 }\n  param { lr_mult: 2 }\n  convolution_param {\n    num_output: 512\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5\"\n  type: \"ReLU\"\n  bottom: \"conv5\"\n  top: \"conv5\"\n}\n\n#========= RCNN ============\n\nlayer {\n  name: \"roi_pool5\"\n  type: \"ROIPooling\"\n  bottom: \"conv5\"\n  bottom: \"rois\"\n  top: \"pool5\"\n  roi_pooling_param {\n    pooled_w: 6\n    pooled_h: 6\n    spatial_scale: 0.0625 # 1/16\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"pool5\"\n  top: \"fc6\"\n  param { lr_mult: 1 }\n  param { lr_mult: 2 }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"drop6\"\n  type: \"Dropout\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n  dropout_param {\n    dropout_ratio: 0.5\n  }\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  param { lr_mult: 1 }\n  param { lr_mult: 2 }\n  inner_product_param {\n    num_output: 1024\n  }\n}\nlayer {\n  name: \"relu7\"\n  type: \"ReLU\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n}\nlayer {\n  name: \"drop7\"\n  type: \"Dropout\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n  dropout_param {\n    dropout_ratio: 0.5\n  }\n}\nlayer {\n  name: \"cls_score\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"cls_score\"\n  param { lr_mult: 1 }\n  param { lr_mult: 2 }\n  inner_product_param {\n    num_output: 21\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"bbox_pred\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"bbox_pred\"\n  param { lr_mult: 1 }\n  param { lr_mult: 2 }\n  inner_product_param {\n    num_output: 84\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.001\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"loss_cls\"\n  type: \"SoftmaxWithLoss\"\n  bottom: \"cls_score\"\n  bottom: \"labels\"\n  propagate_down: 1\n  propagate_down: 0\n  top: \"loss_cls\"\n  loss_weight: 1\n}\nlayer {\n  name: \"loss_bbox\"\n  type: \"SmoothL1Loss\"\n  bottom: \"bbox_pred\"\n  bottom: \"bbox_targets\"\n  bottom: \"bbox_inside_weights\"\n  bottom: \"bbox_outside_weights\"\n  top: \"loss_bbox\"\n  loss_weight: 1\n}\n\n#========= RPN ============\n# Dummy layers so that initial parameters are saved into the output net\n\nlayer {\n  name: \"rpn_conv/3x3\"\n  type: \"Convolution\"\n  bottom: \"conv5\"\n  top: \"rpn/output\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 256\n    kernel_size: 3 pad: 1 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_relu/3x3\"\n  type: \"ReLU\"\n  bottom: \"rpn/output\"\n  top: \"rpn/output\"\n}\nlayer {\n  name: \"rpn_cls_score\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_cls_score\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 18   # 2(bg/fg) * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_bbox_pred\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_bbox_pred\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 36   # 4 * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"silence_rpn_cls_score\"\n  type: \"Silence\"\n  bottom: \"rpn_cls_score\"\n}\nlayer {\n  name: \"silence_rpn_bbox_pred\"\n  type: \"Silence\"\n  bottom: \"rpn_bbox_pred\"\n}\n"
  },
  {
    "path": "models/pascal_voc/VGG_CNN_M_1024/faster_rcnn_alt_opt/stage1_rpn_solver60k80k.pt",
    "content": "train_net: \"models/pascal_voc/VGG_CNN_M_1024/faster_rcnn_alt_opt/stage1_rpn_train.pt\"\n\nbase_lr: 0.001\nlr_policy: \"step\"\ngamma: 0.1\nstepsize: 60000\ndisplay: 20\naverage_loss: 100\nmomentum: 0.9\nweight_decay: 0.0005\n\n# We disable standard caffe solver snapshotting and implement our own snapshot\n# function\nsnapshot: 0\n# We still use the snapshot prefix, though\nsnapshot_prefix: \"vgg_cnn_m_1024_rpn\"\n"
  },
  {
    "path": "models/pascal_voc/VGG_CNN_M_1024/faster_rcnn_alt_opt/stage1_rpn_train.pt",
    "content": "name: \"VGG_CNN_M_1024\"\nlayer {\n  name: 'input-data'\n  type: 'Python'\n  top: 'data'\n  top: 'im_info'\n  top: 'gt_boxes'\n  python_param {\n    module: 'roi_data_layer.layer'\n    layer: 'RoIDataLayer'\n    param_str: \"'num_classes': 21\"\n  }\n}\nlayer {\n  name: \"conv1\"\n  type: \"Convolution\"\n  bottom: \"data\"\n  top: \"conv1\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 96\n    kernel_size: 7 stride: 2\n  }\n}\nlayer {\n  name: \"relu1\"\n  type: \"ReLU\"\n  bottom: \"conv1\"\n  top: \"conv1\"\n}\nlayer {\n  name: \"norm1\"\n  type: \"LRN\"\n  bottom: \"conv1\"\n  top: \"norm1\"\n  lrn_param {\n    local_size: 5\n    alpha: 0.0005\n    beta: 0.75\n    k: 2\n  }\n}\nlayer {\n  name: \"pool1\"\n  type: \"Pooling\"\n  bottom: \"norm1\"\n  top: \"pool1\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 3 stride: 2\n  }\n}\nlayer {\n  name: \"conv2\"\n  type: \"Convolution\"\n  bottom: \"pool1\"\n  top: \"conv2\"\n  param { lr_mult: 1 }\n  param { lr_mult: 2 }\n  convolution_param {\n    num_output: 256\n    pad: 1 kernel_size: 5 stride: 2\n  }\n}\nlayer {\n  name: \"relu2\"\n  type: \"ReLU\"\n  bottom: \"conv2\"\n  top: \"conv2\"\n}\nlayer {\n  name: \"norm2\"\n  type: \"LRN\"\n  bottom: \"conv2\"\n  top: \"norm2\"\n  lrn_param {\n    local_size: 5\n    alpha: 0.0005\n    beta: 0.75\n    k: 2\n  }\n}\nlayer {\n  name: \"pool2\"\n  type: \"Pooling\"\n  bottom: \"norm2\"\n  top: \"pool2\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 3 stride: 2\n  }\n}\nlayer {\n  name: \"conv3\"\n  type: \"Convolution\"\n  bottom: \"pool2\"\n  top: \"conv3\"\n  param { lr_mult: 1 }\n  param { lr_mult: 2 }\n  convolution_param {\n    num_output: 512\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3\"\n  type: \"ReLU\"\n  bottom: \"conv3\"\n  top: \"conv3\"\n}\nlayer {\n  name: \"conv4\"\n  type: \"Convolution\"\n  bottom: \"conv3\"\n  top: \"conv4\"\n  param { lr_mult: 1 }\n  param { lr_mult: 2 }\n  convolution_param {\n    num_output: 512\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4\"\n  type: \"ReLU\"\n  bottom: \"conv4\"\n  top: \"conv4\"\n}\nlayer {\n  name: \"conv5\"\n  type: \"Convolution\"\n  bottom: \"conv4\"\n  top: \"conv5\"\n  param { lr_mult: 1 }\n  param { lr_mult: 2 }\n  convolution_param {\n    num_output: 512\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5\"\n  type: \"ReLU\"\n  bottom: \"conv5\"\n  top: \"conv5\"\n}\n\n#========= RPN ============\n\nlayer {\n  name: \"rpn_conv/3x3\"\n  type: \"Convolution\"\n  bottom: \"conv5\"\n  top: \"rpn/output\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 256\n    kernel_size: 3 pad: 1 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_relu/3x3\"\n  type: \"ReLU\"\n  bottom: \"rpn/output\"\n  top: \"rpn/output\"\n}\nlayer {\n  name: \"rpn_cls_score\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_cls_score\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 18   # 2(bg/fg) * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_bbox_pred\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_bbox_pred\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 36   # 4 * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n   bottom: \"rpn_cls_score\"\n   top: \"rpn_cls_score_reshape\"\n   name: \"rpn_cls_score_reshape\"\n   type: \"Reshape\"\n   reshape_param { shape { dim: 0 dim: 2 dim: -1 dim: 0 } }\n}\nlayer {\n  name: 'rpn-data'\n  type: 'Python'\n  bottom: 'rpn_cls_score'\n  bottom: 'gt_boxes'\n  bottom: 'im_info'\n  bottom: 'data'\n  top: 'rpn_labels'\n  top: 'rpn_bbox_targets'\n  top: 'rpn_bbox_inside_weights'\n  top: 'rpn_bbox_outside_weights'\n  python_param {\n    module: 'rpn.anchor_target_layer'\n    layer: 'AnchorTargetLayer'\n    param_str: \"'feat_stride': 16\"\n  }\n}\nlayer {\n  name: \"rpn_loss_cls\"\n  type: \"SoftmaxWithLoss\"\n  bottom: \"rpn_cls_score_reshape\"\n  bottom: \"rpn_labels\"\n  propagate_down: 1\n  propagate_down: 0\n  top: \"rpn_cls_loss\"\n  loss_weight: 1\n  loss_param {\n    ignore_label: -1\n    normalize: true\n  }\n}\nlayer {\n  name: \"rpn_loss_bbox\"\n  type: \"SmoothL1Loss\"\n  bottom: \"rpn_bbox_pred\"\n  bottom: \"rpn_bbox_targets\"\n  bottom: 'rpn_bbox_inside_weights'\n  bottom: 'rpn_bbox_outside_weights'\n  top: \"rpn_loss_bbox\"\n  loss_weight: 1\n  smooth_l1_loss_param { sigma: 3.0 }\n}\n\n#========= RCNN ============\n\nlayer {\n  name: \"dummy_roi_pool_conv5\"\n  type: \"DummyData\"\n  top: \"dummy_roi_pool_conv5\"\n  dummy_data_param {\n    shape { dim: 1 dim: 18432 }\n    data_filler { type: \"gaussian\" std: 0.01 }\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"dummy_roi_pool_conv5\"\n  top: \"fc6\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  inner_product_param {\n    num_output: 1024\n  }\n}\nlayer {\n  name: \"silence_fc7\"\n  type: \"Silence\"\n  bottom: \"fc7\"\n}\n"
  },
  {
    "path": "models/pascal_voc/VGG_CNN_M_1024/faster_rcnn_alt_opt/stage2_fast_rcnn_solver30k40k.pt",
    "content": "train_net: \"models/pascal_voc/VGG_CNN_M_1024/faster_rcnn_alt_opt/stage2_fast_rcnn_train.pt\"\n\nbase_lr: 0.001\nlr_policy: \"step\"\ngamma: 0.1\nstepsize: 30000\ndisplay: 20\naverage_loss: 100\nmomentum: 0.9\nweight_decay: 0.0005\n\n# We disable standard caffe solver snapshotting and implement our own snapshot\n# function\nsnapshot: 0\n# We still use the snapshot prefix, though\nsnapshot_prefix: \"vgg_cnn_m_1024_fast_rcnn\"\n"
  },
  {
    "path": "models/pascal_voc/VGG_CNN_M_1024/faster_rcnn_alt_opt/stage2_fast_rcnn_train.pt",
    "content": "name: \"VGG_CNN_M_1024\"\nlayer {\n  name: 'data'\n  type: 'Python'\n  top: 'data'\n  top: 'rois'\n  top: 'labels'\n  top: 'bbox_targets'\n  top: 'bbox_inside_weights'\n  top: 'bbox_outside_weights'\n  python_param {\n    module: 'roi_data_layer.layer'\n    layer: 'RoIDataLayer'\n    param_str: \"'num_classes': 21\"\n  }\n}\nlayer {\n  name: \"conv1\"\n  type: \"Convolution\"\n  bottom: \"data\"\n  top: \"conv1\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 96\n    kernel_size: 7 stride: 2\n  }\n}\nlayer {\n  name: \"relu1\"\n  type: \"ReLU\"\n  bottom: \"conv1\"\n  top: \"conv1\"\n}\nlayer {\n  name: \"norm1\"\n  type: \"LRN\"\n  bottom: \"conv1\"\n  top: \"norm1\"\n  lrn_param {\n    local_size: 5\n    alpha: 0.0005\n    beta: 0.75\n    k: 2\n  }\n}\nlayer {\n  name: \"pool1\"\n  type: \"Pooling\"\n  bottom: \"norm1\"\n  top: \"pool1\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 3 stride: 2\n  }\n}\nlayer {\n  name: \"conv2\"\n  type: \"Convolution\"\n  bottom: \"pool1\"\n  top: \"conv2\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 256\n    pad: 1 kernel_size: 5 stride: 2\n  }\n}\nlayer {\n  name: \"relu2\"\n  type: \"ReLU\"\n  bottom: \"conv2\"\n  top: \"conv2\"\n}\nlayer {\n  name: \"norm2\"\n  type: \"LRN\"\n  bottom: \"conv2\"\n  top: \"norm2\"\n  lrn_param {\n    local_size: 5\n    alpha: 0.0005\n    beta: 0.75\n    k: 2\n  }\n}\nlayer {\n  name: \"pool2\"\n  type: \"Pooling\"\n  bottom: \"norm2\"\n  top: \"pool2\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 3 stride: 2\n  }\n}\nlayer {\n  name: \"conv3\"\n  type: \"Convolution\"\n  bottom: \"pool2\"\n  top: \"conv3\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 512\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3\"\n  type: \"ReLU\"\n  bottom: \"conv3\"\n  top: \"conv3\"\n}\nlayer {\n  name: \"conv4\"\n  type: \"Convolution\"\n  bottom: \"conv3\"\n  top: \"conv4\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 512\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4\"\n  type: \"ReLU\"\n  bottom: \"conv4\"\n  top: \"conv4\"\n}\nlayer {\n  name: \"conv5\"\n  type: \"Convolution\"\n  bottom: \"conv4\"\n  top: \"conv5\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 512\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5\"\n  type: \"ReLU\"\n  bottom: \"conv5\"\n  top: \"conv5\"\n}\n\n#========= RCNN ============\n\nlayer {\n  name: \"roi_pool5\"\n  type: \"ROIPooling\"\n  bottom: \"conv5\"\n  bottom: \"rois\"\n  top: \"pool5\"\n  roi_pooling_param {\n    pooled_w: 6\n    pooled_h: 6\n    spatial_scale: 0.0625 # 1/16\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"pool5\"\n  top: \"fc6\"\n  param { lr_mult: 1 }\n  param { lr_mult: 2 }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"drop6\"\n  type: \"Dropout\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n  dropout_param {\n    dropout_ratio: 0.5\n  }\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  param { lr_mult: 1 }\n  param { lr_mult: 2 }\n  inner_product_param {\n    num_output: 1024\n  }\n}\nlayer {\n  name: \"relu7\"\n  type: \"ReLU\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n}\nlayer {\n  name: \"drop7\"\n  type: \"Dropout\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n  dropout_param {\n    dropout_ratio: 0.5\n  }\n}\nlayer {\n  name: \"cls_score\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"cls_score\"\n  param { lr_mult: 1 }\n  param { lr_mult: 2 }\n  inner_product_param {\n    num_output: 21\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"bbox_pred\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"bbox_pred\"\n  param { lr_mult: 1 }\n  param { lr_mult: 2 }\n  inner_product_param {\n    num_output: 84\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.001\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"loss_cls\"\n  type: \"SoftmaxWithLoss\"\n  bottom: \"cls_score\"\n  bottom: \"labels\"\n  propagate_down: 1\n  propagate_down: 0\n  top: \"loss_cls\"\n  loss_weight: 1\n}\nlayer {\n  name: \"loss_bbox\"\n  type: \"SmoothL1Loss\"\n  bottom: \"bbox_pred\"\n  bottom: \"bbox_targets\"\n  bottom: \"bbox_inside_weights\"\n  bottom: \"bbox_outside_weights\"\n  top: \"loss_bbox\"\n  loss_weight: 1\n}\n\n#========= RPN ============\n# Dummy layers so that initial parameters are saved into the output net\n\nlayer {\n  name: \"rpn_conv/3x3\"\n  type: \"Convolution\"\n  bottom: \"conv5\"\n  top: \"rpn/output\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 256\n    kernel_size: 3 pad: 1 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_relu/3x3\"\n  type: \"ReLU\"\n  bottom: \"rpn/output\"\n  top: \"rpn/output\"\n}\nlayer {\n  name: \"rpn_cls_score\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_cls_score\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 18   # 2(bg/fg) * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_bbox_pred\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_bbox_pred\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 36   # 4 * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"silence_rpn_cls_score\"\n  type: \"Silence\"\n  bottom: \"rpn_cls_score\"\n}\nlayer {\n  name: \"silence_rpn_bbox_pred\"\n  type: \"Silence\"\n  bottom: \"rpn_bbox_pred\"\n}\n"
  },
  {
    "path": "models/pascal_voc/VGG_CNN_M_1024/faster_rcnn_alt_opt/stage2_rpn_solver60k80k.pt",
    "content": "train_net: \"models/pascal_voc/VGG_CNN_M_1024/faster_rcnn_alt_opt/stage2_rpn_train.pt\"\n\nbase_lr: 0.001\nlr_policy: \"step\"\ngamma: 0.1\nstepsize: 60000\ndisplay: 20\naverage_loss: 100\nmomentum: 0.9\nweight_decay: 0.0005\n\n# We disable standard caffe solver snapshotting and implement our own snapshot\n# function\nsnapshot: 0\n# We still use the snapshot prefix, though\nsnapshot_prefix: \"vgg_cnn_m_1024_rpn\"\n"
  },
  {
    "path": "models/pascal_voc/VGG_CNN_M_1024/faster_rcnn_alt_opt/stage2_rpn_train.pt",
    "content": "name: \"VGG_CNN_M_1024\"\nlayer {\n  name: 'input-data'\n  type: 'Python'\n  top: 'data'\n  top: 'im_info'\n  top: 'gt_boxes'\n  python_param {\n    module: 'roi_data_layer.layer'\n    layer: 'RoIDataLayer'\n    param_str: \"'num_classes': 21\"\n  }\n}\nlayer {\n  name: \"conv1\"\n  type: \"Convolution\"\n  bottom: \"data\"\n  top: \"conv1\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 96\n    kernel_size: 7 stride: 2\n  }\n}\nlayer {\n  name: \"relu1\"\n  type: \"ReLU\"\n  bottom: \"conv1\"\n  top: \"conv1\"\n}\nlayer {\n  name: \"norm1\"\n  type: \"LRN\"\n  bottom: \"conv1\"\n  top: \"norm1\"\n  lrn_param {\n    local_size: 5\n    alpha: 0.0005\n    beta: 0.75\n    k: 2\n  }\n}\nlayer {\n  name: \"pool1\"\n  type: \"Pooling\"\n  bottom: \"norm1\"\n  top: \"pool1\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 3 stride: 2\n  }\n}\nlayer {\n  name: \"conv2\"\n  type: \"Convolution\"\n  bottom: \"pool1\"\n  top: \"conv2\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 256\n    pad: 1 kernel_size: 5 stride: 2\n  }\n}\nlayer {\n  name: \"relu2\"\n  type: \"ReLU\"\n  bottom: \"conv2\"\n  top: \"conv2\"\n}\nlayer {\n  name: \"norm2\"\n  type: \"LRN\"\n  bottom: \"conv2\"\n  top: \"norm2\"\n  lrn_param {\n    local_size: 5\n    alpha: 0.0005\n    beta: 0.75\n    k: 2\n  }\n}\nlayer {\n  name: \"pool2\"\n  type: \"Pooling\"\n  bottom: \"norm2\"\n  top: \"pool2\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 3 stride: 2\n  }\n}\nlayer {\n  name: \"conv3\"\n  type: \"Convolution\"\n  bottom: \"pool2\"\n  top: \"conv3\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 512\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3\"\n  type: \"ReLU\"\n  bottom: \"conv3\"\n  top: \"conv3\"\n}\nlayer {\n  name: \"conv4\"\n  type: \"Convolution\"\n  bottom: \"conv3\"\n  top: \"conv4\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 512\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4\"\n  type: \"ReLU\"\n  bottom: \"conv4\"\n  top: \"conv4\"\n}\nlayer {\n  name: \"conv5\"\n  type: \"Convolution\"\n  bottom: \"conv4\"\n  top: \"conv5\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 512\n    pad: 1 kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5\"\n  type: \"ReLU\"\n  bottom: \"conv5\"\n  top: \"conv5\"\n}\n\n#========= RPN ============\n\nlayer {\n  name: \"rpn_conv/3x3\"\n  type: \"Convolution\"\n  bottom: \"conv5\"\n  top: \"rpn/output\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 256\n    kernel_size: 3 pad: 1 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_relu/3x3\"\n  type: \"ReLU\"\n  bottom: \"rpn/output\"\n  top: \"rpn/output\"\n}\nlayer {\n  name: \"rpn_cls_score\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_cls_score\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 18   # 2(bg/fg) * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_bbox_pred\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_bbox_pred\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 36   # 4 * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n   bottom: \"rpn_cls_score\"\n   top: \"rpn_cls_score_reshape\"\n   name: \"rpn_cls_score_reshape\"\n   type: \"Reshape\"\n   reshape_param { shape { dim: 0 dim: 2 dim: -1 dim: 0 } }\n}\nlayer {\n  name: 'rpn-data'\n  type: 'Python'\n  bottom: 'rpn_cls_score'\n  bottom: 'gt_boxes'\n  bottom: 'im_info'\n  bottom: 'data'\n  top: 'rpn_labels'\n  top: 'rpn_bbox_targets'\n  top: 'rpn_bbox_inside_weights'\n  top: 'rpn_bbox_outside_weights'\n  python_param {\n    module: 'rpn.anchor_target_layer'\n    layer: 'AnchorTargetLayer'\n    param_str: \"'feat_stride': 16\"\n  }\n}\nlayer {\n  name: \"rpn_loss_cls\"\n  type: \"SoftmaxWithLoss\"\n  bottom: \"rpn_cls_score_reshape\"\n  bottom: \"rpn_labels\"\n  propagate_down: 1\n  propagate_down: 0\n  top: \"rpn_cls_loss\"\n  loss_weight: 1\n  loss_param {\n    ignore_label: -1\n    normalize: true\n  }\n}\nlayer {\n  name: \"rpn_loss_bbox\"\n  type: \"SmoothL1Loss\"\n  bottom: \"rpn_bbox_pred\"\n  bottom: \"rpn_bbox_targets\"\n  bottom: 'rpn_bbox_inside_weights'\n  bottom: 'rpn_bbox_outside_weights'\n  top: \"rpn_loss_bbox\"\n  loss_weight: 1\n  smooth_l1_loss_param { sigma: 3.0 }\n}\n\n#========= RCNN ============\n\nlayer {\n  name: \"dummy_roi_pool_conv5\"\n  type: \"DummyData\"\n  top: \"dummy_roi_pool_conv5\"\n  dummy_data_param {\n    shape { dim: 1 dim: 18432 }\n    data_filler { type: \"gaussian\" std: 0.01 }\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"dummy_roi_pool_conv5\"\n  top: \"fc6\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  inner_product_param {\n    num_output: 1024\n  }\n}\nlayer {\n  name: \"silence_fc7\"\n  type: \"Silence\"\n  bottom: \"fc7\"\n}\n"
  },
  {
    "path": "models/pascal_voc/VGG_CNN_M_1024/faster_rcnn_end2end/solver.prototxt",
    "content": "train_net: \"models/pascal_voc/VGG_CNN_M_1024/faster_rcnn_end2end/train.prototxt\"\nbase_lr: 0.001\nlr_policy: \"step\"\ngamma: 0.1\nstepsize: 50000\ndisplay: 20\naverage_loss: 100\nmomentum: 0.9\nweight_decay: 0.0005\n# We disable standard caffe solver snapshotting and implement our own snapshot\n# function\nsnapshot: 0\n# We still use the snapshot prefix, though\nsnapshot_prefix: \"vgg_cnn_m_1024_faster_rcnn\"\n"
  },
  {
    "path": "models/pascal_voc/VGG_CNN_M_1024/faster_rcnn_end2end/test.prototxt",
    "content": "name: \"VGG_CNN_M_1024\"\ninput: \"data\"\ninput_shape {\n  dim: 1\n  dim: 3\n  dim: 224\n  dim: 224\n}\ninput: \"im_info\"\ninput_shape {\n  dim: 1\n  dim: 3\n}\nlayer {\n  name: \"conv1\"\n  type: \"Convolution\"\n  bottom: \"data\"\n  top: \"conv1\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 96\n    kernel_size: 7\n    stride: 2\n  }\n}\nlayer {\n  name: \"relu1\"\n  type: \"ReLU\"\n  bottom: \"conv1\"\n  top: \"conv1\"\n}\nlayer {\n  name: \"norm1\"\n  type: \"LRN\"\n  bottom: \"conv1\"\n  top: \"norm1\"\n  lrn_param {\n    local_size: 5\n    alpha: 0.0005\n    beta: 0.75\n    k: 2\n  }\n}\nlayer {\n  name: \"pool1\"\n  type: \"Pooling\"\n  bottom: \"norm1\"\n  top: \"pool1\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 3\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv2\"\n  type: \"Convolution\"\n  bottom: \"pool1\"\n  top: \"conv2\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 5\n    stride: 2\n  }\n}\nlayer {\n  name: \"relu2\"\n  type: \"ReLU\"\n  bottom: \"conv2\"\n  top: \"conv2\"\n}\nlayer {\n  name: \"norm2\"\n  type: \"LRN\"\n  bottom: \"conv2\"\n  top: \"norm2\"\n  lrn_param {\n    local_size: 5\n    alpha: 0.0005\n    beta: 0.75\n    k: 2\n  }\n}\nlayer {\n  name: \"pool2\"\n  type: \"Pooling\"\n  bottom: \"norm2\"\n  top: \"pool2\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 3\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv3\"\n  type: \"Convolution\"\n  bottom: \"pool2\"\n  top: \"conv3\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3\"\n  type: \"ReLU\"\n  bottom: \"conv3\"\n  top: \"conv3\"\n}\nlayer {\n  name: \"conv4\"\n  type: \"Convolution\"\n  bottom: \"conv3\"\n  top: \"conv4\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4\"\n  type: \"ReLU\"\n  bottom: \"conv4\"\n  top: \"conv4\"\n}\nlayer {\n  name: \"conv5\"\n  type: \"Convolution\"\n  bottom: \"conv4\"\n  top: \"conv5\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5\"\n  type: \"ReLU\"\n  bottom: \"conv5\"\n  top: \"conv5\"\n}\n\n#========= RPN ============\n\nlayer {\n  name: \"rpn_conv/3x3\"\n  type: \"Convolution\"\n  bottom: \"conv5\"\n  top: \"rpn/output\"\n  param { lr_mult: 1.0 decay_mult: 1.0 }\n  param { lr_mult: 2.0 decay_mult: 0 }\n  convolution_param {\n    num_output: 256\n    kernel_size: 3 pad: 1 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_relu/3x3\"\n  type: \"ReLU\"\n  bottom: \"rpn/output\"\n  top: \"rpn/output\"\n}\n\n#layer {\n#  name: \"rpn_conv/3x3\"\n#  type: \"Convolution\"\n#  bottom: \"conv5\"\n#  top: \"rpn_conv/3x3\"\n#  param { lr_mult: 1.0 decay_mult: 1.0 }\n#  param { lr_mult: 2.0 decay_mult: 0 }\n#  convolution_param {\n#    num_output: 192\n#    kernel_size: 3 pad: 1 stride: 1\n#    weight_filler { type: \"gaussian\" std: 0.01 }\n#    bias_filler { type: \"constant\" value: 0 }\n#  }\n#}\n#layer {\n#  name: \"rpn_conv/5x5\"\n#  type: \"Convolution\"\n#  bottom: \"conv5\"\n#  top: \"rpn_conv/5x5\"\n#  param { lr_mult: 1.0 decay_mult: 1.0 }\n#  param { lr_mult: 2.0 decay_mult: 0 }\n#  convolution_param {\n#    num_output: 64\n#    kernel_size: 5 pad: 2 stride: 1\n#    weight_filler { type: \"gaussian\" std: 0.0036 }\n#    bias_filler { type: \"constant\" value: 0 }\n#  }\n#}\n#layer {\n#  name: \"rpn/output\"\n#  type: \"Concat\"\n#  bottom: \"rpn_conv/3x3\"\n#  bottom: \"rpn_conv/5x5\"\n#  top: \"rpn/output\"\n#}\n#layer {\n#  name: \"rpn_relu/output\"\n#  type: \"ReLU\"\n#  bottom: \"rpn/output\"\n#  top: \"rpn/output\"\n#}\n\nlayer {\n  name: \"rpn_cls_score\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_cls_score\"\n  param { lr_mult: 1.0 decay_mult: 1.0 }\n  param { lr_mult: 2.0 decay_mult: 0 }\n  convolution_param {\n    num_output: 18   # 2(bg/fg) * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_bbox_pred\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_bbox_pred\"\n  param { lr_mult: 1.0 decay_mult: 1.0 }\n  param { lr_mult: 2.0 decay_mult: 0 }\n  convolution_param {\n    num_output: 36   # 4 * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n   bottom: \"rpn_cls_score\"\n   top: \"rpn_cls_score_reshape\"\n   name: \"rpn_cls_score_reshape\"\n   type: \"Reshape\"\n   reshape_param { shape { dim: 0 dim: 2 dim: -1 dim: 0 } }\n}\n\n#========= RoI Proposal ============\n\nlayer {\n  name: \"rpn_cls_prob\"\n  type: \"Softmax\"\n  bottom: \"rpn_cls_score_reshape\"\n  top: \"rpn_cls_prob\"\n}\nlayer {\n  name: 'rpn_cls_prob_reshape'\n  type: 'Reshape'\n  bottom: 'rpn_cls_prob'\n  top: 'rpn_cls_prob_reshape'\n  reshape_param { shape { dim: 0 dim: 18 dim: -1 dim: 0 } }\n}\nlayer {\n  name: 'proposal'\n  type: 'Python'\n  bottom: 'rpn_cls_prob_reshape'\n  bottom: 'rpn_bbox_pred'\n  bottom: 'im_info'\n  top: 'rois'\n  python_param {\n    module: 'rpn.proposal_layer'\n    layer: 'ProposalLayer'\n    param_str: \"'feat_stride': 16\"\n  }\n}\n\n#========= RCNN ============\n\nlayer {\n  name: \"roi_pool5\"\n  type: \"ROIPooling\"\n  bottom: \"conv5\"\n  bottom: \"rois\"\n  top: \"pool5\"\n  roi_pooling_param {\n    pooled_w: 6\n    pooled_h: 6\n    spatial_scale: 0.0625 # 1/16\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"pool5\"\n  top: \"fc6\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"drop6\"\n  type: \"Dropout\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n  dropout_param {\n    dropout_ratio: 0.5\n  }\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 1024\n  }\n}\nlayer {\n  name: \"relu7\"\n  type: \"ReLU\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n}\nlayer {\n  name: \"drop7\"\n  type: \"Dropout\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n  dropout_param {\n    dropout_ratio: 0.5\n  }\n}\nlayer {\n  name: \"cls_score\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"cls_score\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 21\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"bbox_pred\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"bbox_pred\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  inner_product_param {\n    num_output: 84\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.001\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"cls_prob\"\n  type: \"Softmax\"\n  bottom: \"cls_score\"\n  top: \"cls_prob\"\n}\n"
  },
  {
    "path": "models/pascal_voc/VGG_CNN_M_1024/faster_rcnn_end2end/train.prototxt",
    "content": "name: \"VGG_CNN_M_1024\"\nlayer {\n  name: 'input-data'\n  type: 'Python'\n  top: 'data'\n  top: 'im_info'\n  top: 'gt_boxes'\n  python_param {\n    module: 'roi_data_layer.layer'\n    layer: 'RoIDataLayer'\n    param_str: \"'num_classes': 21\"\n  }\n}\nlayer {\n  name: \"conv1\"\n  type: \"Convolution\"\n  bottom: \"data\"\n  top: \"conv1\"\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  param {\n    lr_mult: 0\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 96\n    kernel_size: 7\n    stride: 2\n  }\n}\nlayer {\n  name: \"relu1\"\n  type: \"ReLU\"\n  bottom: \"conv1\"\n  top: \"conv1\"\n}\nlayer {\n  name: \"norm1\"\n  type: \"LRN\"\n  bottom: \"conv1\"\n  top: \"norm1\"\n  lrn_param {\n    local_size: 5\n    alpha: 0.0005\n    beta: 0.75\n    k: 2\n  }\n}\nlayer {\n  name: \"pool1\"\n  type: \"Pooling\"\n  bottom: \"norm1\"\n  top: \"pool1\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 3\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv2\"\n  type: \"Convolution\"\n  bottom: \"pool1\"\n  top: \"conv2\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 5\n    stride: 2\n  }\n}\nlayer {\n  name: \"relu2\"\n  type: \"ReLU\"\n  bottom: \"conv2\"\n  top: \"conv2\"\n}\nlayer {\n  name: \"norm2\"\n  type: \"LRN\"\n  bottom: \"conv2\"\n  top: \"norm2\"\n  lrn_param {\n    local_size: 5\n    alpha: 0.0005\n    beta: 0.75\n    k: 2\n  }\n}\nlayer {\n  name: \"pool2\"\n  type: \"Pooling\"\n  bottom: \"norm2\"\n  top: \"pool2\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 3\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv3\"\n  type: \"Convolution\"\n  bottom: \"pool2\"\n  top: \"conv3\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu3\"\n  type: \"ReLU\"\n  bottom: \"conv3\"\n  top: \"conv3\"\n}\nlayer {\n  name: \"conv4\"\n  type: \"Convolution\"\n  bottom: \"conv3\"\n  top: \"conv4\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu4\"\n  type: \"ReLU\"\n  bottom: \"conv4\"\n  top: \"conv4\"\n}\nlayer {\n  name: \"conv5\"\n  type: \"Convolution\"\n  bottom: \"conv4\"\n  top: \"conv5\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n  }\n}\nlayer {\n  name: \"relu5\"\n  type: \"ReLU\"\n  bottom: \"conv5\"\n  top: \"conv5\"\n}\n\n#========= RPN ============\n\nlayer {\n  name: \"rpn_conv/3x3\"\n  type: \"Convolution\"\n  bottom: \"conv5\"\n  top: \"rpn/output\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 256\n    kernel_size: 3 pad: 1 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_relu/3x3\"\n  type: \"ReLU\"\n  bottom: \"rpn/output\"\n  top: \"rpn/output\"\n}\nlayer {\n  name: \"rpn_cls_score\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_cls_score\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 18   # 2(bg/fg) * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\n\nlayer {\n  name: \"rpn_bbox_pred\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_bbox_pred\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 36   # 4 * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\n\nlayer {\n   bottom: \"rpn_cls_score\"\n   top: \"rpn_cls_score_reshape\"\n   name: \"rpn_cls_score_reshape\"\n   type: \"Reshape\"\n   reshape_param { shape { dim: 0 dim: 2 dim: -1 dim: 0 } }\n}\n\nlayer {\n  name: 'rpn-data'\n  type: 'Python'\n  bottom: 'rpn_cls_score'\n  bottom: 'gt_boxes'\n  bottom: 'im_info'\n  bottom: 'data'\n  top: 'rpn_labels'\n  top: 'rpn_bbox_targets'\n  top: 'rpn_bbox_inside_weights'\n  top: 'rpn_bbox_outside_weights'\n  python_param {\n    module: 'rpn.anchor_target_layer'\n    layer: 'AnchorTargetLayer'\n    param_str: \"'feat_stride': 16\"\n  }\n}\n\nlayer {\n  name: \"rpn_loss_cls\"\n  type: \"SoftmaxWithLoss\"\n  bottom: \"rpn_cls_score_reshape\"\n  bottom: \"rpn_labels\"\n  propagate_down: 1\n  propagate_down: 0\n  top: \"rpn_cls_loss\"\n  loss_weight: 1\n  loss_param {\n    ignore_label: -1\n    normalize: true\n  }\n}\n\nlayer {\n  name: \"rpn_loss_bbox\"\n  type: \"SmoothL1Loss\"\n  bottom: \"rpn_bbox_pred\"\n  bottom: \"rpn_bbox_targets\"\n  bottom: 'rpn_bbox_inside_weights'\n  bottom: 'rpn_bbox_outside_weights'\n  top: \"rpn_loss_bbox\"\n  loss_weight: 1\n  smooth_l1_loss_param { sigma: 3.0 }\n}\n\n#========= RoI Proposal ============\n\nlayer {\n  name: \"rpn_cls_prob\"\n  type: \"Softmax\"\n  bottom: \"rpn_cls_score_reshape\"\n  top: \"rpn_cls_prob\"\n}\n\nlayer {\n  name: 'rpn_cls_prob_reshape'\n  type: 'Reshape'\n  bottom: 'rpn_cls_prob'\n  top: 'rpn_cls_prob_reshape'\n  reshape_param { shape { dim: 0 dim: 18 dim: -1 dim: 0 } }\n}\n\nlayer {\n  name: 'proposal'\n  type: 'Python'\n  bottom: 'rpn_cls_prob_reshape'\n  bottom: 'rpn_bbox_pred'\n  bottom: 'im_info'\n  top: 'rpn_rois'\n#  top: 'rpn_scores'\n  python_param {\n    module: 'rpn.proposal_layer'\n    layer: 'ProposalLayer'\n    param_str: \"'feat_stride': 16\"\n  }\n}\n\n#layer {\n#  name: 'debug-data'\n#  type: 'Python'\n#  bottom: 'data'\n#  bottom: 'rpn_rois'\n#  bottom: 'rpn_scores'\n#  python_param {\n#    module: 'rpn.debug_layer'\n#    layer: 'RPNDebugLayer'\n#  }\n#}\n\nlayer {\n  name: 'roi-data'\n  type: 'Python'\n  bottom: 'rpn_rois'\n  bottom: 'gt_boxes'\n  top: 'rois'\n  top: 'labels'\n  top: 'bbox_targets'\n  top: 'bbox_inside_weights'\n  top: 'bbox_outside_weights'\n  python_param {\n    module: 'rpn.proposal_target_layer'\n    layer: 'ProposalTargetLayer'\n    param_str: \"'num_classes': 21\"\n  }\n}\n\n#========= RCNN ============\n\nlayer {\n  name: \"roi_pool5\"\n  type: \"ROIPooling\"\n  bottom: \"conv5\"\n  bottom: \"rois\"\n  top: \"pool5\"\n  roi_pooling_param {\n    pooled_w: 6\n    pooled_h: 6\n    spatial_scale: 0.0625 # 1/16\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"pool5\"\n  top: \"fc6\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"drop6\"\n  type: \"Dropout\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n  dropout_param {\n    dropout_ratio: 0.5\n  }\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 1024\n  }\n}\nlayer {\n  name: \"relu7\"\n  type: \"ReLU\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n}\nlayer {\n  name: \"drop7\"\n  type: \"Dropout\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n  dropout_param {\n    dropout_ratio: 0.5\n  }\n}\nlayer {\n  name: \"cls_score\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"cls_score\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 21\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"bbox_pred\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"bbox_pred\"\n  param {\n    lr_mult: 1\n  }\n  param {\n    lr_mult: 2\n  }\n  inner_product_param {\n    num_output: 84\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.001\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"loss_cls\"\n  type: \"SoftmaxWithLoss\"\n  bottom: \"cls_score\"\n  bottom: \"labels\"\n  propagate_down: 1\n  propagate_down: 0\n  top: \"loss_cls\"\n  loss_weight: 1\n}\nlayer {\n  name: \"loss_bbox\"\n  type: \"SmoothL1Loss\"\n  bottom: \"bbox_pred\"\n  bottom: \"bbox_targets\"\n  bottom: \"bbox_inside_weights\"\n  bottom: \"bbox_outside_weights\"\n  top: \"loss_bbox\"\n  loss_weight: 1\n}\n"
  },
  {
    "path": "models/pascal_voc/ZF/fast_rcnn/solver.prototxt",
    "content": "train_net: \"models/pascal_voc/ZF/fast_rcnn/train.prototxt\"\n\nbase_lr: 0.001\nlr_policy: \"step\"\ngamma: 0.1\nstepsize: 30000\ndisplay: 20\naverage_loss: 100\nmomentum: 0.9\nweight_decay: 0.0005\n\n# We disable standard caffe solver snapshotting and implement our own snapshot\n# function\nsnapshot: 0\n# We still use the snapshot prefix, though\nsnapshot_prefix: \"zf_fast_rcnn\"\n#debug_info: true\n#iter_size: 2\n"
  },
  {
    "path": "models/pascal_voc/ZF/fast_rcnn/test.prototxt",
    "content": "name: \"ZF\"\n\ninput: \"data\"\ninput_shape {\n  dim: 1\n  dim: 3\n  dim: 224\n  dim: 224\n}\n\ninput: \"rois\"\ninput_shape {\n  dim: 1 # to be changed on-the-fly to num ROIs\n  dim: 5 # [batch ind, x1, y1, x2, y2] zero-based indexing\n}\n\n#========= conv1-conv5 ============\n\nlayer {\n\tname: \"conv1\"\n\ttype: \"Convolution\"\n\tbottom: \"data\"\n\ttop: \"conv1\"\n\tconvolution_param {\n\t\tnum_output: 96\n\t\tkernel_size: 7\n\t\tpad: 3\n\t\tstride: 2\n\t}\n}\nlayer {\n\tname: \"relu1\"\n\ttype: \"ReLU\"\n\tbottom: \"conv1\"\n\ttop: \"conv1\"\n}\nlayer {\n\tname: \"norm1\"\n\ttype: \"LRN\"\n\tbottom: \"conv1\"\n\ttop: \"norm1\"\n\tlrn_param {\n\t\tlocal_size: 3\n\t\talpha: 0.00005\n\t\tbeta: 0.75\n\t\tnorm_region: WITHIN_CHANNEL\n    engine: CAFFE\n\t}\n}\nlayer {\n\tname: \"pool1\"\n\ttype: \"Pooling\"\n\tbottom: \"norm1\"\n\ttop: \"pool1\"\n\tpooling_param {\n\t\tkernel_size: 3\n\t\tstride: 2\n\t\tpad: 1\n\t\tpool: MAX\n\t}\n}\nlayer {\n\tname: \"conv2\"\n\ttype: \"Convolution\"\n\tbottom: \"pool1\"\n\ttop: \"conv2\"\n\tconvolution_param {\n\t\tnum_output: 256\n\t\tkernel_size: 5\n\t\tpad: 2\n\t\tstride: 2\n\t}\n}\nlayer {\n\tname: \"relu2\"\n\ttype: \"ReLU\"\n\tbottom: \"conv2\"\n\ttop: \"conv2\"\n}\nlayer {\n\tname: \"norm2\"\n\ttype: \"LRN\"\n\tbottom: \"conv2\"\n\ttop: \"norm2\"\n\tlrn_param {\n\t\tlocal_size: 3\n\t\talpha: 0.00005\n\t\tbeta: 0.75\n\t\tnorm_region: WITHIN_CHANNEL\n    engine: CAFFE\n\t}\n}\nlayer {\n\tname: \"pool2\"\n\ttype: \"Pooling\"\n\tbottom: \"norm2\"\n\ttop: \"pool2\"\n\tpooling_param {\n\t\tkernel_size: 3\n\t\tstride: 2\n\t\tpad: 1\n\t\tpool: MAX\n\t}\n}\nlayer {\n\tname: \"conv3\"\n\ttype: \"Convolution\"\n\tbottom: \"pool2\"\n\ttop: \"conv3\"\n\tconvolution_param {\n\t\tnum_output: 384\n\t\tkernel_size: 3\n\t\tpad: 1\n\t\tstride: 1\n\t}\n}\nlayer {\n\tname: \"relu3\"\n\ttype: \"ReLU\"\n\tbottom: \"conv3\"\n\ttop: \"conv3\"\n}\nlayer {\n\tname: \"conv4\"\n\ttype: \"Convolution\"\n\tbottom: \"conv3\"\n\ttop: \"conv4\"\n\tconvolution_param {\n\t\tnum_output: 384\n\t\tkernel_size: 3\n\t\tpad: 1\n\t\tstride: 1\n\t}\n}\nlayer {\n\tname: \"relu4\"\n\ttype: \"ReLU\"\n\tbottom: \"conv4\"\n\ttop: \"conv4\"\n}\nlayer {\n\tname: \"conv5\"\n\ttype: \"Convolution\"\n\tbottom: \"conv4\"\n\ttop: \"conv5\"\n\tconvolution_param {\n\t\tnum_output: 256\n\t\tkernel_size: 3\n\t\tpad: 1\n\t\tstride: 1\n\t}\n}\nlayer {\n\tname: \"relu5\"\n\ttype: \"ReLU\"\n\tbottom: \"conv5\"\n\ttop: \"conv5\"\n}\n\n#========= RCNN ============\n\nlayer {\n  name: \"roi_pool_conv5\"\n  type: \"ROIPooling\"\n  bottom: \"conv5\"\n  bottom: \"rois\"\n  top: \"roi_pool_conv5\"\n  roi_pooling_param {\n    pooled_w: 6\n    pooled_h: 6\n    spatial_scale: 0.0625 # 1/16\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"roi_pool_conv5\"\n  top: \"fc6\"\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"drop6\"\n  type: \"Dropout\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n  dropout_param {\n    dropout_ratio: 0.5\n    scale_train: false\n  }\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu7\"\n  type: \"ReLU\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n}\nlayer {\n  name: \"drop7\"\n  type: \"Dropout\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n  dropout_param {\n    dropout_ratio: 0.5\n    scale_train: false\n  }\n}\nlayer {\n  name: \"cls_score\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"cls_score\"\n  inner_product_param {\n    num_output: 21\n  }\n}\nlayer {\n  name: \"bbox_pred\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"bbox_pred\"\n  inner_product_param {\n    num_output: 84\n  }\n}\nlayer {\n  name: \"cls_prob\"\n  type: \"Softmax\"\n  bottom: \"cls_score\"\n  top: \"cls_prob\"\n  loss_param {\n    ignore_label: -1\n    normalize: true\n  }\n}\n"
  },
  {
    "path": "models/pascal_voc/ZF/fast_rcnn/train.prototxt",
    "content": "name: \"ZF\"\nlayer {\n  name: 'data'\n  type: 'Python'\n  top: 'data'\n  top: 'rois'\n  top: 'labels'\n  top: 'bbox_targets'\n  top: 'bbox_inside_weights'\n  top: 'bbox_outside_weights'\n  python_param {\n    module: 'roi_data_layer.layer'\n    layer: 'RoIDataLayer'\n    param_str: \"'num_classes': 21\"\n  }\n}\n\n#========= conv1-conv5 ============\n\nlayer {\n\tname: \"conv1\"\n\ttype: \"Convolution\"\n\tbottom: \"data\"\n\ttop: \"conv1\"\n\tparam { lr_mult: 1.0 }\n\tparam { lr_mult: 2.0 }\n\tconvolution_param {\n\t\tnum_output: 96\n\t\tkernel_size: 7\n\t\tpad: 3\n\t\tstride: 2\n\t}\n}\nlayer {\n\tname: \"relu1\"\n\ttype: \"ReLU\"\n\tbottom: \"conv1\"\n\ttop: \"conv1\"\n}\nlayer {\n\tname: \"norm1\"\n\ttype: \"LRN\"\n\tbottom: \"conv1\"\n\ttop: \"norm1\"\n\tlrn_param {\n\t\tlocal_size: 3\n\t\talpha: 0.00005\n\t\tbeta: 0.75\n\t\tnorm_region: WITHIN_CHANNEL\n    engine: CAFFE\n\t}\n}\nlayer {\n\tname: \"pool1\"\n\ttype: \"Pooling\"\n\tbottom: \"norm1\"\n\ttop: \"pool1\"\n\tpooling_param {\n\t\tkernel_size: 3\n\t\tstride: 2\n\t\tpad: 1\n\t\tpool: MAX\n\t}\n}\nlayer {\n\tname: \"conv2\"\n\ttype: \"Convolution\"\n\tbottom: \"pool1\"\n\ttop: \"conv2\"\n\tparam { lr_mult: 1.0 }\n\tparam { lr_mult: 2.0 }\n\tconvolution_param {\n\t\tnum_output: 256\n\t\tkernel_size: 5\n\t\tpad: 2\n\t\tstride: 2\n\t}\n}\nlayer {\n\tname: \"relu2\"\n\ttype: \"ReLU\"\n\tbottom: \"conv2\"\n\ttop: \"conv2\"\n}\nlayer {\n\tname: \"norm2\"\n\ttype: \"LRN\"\n\tbottom: \"conv2\"\n\ttop: \"norm2\"\n\tlrn_param {\n\t\tlocal_size: 3\n\t\talpha: 0.00005\n\t\tbeta: 0.75\n\t\tnorm_region: WITHIN_CHANNEL\n    engine: CAFFE\n\t}\n}\nlayer {\n\tname: \"pool2\"\n\ttype: \"Pooling\"\n\tbottom: \"norm2\"\n\ttop: \"pool2\"\n\tpooling_param {\n\t\tkernel_size: 3\n\t\tstride: 2\n\t\tpad: 1\n\t\tpool: MAX\n\t}\n}\nlayer {\n\tname: \"conv3\"\n\ttype: \"Convolution\"\n\tbottom: \"pool2\"\n\ttop: \"conv3\"\n\tparam { lr_mult: 1.0 }\n\tparam { lr_mult: 2.0 }\n\tconvolution_param {\n\t\tnum_output: 384\n\t\tkernel_size: 3\n\t\tpad: 1\n\t\tstride: 1\n\t}\n}\nlayer {\n\tname: \"relu3\"\n\ttype: \"ReLU\"\n\tbottom: \"conv3\"\n\ttop: \"conv3\"\n}\nlayer {\n\tname: \"conv4\"\n\ttype: \"Convolution\"\n\tbottom: \"conv3\"\n\ttop: \"conv4\"\n\tparam { lr_mult: 1.0 }\n\tparam { lr_mult: 2.0 }\n\tconvolution_param {\n\t\tnum_output: 384\n\t\tkernel_size: 3\n\t\tpad: 1\n\t\tstride: 1\n\t}\n}\nlayer {\n\tname: \"relu4\"\n\ttype: \"ReLU\"\n\tbottom: \"conv4\"\n\ttop: \"conv4\"\n}\nlayer {\n\tname: \"conv5\"\n\ttype: \"Convolution\"\n\tbottom: \"conv4\"\n\ttop: \"conv5\"\n\tparam { lr_mult: 1.0 }\n\tparam { lr_mult: 2.0 }\n\tconvolution_param {\n\t\tnum_output: 256\n\t\tkernel_size: 3\n\t\tpad: 1\n\t\tstride: 1\n\t}\n}\nlayer {\n\tname: \"relu5\"\n\ttype: \"ReLU\"\n\tbottom: \"conv5\"\n\ttop: \"conv5\"\n}\n\n#========= RCNN ============\n\nlayer {\n  name: \"roi_pool_conv5\"\n  type: \"ROIPooling\"\n  bottom: \"conv5\"\n  bottom: \"rois\"\n  top: \"roi_pool_conv5\"\n  roi_pooling_param {\n    pooled_w: 6\n    pooled_h: 6\n    spatial_scale: 0.0625 # 1/16\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"roi_pool_conv5\"\n  top: \"fc6\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"drop6\"\n  type: \"Dropout\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n  dropout_param {\n    dropout_ratio: 0.5\n    scale_train: false\n  }\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu7\"\n  type: \"ReLU\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n}\nlayer {\n  name: \"drop7\"\n  type: \"Dropout\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n  dropout_param {\n    dropout_ratio: 0.5\n    scale_train: false\n  }\n}\nlayer {\n  name: \"cls_score\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"cls_score\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  inner_product_param {\n    num_output: 21\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"bbox_pred\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"bbox_pred\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  inner_product_param {\n    num_output: 84\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.001\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"loss_cls\"\n  type: \"SoftmaxWithLoss\"\n  bottom: \"cls_score\"\n  bottom: \"labels\"\n  propagate_down: 1\n  propagate_down: 0\n  top: \"cls_loss\"\n  loss_weight: 1\n  loss_param {\n    ignore_label: -1\n    normalize: true\n  }\n}\nlayer {\n  name: \"loss_bbox\"\n  type: \"SmoothL1Loss\"\n  bottom: \"bbox_pred\"\n  bottom: \"bbox_targets\"\n  bottom: \"bbox_inside_weights\"\n  bottom: \"bbox_outside_weights\"\n  top: \"bbox_loss\"\n  loss_weight: 1\n}\n"
  },
  {
    "path": "models/pascal_voc/ZF/faster_rcnn_alt_opt/faster_rcnn_test.pt",
    "content": "name: \"ZF\"\n\ninput: \"data\"\ninput_shape {\n  dim: 1\n  dim: 3\n  dim: 224\n  dim: 224\n}\n\ninput: \"im_info\"\ninput_shape {\n  dim: 1\n  dim: 3\n}\n\n#========= conv1-conv5 ============\n\nlayer {\n\tname: \"conv1\"\n\ttype: \"Convolution\"\n\tbottom: \"data\"\n\ttop: \"conv1\"\n\tconvolution_param {\n\t\tnum_output: 96\n\t\tkernel_size: 7\n\t\tpad: 3\n\t\tstride: 2\n\t}\n}\nlayer {\n\tname: \"relu1\"\n\ttype: \"ReLU\"\n\tbottom: \"conv1\"\n\ttop: \"conv1\"\n}\nlayer {\n\tname: \"norm1\"\n\ttype: \"LRN\"\n\tbottom: \"conv1\"\n\ttop: \"norm1\"\n\tlrn_param {\n\t\tlocal_size: 3\n\t\talpha: 0.00005\n\t\tbeta: 0.75\n\t\tnorm_region: WITHIN_CHANNEL\n    engine: CAFFE\n\t}\n}\nlayer {\n\tname: \"pool1\"\n\ttype: \"Pooling\"\n\tbottom: \"norm1\"\n\ttop: \"pool1\"\n\tpooling_param {\n\t\tkernel_size: 3\n\t\tstride: 2\n\t\tpad: 1\n\t\tpool: MAX\n\t}\n}\nlayer {\n\tname: \"conv2\"\n\ttype: \"Convolution\"\n\tbottom: \"pool1\"\n\ttop: \"conv2\"\n\tconvolution_param {\n\t\tnum_output: 256\n\t\tkernel_size: 5\n\t\tpad: 2\n\t\tstride: 2\n\t}\n}\nlayer {\n\tname: \"relu2\"\n\ttype: \"ReLU\"\n\tbottom: \"conv2\"\n\ttop: \"conv2\"\n}\nlayer {\n\tname: \"norm2\"\n\ttype: \"LRN\"\n\tbottom: \"conv2\"\n\ttop: \"norm2\"\n\tlrn_param {\n\t\tlocal_size: 3\n\t\talpha: 0.00005\n\t\tbeta: 0.75\n\t\tnorm_region: WITHIN_CHANNEL\n    engine: CAFFE\n\t}\n}\nlayer {\n\tname: \"pool2\"\n\ttype: \"Pooling\"\n\tbottom: \"norm2\"\n\ttop: \"pool2\"\n\tpooling_param {\n\t\tkernel_size: 3\n\t\tstride: 2\n\t\tpad: 1\n\t\tpool: MAX\n\t}\n}\nlayer {\n\tname: \"conv3\"\n\ttype: \"Convolution\"\n\tbottom: \"pool2\"\n\ttop: \"conv3\"\n\tconvolution_param {\n\t\tnum_output: 384\n\t\tkernel_size: 3\n\t\tpad: 1\n\t\tstride: 1\n\t}\n}\nlayer {\n\tname: \"relu3\"\n\ttype: \"ReLU\"\n\tbottom: \"conv3\"\n\ttop: \"conv3\"\n}\nlayer {\n\tname: \"conv4\"\n\ttype: \"Convolution\"\n\tbottom: \"conv3\"\n\ttop: \"conv4\"\n\tconvolution_param {\n\t\tnum_output: 384\n\t\tkernel_size: 3\n\t\tpad: 1\n\t\tstride: 1\n\t}\n}\nlayer {\n\tname: \"relu4\"\n\ttype: \"ReLU\"\n\tbottom: \"conv4\"\n\ttop: \"conv4\"\n}\nlayer {\n\tname: \"conv5\"\n\ttype: \"Convolution\"\n\tbottom: \"conv4\"\n\ttop: \"conv5\"\n\tconvolution_param {\n\t\tnum_output: 256\n\t\tkernel_size: 3\n\t\tpad: 1\n\t\tstride: 1\n\t}\n}\nlayer {\n\tname: \"relu5\"\n\ttype: \"ReLU\"\n\tbottom: \"conv5\"\n\ttop: \"conv5\"\n}\n\n#========= RPN ============\n\n\nlayer {\n  name: \"rpn_conv1\"\n  type: \"Convolution\"\n  bottom: \"conv5\"\n  top: \"rpn_conv1\"\n  convolution_param {\n    num_output: 256\n    kernel_size: 3 pad: 1 stride: 1\n  }\n}\nlayer {\n  name: \"rpn_relu1\"\n  type: \"ReLU\"\n  bottom: \"rpn_conv1\"\n  top: \"rpn_conv1\"\n}\nlayer {\n  name: \"rpn_cls_score\"\n  type: \"Convolution\"\n  bottom: \"rpn_conv1\"\n  top: \"rpn_cls_score\"\n  convolution_param {\n    num_output: 18   # 2(bg/fg) * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n  }\n}\nlayer {\n  name: \"rpn_bbox_pred\"\n  type: \"Convolution\"\n  bottom: \"rpn_conv1\"\n  top: \"rpn_bbox_pred\"\n  convolution_param {\n    num_output: 36   # 4 * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n  }\n}\nlayer {\n   bottom: \"rpn_cls_score\"\n   top: \"rpn_cls_score_reshape\"\n   name: \"rpn_cls_score_reshape\"\n   type: \"Reshape\"\n   reshape_param { shape { dim: 0 dim: 2 dim: -1 dim: 0 } }\n}\n\n#========= RoI Proposal ============\n\nlayer {\n  name: \"rpn_cls_prob\"\n  type: \"Softmax\"\n  bottom: \"rpn_cls_score_reshape\"\n  top: \"rpn_cls_prob\"\n}\nlayer {\n  name: 'rpn_cls_prob_reshape'\n  type: 'Reshape'\n  bottom: 'rpn_cls_prob'\n  top: 'rpn_cls_prob_reshape'\n  reshape_param { shape { dim: 0 dim: 18 dim: -1 dim: 0 } }\n}\nlayer {\n  name: 'proposal'\n  type: 'Python'\n  bottom: 'rpn_cls_prob_reshape'\n  bottom: 'rpn_bbox_pred'\n  bottom: 'im_info'\n  top: 'rois'\n  python_param {\n    module: 'rpn.proposal_layer'\n    layer: 'ProposalLayer'\n    param_str: \"'feat_stride': 16\"\n  }\n}\n\n#========= RCNN ============\n\nlayer {\n  name: \"roi_pool_conv5\"\n  type: \"ROIPooling\"\n  bottom: \"conv5\"\n  bottom: \"rois\"\n  top: \"roi_pool_conv5\"\n  roi_pooling_param {\n    pooled_w: 6\n    pooled_h: 6\n    spatial_scale: 0.0625 # 1/16\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"roi_pool_conv5\"\n  top: \"fc6\"\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"drop6\"\n  type: \"Dropout\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n  dropout_param {\n    dropout_ratio: 0.5\n    scale_train: false\n  }\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu7\"\n  type: \"ReLU\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n}\nlayer {\n  name: \"drop7\"\n  type: \"Dropout\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n  dropout_param {\n    dropout_ratio: 0.5\n    scale_train: false\n  }\n}\nlayer {\n  name: \"cls_score\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"cls_score\"\n  inner_product_param {\n    num_output: 21\n  }\n}\nlayer {\n  name: \"bbox_pred\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"bbox_pred\"\n  inner_product_param {\n    num_output: 84\n  }\n}\nlayer {\n  name: \"cls_prob\"\n  type: \"Softmax\"\n  bottom: \"cls_score\"\n  top: \"cls_prob\"\n  loss_param {\n    ignore_label: -1\n    normalize: true\n  }\n}\n"
  },
  {
    "path": "models/pascal_voc/ZF/faster_rcnn_alt_opt/rpn_test.pt",
    "content": "name: \"ZF\"\n\ninput: \"data\"\ninput_shape {\n  dim: 1\n  dim: 3\n  dim: 224\n  dim: 224\n}\n\ninput: \"im_info\"\ninput_shape {\n  dim: 1\n  dim: 3\n}\n\n# ------------------------ layer 1 -----------------------------\nlayer {\n\tname: \"conv1\"\n\ttype: \"Convolution\"\n\tbottom: \"data\"\n\ttop: \"conv1\"\n\tconvolution_param {\n\t\tnum_output: 96\n\t\tkernel_size: 7\n\t\tpad: 3\n\t\tstride: 2\n\t}\n}\nlayer {\n\tname: \"relu1\"\n\ttype: \"ReLU\"\n\tbottom: \"conv1\"\n\ttop: \"conv1\"\n}\nlayer {\n\tname: \"norm1\"\n\ttype: \"LRN\"\n\tbottom: \"conv1\"\n\ttop: \"norm1\"\n\tlrn_param {\n\t\tlocal_size: 3\n\t\talpha: 0.00005\n\t\tbeta: 0.75\n\t\tnorm_region: WITHIN_CHANNEL\n    engine: CAFFE\n\t}\n}\nlayer {\n\tname: \"pool1\"\n\ttype: \"Pooling\"\n\tbottom: \"norm1\"\n\ttop: \"pool1\"\n\tpooling_param {\n\t\tkernel_size: 3\n\t\tstride: 2\n\t\tpad: 1\n\t\tpool: MAX\n\t}\n}\nlayer {\n\tname: \"conv2\"\n\ttype: \"Convolution\"\n\tbottom: \"pool1\"\n\ttop: \"conv2\"\n\tconvolution_param {\n\t\tnum_output: 256\n\t\tkernel_size: 5\n\t\tpad: 2\n\t\tstride: 2\n\t}\n}\nlayer {\n\tname: \"relu2\"\n\ttype: \"ReLU\"\n\tbottom: \"conv2\"\n\ttop: \"conv2\"\n}\n\nlayer {\n\tname: \"norm2\"\n\ttype: \"LRN\"\n\tbottom: \"conv2\"\n\ttop: \"norm2\"\n\tlrn_param {\n\t\tlocal_size: 3\n\t\talpha: 0.00005\n\t\tbeta: 0.75\n\t\tnorm_region: WITHIN_CHANNEL\n    engine: CAFFE\n\t}\n}\nlayer {\n\tname: \"pool2\"\n\ttype: \"Pooling\"\n\tbottom: \"norm2\"\n\ttop: \"pool2\"\n\tpooling_param {\n\t\tkernel_size: 3\n\t\tstride: 2\n\t\tpad: 1\n\t\tpool: MAX\n\t}\n}\nlayer {\n\tname: \"conv3\"\n\ttype: \"Convolution\"\n\tbottom: \"pool2\"\n\ttop: \"conv3\"\n\tconvolution_param {\n\t\tnum_output: 384\n\t\tkernel_size: 3\n\t\tpad: 1\n\t\tstride: 1\n\t}\n}\nlayer {\n\tname: \"relu3\"\n\ttype: \"ReLU\"\n\tbottom: \"conv3\"\n\ttop: \"conv3\"\n}\nlayer {\n\tname: \"conv4\"\n\ttype: \"Convolution\"\n\tbottom: \"conv3\"\n\ttop: \"conv4\"\n\tconvolution_param {\n\t\tnum_output: 384\n\t\tkernel_size: 3\n\t\tpad: 1\n\t\tstride: 1\n\t}\n}\nlayer {\n\tname: \"relu4\"\n\ttype: \"ReLU\"\n\tbottom: \"conv4\"\n\ttop: \"conv4\"\n}\nlayer {\n\tname: \"conv5\"\n\ttype: \"Convolution\"\n\tbottom: \"conv4\"\n\ttop: \"conv5\"\n\tconvolution_param {\n\t\tnum_output: 256\n\t\tkernel_size: 3\n\t\tpad: 1\n\t\tstride: 1\n\t}\n}\nlayer {\n\tname: \"relu5\"\n\ttype: \"ReLU\"\n\tbottom: \"conv5\"\n\ttop: \"conv5\"\n}\n\n#-----------------------layer +-------------------------\n\nlayer {\n  name: \"rpn_conv1\"\n  type: \"Convolution\"\n  bottom: \"conv5\"\n  top: \"rpn_conv1\"\n  convolution_param {\n    num_output: 256\n    kernel_size: 3 pad: 1 stride: 1\n  }\n}\nlayer {\n  name: \"rpn_relu1\"\n  type: \"ReLU\"\n  bottom: \"rpn_conv1\"\n  top: \"rpn_conv1\"\n}\nlayer {\n  name: \"rpn_cls_score\"\n  type: \"Convolution\"\n  bottom: \"rpn_conv1\"\n  top: \"rpn_cls_score\"\n  convolution_param {\n    num_output: 18   # 2(bg/fg) * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n  }\n}\nlayer {\n  name: \"rpn_bbox_pred\"\n  type: \"Convolution\"\n  bottom: \"rpn_conv1\"\n  top: \"rpn_bbox_pred\"\n  convolution_param {\n    num_output: 36   # 4 * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n  }\n}\nlayer {\n   bottom: \"rpn_cls_score\"\n   top: \"rpn_cls_score_reshape\"\n   name: \"rpn_cls_score_reshape\"\n   type: \"Reshape\"\n   reshape_param { shape { dim: 0 dim: 2 dim: -1 dim: 0 } }\n}\n\n#-----------------------output------------------------\nlayer {\n  name: \"rpn_cls_prob\"\n  type: \"Softmax\"\n  bottom: \"rpn_cls_score_reshape\"\n  top: \"rpn_cls_prob\"\n}\nlayer {\n  name: 'rpn_cls_prob_reshape'\n  type: 'Reshape'\n  bottom: 'rpn_cls_prob'\n  top: 'rpn_cls_prob_reshape'\n  reshape_param { shape { dim: 0 dim: 18 dim: -1 dim: 0 } }\n}\nlayer {\n  name: 'proposal'\n  type: 'Python'\n  bottom: 'rpn_cls_prob_reshape'\n  bottom: 'rpn_bbox_pred'\n  bottom: 'im_info'\n  top: 'rois'\n  top: 'scores'\n  python_param {\n    module: 'rpn.proposal_layer'\n    layer: 'ProposalLayer'\n    param_str: \"'feat_stride': 16\"\n  }\n}\n"
  },
  {
    "path": "models/pascal_voc/ZF/faster_rcnn_alt_opt/stage1_fast_rcnn_solver30k40k.pt",
    "content": "train_net: \"models/pascal_voc/ZF/faster_rcnn_alt_opt/stage1_fast_rcnn_train.pt\"\n\nbase_lr: 0.001\nlr_policy: \"step\"\ngamma: 0.1\nstepsize: 30000\ndisplay: 20\naverage_loss: 100\nmomentum: 0.9\nweight_decay: 0.0005\n\n# We disable standard caffe solver snapshotting and implement our own snapshot\n# function\nsnapshot: 0\n# We still use the snapshot prefix, though\nsnapshot_prefix: \"zf_fast_rcnn\"\n"
  },
  {
    "path": "models/pascal_voc/ZF/faster_rcnn_alt_opt/stage1_fast_rcnn_train.pt",
    "content": "name: \"ZF\"\nlayer {\n  name: 'data'\n  type: 'Python'\n  top: 'data'\n  top: 'rois'\n  top: 'labels'\n  top: 'bbox_targets'\n  top: 'bbox_inside_weights'\n  top: 'bbox_outside_weights'\n  python_param {\n    module: 'roi_data_layer.layer'\n    layer: 'RoIDataLayer'\n    param_str: \"'num_classes': 21\"\n  }\n}\n\n#========= conv1-conv5 ============\n\nlayer {\n\tname: \"conv1\"\n\ttype: \"Convolution\"\n\tbottom: \"data\"\n\ttop: \"conv1\"\n\tparam { lr_mult: 1.0 }\n\tparam { lr_mult: 2.0 }\n\tconvolution_param {\n\t\tnum_output: 96\n\t\tkernel_size: 7\n\t\tpad: 3\n\t\tstride: 2\n\t}\n}\nlayer {\n\tname: \"relu1\"\n\ttype: \"ReLU\"\n\tbottom: \"conv1\"\n\ttop: \"conv1\"\n}\nlayer {\n\tname: \"norm1\"\n\ttype: \"LRN\"\n\tbottom: \"conv1\"\n\ttop: \"norm1\"\n\tlrn_param {\n\t\tlocal_size: 3\n\t\talpha: 0.00005\n\t\tbeta: 0.75\n\t\tnorm_region: WITHIN_CHANNEL\n    engine: CAFFE\n\t}\n}\nlayer {\n\tname: \"pool1\"\n\ttype: \"Pooling\"\n\tbottom: \"norm1\"\n\ttop: \"pool1\"\n\tpooling_param {\n\t\tkernel_size: 3\n\t\tstride: 2\n\t\tpad: 1\n\t\tpool: MAX\n\t}\n}\nlayer {\n\tname: \"conv2\"\n\ttype: \"Convolution\"\n\tbottom: \"pool1\"\n\ttop: \"conv2\"\n\tparam { lr_mult: 1.0 }\n\tparam { lr_mult: 2.0 }\n\tconvolution_param {\n\t\tnum_output: 256\n\t\tkernel_size: 5\n\t\tpad: 2\n\t\tstride: 2\n\t}\n}\nlayer {\n\tname: \"relu2\"\n\ttype: \"ReLU\"\n\tbottom: \"conv2\"\n\ttop: \"conv2\"\n}\nlayer {\n\tname: \"norm2\"\n\ttype: \"LRN\"\n\tbottom: \"conv2\"\n\ttop: \"norm2\"\n\tlrn_param {\n\t\tlocal_size: 3\n\t\talpha: 0.00005\n\t\tbeta: 0.75\n\t\tnorm_region: WITHIN_CHANNEL\n    engine: CAFFE\n\t}\n}\nlayer {\n\tname: \"pool2\"\n\ttype: \"Pooling\"\n\tbottom: \"norm2\"\n\ttop: \"pool2\"\n\tpooling_param {\n\t\tkernel_size: 3\n\t\tstride: 2\n\t\tpad: 1\n\t\tpool: MAX\n\t}\n}\nlayer {\n\tname: \"conv3\"\n\ttype: \"Convolution\"\n\tbottom: \"pool2\"\n\ttop: \"conv3\"\n\tparam { lr_mult: 1.0 }\n\tparam { lr_mult: 2.0 }\n\tconvolution_param {\n\t\tnum_output: 384\n\t\tkernel_size: 3\n\t\tpad: 1\n\t\tstride: 1\n\t}\n}\nlayer {\n\tname: \"relu3\"\n\ttype: \"ReLU\"\n\tbottom: \"conv3\"\n\ttop: \"conv3\"\n}\nlayer {\n\tname: \"conv4\"\n\ttype: \"Convolution\"\n\tbottom: \"conv3\"\n\ttop: \"conv4\"\n\tparam { lr_mult: 1.0 }\n\tparam { lr_mult: 2.0 }\n\tconvolution_param {\n\t\tnum_output: 384\n\t\tkernel_size: 3\n\t\tpad: 1\n\t\tstride: 1\n\t}\n}\nlayer {\n\tname: \"relu4\"\n\ttype: \"ReLU\"\n\tbottom: \"conv4\"\n\ttop: \"conv4\"\n}\nlayer {\n\tname: \"conv5\"\n\ttype: \"Convolution\"\n\tbottom: \"conv4\"\n\ttop: \"conv5\"\n\tparam { lr_mult: 1.0 }\n\tparam { lr_mult: 2.0 }\n\tconvolution_param {\n\t\tnum_output: 256\n\t\tkernel_size: 3\n\t\tpad: 1\n\t\tstride: 1\n\t}\n}\nlayer {\n\tname: \"relu5\"\n\ttype: \"ReLU\"\n\tbottom: \"conv5\"\n\ttop: \"conv5\"\n}\n\n#========= RCNN ============\n\nlayer {\n  name: \"roi_pool_conv5\"\n  type: \"ROIPooling\"\n  bottom: \"conv5\"\n  bottom: \"rois\"\n  top: \"roi_pool_conv5\"\n  roi_pooling_param {\n    pooled_w: 6\n    pooled_h: 6\n    spatial_scale: 0.0625 # 1/16\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"roi_pool_conv5\"\n  top: \"fc6\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"drop6\"\n  type: \"Dropout\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n  dropout_param {\n    dropout_ratio: 0.5\n    scale_train: false\n  }\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu7\"\n  type: \"ReLU\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n}\nlayer {\n  name: \"drop7\"\n  type: \"Dropout\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n  dropout_param {\n    dropout_ratio: 0.5\n    scale_train: false\n  }\n}\nlayer {\n  name: \"cls_score\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"cls_score\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  inner_product_param {\n    num_output: 21\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"bbox_pred\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"bbox_pred\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  inner_product_param {\n    num_output: 84\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.001\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"loss_cls\"\n  type: \"SoftmaxWithLoss\"\n  bottom: \"cls_score\"\n  bottom: \"labels\"\n  propagate_down: 1\n  propagate_down: 0\n  top: \"cls_loss\"\n  loss_weight: 1\n  loss_param {\n    ignore_label: -1\n    normalize: true\n  }\n}\nlayer {\n  name: \"loss_bbox\"\n  type: \"SmoothL1Loss\"\n  bottom: \"bbox_pred\"\n  bottom: \"bbox_targets\"\n  bottom: \"bbox_inside_weights\"\n  bottom: \"bbox_outside_weights\"\n  top: \"bbox_loss\"\n  loss_weight: 1\n}\n\n#========= RPN ============\n# Dummy layers so that initial parameters are saved into the output net\n\nlayer {\n  name: \"rpn_conv1\"\n  type: \"Convolution\"\n  bottom: \"conv5\"\n  top: \"rpn_conv1\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 256\n    kernel_size: 3 pad: 1 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_relu1\"\n  type: \"ReLU\"\n  bottom: \"rpn_conv1\"\n  top: \"rpn_conv1\"\n}\nlayer {\n  name: \"rpn_cls_score\"\n  type: \"Convolution\"\n  bottom: \"rpn_conv1\"\n  top: \"rpn_cls_score\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 18   # 2(bg/fg) * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_bbox_pred\"\n  type: \"Convolution\"\n  bottom: \"rpn_conv1\"\n  top: \"rpn_bbox_pred\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 36   # 4 * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"silence_rpn_cls_score\"\n  type: \"Silence\"\n  bottom: \"rpn_cls_score\"\n}\nlayer {\n  name: \"silence_rpn_bbox_pred\"\n  type: \"Silence\"\n  bottom: \"rpn_bbox_pred\"\n}\n"
  },
  {
    "path": "models/pascal_voc/ZF/faster_rcnn_alt_opt/stage1_rpn_solver60k80k.pt",
    "content": "train_net: \"models/pascal_voc/ZF/faster_rcnn_alt_opt/stage1_rpn_train.pt\"\n\nbase_lr: 0.001\nlr_policy: \"step\"\ngamma: 0.1\nstepsize: 60000\ndisplay: 20\naverage_loss: 100\nmomentum: 0.9\nweight_decay: 0.0005\n\n# We disable standard caffe solver snapshotting and implement our own snapshot\n# function\nsnapshot: 0\n# We still use the snapshot prefix, though\nsnapshot_prefix: \"zf_rpn\"\n"
  },
  {
    "path": "models/pascal_voc/ZF/faster_rcnn_alt_opt/stage1_rpn_train.pt",
    "content": "name: \"ZF\"\nlayer {\n  name: 'input-data'\n  type: 'Python'\n  top: 'data'\n  top: 'im_info'\n  top: 'gt_boxes'\n  python_param {\n    module: 'roi_data_layer.layer'\n    layer: 'RoIDataLayer'\n    param_str: \"'num_classes': 21\"\n  }\n}\n\n#========= conv1-conv5 ============\n\nlayer {\n\tname: \"conv1\"\n\ttype: \"Convolution\"\n\tbottom: \"data\"\n\ttop: \"conv1\"\n\tparam { lr_mult: 1.0 }\n\tparam { lr_mult: 2.0 }\n\tconvolution_param {\n\t\tnum_output: 96\n\t\tkernel_size: 7\n\t\tpad: 3\n\t\tstride: 2\n\t}\n}\nlayer {\n\tname: \"relu1\"\n\ttype: \"ReLU\"\n\tbottom: \"conv1\"\n\ttop: \"conv1\"\n}\nlayer {\n\tname: \"norm1\"\n\ttype: \"LRN\"\n\tbottom: \"conv1\"\n\ttop: \"norm1\"\n\tlrn_param {\n\t\tlocal_size: 3\n\t\talpha: 0.00005\n\t\tbeta: 0.75\n\t\tnorm_region: WITHIN_CHANNEL\n    engine: CAFFE\n\t}\n}\nlayer {\n\tname: \"pool1\"\n\ttype: \"Pooling\"\n\tbottom: \"norm1\"\n\ttop: \"pool1\"\n\tpooling_param {\n\t\tkernel_size: 3\n\t\tstride: 2\n\t\tpad: 1\n\t\tpool: MAX\n\t}\n}\nlayer {\n\tname: \"conv2\"\n\ttype: \"Convolution\"\n\tbottom: \"pool1\"\n\ttop: \"conv2\"\n\tparam { lr_mult: 1.0 }\n\tparam { lr_mult: 2.0 }\n\tconvolution_param {\n\t\tnum_output: 256\n\t\tkernel_size: 5\n\t\tpad: 2\n\t\tstride: 2\n\t}\n}\nlayer {\n\tname: \"relu2\"\n\ttype: \"ReLU\"\n\tbottom: \"conv2\"\n\ttop: \"conv2\"\n}\nlayer {\n\tname: \"norm2\"\n\ttype: \"LRN\"\n\tbottom: \"conv2\"\n\ttop: \"norm2\"\n\tlrn_param {\n\t\tlocal_size: 3\n\t\talpha: 0.00005\n\t\tbeta: 0.75\n\t\tnorm_region: WITHIN_CHANNEL\n    engine: CAFFE\n\t}\n}\nlayer {\n\tname: \"pool2\"\n\ttype: \"Pooling\"\n\tbottom: \"norm2\"\n\ttop: \"pool2\"\n\tpooling_param {\n\t\tkernel_size: 3\n\t\tstride: 2\n\t\tpad: 1\n\t\tpool: MAX\n\t}\n}\nlayer {\n\tname: \"conv3\"\n\ttype: \"Convolution\"\n\tbottom: \"pool2\"\n\ttop: \"conv3\"\n\tparam { lr_mult: 1.0 }\n\tparam { lr_mult: 2.0 }\n\tconvolution_param {\n\t\tnum_output: 384\n\t\tkernel_size: 3\n\t\tpad: 1\n\t\tstride: 1\n\t}\n}\nlayer {\n\tname: \"relu3\"\n\ttype: \"ReLU\"\n\tbottom: \"conv3\"\n\ttop: \"conv3\"\n}\nlayer {\n\tname: \"conv4\"\n\ttype: \"Convolution\"\n\tbottom: \"conv3\"\n\ttop: \"conv4\"\n\tparam { lr_mult: 1.0 }\n\tparam { lr_mult: 2.0 }\n\tconvolution_param {\n\t\tnum_output: 384\n\t\tkernel_size: 3\n\t\tpad: 1\n\t\tstride: 1\n\t}\n}\nlayer {\n\tname: \"relu4\"\n\ttype: \"ReLU\"\n\tbottom: \"conv4\"\n\ttop: \"conv4\"\n}\nlayer {\n\tname: \"conv5\"\n\ttype: \"Convolution\"\n\tbottom: \"conv4\"\n\ttop: \"conv5\"\n\tparam { lr_mult: 1.0 }\n\tparam { lr_mult: 2.0 }\n\tconvolution_param {\n\t\tnum_output: 256\n\t\tkernel_size: 3\n\t\tpad: 1\n\t\tstride: 1\n\t}\n}\nlayer {\n\tname: \"relu5\"\n\ttype: \"ReLU\"\n\tbottom: \"conv5\"\n\ttop: \"conv5\"\n}\n\n#========= RPN ============\n\nlayer {\n  name: \"rpn_conv1\"\n  type: \"Convolution\"\n  bottom: \"conv5\"\n  top: \"rpn_conv1\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 256\n    kernel_size: 3 pad: 1 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_relu1\"\n  type: \"ReLU\"\n  bottom: \"rpn_conv1\"\n  top: \"rpn_conv1\"\n}\nlayer {\n  name: \"rpn_cls_score\"\n  type: \"Convolution\"\n  bottom: \"rpn_conv1\"\n  top: \"rpn_cls_score\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 18   # 2(bg/fg) * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_bbox_pred\"\n  type: \"Convolution\"\n  bottom: \"rpn_conv1\"\n  top: \"rpn_bbox_pred\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 36   # 4 * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n   bottom: \"rpn_cls_score\"\n   top: \"rpn_cls_score_reshape\"\n   name: \"rpn_cls_score_reshape\"\n   type: \"Reshape\"\n   reshape_param { shape { dim: 0 dim: 2 dim: -1 dim: 0 } }\n}\nlayer {\n  name: 'rpn-data'\n  type: 'Python'\n  bottom: 'rpn_cls_score'\n  bottom: 'gt_boxes'\n  bottom: 'im_info'\n  bottom: 'data'\n  top: 'rpn_labels'\n  top: 'rpn_bbox_targets'\n  top: 'rpn_bbox_inside_weights'\n  top: 'rpn_bbox_outside_weights'\n  python_param {\n    module: 'rpn.anchor_target_layer'\n    layer: 'AnchorTargetLayer'\n    param_str: \"'feat_stride': 16\"\n  }\n}\nlayer {\n  name: \"rpn_loss_cls\"\n  type: \"SoftmaxWithLoss\"\n  bottom: \"rpn_cls_score_reshape\"\n  bottom: \"rpn_labels\"\n  propagate_down: 1\n  propagate_down: 0\n  top: \"rpn_cls_loss\"\n  loss_weight: 1\n  loss_param {\n    ignore_label: -1\n    normalize: true\n  }\n}\nlayer {\n  name: \"rpn_loss_bbox\"\n  type: \"SmoothL1Loss\"\n  bottom: \"rpn_bbox_pred\"\n  bottom: \"rpn_bbox_targets\"\n  bottom: \"rpn_bbox_inside_weights\"\n  bottom: \"rpn_bbox_outside_weights\"\n  top: \"rpn_loss_bbox\"\n  loss_weight: 1\n  smooth_l1_loss_param { sigma: 3.0 }\n}\n\n#========= RCNN ============\n# Dummy layers so that initial parameters are saved into the output net\n\nlayer {\n  name: \"dummy_roi_pool_conv5\"\n  type: \"DummyData\"\n  top: \"dummy_roi_pool_conv5\"\n  dummy_data_param {\n    shape { dim: 1 dim: 9216 }\n    data_filler { type: \"gaussian\" std: 0.01 }\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"dummy_roi_pool_conv5\"\n  top: \"fc6\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"silence_fc7\"\n  type: \"Silence\"\n  bottom: \"fc7\"\n}\n"
  },
  {
    "path": "models/pascal_voc/ZF/faster_rcnn_alt_opt/stage2_fast_rcnn_solver30k40k.pt",
    "content": "train_net: \"models/pascal_voc/ZF/faster_rcnn_alt_opt/stage2_fast_rcnn_train.pt\"\n\nbase_lr: 0.001\nlr_policy: \"step\"\ngamma: 0.1\nstepsize: 30000\ndisplay: 20\naverage_loss: 100\nmomentum: 0.9\nweight_decay: 0.0005\n\n# We disable standard caffe solver snapshotting and implement our own snapshot\n# function\nsnapshot: 0\n# We still use the snapshot prefix, though\nsnapshot_prefix: \"zf_fast_rcnn\"\n"
  },
  {
    "path": "models/pascal_voc/ZF/faster_rcnn_alt_opt/stage2_fast_rcnn_train.pt",
    "content": "name: \"ZF\"\nlayer {\n  name: 'data'\n  type: 'Python'\n  top: 'data'\n  top: 'rois'\n  top: 'labels'\n  top: 'bbox_targets'\n  top: 'bbox_inside_weights'\n  top: 'bbox_outside_weights'\n  python_param {\n    module: 'roi_data_layer.layer'\n    layer: 'RoIDataLayer'\n    param_str: \"'num_classes': 21\"\n  }\n}\n\n#========= conv1-conv5 ============\n\nlayer {\n\tname: \"conv1\"\n\ttype: \"Convolution\"\n\tbottom: \"data\"\n\ttop: \"conv1\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n\tconvolution_param {\n\t\tnum_output: 96\n\t\tkernel_size: 7\n\t\tpad: 3\n\t\tstride: 2\n\t}\n}\nlayer {\n\tname: \"relu1\"\n\ttype: \"ReLU\"\n\tbottom: \"conv1\"\n\ttop: \"conv1\"\n}\nlayer {\n\tname: \"norm1\"\n\ttype: \"LRN\"\n\tbottom: \"conv1\"\n\ttop: \"norm1\"\n\tlrn_param {\n\t\tlocal_size: 3\n\t\talpha: 0.00005\n\t\tbeta: 0.75\n\t\tnorm_region: WITHIN_CHANNEL\n    engine: CAFFE\n\t}\n}\nlayer {\n\tname: \"pool1\"\n\ttype: \"Pooling\"\n\tbottom: \"norm1\"\n\ttop: \"pool1\"\n\tpooling_param {\n\t\tkernel_size: 3\n\t\tstride: 2\n\t\tpad: 1\n\t\tpool: MAX\n\t}\n}\nlayer {\n\tname: \"conv2\"\n\ttype: \"Convolution\"\n\tbottom: \"pool1\"\n\ttop: \"conv2\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n\tconvolution_param {\n\t\tnum_output: 256\n\t\tkernel_size: 5\n\t\tpad: 2\n\t\tstride: 2\n\t}\n}\nlayer {\n\tname: \"relu2\"\n\ttype: \"ReLU\"\n\tbottom: \"conv2\"\n\ttop: \"conv2\"\n}\nlayer {\n\tname: \"norm2\"\n\ttype: \"LRN\"\n\tbottom: \"conv2\"\n\ttop: \"norm2\"\n\tlrn_param {\n\t\tlocal_size: 3\n\t\talpha: 0.00005\n\t\tbeta: 0.75\n\t\tnorm_region: WITHIN_CHANNEL\n    engine: CAFFE\n\t}\n}\nlayer {\n\tname: \"pool2\"\n\ttype: \"Pooling\"\n\tbottom: \"norm2\"\n\ttop: \"pool2\"\n\tpooling_param {\n\t\tkernel_size: 3\n\t\tstride: 2\n\t\tpad: 1\n\t\tpool: MAX\n\t}\n}\nlayer {\n\tname: \"conv3\"\n\ttype: \"Convolution\"\n\tbottom: \"pool2\"\n\ttop: \"conv3\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n\tconvolution_param {\n\t\tnum_output: 384\n\t\tkernel_size: 3\n\t\tpad: 1\n\t\tstride: 1\n\t}\n}\nlayer {\n\tname: \"relu3\"\n\ttype: \"ReLU\"\n\tbottom: \"conv3\"\n\ttop: \"conv3\"\n}\nlayer {\n\tname: \"conv4\"\n\ttype: \"Convolution\"\n\tbottom: \"conv3\"\n\ttop: \"conv4\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n\tconvolution_param {\n\t\tnum_output: 384\n\t\tkernel_size: 3\n\t\tpad: 1\n\t\tstride: 1\n\t}\n}\nlayer {\n\tname: \"relu4\"\n\ttype: \"ReLU\"\n\tbottom: \"conv4\"\n\ttop: \"conv4\"\n}\nlayer {\n\tname: \"conv5\"\n\ttype: \"Convolution\"\n\tbottom: \"conv4\"\n\ttop: \"conv5\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n\tconvolution_param {\n\t\tnum_output: 256\n\t\tkernel_size: 3\n\t\tpad: 1\n\t\tstride: 1\n\t}\n}\nlayer {\n\tname: \"relu5\"\n\ttype: \"ReLU\"\n\tbottom: \"conv5\"\n\ttop: \"conv5\"\n}\n\n#========= RCNN ============\n\nlayer {\n  name: \"roi_pool_conv5\"\n  type: \"ROIPooling\"\n  bottom: \"conv5\"\n  bottom: \"rois\"\n  top: \"roi_pool_conv5\"\n  roi_pooling_param {\n    pooled_w: 6\n    pooled_h: 6\n    spatial_scale: 0.0625 # 1/16\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"roi_pool_conv5\"\n  top: \"fc6\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"drop6\"\n  type: \"Dropout\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n  dropout_param {\n    dropout_ratio: 0.5\n    scale_train: false\n  }\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu7\"\n  type: \"ReLU\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n}\nlayer {\n  name: \"drop7\"\n  type: \"Dropout\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n  dropout_param {\n    dropout_ratio: 0.5\n    scale_train: false\n  }\n}\nlayer {\n  name: \"cls_score\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"cls_score\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  inner_product_param {\n    num_output: 21\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"bbox_pred\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"bbox_pred\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  inner_product_param {\n    num_output: 84\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.001\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"loss_cls\"\n  type: \"SoftmaxWithLoss\"\n  bottom: \"cls_score\"\n  bottom: \"labels\"\n  propagate_down: 1\n  propagate_down: 0\n  top: \"cls_loss\"\n  loss_weight: 1\n  loss_param {\n    ignore_label: -1\n    normalize: true\n  }\n}\nlayer {\n  name: \"loss_bbox\"\n  type: \"SmoothL1Loss\"\n  bottom: \"bbox_pred\"\n  bottom: \"bbox_targets\"\n  bottom: \"bbox_inside_weights\"\n  bottom: \"bbox_outside_weights\"\n  top: \"bbox_loss\"\n  loss_weight: 1\n}\n\n#========= RPN ============\n# Dummy layers so that initial parameters are saved into the output net\n\nlayer {\n  name: \"rpn_conv1\"\n  type: \"Convolution\"\n  bottom: \"conv5\"\n  top: \"rpn_conv1\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 256\n    kernel_size: 3 pad: 1 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_relu1\"\n  type: \"ReLU\"\n  bottom: \"rpn_conv1\"\n  top: \"rpn_conv1\"\n}\nlayer {\n  name: \"rpn_cls_score\"\n  type: \"Convolution\"\n  bottom: \"rpn_conv1\"\n  top: \"rpn_cls_score\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 18   # 2(bg/fg) * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_bbox_pred\"\n  type: \"Convolution\"\n  bottom: \"rpn_conv1\"\n  top: \"rpn_bbox_pred\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  convolution_param {\n    num_output: 36   # 4 * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"silence_rpn_cls_score\"\n  type: \"Silence\"\n  bottom: \"rpn_cls_score\"\n}\nlayer {\n  name: \"silence_rpn_bbox_pred\"\n  type: \"Silence\"\n  bottom: \"rpn_bbox_pred\"\n}\n"
  },
  {
    "path": "models/pascal_voc/ZF/faster_rcnn_alt_opt/stage2_rpn_solver60k80k.pt",
    "content": "train_net: \"models/pascal_voc/ZF/faster_rcnn_alt_opt/stage2_rpn_train.pt\"\n\nbase_lr: 0.001\nlr_policy: \"step\"\ngamma: 0.1\nstepsize: 60000\ndisplay: 20\naverage_loss: 100\nmomentum: 0.9\nweight_decay: 0.0005\n\n# We disable standard caffe solver snapshotting and implement our own snapshot\n# function\nsnapshot: 0\n# We still use the snapshot prefix, though\nsnapshot_prefix: \"zf_rpn\"\n"
  },
  {
    "path": "models/pascal_voc/ZF/faster_rcnn_alt_opt/stage2_rpn_train.pt",
    "content": "name: \"ZF\"\nlayer {\n  name: 'input-data'\n  type: 'Python'\n  top: 'data'\n  top: 'im_info'\n  top: 'gt_boxes'\n  python_param {\n    module: 'roi_data_layer.layer'\n    layer: 'RoIDataLayer'\n    param_str: \"'num_classes': 21\"\n  }\n}\n\n#========= conv1-conv5 ============\n\nlayer {\n\tname: \"conv1\"\n\ttype: \"Convolution\"\n\tbottom: \"data\"\n\ttop: \"conv1\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n\tconvolution_param {\n\t\tnum_output: 96\n\t\tkernel_size: 7\n\t\tpad: 3\n\t\tstride: 2\n\t}\n}\nlayer {\n\tname: \"relu1\"\n\ttype: \"ReLU\"\n\tbottom: \"conv1\"\n\ttop: \"conv1\"\n}\nlayer {\n\tname: \"norm1\"\n\ttype: \"LRN\"\n\tbottom: \"conv1\"\n\ttop: \"norm1\"\n\tlrn_param {\n\t\tlocal_size: 3\n\t\talpha: 0.00005\n\t\tbeta: 0.75\n\t\tnorm_region: WITHIN_CHANNEL\n    engine: CAFFE\n\t}\n}\nlayer {\n\tname: \"pool1\"\n\ttype: \"Pooling\"\n\tbottom: \"norm1\"\n\ttop: \"pool1\"\n\tpooling_param {\n\t\tkernel_size: 3\n\t\tstride: 2\n\t\tpad: 1\n\t\tpool: MAX\n\t}\n}\nlayer {\n\tname: \"conv2\"\n\ttype: \"Convolution\"\n\tbottom: \"pool1\"\n\ttop: \"conv2\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n\tconvolution_param {\n\t\tnum_output: 256\n\t\tkernel_size: 5\n\t\tpad: 2\n\t\tstride: 2\n\t}\n}\nlayer {\n\tname: \"relu2\"\n\ttype: \"ReLU\"\n\tbottom: \"conv2\"\n\ttop: \"conv2\"\n}\nlayer {\n\tname: \"norm2\"\n\ttype: \"LRN\"\n\tbottom: \"conv2\"\n\ttop: \"norm2\"\n\tlrn_param {\n\t\tlocal_size: 3\n\t\talpha: 0.00005\n\t\tbeta: 0.75\n\t\tnorm_region: WITHIN_CHANNEL\n    engine: CAFFE\n\t}\n}\nlayer {\n\tname: \"pool2\"\n\ttype: \"Pooling\"\n\tbottom: \"norm2\"\n\ttop: \"pool2\"\n\tpooling_param {\n\t\tkernel_size: 3\n\t\tstride: 2\n\t\tpad: 1\n\t\tpool: MAX\n\t}\n}\nlayer {\n\tname: \"conv3\"\n\ttype: \"Convolution\"\n\tbottom: \"pool2\"\n\ttop: \"conv3\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n\tconvolution_param {\n\t\tnum_output: 384\n\t\tkernel_size: 3\n\t\tpad: 1\n\t\tstride: 1\n\t}\n}\nlayer {\n\tname: \"relu3\"\n\ttype: \"ReLU\"\n\tbottom: \"conv3\"\n\ttop: \"conv3\"\n}\nlayer {\n\tname: \"conv4\"\n\ttype: \"Convolution\"\n\tbottom: \"conv3\"\n\ttop: \"conv4\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n\tconvolution_param {\n\t\tnum_output: 384\n\t\tkernel_size: 3\n\t\tpad: 1\n\t\tstride: 1\n\t}\n}\nlayer {\n\tname: \"relu4\"\n\ttype: \"ReLU\"\n\tbottom: \"conv4\"\n\ttop: \"conv4\"\n}\nlayer {\n\tname: \"conv5\"\n\ttype: \"Convolution\"\n\tbottom: \"conv4\"\n\ttop: \"conv5\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n\tconvolution_param {\n\t\tnum_output: 256\n\t\tkernel_size: 3\n\t\tpad: 1\n\t\tstride: 1\n\t}\n}\nlayer {\n\tname: \"relu5\"\n\ttype: \"ReLU\"\n\tbottom: \"conv5\"\n\ttop: \"conv5\"\n}\n\n#========= RPN ============\n\nlayer {\n  name: \"rpn_conv1\"\n  type: \"Convolution\"\n  bottom: \"conv5\"\n  top: \"rpn_conv1\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 256\n    kernel_size: 3 pad: 1 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_relu1\"\n  type: \"ReLU\"\n  bottom: \"rpn_conv1\"\n  top: \"rpn_conv1\"\n}\nlayer {\n  name: \"rpn_cls_score\"\n  type: \"Convolution\"\n  bottom: \"rpn_conv1\"\n  top: \"rpn_cls_score\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 18   # 2(bg/fg) * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_bbox_pred\"\n  type: \"Convolution\"\n  bottom: \"rpn_conv1\"\n  top: \"rpn_bbox_pred\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 36   # 4 * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n   bottom: \"rpn_cls_score\"\n   top: \"rpn_cls_score_reshape\"\n   name: \"rpn_cls_score_reshape\"\n   type: \"Reshape\"\n   reshape_param { shape { dim: 0 dim: 2 dim: -1 dim: 0 } }\n}\nlayer {\n  name: 'rpn-data'\n  type: 'Python'\n  bottom: 'rpn_cls_score'\n  bottom: 'gt_boxes'\n  bottom: 'im_info'\n  bottom: 'data'\n  top: 'rpn_labels'\n  top: 'rpn_bbox_targets'\n  top: 'rpn_bbox_inside_weights'\n  top: 'rpn_bbox_outside_weights'\n  python_param {\n    module: 'rpn.anchor_target_layer'\n    layer: 'AnchorTargetLayer'\n    param_str: \"'feat_stride': 16\"\n  }\n}\nlayer {\n  name: \"rpn_loss_cls\"\n  type: \"SoftmaxWithLoss\"\n  bottom: \"rpn_cls_score_reshape\"\n  bottom: \"rpn_labels\"\n  propagate_down: 1\n  propagate_down: 0\n  top: \"rpn_cls_loss\"\n  loss_weight: 1\n  loss_param {\n    ignore_label: -1\n    normalize: true\n  }\n}\nlayer {\n  name: \"rpn_loss_bbox\"\n  type: \"SmoothL1Loss\"\n  bottom: \"rpn_bbox_pred\"\n  bottom: \"rpn_bbox_targets\"\n  bottom: \"rpn_bbox_inside_weights\"\n  bottom: \"rpn_bbox_outside_weights\"\n  top: \"rpn_loss_bbox\"\n  loss_weight: 1\n  smooth_l1_loss_param { sigma: 3.0 }\n}\n\n#========= RCNN ============\n# Dummy layers so that initial parameters are saved into the output net\n\nlayer {\n  name: \"dummy_roi_pool_conv5\"\n  type: \"DummyData\"\n  top: \"dummy_roi_pool_conv5\"\n  dummy_data_param {\n    shape { dim: 1 dim: 9216 }\n    data_filler { type: \"gaussian\" std: 0.01 }\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"dummy_roi_pool_conv5\"\n  top: \"fc6\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  param { lr_mult: 0 decay_mult: 0 }\n  param { lr_mult: 0 decay_mult: 0 }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"silence_fc7\"\n  type: \"Silence\"\n  bottom: \"fc7\"\n}\n"
  },
  {
    "path": "models/pascal_voc/ZF/faster_rcnn_end2end/solver.prototxt",
    "content": "train_net: \"models/pascal_voc/ZF/faster_rcnn_end2end/train.prototxt\"\n\nbase_lr: 0.001\nlr_policy: \"step\"\ngamma: 0.1\nstepsize: 50000\ndisplay: 20\naverage_loss: 100\nmomentum: 0.9\nweight_decay: 0.0005\n\n#base_lr: 0.001\n#lr_policy: \"exp\"\n#gamma: 0.999539589  # (0.00001/0.001)^(1/10000)\n#display: 1\n#average_loss: 100\n#momentum: 0.9\n#weight_decay: 0.0005\n\n# We disable standard caffe solver snapshotting and implement our own snapshot\n# function\nsnapshot: 0\n# We still use the snapshot prefix, though\nsnapshot_prefix: \"zf_faster_rcnn\"\niter_size: 2\n"
  },
  {
    "path": "models/pascal_voc/ZF/faster_rcnn_end2end/test.prototxt",
    "content": "name: \"ZF\"\n\ninput: \"data\"\ninput_shape {\n  dim: 1\n  dim: 3\n  dim: 224\n  dim: 224\n}\n\ninput: \"im_info\"\ninput_shape {\n  dim: 1\n  dim: 3\n}\n\n#========= conv1-conv5 ============\n\nlayer {\n\tname: \"conv1\"\n\ttype: \"Convolution\"\n\tbottom: \"data\"\n\ttop: \"conv1\"\n\tconvolution_param {\n\t\tnum_output: 96\n\t\tkernel_size: 7\n\t\tpad: 3\n\t\tstride: 2\n\t}\n}\nlayer {\n\tname: \"relu1\"\n\ttype: \"ReLU\"\n\tbottom: \"conv1\"\n\ttop: \"conv1\"\n}\nlayer {\n\tname: \"norm1\"\n\ttype: \"LRN\"\n\tbottom: \"conv1\"\n\ttop: \"norm1\"\n\tlrn_param {\n\t\tlocal_size: 3\n\t\talpha: 0.00005\n\t\tbeta: 0.75\n\t\tnorm_region: WITHIN_CHANNEL\n    engine: CAFFE\n\t}\n}\nlayer {\n\tname: \"pool1\"\n\ttype: \"Pooling\"\n\tbottom: \"norm1\"\n\ttop: \"pool1\"\n\tpooling_param {\n\t\tkernel_size: 3\n\t\tstride: 2\n\t\tpad: 1\n\t\tpool: MAX\n\t}\n}\nlayer {\n\tname: \"conv2\"\n\ttype: \"Convolution\"\n\tbottom: \"pool1\"\n\ttop: \"conv2\"\n\tconvolution_param {\n\t\tnum_output: 256\n\t\tkernel_size: 5\n\t\tpad: 2\n\t\tstride: 2\n\t}\n}\nlayer {\n\tname: \"relu2\"\n\ttype: \"ReLU\"\n\tbottom: \"conv2\"\n\ttop: \"conv2\"\n}\nlayer {\n\tname: \"norm2\"\n\ttype: \"LRN\"\n\tbottom: \"conv2\"\n\ttop: \"norm2\"\n\tlrn_param {\n\t\tlocal_size: 3\n\t\talpha: 0.00005\n\t\tbeta: 0.75\n\t\tnorm_region: WITHIN_CHANNEL\n    engine: CAFFE\n\t}\n}\nlayer {\n\tname: \"pool2\"\n\ttype: \"Pooling\"\n\tbottom: \"norm2\"\n\ttop: \"pool2\"\n\tpooling_param {\n\t\tkernel_size: 3\n\t\tstride: 2\n\t\tpad: 1\n\t\tpool: MAX\n\t}\n}\nlayer {\n\tname: \"conv3\"\n\ttype: \"Convolution\"\n\tbottom: \"pool2\"\n\ttop: \"conv3\"\n\tconvolution_param {\n\t\tnum_output: 384\n\t\tkernel_size: 3\n\t\tpad: 1\n\t\tstride: 1\n\t}\n}\nlayer {\n\tname: \"relu3\"\n\ttype: \"ReLU\"\n\tbottom: \"conv3\"\n\ttop: \"conv3\"\n}\nlayer {\n\tname: \"conv4\"\n\ttype: \"Convolution\"\n\tbottom: \"conv3\"\n\ttop: \"conv4\"\n\tconvolution_param {\n\t\tnum_output: 384\n\t\tkernel_size: 3\n\t\tpad: 1\n\t\tstride: 1\n\t}\n}\nlayer {\n\tname: \"relu4\"\n\ttype: \"ReLU\"\n\tbottom: \"conv4\"\n\ttop: \"conv4\"\n}\nlayer {\n\tname: \"conv5\"\n\ttype: \"Convolution\"\n\tbottom: \"conv4\"\n\ttop: \"conv5\"\n\tconvolution_param {\n\t\tnum_output: 256\n\t\tkernel_size: 3\n\t\tpad: 1\n\t\tstride: 1\n\t}\n}\nlayer {\n\tname: \"relu5\"\n\ttype: \"ReLU\"\n\tbottom: \"conv5\"\n\ttop: \"conv5\"\n}\n\n#========= RPN ============\n\nlayer {\n  name: \"rpn_conv/3x3\"\n  type: \"Convolution\"\n  bottom: \"conv5\"\n  top: \"rpn/output\"\n  convolution_param {\n    num_output: 256\n    kernel_size: 3 pad: 1 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_relu/3x3\"\n  type: \"ReLU\"\n  bottom: \"rpn/output\"\n  top: \"rpn/output\"\n}\n#layer {\n#  name: \"rpn_conv/3x3\"\n#  type: \"Convolution\"\n#  bottom: \"conv5\"\n#  top: \"rpn_conv/3x3\"\n#  param { lr_mult: 1.0 decay_mult: 1.0 }\n#  param { lr_mult: 2.0 decay_mult: 0 }\n#  convolution_param {\n#    num_output: 192\n#    kernel_size: 3 pad: 1 stride: 1\n#    weight_filler { type: \"gaussian\" std: 0.01 }\n#    bias_filler { type: \"constant\" value: 0 }\n#  }\n#}\n#layer {\n#  name: \"rpn_conv/5x5\"\n#  type: \"Convolution\"\n#  bottom: \"conv5\"\n#  top: \"rpn_conv/5x5\"\n#  param { lr_mult: 1.0 decay_mult: 1.0 }\n#  param { lr_mult: 2.0 decay_mult: 0 }\n#  convolution_param {\n#    num_output: 64\n#    kernel_size: 5 pad: 2 stride: 1\n#    weight_filler { type: \"gaussian\" std: 0.0036 }\n#    bias_filler { type: \"constant\" value: 0 }\n#  }\n#}\n#layer {\n#  name: \"rpn/output\"\n#  type: \"Concat\"\n#  bottom: \"rpn_conv/3x3\"\n#  bottom: \"rpn_conv/5x5\"\n#  top: \"rpn/output\"\n#}\n#layer {\n#  name: \"rpn_relu/output\"\n#  type: \"ReLU\"\n#  bottom: \"rpn/output\"\n#  top: \"rpn/output\"\n#}\nlayer {\n  name: \"rpn_cls_score\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_cls_score\"\n  convolution_param {\n    num_output: 18   # 2(bg/fg) * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_bbox_pred\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_bbox_pred\"\n  convolution_param {\n    num_output: 36   # 4 * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n   bottom: \"rpn_cls_score\"\n   top: \"rpn_cls_score_reshape\"\n   name: \"rpn_cls_score_reshape\"\n   type: \"Reshape\"\n   reshape_param { shape { dim: 0 dim: 2 dim: -1 dim: 0 } }\n}\n\n#========= RoI Proposal ============\n\nlayer {\n  name: \"rpn_cls_prob\"\n  type: \"Softmax\"\n  bottom: \"rpn_cls_score_reshape\"\n  top: \"rpn_cls_prob\"\n}\nlayer {\n  name: 'rpn_cls_prob_reshape'\n  type: 'Reshape'\n  bottom: 'rpn_cls_prob'\n  top: 'rpn_cls_prob_reshape'\n  reshape_param { shape { dim: 0 dim: 18 dim: -1 dim: 0 } }\n}\nlayer {\n  name: 'proposal'\n  type: 'Python'\n  bottom: 'rpn_cls_prob_reshape'\n  bottom: 'rpn_bbox_pred'\n  bottom: 'im_info'\n  top: 'rois'\n  python_param {\n    module: 'rpn.proposal_layer'\n    layer: 'ProposalLayer'\n    param_str: \"'feat_stride': 16\"\n  }\n}\n\n#========= RCNN ============\n\nlayer {\n  name: \"roi_pool_conv5\"\n  type: \"ROIPooling\"\n  bottom: \"conv5\"\n  bottom: \"rois\"\n  top: \"roi_pool_conv5\"\n  roi_pooling_param {\n    pooled_w: 6\n    pooled_h: 6\n    spatial_scale: 0.0625 # 1/16\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"roi_pool_conv5\"\n  top: \"fc6\"\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"drop6\"\n  type: \"Dropout\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n  dropout_param {\n    dropout_ratio: 0.5\n    scale_train: false\n  }\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu7\"\n  type: \"ReLU\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n}\nlayer {\n  name: \"drop7\"\n  type: \"Dropout\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n  dropout_param {\n    dropout_ratio: 0.5\n    scale_train: false\n  }\n}\nlayer {\n  name: \"cls_score\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"cls_score\"\n  inner_product_param {\n    num_output: 21\n  }\n}\nlayer {\n  name: \"bbox_pred\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"bbox_pred\"\n  inner_product_param {\n    num_output: 84\n  }\n}\nlayer {\n  name: \"cls_prob\"\n  type: \"Softmax\"\n  bottom: \"cls_score\"\n  top: \"cls_prob\"\n  loss_param {\n    ignore_label: -1\n    normalize: true\n  }\n}\n"
  },
  {
    "path": "models/pascal_voc/ZF/faster_rcnn_end2end/train.prototxt",
    "content": "name: \"ZF\"\nlayer {\n  name: 'input-data'\n  type: 'Python'\n  top: 'data'\n  top: 'im_info'\n  top: 'gt_boxes'\n  python_param {\n    module: 'roi_data_layer.layer'\n    layer: 'RoIDataLayer'\n    param_str: \"'num_classes': 21\"\n  }\n}\n\n#========= conv1-conv5 ============\n\nlayer {\n\tname: \"conv1\"\n\ttype: \"Convolution\"\n\tbottom: \"data\"\n\ttop: \"conv1\"\n\tparam { lr_mult: 1.0 }\n\tparam { lr_mult: 2.0 }\n\tconvolution_param {\n\t\tnum_output: 96\n\t\tkernel_size: 7\n\t\tpad: 3\n\t\tstride: 2\n\t}\n}\nlayer {\n\tname: \"relu1\"\n\ttype: \"ReLU\"\n\tbottom: \"conv1\"\n\ttop: \"conv1\"\n}\nlayer {\n\tname: \"norm1\"\n\ttype: \"LRN\"\n\tbottom: \"conv1\"\n\ttop: \"norm1\"\n\tlrn_param {\n\t\tlocal_size: 3\n\t\talpha: 0.00005\n\t\tbeta: 0.75\n\t\tnorm_region: WITHIN_CHANNEL\n    engine: CAFFE\n\t}\n}\nlayer {\n\tname: \"pool1\"\n\ttype: \"Pooling\"\n\tbottom: \"norm1\"\n\ttop: \"pool1\"\n\tpooling_param {\n\t\tkernel_size: 3\n\t\tstride: 2\n\t\tpad: 1\n\t\tpool: MAX\n\t}\n}\nlayer {\n\tname: \"conv2\"\n\ttype: \"Convolution\"\n\tbottom: \"pool1\"\n\ttop: \"conv2\"\n\tparam { lr_mult: 1.0 }\n\tparam { lr_mult: 2.0 }\n\tconvolution_param {\n\t\tnum_output: 256\n\t\tkernel_size: 5\n\t\tpad: 2\n\t\tstride: 2\n\t}\n}\nlayer {\n\tname: \"relu2\"\n\ttype: \"ReLU\"\n\tbottom: \"conv2\"\n\ttop: \"conv2\"\n}\nlayer {\n\tname: \"norm2\"\n\ttype: \"LRN\"\n\tbottom: \"conv2\"\n\ttop: \"norm2\"\n\tlrn_param {\n\t\tlocal_size: 3\n\t\talpha: 0.00005\n\t\tbeta: 0.75\n\t\tnorm_region: WITHIN_CHANNEL\n    engine: CAFFE\n\t}\n}\nlayer {\n\tname: \"pool2\"\n\ttype: \"Pooling\"\n\tbottom: \"norm2\"\n\ttop: \"pool2\"\n\tpooling_param {\n\t\tkernel_size: 3\n\t\tstride: 2\n\t\tpad: 1\n\t\tpool: MAX\n\t}\n}\nlayer {\n\tname: \"conv3\"\n\ttype: \"Convolution\"\n\tbottom: \"pool2\"\n\ttop: \"conv3\"\n\tparam { lr_mult: 1.0 }\n\tparam { lr_mult: 2.0 }\n\tconvolution_param {\n\t\tnum_output: 384\n\t\tkernel_size: 3\n\t\tpad: 1\n\t\tstride: 1\n\t}\n}\nlayer {\n\tname: \"relu3\"\n\ttype: \"ReLU\"\n\tbottom: \"conv3\"\n\ttop: \"conv3\"\n}\nlayer {\n\tname: \"conv4\"\n\ttype: \"Convolution\"\n\tbottom: \"conv3\"\n\ttop: \"conv4\"\n\tparam { lr_mult: 1.0 }\n\tparam { lr_mult: 2.0 }\n\tconvolution_param {\n\t\tnum_output: 384\n\t\tkernel_size: 3\n\t\tpad: 1\n\t\tstride: 1\n\t}\n}\nlayer {\n\tname: \"relu4\"\n\ttype: \"ReLU\"\n\tbottom: \"conv4\"\n\ttop: \"conv4\"\n}\nlayer {\n\tname: \"conv5\"\n\ttype: \"Convolution\"\n\tbottom: \"conv4\"\n\ttop: \"conv5\"\n\tparam { lr_mult: 1.0 }\n\tparam { lr_mult: 2.0 }\n\tconvolution_param {\n\t\tnum_output: 256\n\t\tkernel_size: 3\n\t\tpad: 1\n\t\tstride: 1\n\t}\n}\nlayer {\n\tname: \"relu5\"\n\ttype: \"ReLU\"\n\tbottom: \"conv5\"\n\ttop: \"conv5\"\n}\n\n#========= RPN ============\n\nlayer {\n  name: \"rpn_conv/3x3\"\n  type: \"Convolution\"\n  bottom: \"conv5\"\n  top: \"rpn/output\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 256\n    kernel_size: 3 pad: 1 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_relu/3x3\"\n  type: \"ReLU\"\n  bottom: \"rpn/output\"\n  top: \"rpn/output\"\n}\n\n#layer {\n#  name: \"rpn_conv/3x3\"\n#  type: \"Convolution\"\n#  bottom: \"conv5\"\n#  top: \"rpn_conv/3x3\"\n#  param { lr_mult: 1.0 }\n#  param { lr_mult: 2.0 }\n#  convolution_param {\n#    num_output: 192\n#    kernel_size: 3 pad: 1 stride: 1\n#    weight_filler { type: \"gaussian\" std: 0.01 }\n#    bias_filler { type: \"constant\" value: 0 }\n#  }\n#}\n#layer {\n#  name: \"rpn_conv/5x5\"\n#  type: \"Convolution\"\n#  bottom: \"conv5\"\n#  top: \"rpn_conv/5x5\"\n#  param { lr_mult: 1.0 }\n#  param { lr_mult: 2.0 }\n#  convolution_param {\n#    num_output: 64\n#    kernel_size: 5 pad: 2 stride: 1\n#    weight_filler { type: \"gaussian\" std: 0.0036 }\n#    bias_filler { type: \"constant\" value: 0 }\n#  }\n#}\n#layer {\n#  name: \"rpn/output\"\n#  type: \"Concat\"\n#  bottom: \"rpn_conv/3x3\"\n#  bottom: \"rpn_conv/5x5\"\n#  top: \"rpn/output\"\n#}\n#layer {\n#  name: \"rpn_relu/output\"\n#  type: \"ReLU\"\n#  bottom: \"rpn/output\"\n#  top: \"rpn/output\"\n#}\n\nlayer {\n  name: \"rpn_cls_score\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_cls_score\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 18   # 2(bg/fg) * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n  name: \"rpn_bbox_pred\"\n  type: \"Convolution\"\n  bottom: \"rpn/output\"\n  top: \"rpn_bbox_pred\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  convolution_param {\n    num_output: 36   # 4 * 9(anchors)\n    kernel_size: 1 pad: 0 stride: 1\n    weight_filler { type: \"gaussian\" std: 0.01 }\n    bias_filler { type: \"constant\" value: 0 }\n  }\n}\nlayer {\n   bottom: \"rpn_cls_score\"\n   top: \"rpn_cls_score_reshape\"\n   name: \"rpn_cls_score_reshape\"\n   type: \"Reshape\"\n   reshape_param { shape { dim: 0 dim: 2 dim: -1 dim: 0 } }\n}\nlayer {\n  name: 'rpn-data'\n  type: 'Python'\n  bottom: 'rpn_cls_score'\n  bottom: 'gt_boxes'\n  bottom: 'im_info'\n  bottom: 'data'\n  top: 'rpn_labels'\n  top: 'rpn_bbox_targets'\n  top: 'rpn_bbox_inside_weights'\n  top: 'rpn_bbox_outside_weights'\n  python_param {\n    module: 'rpn.anchor_target_layer'\n    layer: 'AnchorTargetLayer'\n    param_str: \"'feat_stride': 16\"\n  }\n}\nlayer {\n  name: \"rpn_loss_cls\"\n  type: \"SoftmaxWithLoss\"\n  bottom: \"rpn_cls_score_reshape\"\n  bottom: \"rpn_labels\"\n  propagate_down: 1\n  propagate_down: 0\n  top: \"rpn_cls_loss\"\n  loss_weight: 1\n  loss_param {\n    ignore_label: -1\n    normalize: true\n  }\n}\nlayer {\n  name: \"rpn_loss_bbox\"\n  type: \"SmoothL1Loss\"\n  bottom: \"rpn_bbox_pred\"\n  bottom: \"rpn_bbox_targets\"\n  bottom: 'rpn_bbox_inside_weights'\n  bottom: 'rpn_bbox_outside_weights'\n  top: \"rpn_loss_bbox\"\n  loss_weight: 1\n  smooth_l1_loss_param { sigma: 3.0 }\n}\n\n#========= RoI Proposal ============\n\nlayer {\n  name: \"rpn_cls_prob\"\n  type: \"Softmax\"\n  bottom: \"rpn_cls_score_reshape\"\n  top: \"rpn_cls_prob\"\n}\nlayer {\n  name: 'rpn_cls_prob_reshape'\n  type: 'Reshape'\n  bottom: 'rpn_cls_prob'\n  top: 'rpn_cls_prob_reshape'\n  reshape_param { shape { dim: 0 dim: 18 dim: -1 dim: 0 } }\n}\nlayer {\n  name: 'proposal'\n  type: 'Python'\n  bottom: 'rpn_cls_prob_reshape'\n  bottom: 'rpn_bbox_pred'\n  bottom: 'im_info'\n  top: 'rpn_rois'\n#  top: 'rpn_scores'\n  python_param {\n    module: 'rpn.proposal_layer'\n    layer: 'ProposalLayer'\n    param_str: \"'feat_stride': 16\"\n  }\n}\n#layer {\n#  name: 'debug-data'\n#  type: 'Python'\n#  bottom: 'data'\n#  bottom: 'rpn_rois'\n#  bottom: 'rpn_scores'\n#  python_param {\n#    module: 'rpn.debug_layer'\n#    layer: 'RPNDebugLayer'\n#  }\n#}\nlayer {\n  name: 'roi-data'\n  type: 'Python'\n  bottom: 'rpn_rois'\n  bottom: 'gt_boxes'\n  top: 'rois'\n  top: 'labels'\n  top: 'bbox_targets'\n  top: 'bbox_inside_weights'\n  top: 'bbox_outside_weights'\n  python_param {\n    module: 'rpn.proposal_target_layer'\n    layer: 'ProposalTargetLayer'\n    param_str: \"'num_classes': 21\"\n  }\n}\n\n#========= RCNN ============\n\nlayer {\n  name: \"roi_pool_conv5\"\n  type: \"ROIPooling\"\n  bottom: \"conv5\"\n  bottom: \"rois\"\n  top: \"roi_pool_conv5\"\n  roi_pooling_param {\n    pooled_w: 6\n    pooled_h: 6\n    spatial_scale: 0.0625 # 1/16\n  }\n}\nlayer {\n  name: \"fc6\"\n  type: \"InnerProduct\"\n  bottom: \"roi_pool_conv5\"\n  top: \"fc6\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu6\"\n  type: \"ReLU\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n}\nlayer {\n  name: \"drop6\"\n  type: \"Dropout\"\n  bottom: \"fc6\"\n  top: \"fc6\"\n  dropout_param {\n    dropout_ratio: 0.5\n    scale_train: false\n  }\n}\nlayer {\n  name: \"fc7\"\n  type: \"InnerProduct\"\n  bottom: \"fc6\"\n  top: \"fc7\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  inner_product_param {\n    num_output: 4096\n  }\n}\nlayer {\n  name: \"relu7\"\n  type: \"ReLU\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n}\nlayer {\n  name: \"drop7\"\n  type: \"Dropout\"\n  bottom: \"fc7\"\n  top: \"fc7\"\n  dropout_param {\n    dropout_ratio: 0.5\n    scale_train: false\n  }\n}\nlayer {\n  name: \"cls_score\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"cls_score\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  inner_product_param {\n    num_output: 21\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"bbox_pred\"\n  type: \"InnerProduct\"\n  bottom: \"fc7\"\n  top: \"bbox_pred\"\n  param { lr_mult: 1.0 }\n  param { lr_mult: 2.0 }\n  inner_product_param {\n    num_output: 84\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.001\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"loss_cls\"\n  type: \"SoftmaxWithLoss\"\n  bottom: \"cls_score\"\n  bottom: \"labels\"\n  propagate_down: 1\n  propagate_down: 0\n  top: \"cls_loss\"\n  loss_weight: 1\n  loss_param {\n    ignore_label: -1\n    normalize: true\n  }\n}\nlayer {\n  name: \"loss_bbox\"\n  type: \"SmoothL1Loss\"\n  bottom: \"bbox_pred\"\n  bottom: \"bbox_targets\"\n  bottom: 'bbox_inside_weights'\n  bottom: 'bbox_outside_weights'\n  top: \"bbox_loss\"\n  loss_weight: 1\n}\n"
  },
  {
    "path": "tools/README.md",
    "content": "Tools for training, testing, and compressing Fast R-CNN networks.\n"
  },
  {
    "path": "tools/_init_paths.py",
    "content": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\n\"\"\"Set up paths for Fast R-CNN.\"\"\"\n\nimport os.path as osp\nimport sys\n\ndef add_path(path):\n    if path not in sys.path:\n        sys.path.insert(0, path)\n\nthis_dir = osp.dirname(__file__)\n\n# Add caffe to PYTHONPATH\ncaffe_path = osp.join(this_dir, '..', 'caffe-fast-rcnn', 'python')\nadd_path(caffe_path)\n\n# Add lib to PYTHONPATH\nlib_path = osp.join(this_dir, '..', 'lib')\nadd_path(lib_path)\n"
  },
  {
    "path": "tools/compress_net.py",
    "content": "#!/usr/bin/env python\n\n# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\n\"\"\"Compress a Fast R-CNN network using truncated SVD.\"\"\"\n\nimport _init_paths\nimport caffe\nimport argparse\nimport numpy as np\nimport os, sys\n\ndef parse_args():\n    \"\"\"Parse input arguments.\"\"\"\n    parser = argparse.ArgumentParser(description='Compress a Fast R-CNN network')\n    parser.add_argument('--def', dest='prototxt',\n                        help='prototxt file defining the uncompressed network',\n                        default=None, type=str)\n    parser.add_argument('--def-svd', dest='prototxt_svd',\n                        help='prototxt file defining the SVD compressed network',\n                        default=None, type=str)\n    parser.add_argument('--net', dest='caffemodel',\n                        help='model to compress',\n                        default=None, type=str)\n\n    if len(sys.argv) == 1:\n        parser.print_help()\n        sys.exit(1)\n\n    args = parser.parse_args()\n    return args\n\ndef compress_weights(W, l):\n    \"\"\"Compress the weight matrix W of an inner product (fully connected) layer\n    using truncated SVD.\n\n    Parameters:\n    W: N x M weights matrix\n    l: number of singular values to retain\n\n    Returns:\n    Ul, L: matrices such that W \\approx Ul*L\n    \"\"\"\n\n    # numpy doesn't seem to have a fast truncated SVD algorithm...\n    # this could be faster\n    U, s, V = np.linalg.svd(W, full_matrices=False)\n\n    Ul = U[:, :l]\n    sl = s[:l]\n    Vl = V[:l, :]\n\n    L = np.dot(np.diag(sl), Vl)\n    return Ul, L\n\ndef main():\n    args = parse_args()\n\n    # prototxt = 'models/VGG16/test.prototxt'\n    # caffemodel = 'snapshots/vgg16_fast_rcnn_iter_40000.caffemodel'\n    net = caffe.Net(args.prototxt, args.caffemodel, caffe.TEST)\n\n    # prototxt_svd = 'models/VGG16/svd/test_fc6_fc7.prototxt'\n    # caffemodel = 'snapshots/vgg16_fast_rcnn_iter_40000.caffemodel'\n    net_svd = caffe.Net(args.prototxt_svd, args.caffemodel, caffe.TEST)\n\n    print('Uncompressed network {} : {}'.format(args.prototxt, args.caffemodel))\n    print('Compressed network prototxt {}'.format(args.prototxt_svd))\n\n    out = os.path.splitext(os.path.basename(args.caffemodel))[0] + '_svd'\n    out_dir = os.path.dirname(args.caffemodel)\n\n    # Compress fc6\n    if net_svd.params.has_key('fc6_L'):\n        l_fc6 = net_svd.params['fc6_L'][0].data.shape[0]\n        print('  fc6_L bottleneck size: {}'.format(l_fc6))\n\n        # uncompressed weights and biases\n        W_fc6 = net.params['fc6'][0].data\n        B_fc6 = net.params['fc6'][1].data\n\n        print('  compressing fc6...')\n        Ul_fc6, L_fc6 = compress_weights(W_fc6, l_fc6)\n\n        assert(len(net_svd.params['fc6_L']) == 1)\n\n        # install compressed matrix factors (and original biases)\n        net_svd.params['fc6_L'][0].data[...] = L_fc6\n\n        net_svd.params['fc6_U'][0].data[...] = Ul_fc6\n        net_svd.params['fc6_U'][1].data[...] = B_fc6\n\n        out += '_fc6_{}'.format(l_fc6)\n\n    # Compress fc7\n    if net_svd.params.has_key('fc7_L'):\n        l_fc7 = net_svd.params['fc7_L'][0].data.shape[0]\n        print '  fc7_L bottleneck size: {}'.format(l_fc7)\n\n        W_fc7 = net.params['fc7'][0].data\n        B_fc7 = net.params['fc7'][1].data\n\n        print('  compressing fc7...')\n        Ul_fc7, L_fc7 = compress_weights(W_fc7, l_fc7)\n\n        assert(len(net_svd.params['fc7_L']) == 1)\n\n        net_svd.params['fc7_L'][0].data[...] = L_fc7\n\n        net_svd.params['fc7_U'][0].data[...] = Ul_fc7\n        net_svd.params['fc7_U'][1].data[...] = B_fc7\n\n        out += '_fc7_{}'.format(l_fc7)\n\n    filename = '{}/{}.caffemodel'.format(out_dir, out)\n    net_svd.save(filename)\n    print 'Wrote svd model to: {:s}'.format(filename)\n\nif __name__ == '__main__':\n    main()\n"
  },
  {
    "path": "tools/demo.py",
    "content": "#!/usr/bin/env python\n\n# --------------------------------------------------------\n# Faster R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\n\"\"\"\nDemo script showing detections in sample images.\n\nSee README.md for installation instructions before running.\n\"\"\"\n\nimport _init_paths\nfrom fast_rcnn.config import cfg\nfrom fast_rcnn.test import im_detect\nfrom fast_rcnn.nms_wrapper import nms\nfrom utils.timer import Timer\nimport matplotlib.pyplot as plt\nimport numpy as np\nimport scipy.io as sio\nimport caffe, os, sys, cv2\nimport argparse\n\nCLASSES = ('__background__',\n           'aeroplane', 'bicycle', 'bird', 'boat',\n           'bottle', 'bus', 'car', 'cat', 'chair',\n           'cow', 'diningtable', 'dog', 'horse',\n           'motorbike', 'person', 'pottedplant',\n           'sheep', 'sofa', 'train', 'tvmonitor')\n\nNETS = {'vgg16': ('VGG16',\n                  'VGG16_faster_rcnn_final.caffemodel'),\n        'zf': ('ZF',\n                  'ZF_faster_rcnn_final.caffemodel')}\n\n\ndef vis_detections(im, class_name, dets, thresh=0.5):\n    \"\"\"Draw detected bounding boxes.\"\"\"\n    inds = np.where(dets[:, -1] >= thresh)[0]\n    if len(inds) == 0:\n        return\n\n    im = im[:, :, (2, 1, 0)]\n    fig, ax = plt.subplots(figsize=(12, 12))\n    ax.imshow(im, aspect='equal')\n    for i in inds:\n        bbox = dets[i, :4]\n        score = dets[i, -1]\n\n        ax.add_patch(\n            plt.Rectangle((bbox[0], bbox[1]),\n                          bbox[2] - bbox[0],\n                          bbox[3] - bbox[1], fill=False,\n                          edgecolor='red', linewidth=3.5)\n            )\n        ax.text(bbox[0], bbox[1] - 2,\n                '{:s} {:.3f}'.format(class_name, score),\n                bbox=dict(facecolor='blue', alpha=0.5),\n                fontsize=14, color='white')\n\n    ax.set_title(('{} detections with '\n                  'p({} | box) >= {:.1f}').format(class_name, class_name,\n                                                  thresh),\n                  fontsize=14)\n    plt.axis('off')\n    plt.tight_layout()\n    plt.draw()\n\ndef demo(net, image_name):\n    \"\"\"Detect object classes in an image using pre-computed object proposals.\"\"\"\n\n    # Load the demo image\n    im_file = os.path.join(cfg.DATA_DIR, 'demo', image_name)\n    im = cv2.imread(im_file)\n\n    # Detect all object classes and regress object bounds\n    timer = Timer()\n    timer.tic()\n    scores, boxes = im_detect(net, im)\n    timer.toc()\n    print ('Detection took {:.3f}s for '\n           '{:d} object proposals').format(timer.total_time, boxes.shape[0])\n\n    # Visualize detections for each class\n    CONF_THRESH = 0.8\n    NMS_THRESH = 0.3\n    for cls_ind, cls in enumerate(CLASSES[1:]):\n        cls_ind += 1 # because we skipped background\n        cls_boxes = boxes[:, 4*cls_ind:4*(cls_ind + 1)]\n        cls_scores = scores[:, cls_ind]\n        dets = np.hstack((cls_boxes,\n                          cls_scores[:, np.newaxis])).astype(np.float32)\n        keep = nms(dets, NMS_THRESH)\n        dets = dets[keep, :]\n        vis_detections(im, cls, dets, thresh=CONF_THRESH)\n\ndef parse_args():\n    \"\"\"Parse input arguments.\"\"\"\n    parser = argparse.ArgumentParser(description='Faster R-CNN demo')\n    parser.add_argument('--gpu', dest='gpu_id', help='GPU device id to use [0]',\n                        default=0, type=int)\n    parser.add_argument('--cpu', dest='cpu_mode',\n                        help='Use CPU mode (overrides --gpu)',\n                        action='store_true')\n    parser.add_argument('--net', dest='demo_net', help='Network to use [vgg16]',\n                        choices=NETS.keys(), default='vgg16')\n\n    args = parser.parse_args()\n\n    return args\n\nif __name__ == '__main__':\n    cfg.TEST.HAS_RPN = True  # Use RPN for proposals\n\n    args = parse_args()\n\n    prototxt = os.path.join(cfg.MODELS_DIR, NETS[args.demo_net][0],\n                            'faster_rcnn_alt_opt', 'faster_rcnn_test.pt')\n    caffemodel = os.path.join(cfg.DATA_DIR, 'faster_rcnn_models',\n                              NETS[args.demo_net][1])\n\n    if not os.path.isfile(caffemodel):\n        raise IOError(('{:s} not found.\\nDid you run ./data/script/'\n                       'fetch_faster_rcnn_models.sh?').format(caffemodel))\n\n    if args.cpu_mode:\n        caffe.set_mode_cpu()\n    else:\n        caffe.set_mode_gpu()\n        caffe.set_device(args.gpu_id)\n        cfg.GPU_ID = args.gpu_id\n    net = caffe.Net(prototxt, caffemodel, caffe.TEST)\n\n    print '\\n\\nLoaded network {:s}'.format(caffemodel)\n\n    # Warmup on a dummy image\n    im = 128 * np.ones((300, 500, 3), dtype=np.uint8)\n    for i in xrange(2):\n        _, _= im_detect(net, im)\n\n    im_names = ['000456.jpg', '000542.jpg', '001150.jpg',\n                '001763.jpg', '004545.jpg']\n    for im_name in im_names:\n        print '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'\n        print 'Demo for data/demo/{}'.format(im_name)\n        demo(net, im_name)\n\n    plt.show()\n"
  },
  {
    "path": "tools/eval_recall.py",
    "content": "#!/usr/bin/env python\n\nimport _init_paths\nfrom fast_rcnn.config import cfg, cfg_from_file, cfg_from_list\nfrom datasets.factory import get_imdb\nimport argparse\nimport time, os, sys\nimport numpy as np\n\ndef parse_args():\n    \"\"\"\n    Parse input arguments\n    \"\"\"\n    parser = argparse.ArgumentParser(description='Test a Fast R-CNN network')\n    parser.add_argument('--imdb', dest='imdb_name',\n                        help='dataset to test',\n                        default='voc_2007_test', type=str)\n    parser.add_argument('--method', dest='method',\n                        help='proposal method',\n                        default='selective_search', type=str)\n    parser.add_argument('--rpn-file', dest='rpn_file',\n                        default=None, type=str)\n\n    if len(sys.argv) == 1:\n        parser.print_help()\n        sys.exit(1)\n\n    args = parser.parse_args()\n    return args\n\nif __name__ == '__main__':\n    args = parse_args()\n\n    print('Called with args:')\n    print(args)\n\n    imdb = get_imdb(args.imdb_name)\n    imdb.set_proposal_method(args.method)\n    if args.rpn_file is not None:\n        imdb.config['rpn_file'] = args.rpn_file\n\n    candidate_boxes = None\n    if 0:\n        import scipy.io as sio\n        filename = 'debug/stage1_rpn_voc_2007_test.mat'\n        raw_data = sio.loadmat(filename)['aboxes'].ravel()\n        candidate_boxes = raw_data\n\n    ar, gt_overlaps, recalls, thresholds = \\\n        imdb.evaluate_recall(candidate_boxes=candidate_boxes)\n    print 'Method: {}'.format(args.method)\n    print 'AverageRec: {:.3f}'.format(ar)\n\n    def recall_at(t):\n        ind = np.where(thresholds > t - 1e-5)[0][0]\n        assert np.isclose(thresholds[ind], t)\n        return recalls[ind]\n\n    print 'Recall@0.5: {:.3f}'.format(recall_at(0.5))\n    print 'Recall@0.6: {:.3f}'.format(recall_at(0.6))\n    print 'Recall@0.7: {:.3f}'.format(recall_at(0.7))\n    print 'Recall@0.8: {:.3f}'.format(recall_at(0.8))\n    print 'Recall@0.9: {:.3f}'.format(recall_at(0.9))\n    # print again for easy spreadsheet copying\n    print '{:.3f}'.format(ar)\n    print '{:.3f}'.format(recall_at(0.5))\n    print '{:.3f}'.format(recall_at(0.6))\n    print '{:.3f}'.format(recall_at(0.7))\n    print '{:.3f}'.format(recall_at(0.8))\n    print '{:.3f}'.format(recall_at(0.9))\n"
  },
  {
    "path": "tools/reval.py",
    "content": "#!/usr/bin/env python\n\n# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\n\"\"\"Reval = re-eval. Re-evaluate saved detections.\"\"\"\n\nimport _init_paths\nfrom fast_rcnn.test import apply_nms\nfrom fast_rcnn.config import cfg\nfrom datasets.factory import get_imdb\nimport cPickle\nimport os, sys, argparse\nimport numpy as np\n\ndef parse_args():\n    \"\"\"\n    Parse input arguments\n    \"\"\"\n    parser = argparse.ArgumentParser(description='Re-evaluate results')\n    parser.add_argument('output_dir', nargs=1, help='results directory',\n                        type=str)\n    parser.add_argument('--imdb', dest='imdb_name',\n                        help='dataset to re-evaluate',\n                        default='voc_2007_test', type=str)\n    parser.add_argument('--matlab', dest='matlab_eval',\n                        help='use matlab for evaluation',\n                        action='store_true')\n    parser.add_argument('--comp', dest='comp_mode', help='competition mode',\n                        action='store_true')\n    parser.add_argument('--nms', dest='apply_nms', help='apply nms',\n                        action='store_true')\n\n    if len(sys.argv) == 1:\n        parser.print_help()\n        sys.exit(1)\n\n    args = parser.parse_args()\n    return args\n\ndef from_dets(imdb_name, output_dir, args):\n    imdb = get_imdb(imdb_name)\n    imdb.competition_mode(args.comp_mode)\n    imdb.config['matlab_eval'] = args.matlab_eval\n    with open(os.path.join(output_dir, 'detections.pkl'), 'rb') as f:\n        dets = cPickle.load(f)\n\n    if args.apply_nms:\n        print 'Applying NMS to all detections'\n        nms_dets = apply_nms(dets, cfg.TEST.NMS)\n    else:\n        nms_dets = dets\n\n    print 'Evaluating detections'\n    imdb.evaluate_detections(nms_dets, output_dir)\n\nif __name__ == '__main__':\n    args = parse_args()\n\n    output_dir = os.path.abspath(args.output_dir[0])\n    imdb_name = args.imdb_name\n    from_dets(imdb_name, output_dir, args)\n"
  },
  {
    "path": "tools/rpn_generate.py",
    "content": "#!/usr/bin/env python\n\n# --------------------------------------------------------\n# Fast/er/ R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\n\"\"\"Generate RPN proposals.\"\"\"\n\nimport _init_paths\nimport numpy as np\nfrom fast_rcnn.config import cfg, cfg_from_file, cfg_from_list, get_output_dir\nfrom datasets.factory import get_imdb\nfrom rpn.generate import imdb_proposals\nimport cPickle\nimport caffe\nimport argparse\nimport pprint\nimport time, os, sys\n\ndef parse_args():\n    \"\"\"\n    Parse input arguments\n    \"\"\"\n    parser = argparse.ArgumentParser(description='Test a Fast R-CNN network')\n    parser.add_argument('--gpu', dest='gpu_id', help='GPU id to use',\n                        default=0, type=int)\n    parser.add_argument('--def', dest='prototxt',\n                        help='prototxt file defining the network',\n                        default=None, type=str)\n    parser.add_argument('--net', dest='caffemodel',\n                        help='model to test',\n                        default=None, type=str)\n    parser.add_argument('--cfg', dest='cfg_file',\n                        help='optional config file', default=None, type=str)\n    parser.add_argument('--wait', dest='wait',\n                        help='wait until net file exists',\n                        default=True, type=bool)\n    parser.add_argument('--imdb', dest='imdb_name',\n                        help='dataset to test',\n                        default='voc_2007_test', type=str)\n    parser.add_argument('--set', dest='set_cfgs',\n                        help='set config keys', default=None,\n                        nargs=argparse.REMAINDER)\n\n    if len(sys.argv) == 1:\n        parser.print_help()\n        sys.exit(1)\n\n    args = parser.parse_args()\n    return args\n\nif __name__ == '__main__':\n    args = parse_args()\n\n    print('Called with args:')\n    print(args)\n\n    if args.cfg_file is not None:\n        cfg_from_file(args.cfg_file)\n    if args.set_cfgs is not None:\n        cfg_from_list(args.set_cfgs)\n\n    cfg.GPU_ID = args.gpu_id\n\n    # RPN test settings\n    cfg.TEST.RPN_PRE_NMS_TOP_N = -1\n    cfg.TEST.RPN_POST_NMS_TOP_N = 2000\n\n    print('Using config:')\n    pprint.pprint(cfg)\n\n    while not os.path.exists(args.caffemodel) and args.wait:\n        print('Waiting for {} to exist...'.format(args.caffemodel))\n        time.sleep(10)\n\n    caffe.set_mode_gpu()\n    caffe.set_device(args.gpu_id)\n    net = caffe.Net(args.prototxt, args.caffemodel, caffe.TEST)\n    net.name = os.path.splitext(os.path.basename(args.caffemodel))[0]\n\n    imdb = get_imdb(args.imdb_name)\n    imdb_boxes = imdb_proposals(net, imdb)\n\n    output_dir = get_output_dir(imdb, net)\n    rpn_file = os.path.join(output_dir, net.name + '_rpn_proposals.pkl')\n    with open(rpn_file, 'wb') as f:\n        cPickle.dump(imdb_boxes, f, cPickle.HIGHEST_PROTOCOL)\n    print 'Wrote RPN proposals to {}'.format(rpn_file)\n"
  },
  {
    "path": "tools/test_net.py",
    "content": "#!/usr/bin/env python\n\n# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\n\"\"\"Test a Fast R-CNN network on an image database.\"\"\"\n\nimport _init_paths\nfrom fast_rcnn.test import test_net\nfrom fast_rcnn.config import cfg, cfg_from_file, cfg_from_list\nfrom datasets.factory import get_imdb\nimport caffe\nimport argparse\nimport pprint\nimport time, os, sys\n\ndef parse_args():\n    \"\"\"\n    Parse input arguments\n    \"\"\"\n    parser = argparse.ArgumentParser(description='Test a Fast R-CNN network')\n    parser.add_argument('--gpu', dest='gpu_id', help='GPU id to use',\n                        default=0, type=int)\n    parser.add_argument('--def', dest='prototxt',\n                        help='prototxt file defining the network',\n                        default=None, type=str)\n    parser.add_argument('--net', dest='caffemodel',\n                        help='model to test',\n                        default=None, type=str)\n    parser.add_argument('--cfg', dest='cfg_file',\n                        help='optional config file', default=None, type=str)\n    parser.add_argument('--wait', dest='wait',\n                        help='wait until net file exists',\n                        default=True, type=bool)\n    parser.add_argument('--imdb', dest='imdb_name',\n                        help='dataset to test',\n                        default='voc_2007_test', type=str)\n    parser.add_argument('--comp', dest='comp_mode', help='competition mode',\n                        action='store_true')\n    parser.add_argument('--set', dest='set_cfgs',\n                        help='set config keys', default=None,\n                        nargs=argparse.REMAINDER)\n    parser.add_argument('--vis', dest='vis', help='visualize detections',\n                        action='store_true')\n    parser.add_argument('--num_dets', dest='max_per_image',\n                        help='max number of detections per image',\n                        default=100, type=int)\n\n    if len(sys.argv) == 1:\n        parser.print_help()\n        sys.exit(1)\n\n    args = parser.parse_args()\n    return args\n\nif __name__ == '__main__':\n    args = parse_args()\n\n    print('Called with args:')\n    print(args)\n\n    if args.cfg_file is not None:\n        cfg_from_file(args.cfg_file)\n    if args.set_cfgs is not None:\n        cfg_from_list(args.set_cfgs)\n\n    cfg.GPU_ID = args.gpu_id\n\n    print('Using config:')\n    pprint.pprint(cfg)\n\n    while not os.path.exists(args.caffemodel) and args.wait:\n        print('Waiting for {} to exist...'.format(args.caffemodel))\n        time.sleep(10)\n\n    caffe.set_mode_gpu()\n    caffe.set_device(args.gpu_id)\n    net = caffe.Net(args.prototxt, args.caffemodel, caffe.TEST)\n    net.name = os.path.splitext(os.path.basename(args.caffemodel))[0]\n\n    imdb = get_imdb(args.imdb_name)\n    imdb.competition_mode(args.comp_mode)\n    if not cfg.TEST.HAS_RPN:\n        imdb.set_proposal_method(cfg.TEST.PROPOSAL_METHOD)\n\n    test_net(net, imdb, max_per_image=args.max_per_image, vis=args.vis)\n"
  },
  {
    "path": "tools/train_faster_rcnn_alt_opt.py",
    "content": "#!/usr/bin/env python\n\n# --------------------------------------------------------\n# Faster R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\n\"\"\"Train a Faster R-CNN network using alternating optimization.\nThis tool implements the alternating optimization algorithm described in our\nNIPS 2015 paper (\"Faster R-CNN: Towards Real-time Object Detection with Region\nProposal Networks.\" Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun.)\n\"\"\"\n\nimport _init_paths\nfrom fast_rcnn.train import get_training_roidb, train_net\nfrom fast_rcnn.config import cfg, cfg_from_file, cfg_from_list, get_output_dir\nfrom datasets.factory import get_imdb\nfrom rpn.generate import imdb_proposals\nimport argparse\nimport pprint\nimport numpy as np\nimport sys, os\nimport multiprocessing as mp\nimport cPickle\nimport shutil\n\ndef parse_args():\n    \"\"\"\n    Parse input arguments\n    \"\"\"\n    parser = argparse.ArgumentParser(description='Train a Faster R-CNN network')\n    parser.add_argument('--gpu', dest='gpu_id',\n                        help='GPU device id to use [0]',\n                        default=0, type=int)\n    parser.add_argument('--net_name', dest='net_name',\n                        help='network name (e.g., \"ZF\")',\n                        default=None, type=str)\n    parser.add_argument('--weights', dest='pretrained_model',\n                        help='initialize with pretrained model weights',\n                        default=None, type=str)\n    parser.add_argument('--cfg', dest='cfg_file',\n                        help='optional config file',\n                        default=None, type=str)\n    parser.add_argument('--imdb', dest='imdb_name',\n                        help='dataset to train on',\n                        default='voc_2007_trainval', type=str)\n    parser.add_argument('--set', dest='set_cfgs',\n                        help='set config keys', default=None,\n                        nargs=argparse.REMAINDER)\n\n    if len(sys.argv) == 1:\n        parser.print_help()\n        sys.exit(1)\n\n    args = parser.parse_args()\n    return args\n\ndef get_roidb(imdb_name, rpn_file=None):\n    imdb = get_imdb(imdb_name)\n    print 'Loaded dataset `{:s}` for training'.format(imdb.name)\n    imdb.set_proposal_method(cfg.TRAIN.PROPOSAL_METHOD)\n    print 'Set proposal method: {:s}'.format(cfg.TRAIN.PROPOSAL_METHOD)\n    if rpn_file is not None:\n        imdb.config['rpn_file'] = rpn_file\n    roidb = get_training_roidb(imdb)\n    return roidb, imdb\n\ndef get_solvers(net_name):\n    # Faster R-CNN Alternating Optimization\n    n = 'faster_rcnn_alt_opt'\n    # Solver for each training stage\n    solvers = [[net_name, n, 'stage1_rpn_solver60k80k.pt'],\n               [net_name, n, 'stage1_fast_rcnn_solver30k40k.pt'],\n               [net_name, n, 'stage2_rpn_solver60k80k.pt'],\n               [net_name, n, 'stage2_fast_rcnn_solver30k40k.pt']]\n    solvers = [os.path.join(cfg.MODELS_DIR, *s) for s in solvers]\n    # Iterations for each training stage\n    max_iters = [80000, 40000, 80000, 40000]\n    # max_iters = [100, 100, 100, 100]\n    # Test prototxt for the RPN\n    rpn_test_prototxt = os.path.join(\n        cfg.MODELS_DIR, net_name, n, 'rpn_test.pt')\n    return solvers, max_iters, rpn_test_prototxt\n\n# ------------------------------------------------------------------------------\n# Pycaffe doesn't reliably free GPU memory when instantiated nets are discarded\n# (e.g. \"del net\" in Python code). To work around this issue, each training\n# stage is executed in a separate process using multiprocessing.Process.\n# ------------------------------------------------------------------------------\n\ndef _init_caffe(cfg):\n    \"\"\"Initialize pycaffe in a training process.\n    \"\"\"\n\n    import caffe\n    # fix the random seeds (numpy and caffe) for reproducibility\n    np.random.seed(cfg.RNG_SEED)\n    caffe.set_random_seed(cfg.RNG_SEED)\n    # set up caffe\n    caffe.set_mode_gpu()\n    caffe.set_device(cfg.GPU_ID)\n\ndef train_rpn(queue=None, imdb_name=None, init_model=None, solver=None,\n              max_iters=None, cfg=None):\n    \"\"\"Train a Region Proposal Network in a separate training process.\n    \"\"\"\n\n    # Not using any proposals, just ground-truth boxes\n    cfg.TRAIN.HAS_RPN = True\n    cfg.TRAIN.BBOX_REG = False  # applies only to Fast R-CNN bbox regression\n    cfg.TRAIN.PROPOSAL_METHOD = 'gt'\n    cfg.TRAIN.IMS_PER_BATCH = 1\n    print 'Init model: {}'.format(init_model)\n    print('Using config:')\n    pprint.pprint(cfg)\n\n    import caffe\n    _init_caffe(cfg)\n\n    roidb, imdb = get_roidb(imdb_name)\n    print 'roidb len: {}'.format(len(roidb))\n    output_dir = get_output_dir(imdb)\n    print 'Output will be saved to `{:s}`'.format(output_dir)\n\n    model_paths = train_net(solver, roidb, output_dir,\n                            pretrained_model=init_model,\n                            max_iters=max_iters)\n    # Cleanup all but the final model\n    for i in model_paths[:-1]:\n        os.remove(i)\n    rpn_model_path = model_paths[-1]\n    # Send final model path through the multiprocessing queue\n    queue.put({'model_path': rpn_model_path})\n\ndef rpn_generate(queue=None, imdb_name=None, rpn_model_path=None, cfg=None,\n                 rpn_test_prototxt=None):\n    \"\"\"Use a trained RPN to generate proposals.\n    \"\"\"\n\n    cfg.TEST.RPN_PRE_NMS_TOP_N = -1     # no pre NMS filtering\n    cfg.TEST.RPN_POST_NMS_TOP_N = 2000  # limit top boxes after NMS\n    print 'RPN model: {}'.format(rpn_model_path)\n    print('Using config:')\n    pprint.pprint(cfg)\n\n    import caffe\n    _init_caffe(cfg)\n\n    # NOTE: the matlab implementation computes proposals on flipped images, too.\n    # We compute them on the image once and then flip the already computed\n    # proposals. This might cause a minor loss in mAP (less proposal jittering).\n    imdb = get_imdb(imdb_name)\n    print 'Loaded dataset `{:s}` for proposal generation'.format(imdb.name)\n\n    # Load RPN and configure output directory\n    rpn_net = caffe.Net(rpn_test_prototxt, rpn_model_path, caffe.TEST)\n    output_dir = get_output_dir(imdb)\n    print 'Output will be saved to `{:s}`'.format(output_dir)\n    # Generate proposals on the imdb\n    rpn_proposals = imdb_proposals(rpn_net, imdb)\n    # Write proposals to disk and send the proposal file path through the\n    # multiprocessing queue\n    rpn_net_name = os.path.splitext(os.path.basename(rpn_model_path))[0]\n    rpn_proposals_path = os.path.join(\n        output_dir, rpn_net_name + '_proposals.pkl')\n    with open(rpn_proposals_path, 'wb') as f:\n        cPickle.dump(rpn_proposals, f, cPickle.HIGHEST_PROTOCOL)\n    print 'Wrote RPN proposals to {}'.format(rpn_proposals_path)\n    queue.put({'proposal_path': rpn_proposals_path})\n\ndef train_fast_rcnn(queue=None, imdb_name=None, init_model=None, solver=None,\n                    max_iters=None, cfg=None, rpn_file=None):\n    \"\"\"Train a Fast R-CNN using proposals generated by an RPN.\n    \"\"\"\n\n    cfg.TRAIN.HAS_RPN = False           # not generating prosals on-the-fly\n    cfg.TRAIN.PROPOSAL_METHOD = 'rpn'   # use pre-computed RPN proposals instead\n    cfg.TRAIN.IMS_PER_BATCH = 2\n    print 'Init model: {}'.format(init_model)\n    print 'RPN proposals: {}'.format(rpn_file)\n    print('Using config:')\n    pprint.pprint(cfg)\n\n    import caffe\n    _init_caffe(cfg)\n\n    roidb, imdb = get_roidb(imdb_name, rpn_file=rpn_file)\n    output_dir = get_output_dir(imdb)\n    print 'Output will be saved to `{:s}`'.format(output_dir)\n    # Train Fast R-CNN\n    model_paths = train_net(solver, roidb, output_dir,\n                            pretrained_model=init_model,\n                            max_iters=max_iters)\n    # Cleanup all but the final model\n    for i in model_paths[:-1]:\n        os.remove(i)\n    fast_rcnn_model_path = model_paths[-1]\n    # Send Fast R-CNN model path over the multiprocessing queue\n    queue.put({'model_path': fast_rcnn_model_path})\n\nif __name__ == '__main__':\n    args = parse_args()\n\n    print('Called with args:')\n    print(args)\n\n    if args.cfg_file is not None:\n        cfg_from_file(args.cfg_file)\n    if args.set_cfgs is not None:\n        cfg_from_list(args.set_cfgs)\n    cfg.GPU_ID = args.gpu_id\n\n    # --------------------------------------------------------------------------\n    # Pycaffe doesn't reliably free GPU memory when instantiated nets are\n    # discarded (e.g. \"del net\" in Python code). To work around this issue, each\n    # training stage is executed in a separate process using\n    # multiprocessing.Process.\n    # --------------------------------------------------------------------------\n\n    # queue for communicated results between processes\n    mp_queue = mp.Queue()\n    # solves, iters, etc. for each training stage\n    solvers, max_iters, rpn_test_prototxt = get_solvers(args.net_name)\n\n    print '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'\n    print 'Stage 1 RPN, init from ImageNet model'\n    print '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'\n\n    cfg.TRAIN.SNAPSHOT_INFIX = 'stage1'\n    mp_kwargs = dict(\n            queue=mp_queue,\n            imdb_name=args.imdb_name,\n            init_model=args.pretrained_model,\n            solver=solvers[0],\n            max_iters=max_iters[0],\n            cfg=cfg)\n    p = mp.Process(target=train_rpn, kwargs=mp_kwargs)\n    p.start()\n    rpn_stage1_out = mp_queue.get()\n    p.join()\n\n    print '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'\n    print 'Stage 1 RPN, generate proposals'\n    print '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'\n\n    mp_kwargs = dict(\n            queue=mp_queue,\n            imdb_name=args.imdb_name,\n            rpn_model_path=str(rpn_stage1_out['model_path']),\n            cfg=cfg,\n            rpn_test_prototxt=rpn_test_prototxt)\n    p = mp.Process(target=rpn_generate, kwargs=mp_kwargs)\n    p.start()\n    rpn_stage1_out['proposal_path'] = mp_queue.get()['proposal_path']\n    p.join()\n\n    print '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'\n    print 'Stage 1 Fast R-CNN using RPN proposals, init from ImageNet model'\n    print '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'\n\n    cfg.TRAIN.SNAPSHOT_INFIX = 'stage1'\n    mp_kwargs = dict(\n            queue=mp_queue,\n            imdb_name=args.imdb_name,\n            init_model=args.pretrained_model,\n            solver=solvers[1],\n            max_iters=max_iters[1],\n            cfg=cfg,\n            rpn_file=rpn_stage1_out['proposal_path'])\n    p = mp.Process(target=train_fast_rcnn, kwargs=mp_kwargs)\n    p.start()\n    fast_rcnn_stage1_out = mp_queue.get()\n    p.join()\n\n    print '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'\n    print 'Stage 2 RPN, init from stage 1 Fast R-CNN model'\n    print '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'\n\n    cfg.TRAIN.SNAPSHOT_INFIX = 'stage2'\n    mp_kwargs = dict(\n            queue=mp_queue,\n            imdb_name=args.imdb_name,\n            init_model=str(fast_rcnn_stage1_out['model_path']),\n            solver=solvers[2],\n            max_iters=max_iters[2],\n            cfg=cfg)\n    p = mp.Process(target=train_rpn, kwargs=mp_kwargs)\n    p.start()\n    rpn_stage2_out = mp_queue.get()\n    p.join()\n\n    print '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'\n    print 'Stage 2 RPN, generate proposals'\n    print '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'\n\n    mp_kwargs = dict(\n            queue=mp_queue,\n            imdb_name=args.imdb_name,\n            rpn_model_path=str(rpn_stage2_out['model_path']),\n            cfg=cfg,\n            rpn_test_prototxt=rpn_test_prototxt)\n    p = mp.Process(target=rpn_generate, kwargs=mp_kwargs)\n    p.start()\n    rpn_stage2_out['proposal_path'] = mp_queue.get()['proposal_path']\n    p.join()\n\n    print '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'\n    print 'Stage 2 Fast R-CNN, init from stage 2 RPN R-CNN model'\n    print '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'\n\n    cfg.TRAIN.SNAPSHOT_INFIX = 'stage2'\n    mp_kwargs = dict(\n            queue=mp_queue,\n            imdb_name=args.imdb_name,\n            init_model=str(rpn_stage2_out['model_path']),\n            solver=solvers[3],\n            max_iters=max_iters[3],\n            cfg=cfg,\n            rpn_file=rpn_stage2_out['proposal_path'])\n    p = mp.Process(target=train_fast_rcnn, kwargs=mp_kwargs)\n    p.start()\n    fast_rcnn_stage2_out = mp_queue.get()\n    p.join()\n\n    # Create final model (just a copy of the last stage)\n    final_path = os.path.join(\n            os.path.dirname(fast_rcnn_stage2_out['model_path']),\n            args.net_name + '_faster_rcnn_final.caffemodel')\n    print 'cp {} -> {}'.format(\n            fast_rcnn_stage2_out['model_path'], final_path)\n    shutil.copy(fast_rcnn_stage2_out['model_path'], final_path)\n    print 'Final model: {}'.format(final_path)\n"
  },
  {
    "path": "tools/train_net.py",
    "content": "#!/usr/bin/env python\n\n# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\n\"\"\"Train a Fast R-CNN network on a region of interest database.\"\"\"\n\nimport _init_paths\nfrom fast_rcnn.train import get_training_roidb, train_net\nfrom fast_rcnn.config import cfg, cfg_from_file, cfg_from_list, get_output_dir\nfrom datasets.factory import get_imdb\nimport datasets.imdb\nimport caffe\nimport argparse\nimport pprint\nimport numpy as np\nimport sys\n\ndef parse_args():\n    \"\"\"\n    Parse input arguments\n    \"\"\"\n    parser = argparse.ArgumentParser(description='Train a Fast R-CNN network')\n    parser.add_argument('--gpu', dest='gpu_id',\n                        help='GPU device id to use [0]',\n                        default=0, type=int)\n    parser.add_argument('--solver', dest='solver',\n                        help='solver prototxt',\n                        default=None, type=str)\n    parser.add_argument('--iters', dest='max_iters',\n                        help='number of iterations to train',\n                        default=40000, type=int)\n    parser.add_argument('--weights', dest='pretrained_model',\n                        help='initialize with pretrained model weights',\n                        default=None, type=str)\n    parser.add_argument('--cfg', dest='cfg_file',\n                        help='optional config file',\n                        default=None, type=str)\n    parser.add_argument('--imdb', dest='imdb_name',\n                        help='dataset to train on',\n                        default='voc_2007_trainval', type=str)\n    parser.add_argument('--rand', dest='randomize',\n                        help='randomize (do not use a fixed seed)',\n                        action='store_true')\n    parser.add_argument('--set', dest='set_cfgs',\n                        help='set config keys', default=None,\n                        nargs=argparse.REMAINDER)\n\n    if len(sys.argv) == 1:\n        parser.print_help()\n        sys.exit(1)\n\n    args = parser.parse_args()\n    return args\n\ndef combined_roidb(imdb_names):\n    def get_roidb(imdb_name):\n        imdb = get_imdb(imdb_name)\n        print 'Loaded dataset `{:s}` for training'.format(imdb.name)\n        imdb.set_proposal_method(cfg.TRAIN.PROPOSAL_METHOD)\n        print 'Set proposal method: {:s}'.format(cfg.TRAIN.PROPOSAL_METHOD)\n        roidb = get_training_roidb(imdb)\n        return roidb\n\n    roidbs = [get_roidb(s) for s in imdb_names.split('+')]\n    roidb = roidbs[0]\n    if len(roidbs) > 1:\n        for r in roidbs[1:]:\n            roidb.extend(r)\n        imdb = datasets.imdb.imdb(imdb_names)\n    else:\n        imdb = get_imdb(imdb_names)\n    return imdb, roidb\n\nif __name__ == '__main__':\n    args = parse_args()\n\n    print('Called with args:')\n    print(args)\n\n    if args.cfg_file is not None:\n        cfg_from_file(args.cfg_file)\n    if args.set_cfgs is not None:\n        cfg_from_list(args.set_cfgs)\n\n    cfg.GPU_ID = args.gpu_id\n\n    print('Using config:')\n    pprint.pprint(cfg)\n\n    if not args.randomize:\n        # fix the random seeds (numpy and caffe) for reproducibility\n        np.random.seed(cfg.RNG_SEED)\n        caffe.set_random_seed(cfg.RNG_SEED)\n\n    # set up caffe\n    caffe.set_mode_gpu()\n    caffe.set_device(args.gpu_id)\n\n    imdb, roidb = combined_roidb(args.imdb_name)\n    print '{:d} roidb entries'.format(len(roidb))\n\n    output_dir = get_output_dir(imdb)\n    print 'Output will be saved to `{:s}`'.format(output_dir)\n\n    train_net(args.solver, roidb, output_dir,\n              pretrained_model=args.pretrained_model,\n              max_iters=args.max_iters)\n"
  },
  {
    "path": "tools/train_svms.py",
    "content": "#!/usr/bin/env python\n\n# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\n\"\"\"\nTrain post-hoc SVMs using the algorithm and hyper-parameters from\ntraditional R-CNN.\n\"\"\"\n\nimport _init_paths\nfrom fast_rcnn.config import cfg, cfg_from_file\nfrom datasets.factory import get_imdb\nfrom fast_rcnn.test import im_detect\nfrom utils.timer import Timer\nimport caffe\nimport argparse\nimport pprint\nimport numpy as np\nimport numpy.random as npr\nimport cv2\nfrom sklearn import svm\nimport os, sys\n\nclass SVMTrainer(object):\n    \"\"\"\n    Trains post-hoc detection SVMs for all classes using the algorithm\n    and hyper-parameters of traditional R-CNN.\n    \"\"\"\n\n    def __init__(self, net, imdb):\n        self.imdb = imdb\n        self.net = net\n        self.layer = 'fc7'\n        self.hard_thresh = -1.0001\n        self.neg_iou_thresh = 0.3\n\n        dim = net.params['cls_score'][0].data.shape[1]\n        scale = self._get_feature_scale()\n        print('Feature dim: {}'.format(dim))\n        print('Feature scale: {:.3f}'.format(scale))\n        self.trainers = [SVMClassTrainer(cls, dim, feature_scale=scale)\n                         for cls in imdb.classes]\n\n    def _get_feature_scale(self, num_images=100):\n        TARGET_NORM = 20.0 # Magic value from traditional R-CNN\n        _t = Timer()\n        roidb = self.imdb.roidb\n        total_norm = 0.0\n        count = 0.0\n        inds = npr.choice(xrange(self.imdb.num_images), size=num_images,\n                          replace=False)\n        for i_, i in enumerate(inds):\n            im = cv2.imread(self.imdb.image_path_at(i))\n            if roidb[i]['flipped']:\n                im = im[:, ::-1, :]\n            _t.tic()\n            scores, boxes = im_detect(self.net, im, roidb[i]['boxes'])\n            _t.toc()\n            feat = self.net.blobs[self.layer].data\n            total_norm += np.sqrt((feat ** 2).sum(axis=1)).sum()\n            count += feat.shape[0]\n            print('{}/{}: avg feature norm: {:.3f}'.format(i_ + 1, num_images,\n                                                           total_norm / count))\n\n        return TARGET_NORM * 1.0 / (total_norm / count)\n\n    def _get_pos_counts(self):\n        counts = np.zeros((len(self.imdb.classes)), dtype=np.int)\n        roidb = self.imdb.roidb\n        for i in xrange(len(roidb)):\n            for j in xrange(1, self.imdb.num_classes):\n                I = np.where(roidb[i]['gt_classes'] == j)[0]\n                counts[j] += len(I)\n\n        for j in xrange(1, self.imdb.num_classes):\n            print('class {:s} has {:d} positives'.\n                  format(self.imdb.classes[j], counts[j]))\n\n        return counts\n\n    def get_pos_examples(self):\n        counts = self._get_pos_counts()\n        for i in xrange(len(counts)):\n            self.trainers[i].alloc_pos(counts[i])\n\n        _t = Timer()\n        roidb = self.imdb.roidb\n        num_images = len(roidb)\n        # num_images = 100\n        for i in xrange(num_images):\n            im = cv2.imread(self.imdb.image_path_at(i))\n            if roidb[i]['flipped']:\n                im = im[:, ::-1, :]\n            gt_inds = np.where(roidb[i]['gt_classes'] > 0)[0]\n            gt_boxes = roidb[i]['boxes'][gt_inds]\n            _t.tic()\n            scores, boxes = im_detect(self.net, im, gt_boxes)\n            _t.toc()\n            feat = self.net.blobs[self.layer].data\n            for j in xrange(1, self.imdb.num_classes):\n                cls_inds = np.where(roidb[i]['gt_classes'][gt_inds] == j)[0]\n                if len(cls_inds) > 0:\n                    cls_feat = feat[cls_inds, :]\n                    self.trainers[j].append_pos(cls_feat)\n\n            print 'get_pos_examples: {:d}/{:d} {:.3f}s' \\\n                  .format(i + 1, len(roidb), _t.average_time)\n\n    def initialize_net(self):\n        # Start all SVM parameters at zero\n        self.net.params['cls_score'][0].data[...] = 0\n        self.net.params['cls_score'][1].data[...] = 0\n\n        # Initialize SVMs in a smart way. Not doing this because its such\n        # a good initialization that we might not learn something close to\n        # the SVM solution.\n#        # subtract background weights and biases for the foreground classes\n#        w_bg = self.net.params['cls_score'][0].data[0, :]\n#        b_bg = self.net.params['cls_score'][1].data[0]\n#        self.net.params['cls_score'][0].data[1:, :] -= w_bg\n#        self.net.params['cls_score'][1].data[1:] -= b_bg\n#        # set the background weights and biases to 0 (where they shall remain)\n#        self.net.params['cls_score'][0].data[0, :] = 0\n#        self.net.params['cls_score'][1].data[0] = 0\n\n    def update_net(self, cls_ind, w, b):\n        self.net.params['cls_score'][0].data[cls_ind, :] = w\n        self.net.params['cls_score'][1].data[cls_ind] = b\n\n    def train_with_hard_negatives(self):\n        _t = Timer()\n        roidb = self.imdb.roidb\n        num_images = len(roidb)\n        # num_images = 100\n        for i in xrange(num_images):\n            im = cv2.imread(self.imdb.image_path_at(i))\n            if roidb[i]['flipped']:\n                im = im[:, ::-1, :]\n            _t.tic()\n            scores, boxes = im_detect(self.net, im, roidb[i]['boxes'])\n            _t.toc()\n            feat = self.net.blobs[self.layer].data\n            for j in xrange(1, self.imdb.num_classes):\n                hard_inds = \\\n                    np.where((scores[:, j] > self.hard_thresh) &\n                             (roidb[i]['gt_overlaps'][:, j].toarray().ravel() <\n                              self.neg_iou_thresh))[0]\n                if len(hard_inds) > 0:\n                    hard_feat = feat[hard_inds, :].copy()\n                    new_w_b = \\\n                        self.trainers[j].append_neg_and_retrain(feat=hard_feat)\n                    if new_w_b is not None:\n                        self.update_net(j, new_w_b[0], new_w_b[1])\n\n            print(('train_with_hard_negatives: '\n                   '{:d}/{:d} {:.3f}s').format(i + 1, len(roidb),\n                                               _t.average_time))\n\n    def train(self):\n        # Initialize SVMs using\n        #   a. w_i = fc8_w_i - fc8_w_0\n        #   b. b_i = fc8_b_i - fc8_b_0\n        #   c. Install SVMs into net\n        self.initialize_net()\n\n        # Pass over roidb to count num positives for each class\n        #   a. Pre-allocate arrays for positive feature vectors\n        # Pass over roidb, computing features for positives only\n        self.get_pos_examples()\n\n        # Pass over roidb\n        #   a. Compute cls_score with forward pass\n        #   b. For each class\n        #       i. Select hard negatives\n        #       ii. Add them to cache\n        #   c. For each class\n        #       i. If SVM retrain criteria met, update SVM\n        #       ii. Install new SVM into net\n        self.train_with_hard_negatives()\n\n        # One final SVM retraining for each class\n        # Install SVMs into net\n        for j in xrange(1, self.imdb.num_classes):\n            new_w_b = self.trainers[j].append_neg_and_retrain(force=True)\n            self.update_net(j, new_w_b[0], new_w_b[1])\n\nclass SVMClassTrainer(object):\n    \"\"\"Manages post-hoc SVM training for a single object class.\"\"\"\n\n    def __init__(self, cls, dim, feature_scale=1.0,\n                 C=0.001, B=10.0, pos_weight=2.0):\n        self.pos = np.zeros((0, dim), dtype=np.float32)\n        self.neg = np.zeros((0, dim), dtype=np.float32)\n        self.B = B\n        self.C = C\n        self.cls = cls\n        self.pos_weight = pos_weight\n        self.dim = dim\n        self.feature_scale = feature_scale\n        self.svm = svm.LinearSVC(C=C, class_weight={1: 2, -1: 1},\n                                 intercept_scaling=B, verbose=1,\n                                 penalty='l2', loss='l1',\n                                 random_state=cfg.RNG_SEED, dual=True)\n        self.pos_cur = 0\n        self.num_neg_added = 0\n        self.retrain_limit = 2000\n        self.evict_thresh = -1.1\n        self.loss_history = []\n\n    def alloc_pos(self, count):\n        self.pos_cur = 0\n        self.pos = np.zeros((count, self.dim), dtype=np.float32)\n\n    def append_pos(self, feat):\n        num = feat.shape[0]\n        self.pos[self.pos_cur:self.pos_cur + num, :] = feat\n        self.pos_cur += num\n\n    def train(self):\n        print('>>> Updating {} detector <<<'.format(self.cls))\n        num_pos = self.pos.shape[0]\n        num_neg = self.neg.shape[0]\n        print('Cache holds {} pos examples and {} neg examples'.\n              format(num_pos, num_neg))\n        X = np.vstack((self.pos, self.neg)) * self.feature_scale\n        y = np.hstack((np.ones(num_pos),\n                       -np.ones(num_neg)))\n        self.svm.fit(X, y)\n        w = self.svm.coef_\n        b = self.svm.intercept_[0]\n        scores = self.svm.decision_function(X)\n        pos_scores = scores[:num_pos]\n        neg_scores = scores[num_pos:]\n\n        pos_loss = (self.C * self.pos_weight *\n                    np.maximum(0, 1 - pos_scores).sum())\n        neg_loss = self.C * np.maximum(0, 1 + neg_scores).sum()\n        reg_loss = 0.5 * np.dot(w.ravel(), w.ravel()) + 0.5 * b ** 2\n        tot_loss = pos_loss + neg_loss + reg_loss\n        self.loss_history.append((tot_loss, pos_loss, neg_loss, reg_loss))\n\n        for i, losses in enumerate(self.loss_history):\n            print(('    {:d}: obj val: {:.3f} = {:.3f} '\n                   '(pos) + {:.3f} (neg) + {:.3f} (reg)').format(i, *losses))\n\n        # Sanity check\n        scores_ret = (\n                X * 1.0 / self.feature_scale).dot(w.T * self.feature_scale) + b\n        assert np.allclose(scores, scores_ret[:, 0], atol=1e-5), \\\n                \"Scores from returned model don't match decision function\"\n\n        return ((w * self.feature_scale, b), pos_scores, neg_scores)\n\n    def append_neg_and_retrain(self, feat=None, force=False):\n        if feat is not None:\n            num = feat.shape[0]\n            self.neg = np.vstack((self.neg, feat))\n            self.num_neg_added += num\n        if self.num_neg_added > self.retrain_limit or force:\n            self.num_neg_added = 0\n            new_w_b, pos_scores, neg_scores = self.train()\n            # scores = np.dot(self.neg, new_w_b[0].T) + new_w_b[1]\n            # easy_inds = np.where(neg_scores < self.evict_thresh)[0]\n            not_easy_inds = np.where(neg_scores >= self.evict_thresh)[0]\n            if len(not_easy_inds) > 0:\n                self.neg = self.neg[not_easy_inds, :]\n                # self.neg = np.delete(self.neg, easy_inds)\n            print('    Pruning easy negatives')\n            print('    Cache holds {} pos examples and {} neg examples'.\n                  format(self.pos.shape[0], self.neg.shape[0]))\n            print('    {} pos support vectors'.format((pos_scores <= 1).sum()))\n            print('    {} neg support vectors'.format((neg_scores >= -1).sum()))\n            return new_w_b\n        else:\n            return None\n\ndef parse_args():\n    \"\"\"\n    Parse input arguments\n    \"\"\"\n    parser = argparse.ArgumentParser(description='Train SVMs (old skool)')\n    parser.add_argument('--gpu', dest='gpu_id', help='GPU device id to use [0]',\n                        default=0, type=int)\n    parser.add_argument('--def', dest='prototxt',\n                        help='prototxt file defining the network',\n                        default=None, type=str)\n    parser.add_argument('--net', dest='caffemodel',\n                        help='model to test',\n                        default=None, type=str)\n    parser.add_argument('--cfg', dest='cfg_file',\n                        help='optional config file', default=None, type=str)\n    parser.add_argument('--imdb', dest='imdb_name',\n                        help='dataset to train on',\n                        default='voc_2007_trainval', type=str)\n\n    if len(sys.argv) == 1:\n        parser.print_help()\n        sys.exit(1)\n\n    args = parser.parse_args()\n    return args\n\nif __name__ == '__main__':\n    # Must turn this off to prevent issues when digging into the net blobs to\n    # pull out features (tricky!)\n    cfg.DEDUP_BOXES = 0\n\n    # Must turn this on because we use the test im_detect() method to harvest\n    # hard negatives\n    cfg.TEST.SVM = True\n\n    args = parse_args()\n\n    print('Called with args:')\n    print(args)\n\n    if args.cfg_file is not None:\n        cfg_from_file(args.cfg_file)\n\n    print('Using config:')\n    pprint.pprint(cfg)\n\n    # fix the random seed for reproducibility\n    np.random.seed(cfg.RNG_SEED)\n\n    # set up caffe\n    caffe.set_mode_gpu()\n    if args.gpu_id is not None:\n        caffe.set_device(args.gpu_id)\n    net = caffe.Net(args.prototxt, args.caffemodel, caffe.TEST)\n    net.name = os.path.splitext(os.path.basename(args.caffemodel))[0]\n    out = os.path.splitext(os.path.basename(args.caffemodel))[0] + '_svm'\n    out_dir = os.path.dirname(args.caffemodel)\n\n    imdb = get_imdb(args.imdb_name)\n    print 'Loaded dataset `{:s}` for training'.format(imdb.name)\n\n    # enhance roidb to contain flipped examples\n    if cfg.TRAIN.USE_FLIPPED:\n        print 'Appending horizontally-flipped training examples...'\n        imdb.append_flipped_images()\n        print 'done'\n\n    SVMTrainer(net, imdb).train()\n\n    filename = '{}/{}.caffemodel'.format(out_dir, out)\n    net.save(filename)\n    print 'Wrote svm model to: {:s}'.format(filename)\n"
  }
]