Full Code of ivalab/grasp_multiObject_multiGrasp for AI

master 806ad3d71c2f cached

86 files

947.1 KB

298.3k tokens

484 symbols

1 requests

Download .txt

Showing preview only (985K chars total). Download the full file or copy to clipboard to get everything.

Repository: ivalab/grasp_multiObject_multiGrasp
Branch: master
Commit: 806ad3d71c2f
Files: 86
Total size: 947.1 KB

Directory structure:
gitextract_6dytcin7/

├── README.md
├── data/
│   └── scripts/
│       ├── dataPreprocessingTest_fasterrcnn_split.m
│       ├── dataPreprocessing_fasterrcnn.m
│       └── fetch_faster_rcnn_models.sh
├── experiments/
│   ├── cfgs/
│   │   ├── res101-lg.yml
│   │   ├── res101.yml
│   │   ├── res50.yml
│   │   └── vgg16.yml
│   ├── logs/
│   │   └── .gitignore
│   └── scripts/
│       ├── convert_vgg16.sh
│       ├── test_faster_rcnn.sh
│       ├── train_faster_rcnn.sh
│       └── train_faster_rcnn.sh~
├── lib/
│   ├── Makefile
│   ├── datasets/
│   │   ├── VOCdevkit-matlab-wrapper/
│   │   │   ├── get_voc_opts.m
│   │   │   ├── voc_eval.m
│   │   │   └── xVOCap.m
│   │   ├── __init__.py
│   │   ├── coco.py
│   │   ├── ds_utils.py
│   │   ├── factory.py
│   │   ├── factory.py~
│   │   ├── graspRGB.py
│   │   ├── graspRGB.py~
│   │   ├── imdb.py
│   │   ├── pascal_voc.py
│   │   ├── tools/
│   │   │   └── mcg_munge.py
│   │   └── voc_eval.py
│   ├── layer_utils/
│   │   ├── __init__.py
│   │   ├── anchor_target_layer.py
│   │   ├── generate_anchors.py
│   │   ├── proposal_layer.py
│   │   ├── proposal_target_layer.py
│   │   ├── proposal_top_layer.py
│   │   └── snippets.py
│   ├── model/
│   │   ├── __init__.py
│   │   ├── bbox_transform.py
│   │   ├── config.py
│   │   ├── config.py~
│   │   ├── nms_wrapper.py
│   │   ├── test.py
│   │   ├── test.py~
│   │   ├── train_val.py
│   │   └── train_val.py~
│   ├── nets/
│   │   ├── __init__.py
│   │   ├── network.py
│   │   ├── resnet_v1.py
│   │   ├── resnet_v1.py~
│   │   └── vgg16.py
│   ├── nms/
│   │   ├── .gitignore
│   │   ├── __init__.py
│   │   ├── cpu_nms.c
│   │   ├── cpu_nms.pyx
│   │   ├── gpu_nms.cpp
│   │   ├── gpu_nms.hpp
│   │   ├── gpu_nms.pyx
│   │   ├── nms_kernel.cu
│   │   └── py_cpu_nms.py
│   ├── roi_data_layer/
│   │   ├── __init__.py
│   │   ├── layer.py
│   │   ├── minibatch.py
│   │   ├── minibatch.py~
│   │   └── roidb.py
│   ├── setup.py
│   ├── setup.py~
│   └── utils/
│       ├── .gitignore
│       ├── __init__.py
│       ├── bbox.pyx
│       ├── blob.py
│       ├── boxes_grid.py
│       ├── nms.py
│       ├── nms.pyx
│       └── timer.py
└── tools/
    ├── _init_paths.py
    ├── demo.py~
    ├── demo_graspRGD.py
    ├── demo_graspRGD.py~
    ├── demo_graspRGD_socket.py
    ├── demo_graspRGD_socket.py~
    ├── demo_graspRGD_socket_drawer.py~
    ├── demo_graspRGD_socket_save_to_rgbd.py~
    ├── demo_graspRGD_vis_mask.py
    ├── demo_graspRGD_vis_select.py
    ├── eval_graspRGD.py~
    ├── mask_gen.py
    └── trainval_net.py

================================================
FILE CONTENTS
================================================

================================================
FILE: README.md
================================================
# grasp_multiObject_multiGrasp

This is the implementation of our RA-L work 'Real-world Multi-object, Multi-grasp Detection'. The detector takes RGB-D image input and predicts multiple grasp candidates for a single object or multiple objects, in a single shot. The original arxiv paper can be found [here](https://arxiv.org/pdf/1802.00520.pdf). The final version will be updated after publication process.

<p align="center">
<img src="https://github.com/ivalab/grasp_multiObject_multiGrasp/blob/master/fig/multi_grasp_multi_object.png" alt="drawing" width="300"/>
</p>

If you find it helpful for your research, please consider citing:

    @inproceedings{chu2018deep,
      title = {Real-World Multiobject, Multigrasp Detection},
      author = {F. Chu and R. Xu and P. A. Vela},
      journal = {IEEE Robotics and Automation Letters},
      year = {2018},
      volume = {3},
      number = {4},
      pages = {3355-3362},
      DOI = {10.1109/LRA.2018.2852777},
      ISSN = {2377-3766},
      month = {Oct}
    }


If you encounter any questions, please contact me at fujenchu[at]gatech[dot]edu


### Demo
1. Clone this repository
```
git clone https://github.com/ivalab/grasp_multiObject_multiGrasp.git
cd grasp_multiObject_multiGrasp
```

2. Build Cython modules
```
cd lib
make clean
make
cd ..
```

3. Install [Python COCO API](https://github.com/cocodataset/cocoapi)
```
cd data
git clone https://github.com/pdollar/coco.git
cd coco/PythonAPI
make
cd ../../..
```

4. Download pretrained models
- trained model for grasp on [dropbox drive](https://www.dropbox.com/s/ldapcpanzqdu7tc/models.zip?dl=0) 
- put under `output/res50/train/default/`

5. Run demo
```
./tools/demo_graspRGD.py --net res50 --dataset grasp
```
you can see images pop out.

### Train
1. Generate data   
1-1. Download [Cornell Dataset](http://pr.cs.cornell.edu/grasping/rect_data/data.php)   
1-2. Run `dataPreprocessingTest_fasterrcnn_split.m` (please modify paths according to your structure)   
1-3. Follow 'Format Your Dataset' section [here](https://github.com/zeyuanxy/fast-rcnn/tree/master/help/train) to check if your data follows VOC format   

2. Train
```
./experiments/scripts/train_faster_rcnn.sh 0 graspRGB res50
```

### ROS version?
Yes! please find it [HERE](https://github.com/ivaROS/ros_deep_grasp)

### Acknowledgment

This repo borrows tons of code from
- [tf-faster-rcnn](https://github.com/endernewton/tf-faster-rcnn) by endernewton

### Resources
- [multi-object grasp dataset](https://github.com/ivalab/grasp_multiObject)
- [grasp annotation tool](https://github.com/ivalab/grasp_annotation_tool)


================================================
FILE: data/scripts/dataPreprocessingTest_fasterrcnn_split.m
================================================
%% script to test dataPreprocessing
%% created by Fu-Jen Chu on 09/15/2016

close all
clear

%parpool(4)
addpath('/media/fujenchu/home3/data/grasps/')


% generate list for splits
list = [100:949 1000:1034];
list_idx = randperm(length(list));
train_list_idx = list_idx(length(list)/5+1:end);
test_list_idx = list_idx(1:length(list)/5);
train_list = list(train_list_idx);
test_list = list(test_list_idx);


for folder = 1:10
display(['processing folder ' int2str(folder)])

imgDataDir = ['/home/fujenchu/projects/deepLearning/tensorflow-finetune-flickr-style-master/data/grasps/' sprintf('%02d',folder) '_rgd'];
txtDataDir = ['/home/fujenchu/projects/deepLearning/tensorflow-finetune-flickr-style-master/data/grasps/' sprintf('%02d',folder)];

%imgDataOutDir = ['/home/fujenchu/projects/deepLearning/tensorflow-finetune-flickr-style-master/data/grasps/' sprintf('%02d',folder) '_Cropped320_rgd'];
imgDataOutDir = '/media/fujenchu/home3/fasterrcnn_grasp/rgd_multibbs_5_5_5_tf/data/Images';
annotationDataOutDir = '/media/fujenchu/home3/fasterrcnn_grasp/rgd_multibbs_5_5_5_tf/data/Annotations';
imgSetTrain = '/media/fujenchu/home3/fasterrcnn_grasp/rgd_multibbs_5_5_5_tf/data/ImageSets/train.txt'; 
imgSetTest = '/media/fujenchu/home3/fasterrcnn_grasp/rgd_multibbs_5_5_5_tf/data/ImageSets/test.txt'; 

imgFiles = dir([imgDataDir '/*.png']);
txtFiles = dir([txtDataDir '/*pos.txt']);

logfileID = fopen('log.txt','a');
mainfileID = fopen(['/home/fujenchu/projects/deepLearning/deepGraspExtensiveOffline/data/grasps/scripts/trainttt' sprintf('%02d',folder) '.txt'],'a');
for idx = 1:length(imgFiles) 
    %% display progress
    tic
    display(['processing folder: ' sprintf('%02d',folder) ', imgFiles: ' int2str(idx)])
    
    %% reading data
    imgName = imgFiles(idx).name;
    [pathstr,imgname] = fileparts(imgName);
    
    filenum = str2num(imgname(4:7));
    if(any(test_list == filenum))
        file_writeID = fopen(imgSetTest,'a');
        fprintf(file_writeID, '%s\n', [imgDataDir(1:end-3) 'Cropped320_rgd/' imgname '_preprocessed_1.png' ] );
        fclose(file_writeID);
        continue;
    end
    
    txtName = txtFiles(idx).name;
    [pathstr,txtname] = fileparts(txtName);

    img = imread([imgDataDir '/' imgname '.png']);
    fileID = fopen([txtDataDir '/' txtname '.txt'],'r');
    sizeA = [2 100];
    bbsIn_all = fscanf(fileID, '%f %f', sizeA);
    fclose(fileID);
    
    %% data pre-processing
    [imagesOut bbsOut] = dataPreprocessing_fasterrcnn(img, bbsIn_all, 227, 5, 5);
    
    % for each augmented image
    for i = 1:1:size(imagesOut,2)
        
        % for each bbs
        file_writeID = fopen([annotationDataOutDir '/' imgname '_preprocessed_' int2str(i) '.txt'],'w');
        printCount = 0;
        for ibbs = 1:1:size(bbsOut{i},2)
          A = bbsOut{i}{ibbs};  
          xy_ctr = sum(A,2)/4; x_ctr = xy_ctr(1); y_ctr = xy_ctr(2);
          width = sqrt(sum((A(:,1) - A(:,2)).^2)); height = sqrt(sum((A(:,2) - A(:,3)).^2));
          if(A(1,1) > A(1,2))
              theta = atan((A(2,2)-A(2,1))/(A(1,1)-A(1,2)));
          else
              theta = atan((A(2,1)-A(2,2))/(A(1,2)-A(1,1))); % note y is facing down
          end  
    
          % process to fasterrcnn
          x_min = x_ctr - width/2; x_max = x_ctr + width/2;
          y_min = y_ctr - height/2; y_max = y_ctr + height/2;
          %if(x_min < 0 || y_min < 0 || x_max > 227 || y_max > 227) display('yoooooooo'); end
          if((x_min < 0 && x_max < 0) || (y_min > 227 && y_max > 227) || (x_min > 227 && x_max > 227) || (y_min < 0 && y_max < 0)) display('xxxxxxxxx'); break; end
          cls = round((theta/pi*180+90)/10) + 1;
          
          % write as lefttop rightdown, Xmin Ymin Xmax Ymax, ex: 261 109 511 705  (x水平 y垂直)
          fprintf(file_writeID, '%d %f %f %f %f\n', cls, x_min, y_min, x_max, y_max );   
          printCount = printCount+1;
        end
        if(printCount == 0) fprintf(logfileID, '%s\n', [imgname '_preprocessed_' int2str(i) ]);end
        
        fclose(file_writeID);
        imwrite(imagesOut{i}, [imgDataOutDir '/' imgname '_preprocessed_' int2str(i) '.png']); 
        
        % write filename to imageSet 
        file_writeID = fopen(imgSetTrain,'a');
        fprintf(file_writeID, '%s\n', [imgname '_preprocessed_' int2str(i) ] );
        fclose(file_writeID);

    end
    

    toc
end
fclose(mainfileID);

end


================================================
FILE: data/scripts/dataPreprocessing_fasterrcnn.m
================================================
function [imagesOut bbsOut] = dataPreprocessing( imageIn, bbsIn_all, cropSize, translationShiftNumber, roatateAngleNumber)
% dataPreprocessing function perfroms
% 1) croping
% 2) padding
% 3) rotatation
% 4) shifting
%
% for a input image with a bbs as 4 points, 
% dataPreprocessing outputs a set of images with corresponding bbs.
%
%
% Inputs: 
%   imageIn: input image (480 by 640 by 3)
%   bbsIn: bounding box (2 by 4)
%   cropSize: output image size
%   shift: shifting offset
%   rotate: rotation angle
%
% Outputs:
%   imagesOut: output images (n images)
%   bbsOut: output bbs according to shift and rotation
%
%% created by Fu-Jen Chu on 09/15/2016

debug_dev = 0;
debug = 0;
%% show image and bbs
if(debug_dev)
figure(1); imshow(imageIn); hold on;
x = bbsIn_all(1, [1:3]);
y = bbsIn_all(2, [1:3]);
plot(x,y); hold off;
end

%% crop image and padding image
% cropping to 321 by 321 from center
imgCrop = imcrop(imageIn, [145 65 351 351]); 

% padding to 501 by 501
imgPadding = padarray(imgCrop, [75 75], 'replicate', 'both');

count = 1;
for i_rotate = 1:roatateAngleNumber*translationShiftNumber*translationShiftNumber
    % random roatateAngle
    theta = randi(360)-1;
    %theta = 0;

    % random translationShift
    dx = randi(101)-51;
    %dx = 0;

    %% rotation and shifting
    % random translationShift
    dy = randi(101)-51;
    %dy = 0;

    imgRotate = imrotate(imgPadding, theta);
    if(debug_dev)figure(2); imshow(imgRotate);end
    imgCropRotate = imcrop(imgRotate, [size(imgRotate,1)/2-160-dx size(imgRotate,1)/2-160-dy 320 320]);
    if(debug_dev)figure(3); imshow(imgCropRotate);end 
    imgResize = imresize(imgCropRotate, [cropSize cropSize]);
    if(debug)figure(4); imshow(imgResize); hold on;end

    %% modify bbs
    [m, n] = size(bbsIn_all);
    bbsNum = n/4;
    
    countbbs = 1;
    for idx = 1:bbsNum
      bbsIn =  bbsIn_all(:,idx*4-3:idx*4); 
      if(sum(sum(isnan(bbsIn)))) continue; end
      
      bbsInShift = bbsIn - repmat([320; 240], 1, 4);
      R = [cos(theta/180*pi) -sin(theta/180*pi); sin(theta/180*pi) cos(theta/180*pi)];
      bbsRotated = (bbsInShift'*R)';
      bbsInShiftBack = (bbsRotated + repmat([160; 160], 1, 4) + repmat([dx; dy], 1, 4))*cropSize/320;
      if(debug)
        figure(4)
        x = bbsInShiftBack(1, [1:4 1]);
        y = bbsInShiftBack(2, [1:4 1]);
        plot(x,y); hold on; pause(0.01);
      end
      bbsOut{count}{countbbs} = bbsInShiftBack;
      countbbs = countbbs + 1;
    end
    
    imagesOut{count} = imgResize;
    count = count +1;

   
end



end


================================================
FILE: data/scripts/fetch_faster_rcnn_models.sh
================================================
#!/bin/bash

DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )/../" && pwd )"
cd $DIR

NET=res101
FILE=voc_0712_80k-110k.tgz
# replace it with gs11655.sp.cs.cmu.edu if ladoga.graphics.cs.cmu.edu does not work
URL=http://ladoga.graphics.cs.cmu.edu/xinleic/tf-faster-rcnn/$NET/$FILE
CHECKSUM=cb32e9df553153d311cc5095b2f8c340

if [ -f $FILE ]; then
  echo "File already exists. Checking md5..."
  os=`uname -s`
  if [ "$os" = "Linux" ]; then
    checksum=`md5sum $FILE | awk '{ print $1 }'`
  elif [ "$os" = "Darwin" ]; then
    checksum=`cat $FILE | md5`
  fi
  if [ "$checksum" = "$CHECKSUM" ]; then
    echo "Checksum is correct. No need to download."
    exit 0
  else
    echo "Checksum is incorrect. Need to download again."
  fi
fi

echo "Downloading Resnet 101 Faster R-CNN models Pret-trained on VOC 07+12 (340M)..."

wget $URL -O $FILE

echo "Unzipping..."

tar zxvf $FILE

echo "Done. Please run this command again to verify that checksum = $CHECKSUM."


================================================
FILE: experiments/cfgs/res101-lg.yml
================================================
EXP_DIR: res101-lg
TRAIN:
  HAS_RPN: True
  IMS_PER_BATCH: 1
  BBOX_NORMALIZE_TARGETS_PRECOMPUTED: True
  RPN_POSITIVE_OVERLAP: 0.7
  RPN_BATCHSIZE: 256
  PROPOSAL_METHOD: gt
  BG_THRESH_LO: 0.0
  DISPLAY: 20
  BATCH_SIZE: 256
  WEIGHT_DECAY: 0.0001
  DOUBLE_BIAS: False
  SNAPSHOT_PREFIX: res101_faster_rcnn
  SCALES: [800]
  MAX_SIZE: 1333
TEST:
  HAS_RPN: True
  SCALES: [800]
  MAX_SIZE: 1333
  RPN_POST_NMS_TOP_N: 1000
POOLING_MODE: crop
ANCHOR_SCALES: [2,4,8,16,32]


================================================
FILE: experiments/cfgs/res101.yml
================================================
EXP_DIR: res101
TRAIN:
  HAS_RPN: True
  IMS_PER_BATCH: 1
  BBOX_NORMALIZE_TARGETS_PRECOMPUTED: True
  RPN_POSITIVE_OVERLAP: 0.7
  RPN_BATCHSIZE: 256
  PROPOSAL_METHOD: gt
  BG_THRESH_LO: 0.0
  DISPLAY: 20
  BATCH_SIZE: 256
  WEIGHT_DECAY: 0.0001
  DOUBLE_BIAS: False
  SNAPSHOT_PREFIX: res101_faster_rcnn
TEST:
  HAS_RPN: True
POOLING_MODE: crop


================================================
FILE: experiments/cfgs/res50.yml
================================================
EXP_DIR: res50
TRAIN:
  HAS_RPN: True
  IMS_PER_BATCH: 1
  BBOX_NORMALIZE_TARGETS_PRECOMPUTED: True
  RPN_POSITIVE_OVERLAP: 0.7
  RPN_BATCHSIZE: 256
  PROPOSAL_METHOD: gt
  BG_THRESH_LO: 0.0
  DISPLAY: 20
  BATCH_SIZE: 256
  WEIGHT_DECAY: 0.0001
  DOUBLE_BIAS: False
  SNAPSHOT_PREFIX: res50_faster_rcnn
TEST:
  HAS_RPN: True
POOLING_MODE: crop


================================================
FILE: experiments/cfgs/vgg16.yml
================================================
EXP_DIR: vgg16
TRAIN:
  HAS_RPN: True
  IMS_PER_BATCH: 1
  BBOX_NORMALIZE_TARGETS_PRECOMPUTED: True
  RPN_POSITIVE_OVERLAP: 0.7
  RPN_BATCHSIZE: 256
  PROPOSAL_METHOD: gt
  BG_THRESH_LO: 0.0
  DISPLAY: 20
  BATCH_SIZE: 256
  SNAPSHOT_PREFIX: vgg16_faster_rcnn
TEST:
  HAS_RPN: True
POOLING_MODE: crop


================================================
FILE: experiments/logs/.gitignore
================================================
*.txt.*


================================================
FILE: experiments/scripts/convert_vgg16.sh
================================================
#!/bin/bash

set -x
set -e

export PYTHONUNBUFFERED="True"

GPU_ID=$1
DATASET=$2
NET=vgg16

array=( $@ )
len=${#array[@]}
EXTRA_ARGS=${array[@]:2:$len}
EXTRA_ARGS_SLUG=${EXTRA_ARGS// /_}

case ${DATASET} in
  pascal_voc)
    TRAIN_IMDB="voc_2007_trainval"
    TEST_IMDB="voc_2007_test"
    STEPSIZE=50000
    ITERS=70000
    ANCHORS="[8,16,32]"
    RATIOS="[0.5,1,2]"
    ;;
  pascal_voc_0712)
    TRAIN_IMDB="voc_2007_trainval+voc_2012_trainval"
    TEST_IMDB="voc_2007_test"
    STEPSIZE=80000
    ITERS=110000
    ANCHORS="[8,16,32]"
    RATIOS="[0.5,1,2]"
    ;;
  coco)
    TRAIN_IMDB="coco_2014_train+coco_2014_valminusminival"
    TEST_IMDB="coco_2014_minival"
    STEPSIZE=350000
    ITERS=490000
    ANCHORS="[4,8,16,32]"
    RATIOS="[0.5,1,2]"
    ;;
  *)
    echo "No dataset given"
    exit
    ;;
esac

set +x
NET_FINAL=${NET}_faster_rcnn_iter_${ITERS}
set -x

if [ ! -f ${NET_FINAL}.index ]; then
    if [[ ! -z  ${EXTRA_ARGS_SLUG}  ]]; then
        CUDA_VISIBLE_DEVICES=${GPU_ID} time python ./tools/convert_from_depre.py \
            --snapshot ${NET_FINAL} \
            --imdb ${TRAIN_IMDB} \
            --iters ${ITERS} \
            --cfg experiments/cfgs/${NET}.yml \
            --tag ${EXTRA_ARGS_SLUG} \
            --set ANCHOR_SCALES ${ANCHORS} ANCHOR_RATIOS ${RATIOS} TRAIN.STEPSIZE ${STEPSIZE} ${EXTRA_ARGS}
    else
        CUDA_VISIBLE_DEVICES=${GPU_ID} time python ./tools/convert_from_depre.py \
            --snapshot ${NET_FINAL} \
            --imdb ${TRAIN_IMDB} \
            --iters ${ITERS} \
            --cfg experiments/cfgs/${NET}.yml \
            --set ANCHOR_SCALES ${ANCHORS} ANCHOR_RATIOS ${RATIOS} TRAIN.STEPSIZE ${STEPSIZE} ${EXTRA_ARGS}
    fi
fi



================================================
FILE: experiments/scripts/test_faster_rcnn.sh
================================================
#!/bin/bash

set -x
set -e

export PYTHONUNBUFFERED="True"

GPU_ID=$1
DATASET=$2
NET=$3

array=( $@ )
len=${#array[@]}
EXTRA_ARGS=${array[@]:3:$len}
EXTRA_ARGS_SLUG=${EXTRA_ARGS// /_}

case ${DATASET} in
  pascal_voc)
    TRAIN_IMDB="voc_2007_trainval"
    TEST_IMDB="voc_2007_test"
    ITERS=70000
    ANCHORS="[8,16,32]"
    RATIOS="[0.5,1,2]"
    ;;
  pascal_voc_0712)
    TRAIN_IMDB="voc_2007_trainval+voc_2012_trainval"
    TEST_IMDB="voc_2007_test"
    ITERS=110000
    ANCHORS="[8,16,32]"
    RATIOS="[0.5,1,2]"
    ;;
  coco)
    TRAIN_IMDB="coco_2014_train+coco_2014_valminusminival"
    TEST_IMDB="coco_2014_minival"
    ITERS=490000
    ANCHORS="[4,8,16,32]"
    RATIOS="[0.5,1,2]"
    ;;
  *)
    echo "No dataset given"
    exit
    ;;
esac

LOG="experiments/logs/test_${NET}_${TRAIN_IMDB}_${EXTRA_ARGS_SLUG}.txt.`date +'%Y-%m-%d_%H-%M-%S'`"
exec &> >(tee -a "$LOG")
echo Logging output to "$LOG"

set +x
if [[ ! -z  ${EXTRA_ARGS_SLUG}  ]]; then
  NET_FINAL=output/${NET}/${TRAIN_IMDB}/${EXTRA_ARGS_SLUG}/${NET}_faster_rcnn_iter_${ITERS}.ckpt
else
  NET_FINAL=output/${NET}/${TRAIN_IMDB}/default/${NET}_faster_rcnn_iter_${ITERS}.ckpt
fi
set -x

if [[ ! -z  ${EXTRA_ARGS_SLUG}  ]]; then
  CUDA_VISIBLE_DEVICES=${GPU_ID} time python ./tools/test_net.py \
    --imdb ${TEST_IMDB} \
    --model ${NET_FINAL} \
    --cfg experiments/cfgs/${NET}.yml \
    --tag ${EXTRA_ARGS_SLUG} \
    --net ${NET} \
    --set ANCHOR_SCALES ${ANCHORS} ANCHOR_RATIOS ${RATIOS} ${EXTRA_ARGS}
else
  CUDA_VISIBLE_DEVICES=${GPU_ID} time python ./tools/test_net.py \
    --imdb ${TEST_IMDB} \
    --model ${NET_FINAL} \
    --cfg experiments/cfgs/${NET}.yml \
    --net ${NET} \
    --set ANCHOR_SCALES ${ANCHORS} ANCHOR_RATIOS ${RATIOS} ${EXTRA_ARGS}
fi



================================================
FILE: experiments/scripts/train_faster_rcnn.sh
================================================
#!/bin/bash

set -x
set -e

export PYTHONUNBUFFERED="True"

GPU_ID=$1
DATASET=$2
NET=$3

array=( $@ )
len=${#array[@]}
EXTRA_ARGS=${array[@]:3:$len}
EXTRA_ARGS_SLUG=${EXTRA_ARGS// /_}

case ${DATASET} in
  graspRGB)
    TRAIN_IMDB="graspRGB_train"
    TEST_IMDB="graspRGB_test"
    STEPSIZE=50000
    ITERS=240000
    #ANCHORS="[2,4,8,16,32]"
    ANCHORS="[8,16,32]"
    RATIOS="[0.5,1,2]"
    ;;
  pascal_voc)
    TRAIN_IMDB="voc_2007_trainval"
    TEST_IMDB="voc_2007_test"
    STEPSIZE=50000
    ITERS=70000
    ANCHORS="[8,16,32]"
    RATIOS="[0.5,1,2]"
    ;;
  pascal_voc_0712)
    TRAIN_IMDB="voc_2007_trainval+voc_2012_trainval"
    TEST_IMDB="voc_2007_test"
    STEPSIZE=80000
    ITERS=110000
    ANCHORS="[8,16,32]"
    RATIOS="[0.5,1,2]"
    ;;
  coco)
    TRAIN_IMDB="coco_2014_train+coco_2014_valminusminival"
    TEST_IMDB="coco_2014_minival"
    STEPSIZE=350000
    ITERS=490000
    ANCHORS="[4,8,16,32]"
    RATIOS="[0.5,1,2]"
    ;;
  *)
    echo "No dataset given"
    exit
    ;;
esac

LOG="experiments/logs/${NET}_${TRAIN_IMDB}_${EXTRA_ARGS_SLUG}_${NET}.txt.`date +'%Y-%m-%d_%H-%M-%S'`"
exec &> >(tee -a "$LOG")
echo Logging output to "$LOG"

set +x
if [[ ! -z  ${EXTRA_ARGS_SLUG}  ]]; then
    NET_FINAL=output/${NET}/${TRAIN_IMDB}/${EXTRA_ARGS_SLUG}/${NET}_faster_rcnn_iter_${ITERS}.ckpt
else
    NET_FINAL=output/${NET}/${TRAIN_IMDB}/default/${NET}_faster_rcnn_iter_${ITERS}.ckpt
fi
set -x

if [ ! -f ${NET_FINAL}.index ]; then
    if [[ ! -z  ${EXTRA_ARGS_SLUG}  ]]; then
        CUDA_VISIBLE_DEVICES=${GPU_ID} time python ./tools/trainval_net.py \
            --weight data/imagenet_weights/${NET}.ckpt \
            --imdb ${TRAIN_IMDB} \
            --imdbval ${TEST_IMDB} \
            --iters ${ITERS} \
            --cfg experiments/cfgs/${NET}.yml \
            --tag ${EXTRA_ARGS_SLUG} \
            --net ${NET} \
            --set ANCHOR_SCALES ${ANCHORS} ANCHOR_RATIOS ${RATIOS} TRAIN.STEPSIZE ${STEPSIZE} ${EXTRA_ARGS}
    else
        CUDA_VISIBLE_DEVICES=${GPU_ID} time python ./tools/trainval_net.py \
            --weight data/imagenet_weights/${NET}.ckpt \
            --imdb ${TRAIN_IMDB} \
            --imdbval ${TEST_IMDB} \
            --iters ${ITERS} \
            --cfg experiments/cfgs/${NET}.yml \
            --net ${NET} \
            --set ANCHOR_SCALES ${ANCHORS} ANCHOR_RATIOS ${RATIOS} TRAIN.STEPSIZE ${STEPSIZE} ${EXTRA_ARGS}
    fi
fi

./experiments/scripts/test_faster_rcnn.sh $@


================================================
FILE: experiments/scripts/train_faster_rcnn.sh~
================================================
#!/bin/bash

set -x
set -e

export PYTHONUNBUFFERED="True"

GPU_ID=$1
DATASET=$2
NET=$3

array=( $@ )
len=${#array[@]}
EXTRA_ARGS=${array[@]:3:$len}
EXTRA_ARGS_SLUG=${EXTRA_ARGS// /_}

case ${DATASET} in
  graspRGB)
    TRAIN_IMDB="graspRGB_train"
    TEST_IMDB="graspRGB_test"
    STEPSIZE=50000
    ITERS=160000
    #ANCHORS="[2,4,8,16,32]"
    ANCHORS="[8,16,32]"
    RATIOS="[0.5,1,2]"
    ;;
  pascal_voc)
    TRAIN_IMDB="voc_2007_trainval"
    TEST_IMDB="voc_2007_test"
    STEPSIZE=50000
    ITERS=70000
    ANCHORS="[8,16,32]"
    RATIOS="[0.5,1,2]"
    ;;
  pascal_voc_0712)
    TRAIN_IMDB="voc_2007_trainval+voc_2012_trainval"
    TEST_IMDB="voc_2007_test"
    STEPSIZE=80000
    ITERS=110000
    ANCHORS="[8,16,32]"
    RATIOS="[0.5,1,2]"
    ;;
  coco)
    TRAIN_IMDB="coco_2014_train+coco_2014_valminusminival"
    TEST_IMDB="coco_2014_minival"
    STEPSIZE=350000
    ITERS=490000
    ANCHORS="[4,8,16,32]"
    RATIOS="[0.5,1,2]"
    ;;
  *)
    echo "No dataset given"
    exit
    ;;
esac

LOG="experiments/logs/${NET}_${TRAIN_IMDB}_${EXTRA_ARGS_SLUG}_${NET}.txt.`date +'%Y-%m-%d_%H-%M-%S'`"
exec &> >(tee -a "$LOG")
echo Logging output to "$LOG"

set +x
if [[ ! -z  ${EXTRA_ARGS_SLUG}  ]]; then
    NET_FINAL=output/${NET}/${TRAIN_IMDB}/${EXTRA_ARGS_SLUG}/${NET}_faster_rcnn_iter_${ITERS}.ckpt
else
    NET_FINAL=output/${NET}/${TRAIN_IMDB}/default/${NET}_faster_rcnn_iter_${ITERS}.ckpt
fi
set -x

if [ ! -f ${NET_FINAL}.index ]; then
    if [[ ! -z  ${EXTRA_ARGS_SLUG}  ]]; then
        CUDA_VISIBLE_DEVICES=${GPU_ID} time python ./tools/trainval_net.py \
            --weight data/imagenet_weights/${NET}.ckpt \
            --imdb ${TRAIN_IMDB} \
            --imdbval ${TEST_IMDB} \
            --iters ${ITERS} \
            --cfg experiments/cfgs/${NET}.yml \
            --tag ${EXTRA_ARGS_SLUG} \
            --net ${NET} \
            --set ANCHOR_SCALES ${ANCHORS} ANCHOR_RATIOS ${RATIOS} TRAIN.STEPSIZE ${STEPSIZE} ${EXTRA_ARGS}
    else
        CUDA_VISIBLE_DEVICES=${GPU_ID} time python ./tools/trainval_net.py \
            --weight data/imagenet_weights/${NET}.ckpt \
            --imdb ${TRAIN_IMDB} \
            --imdbval ${TEST_IMDB} \
            --iters ${ITERS} \
            --cfg experiments/cfgs/${NET}.yml \
            --net ${NET} \
            --set ANCHOR_SCALES ${ANCHORS} ANCHOR_RATIOS ${RATIOS} TRAIN.STEPSIZE ${STEPSIZE} ${EXTRA_ARGS}
    fi
fi

./experiments/scripts/test_faster_rcnn.sh $@


================================================
FILE: lib/Makefile
================================================
all:
	python setup.py build_ext --inplace
	rm -rf build
clean:
	rm -rf */*.pyc
	rm -rf */*.so


================================================
FILE: lib/datasets/VOCdevkit-matlab-wrapper/get_voc_opts.m
================================================
function VOCopts = get_voc_opts(path)

tmp = pwd;
cd(path);
try
  addpath('VOCcode');
  VOCinit;
catch
  rmpath('VOCcode');
  cd(tmp);
  error(sprintf('VOCcode directory not found under %s', path));
end
rmpath('VOCcode');
cd(tmp);


================================================
FILE: lib/datasets/VOCdevkit-matlab-wrapper/voc_eval.m
================================================
function res = voc_eval(path, comp_id, test_set, output_dir)

VOCopts = get_voc_opts(path);
VOCopts.testset = test_set;

for i = 1:length(VOCopts.classes)
  cls = VOCopts.classes{i};
  res(i) = voc_eval_cls(cls, VOCopts, comp_id, output_dir);
end

fprintf('\n~~~~~~~~~~~~~~~~~~~~\n');
fprintf('Results:\n');
aps = [res(:).ap]';
fprintf('%.1f\n', aps * 100);
fprintf('%.1f\n', mean(aps) * 100);
fprintf('~~~~~~~~~~~~~~~~~~~~\n');

function res = voc_eval_cls(cls, VOCopts, comp_id, output_dir)

test_set = VOCopts.testset;
year = VOCopts.dataset(4:end);

addpath(fullfile(VOCopts.datadir, 'VOCcode'));

res_fn = sprintf(VOCopts.detrespath, comp_id, cls);

recall = [];
prec = [];
ap = 0;
ap_auc = 0;

do_eval = (str2num(year) <= 2007) | ~strcmp(test_set, 'test');
if do_eval
  % Bug in VOCevaldet requires that tic has been called first
  tic;
  [recall, prec, ap] = VOCevaldet(VOCopts, comp_id, cls, true);
  ap_auc = xVOCap(recall, prec);

  % force plot limits
  ylim([0 1]);
  xlim([0 1]);

  print(gcf, '-djpeg', '-r0', ...
        [output_dir '/' cls '_pr.jpg']);
end
fprintf('!!! %s : %.4f %.4f\n', cls, ap, ap_auc);

res.recall = recall;
res.prec = prec;
res.ap = ap;
res.ap_auc = ap_auc;

save([output_dir '/' cls '_pr.mat'], ...
     'res', 'recall', 'prec', 'ap', 'ap_auc');

rmpath(fullfile(VOCopts.datadir, 'VOCcode'));


================================================
FILE: lib/datasets/VOCdevkit-matlab-wrapper/xVOCap.m
================================================
function ap = xVOCap(rec,prec)
% From the PASCAL VOC 2011 devkit

mrec=[0 ; rec ; 1];
mpre=[0 ; prec ; 0];
for i=numel(mpre)-1:-1:1
    mpre(i)=max(mpre(i),mpre(i+1));
end
i=find(mrec(2:end)~=mrec(1:end-1))+1;
ap=sum((mrec(i)-mrec(i-1)).*mpre(i));


================================================
FILE: lib/datasets/__init__.py
================================================
# --------------------------------------------------------
# Fast R-CNN
# Copyright (c) 2015 Microsoft
# Licensed under The MIT License [see LICENSE for details]
# Written by Ross Girshick
# --------------------------------------------------------


================================================
FILE: lib/datasets/coco.py
================================================
# --------------------------------------------------------
# Fast/er R-CNN
# Licensed under The MIT License [see LICENSE for details]
# Written by Ross Girshick and Xinlei Chen
# --------------------------------------------------------
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from datasets.imdb import imdb
import datasets.ds_utils as ds_utils
from model.config import cfg
import os.path as osp
import sys
import os
import numpy as np
import scipy.sparse
import scipy.io as sio
import pickle
import json
import uuid
# COCO API
from pycocotools.coco import COCO
from pycocotools.cocoeval import COCOeval
from pycocotools import mask as COCOmask

class coco(imdb):
  def __init__(self, image_set, year):
    imdb.__init__(self, 'coco_' + year + '_' + image_set)
    # COCO specific config options
    self.config = {'use_salt': True,
                   'cleanup': True}
    # name, paths
    self._year = year
    self._image_set = image_set
    self._data_path = osp.join(cfg.DATA_DIR, 'coco')
    # load COCO API, classes, class <-> id mappings
    self._COCO = COCO(self._get_ann_file())
    cats = self._COCO.loadCats(self._COCO.getCatIds())
    self._classes = tuple(['__background__'] + [c['name'] for c in cats])
    self._class_to_ind = dict(list(zip(self.classes, list(range(self.num_classes)))))
    self._class_to_coco_cat_id = dict(list(zip([c['name'] for c in cats],
                                               self._COCO.getCatIds())))
    self._image_index = self._load_image_set_index()
    # Default to roidb handler
    self.set_proposal_method('gt')
    self.competition_mode(False)

    # Some image sets are "views" (i.e. subsets) into others.
    # For example, minival2014 is a random 5000 image subset of val2014.
    # This mapping tells us where the view's images and proposals come from.
    self._view_map = {
      'minival2014': 'val2014',  # 5k val2014 subset
      'valminusminival2014': 'val2014',  # val2014 \setminus minival2014
      'test-dev2015': 'test2015',
    }
    coco_name = image_set + year  # e.g., "val2014"
    self._data_name = (self._view_map[coco_name]
                       if coco_name in self._view_map
                       else coco_name)
    # Dataset splits that have ground-truth annotations (test splits
    # do not have gt annotations)
    self._gt_splits = ('train', 'val', 'minival')

  def _get_ann_file(self):
    prefix = 'instances' if self._image_set.find('test') == -1 \
      else 'image_info'
    return osp.join(self._data_path, 'annotations',
                    prefix + '_' + self._image_set + self._year + '.json')

  def _load_image_set_index(self):
    """
    Load image ids.
    """
    image_ids = self._COCO.getImgIds()
    return image_ids

  def _get_widths(self):
    anns = self._COCO.loadImgs(self._image_index)
    widths = [ann['width'] for ann in anns]
    return widths

  def image_path_at(self, i):
    """
    Return the absolute path to image i in the image sequence.
    """
    return self.image_path_from_index(self._image_index[i])

  def image_path_from_index(self, index):
    """
    Construct an image path from the image's "index" identifier.
    """
    # Example image path for index=119993:
    #   images/train2014/COCO_train2014_000000119993.jpg
    file_name = ('COCO_' + self._data_name + '_' +
                 str(index).zfill(12) + '.jpg')
    image_path = osp.join(self._data_path, 'images',
                          self._data_name, file_name)
    assert osp.exists(image_path), \
      'Path does not exist: {}'.format(image_path)
    return image_path

  def gt_roidb(self):
    """
    Return the database of ground-truth regions of interest.
    This function loads/saves from/to a cache file to speed up future calls.
    """
    cache_file = osp.join(self.cache_path, self.name + '_gt_roidb.pkl')
    if osp.exists(cache_file):
      with open(cache_file, 'rb') as fid:
        roidb = pickle.load(fid)
      print('{} gt roidb loaded from {}'.format(self.name, cache_file))
      return roidb

    gt_roidb = [self._load_coco_annotation(index)
                for index in self._image_index]

    with open(cache_file, 'wb') as fid:
      pickle.dump(gt_roidb, fid, pickle.HIGHEST_PROTOCOL)
    print('wrote gt roidb to {}'.format(cache_file))
    return gt_roidb

  def _load_coco_annotation(self, index):
    """
    Loads COCO bounding-box instance annotations. Crowd instances are
    handled by marking their overlaps (with all categories) to -1. This
    overlap value means that crowd "instances" are excluded from training.
    """
    im_ann = self._COCO.loadImgs(index)[0]
    width = im_ann['width']
    height = im_ann['height']

    annIds = self._COCO.getAnnIds(imgIds=index, iscrowd=None)
    objs = self._COCO.loadAnns(annIds)
    # Sanitize bboxes -- some are invalid
    valid_objs = []
    for obj in objs:
      x1 = np.max((0, obj['bbox'][0]))
      y1 = np.max((0, obj['bbox'][1]))
      x2 = np.min((width - 1, x1 + np.max((0, obj['bbox'][2] - 1))))
      y2 = np.min((height - 1, y1 + np.max((0, obj['bbox'][3] - 1))))
      if obj['area'] > 0 and x2 >= x1 and y2 >= y1:
        obj['clean_bbox'] = [x1, y1, x2, y2]
        valid_objs.append(obj)
    objs = valid_objs
    num_objs = len(objs)

    boxes = np.zeros((num_objs, 4), dtype=np.uint16)
    gt_classes = np.zeros((num_objs), dtype=np.int32)
    overlaps = np.zeros((num_objs, self.num_classes), dtype=np.float32)
    seg_areas = np.zeros((num_objs), dtype=np.float32)

    # Lookup table to map from COCO category ids to our internal class
    # indices
    coco_cat_id_to_class_ind = dict([(self._class_to_coco_cat_id[cls],
                                      self._class_to_ind[cls])
                                     for cls in self._classes[1:]])

    for ix, obj in enumerate(objs):
      cls = coco_cat_id_to_class_ind[obj['category_id']]
      boxes[ix, :] = obj['clean_bbox']
      gt_classes[ix] = cls
      seg_areas[ix] = obj['area']
      if obj['iscrowd']:
        # Set overlap to -1 for all classes for crowd objects
        # so they will be excluded during training
        overlaps[ix, :] = -1.0
      else:
        overlaps[ix, cls] = 1.0

    ds_utils.validate_boxes(boxes, width=width, height=height)
    overlaps = scipy.sparse.csr_matrix(overlaps)
    return {'width': width,
            'height': height,
            'boxes': boxes,
            'gt_classes': gt_classes,
            'gt_overlaps': overlaps,
            'flipped': False,
            'seg_areas': seg_areas}

  def _get_widths(self):
    return [r['width'] for r in self.roidb]

  def append_flipped_images(self):
    num_images = self.num_images
    widths = self._get_widths()
    for i in range(num_images):
      boxes = self.roidb[i]['boxes'].copy()
      oldx1 = boxes[:, 0].copy()
      oldx2 = boxes[:, 2].copy()
      boxes[:, 0] = widths[i] - oldx2 - 1
      boxes[:, 2] = widths[i] - oldx1 - 1
      assert (boxes[:, 2] >= boxes[:, 0]).all()
      entry = {'width': widths[i],
               'height': self.roidb[i]['height'],
               'boxes': boxes,
               'gt_classes': self.roidb[i]['gt_classes'],
               'gt_overlaps': self.roidb[i]['gt_overlaps'],
               'flipped': True,
               'seg_areas': self.roidb[i]['seg_areas']}

      self.roidb.append(entry)
    self._image_index = self._image_index * 2

  def _get_box_file(self, index):
    # first 14 chars / first 22 chars / all chars + .mat
    # COCO_val2014_0/COCO_val2014_000000447/COCO_val2014_000000447991.mat
    file_name = ('COCO_' + self._data_name +
                 '_' + str(index).zfill(12) + '.mat')
    return osp.join(file_name[:14], file_name[:22], file_name)

  def _print_detection_eval_metrics(self, coco_eval):
    IoU_lo_thresh = 0.5
    IoU_hi_thresh = 0.95

    def _get_thr_ind(coco_eval, thr):
      ind = np.where((coco_eval.params.iouThrs > thr - 1e-5) &
                     (coco_eval.params.iouThrs < thr + 1e-5))[0][0]
      iou_thr = coco_eval.params.iouThrs[ind]
      assert np.isclose(iou_thr, thr)
      return ind

    ind_lo = _get_thr_ind(coco_eval, IoU_lo_thresh)
    ind_hi = _get_thr_ind(coco_eval, IoU_hi_thresh)
    # precision has dims (iou, recall, cls, area range, max dets)
    # area range index 0: all area ranges
    # max dets index 2: 100 per image
    precision = \
      coco_eval.eval['precision'][ind_lo:(ind_hi + 1), :, :, 0, 2]
    ap_default = np.mean(precision[precision > -1])
    print(('~~~~ Mean and per-category AP @ IoU=[{:.2f},{:.2f}] '
           '~~~~').format(IoU_lo_thresh, IoU_hi_thresh))
    print('{:.1f}'.format(100 * ap_default))
    for cls_ind, cls in enumerate(self.classes):
      if cls == '__background__':
        continue
      # minus 1 because of __background__
      precision = coco_eval.eval['precision'][ind_lo:(ind_hi + 1), :, cls_ind - 1, 0, 2]
      ap = np.mean(precision[precision > -1])
      print('{:.1f}'.format(100 * ap))

    print('~~~~ Summary metrics ~~~~')
    coco_eval.summarize()

  def _do_detection_eval(self, res_file, output_dir):
    ann_type = 'bbox'
    coco_dt = self._COCO.loadRes(res_file)
    coco_eval = COCOeval(self._COCO, coco_dt)
    coco_eval.params.useSegm = (ann_type == 'segm')
    coco_eval.evaluate()
    coco_eval.accumulate()
    self._print_detection_eval_metrics(coco_eval)
    eval_file = osp.join(output_dir, 'detection_results.pkl')
    with open(eval_file, 'wb') as fid:
      pickle.dump(coco_eval, fid, pickle.HIGHEST_PROTOCOL)
    print('Wrote COCO eval results to: {}'.format(eval_file))

  def _coco_results_one_category(self, boxes, cat_id):
    results = []
    for im_ind, index in enumerate(self.image_index):
      dets = boxes[im_ind].astype(np.float)
      if dets == []:
        continue
      scores = dets[:, -1]
      xs = dets[:, 0]
      ys = dets[:, 1]
      ws = dets[:, 2] - xs + 1
      hs = dets[:, 3] - ys + 1
      results.extend(
        [{'image_id': index,
          'category_id': cat_id,
          'bbox': [xs[k], ys[k], ws[k], hs[k]],
          'score': scores[k]} for k in range(dets.shape[0])])
    return results

  def _write_coco_results_file(self, all_boxes, res_file):
    # [{"image_id": 42,
    #   "category_id": 18,
    #   "bbox": [258.15,41.29,348.26,243.78],
    #   "score": 0.236}, ...]
    results = []
    for cls_ind, cls in enumerate(self.classes):
      if cls == '__background__':
        continue
      print('Collecting {} results ({:d}/{:d})'.format(cls, cls_ind,
                                                       self.num_classes - 1))
      coco_cat_id = self._class_to_coco_cat_id[cls]
      results.extend(self._coco_results_one_category(all_boxes[cls_ind],
                                                     coco_cat_id))
    print('Writing results json to {}'.format(res_file))
    with open(res_file, 'w') as fid:
      json.dump(results, fid)

  def evaluate_detections(self, all_boxes, output_dir):
    res_file = osp.join(output_dir, ('detections_' +
                                     self._image_set +
                                     self._year +
                                     '_results'))
    if self.config['use_salt']:
      res_file += '_{}'.format(str(uuid.uuid4()))
    res_file += '.json'
    self._write_coco_results_file(all_boxes, res_file)
    # Only do evaluation on non-test sets
    if self._image_set.find('test') == -1:
      self._do_detection_eval(res_file, output_dir)
    # Optionally cleanup results json file
    if self.config['cleanup']:
      os.remove(res_file)

  def competition_mode(self, on):
    if on:
      self.config['use_salt'] = False
      self.config['cleanup'] = False
    else:
      self.config['use_salt'] = True
      self.config['cleanup'] = True


================================================
FILE: lib/datasets/ds_utils.py
================================================
# --------------------------------------------------------
# Fast/er R-CNN
# Licensed under The MIT License [see LICENSE for details]
# Written by Ross Girshick
# --------------------------------------------------------
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np


def unique_boxes(boxes, scale=1.0):
  """Return indices of unique boxes."""
  v = np.array([1, 1e3, 1e6, 1e9])
  hashes = np.round(boxes * scale).dot(v)
  _, index = np.unique(hashes, return_index=True)
  return np.sort(index)


def xywh_to_xyxy(boxes):
  """Convert [x y w h] box format to [x1 y1 x2 y2] format."""
  return np.hstack((boxes[:, 0:2], boxes[:, 0:2] + boxes[:, 2:4] - 1))


def xyxy_to_xywh(boxes):
  """Convert [x1 y1 x2 y2] box format to [x y w h] format."""
  return np.hstack((boxes[:, 0:2], boxes[:, 2:4] - boxes[:, 0:2] + 1))


def validate_boxes(boxes, width=0, height=0):
  """Check that a set of boxes are valid."""
  x1 = boxes[:, 0]
  y1 = boxes[:, 1]
  x2 = boxes[:, 2]
  y2 = boxes[:, 3]
  assert (x1 >= 0).all()
  assert (y1 >= 0).all()
  assert (x2 >= x1).all()
  assert (y2 >= y1).all()
  assert (x2 < width).all()
  assert (y2 < height).all()


def filter_small_boxes(boxes, min_size):
  w = boxes[:, 2] - boxes[:, 0]
  h = boxes[:, 3] - boxes[:, 1]
  keep = np.where((w >= min_size) & (h > min_size))[0]
  return keep


================================================
FILE: lib/datasets/factory.py
================================================
# --------------------------------------------------------
# Fast R-CNN
# Copyright (c) 2015 Microsoft
# Licensed under The MIT License [see LICENSE for details]
# Written by Ross Girshick
# --------------------------------------------------------

"""Factory method for easily getting imdbs by name."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from datasets.graspRGB import graspRGB

__sets = {}
from datasets.pascal_voc import pascal_voc
from datasets.coco import coco

import numpy as np

# Set up voc_<year>_<split> 
for year in ['2007', '2012']:
  for split in ['train', 'val', 'trainval', 'test']:
    name = 'voc_{}_{}'.format(year, split)
    __sets[name] = (lambda split=split, year=year: pascal_voc(split, year))

# Set up coco_2014_<split>
for year in ['2014']:
  for split in ['train', 'val', 'minival', 'valminusminival', 'trainval']:
    name = 'coco_{}_{}'.format(year, split)
    __sets[name] = (lambda split=split, year=year: coco(split, year))

# Set up coco_2015_<split>
for year in ['2015']:
  for split in ['test', 'test-dev']:
    name = 'coco_{}_{}'.format(year, split)
    __sets[name] = (lambda split=split, year=year: coco(split, year))

# Set up graspRGB_<split> using selective search "fast" mode # added by FC
graspRGB_devkit_path = '/media/fujenchu/home3/fasterrcnn_grasp/rgb_multibbs_5_5_5_object_tf'
for split in ['train', 'test']:
    name = '{}_{}'.format('graspRGB', split)
    __sets[name] = (lambda split=split: graspRGB(split, graspRGB_devkit_path))

def get_imdb(name):
  """Get an imdb (image database) by name."""
  if name not in __sets:
    raise KeyError('Unknown dataset: {}'.format(name))
  return __sets[name]()


def list_imdbs():
  """List all registered imdbs."""
  return list(__sets.keys())


================================================
FILE: lib/datasets/factory.py~
================================================
# --------------------------------------------------------
# Fast R-CNN
# Copyright (c) 2015 Microsoft
# Licensed under The MIT License [see LICENSE for details]
# Written by Ross Girshick
# --------------------------------------------------------

"""Factory method for easily getting imdbs by name."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from datasets.graspRGB import graspRGB

__sets = {}
from datasets.pascal_voc import pascal_voc
from datasets.coco import coco

import numpy as np

# Set up voc_<year>_<split> 
for year in ['2007', '2012']:
  for split in ['train', 'val', 'trainval', 'test']:
    name = 'voc_{}_{}'.format(year, split)
    __sets[name] = (lambda split=split, year=year: pascal_voc(split, year))

# Set up coco_2014_<split>
for year in ['2014']:
  for split in ['train', 'val', 'minival', 'valminusminival', 'trainval']:
    name = 'coco_{}_{}'.format(year, split)
    __sets[name] = (lambda split=split, year=year: coco(split, year))

# Set up coco_2015_<split>
for year in ['2015']:
  for split in ['test', 'test-dev']:
    name = 'coco_{}_{}'.format(year, split)
    __sets[name] = (lambda split=split, year=year: coco(split, year))

# Set up graspRGB_<split> using selective search "fast" mode # added by FC
graspRGB_devkit_path = '/media/fujenchu/home3/fasterrcnn_grasp/rgd_multibbs_5_5_5_object_tf'
for split in ['train', 'test']:
    name = '{}_{}'.format('graspRGB', split)
    __sets[name] = (lambda split=split: graspRGB(split, graspRGB_devkit_path))

def get_imdb(name):
  """Get an imdb (image database) by name."""
  if name not in __sets:
    raise KeyError('Unknown dataset: {}'.format(name))
  return __sets[name]()


def list_imdbs():
  """List all registered imdbs."""
  return list(__sets.keys())


================================================
FILE: lib/datasets/graspRGB.py
================================================
# --------------------------------------------------------
# Fast R-CNN
# Copyright (c) 2015 Microsoft
# Licensed under The MIT License [see LICENSE for details]
# Written by Ross Girshick
# --------------------------------------------------------

import datasets
import datasets.graspRGB
import os
from datasets.imdb import imdb
import datasets.ds_utils as ds_utils
import xml.etree.ElementTree as ET
import numpy as np
import scipy.sparse
import scipy.io as sio
import utils.cython_bbox
import cPickle
import subprocess
import uuid
from voc_eval import voc_eval

class graspRGB(imdb):
    def __init__(self, image_set, devkit_path):
        imdb.__init__(self, image_set)
        self._image_set = image_set
        self._devkit_path = devkit_path
        self._data_path = os.path.join(self._devkit_path, 'data')
        self._classes = ('__background__', # always index 0
                         'angle_01', 'angle_02', 'angle_03', 'angle_04', 'angle_05',
                         'angle_06', 'angle_07', 'angle_08', 'angle_09', 'angle_10',
                         'angle_11', 'angle_12', 'angle_13', 'angle_14', 'angle_15',
                         'angle_16', 'angle_17', 'angle_18', 'angle_19')
        self._class_to_ind = dict(zip(self.classes, xrange(self.num_classes)))
        self._image_ext = ['.jpg', '.png']
        self._image_index = self._load_image_set_index()
        # Default to roidb handler
        self._roidb_handler = self.selective_search_roidb
        self._salt = str(uuid.uuid4())
        self._comp_id = 'comp4'

        # Specific config options
        self.config = {'cleanup'     : True,
                       'use_salt'    : True,
                       'use_diff'    : False,
                       'matlab_eval' : False,
                       'rpn_file'    : None,
                       'min_size'    : 2}

        assert os.path.exists(self._devkit_path), \
                'Devkit path does not exist: {}'.format(self._devkit_path)
        assert os.path.exists(self._data_path), \
                'Path does not exist: {}'.format(self._data_path)

    def image_path_at(self, i):
        """
        Return the absolute path to image i in the image sequence.
        """
        return self.image_path_from_index(self._image_index[i])

    def image_path_from_index(self, index):
        """
        Construct an image path from the image's "index" identifier.
        """
        for ext in self._image_ext:
            image_path = os.path.join(self._data_path, 'Images',
                                  index + ext)
            if os.path.exists(image_path):
                break
        assert os.path.exists(image_path), \
                'Path does not exist: {}'.format(image_path)
	return image_path

    def _load_image_set_index(self):
        """
        Load the indexes listed in this dataset's image set file.
        """
        # Example path to image set file:
        # self._data_path + /ImageSets/val.txt
        image_set_file = os.path.join(self._data_path, 'ImageSets', 
                                      self._image_set + '.txt')
        assert os.path.exists(image_set_file), \
                'Path does not exist: {}'.format(image_set_file)
        with open(image_set_file) as f:
            image_index = [x.strip() for x in f.readlines()] 
        # if you output image_index, it's ['I00001', 'I00002', ..] all the file names, not just numbers
        return image_index

    def gt_roidb(self):
        """
        Return the database of ground-truth regions of interest.

        This function loads/saves from/to a cache file to speed up future calls.
        """
        cache_file = os.path.join(self.cache_path, self.name + '_gt_roidb.pkl')
        if os.path.exists(cache_file):
            with open(cache_file, 'rb') as fid:
                roidb = cPickle.load(fid)
            print '{} gt roidb loaded from {}'.format(self.name, cache_file)
            return roidb

        gt_roidb = [self._load_graspRGB_annotation(index)
                    for index in self.image_index]
        with open(cache_file, 'wb') as fid:
            cPickle.dump(gt_roidb, fid, cPickle.HIGHEST_PROTOCOL)
        print 'wrote gt roidb to {}'.format(cache_file)

        return gt_roidb

    def selective_search_roidb(self):
        """
        Return the database of selective search regions of interest.
        Ground-truth ROIs are also included.

        This function loads/saves from/to a cache file to speed up future calls.
        """
        cache_file = os.path.join(self.cache_path,
                                  self.name + '_selective_search_roidb.pkl')

        if os.path.exists(cache_file):
            with open(cache_file, 'rb') as fid:
                roidb = cPickle.load(fid)
            print '{} ss roidb loaded from {}'.format(self.name, cache_file)
            return roidb

        if self._image_set != 'test':
            gt_roidb = self.gt_roidb()
            ss_roidb = self._load_selective_search_roidb(gt_roidb)
            roidb = imdb.merge_roidbs(gt_roidb, ss_roidb)
        else:
            roidb = self._load_selective_search_roidb(None)
            print len(roidb)
	with open(cache_file, 'wb') as fid:
            cPickle.dump(roidb, fid, cPickle.HIGHEST_PROTOCOL)
        print 'wrote ss roidb to {}'.format(cache_file)

        return roidb

    def _load_selective_search_roidb(self, gt_roidb):
        filename = os.path.abspath(os.path.join(self._devkit_path,
                                                self.name + '.mat'))
        assert os.path.exists(filename), \
               'Selective search data not found at: {}'.format(filename)
        print filename
	raw_data = sio.loadmat(filename)['all_boxes'].ravel()

        box_list = []
        for i in xrange(raw_data.shape[0]):
            box_list.append(raw_data[i][:, (1, 0, 3, 2)] - 1)

	return self.create_roidb_from_box_list(box_list, gt_roidb)

    def selective_search_IJCV_roidb(self):
        """
        eturn the database of selective search regions of interest.
        Ground-truth ROIs are also included.

        This function loads/saves from/to a cache file to speed up future calls.
        """
        cache_file = os.path.join(self.cache_path,
                '{:s}_selective_search_IJCV_top_{:d}_roidb.pkl'.
                format(self.name, self.config['top_k']))

        if os.path.exists(cache_file):
            with open(cache_file, 'rb') as fid:
                roidb = cPickle.load(fid)
            print '{} ss roidb loaded from {}'.format(self.name, cache_file)
            return roidb

        gt_roidb = self.gt_roidb()
        ss_roidb = self._load_selective_search_IJCV_roidb(gt_roidb)
        roidb = datasets.imdb.merge_roidbs(gt_roidb, ss_roidb)
        with open(cache_file, 'wb') as fid:
            cPickle.dump(roidb, fid, cPickle.HIGHEST_PROTOCOL)
        print 'wrote ss roidb to {}'.format(cache_file)

        return roidb

    def _load_selective_search_IJCV_roidb(self, gt_roidb):
        IJCV_path = os.path.abspath(os.path.join(self.cache_path, '..',
                                                 'selective_search_IJCV_data',
                                                 self.name))
        assert os.path.exists(IJCV_path), \
               'Selective search IJCV data not found at: {}'.format(IJCV_path)

        top_k = self.config['top_k']
        box_list = []
        for i in xrange(self.num_images):
            filename = os.path.join(IJCV_path, self.image_index[i] + '.mat')
            raw_data = sio.loadmat(filename)
            box_list.append((raw_data['boxes'][:top_k, :]-1).astype(np.uint16))

        return self.create_roidb_from_box_list(box_list, gt_roidb)

    def _load_graspRGB_annotation(self, index):
        """
        Load image and bounding boxes info from txt files of graspRGB.
        """
        filename = os.path.join(self._data_path, 'Annotations', index + '.txt')
        print 'Loading: {}'.format(filename)
	with open(filename) as f:
            data = f.readlines()
 
        num_objs = len(data)
        print len(data)
        if len(data) == 0:
           print 'yooooooooo'
           import sys
           sys.exit()
        boxes = np.zeros((num_objs, 4), dtype=np.uint16)
        gt_classes = np.zeros((num_objs), dtype=np.int32)
        overlaps = np.zeros((num_objs, self.num_classes), dtype=np.float32)
        # "Seg" area for pascal is just the box area
        seg_areas = np.zeros((num_objs), dtype=np.float32)

        # Load object bounding boxes into a data frame.
        for ix, aline in enumerate(data):
            # Make pixel indexes 0-based
	    tokens = aline.strip().split()   
	    if len(tokens) != 5:
		continue
	    cls = int(tokens[0])     # this file uses 0 as the background
	    x1 = float(tokens[1]) 
	    y1 = float(tokens[2]) 
	    x2 = float(tokens[3]) 
	    y2 = float(tokens[4]) 

            # if not doing this, there is negative value when bbs around boundary of image, and when it got read back, it becomes 655xx 
            if (x1<0 and x2<0) or (y1<0 and y2<0):
               print 'yooooooooo'
               import sys
               sys.exit()
            if x1 < 0:
                x1 = 0
            if x2 < 0:
                x2 = 0
            if y1 < 0:
                y1 = 0
            if y2 < 0:
                y2 = 0

	    gt_classes[ix] = cls	 
	    boxes[ix, :] = [x1, y1, x2, y2]
	    overlaps[ix, cls] = 1.0
            seg_areas[ix] = (x2 - x1 + 1) * (y2 - y1 + 1)

        overlaps = scipy.sparse.csr_matrix(overlaps)

        return {'boxes' : boxes,
                'gt_classes': gt_classes,
                'gt_overlaps' : overlaps,
                'flipped' : False,
                'seg_areas' : seg_areas}

    def _write_graspRGB_results_file(self, all_boxes):
        use_salt = self.config['use_salt']
        comp_id = 'comp4'
        if use_salt:
            comp_id += '-{}'.format(os.getpid())

        # VOCdevkit/results/comp4-44503_det_test_aeroplane.txt
        path = os.path.join(self._devkit_path, 'results', self.name, comp_id + '_')
        for cls_ind, cls in enumerate(self.classes):
            if cls == '__background__':
                continue
            print 'Writing {} results file'.format(cls)
            filename = path + 'det_' + self._image_set + '_' + cls + '.txt'
            with open(filename, 'wt') as f:
                for im_ind, index in enumerate(self.image_index):
                    dets = all_boxes[cls_ind][im_ind]
                    if dets == []:
                        continue
                    # the VOCdevkit expects 1-based indices
                    for k in xrange(dets.shape[0]):
                        f.write('{:s} {:.3f} {:.1f} {:.1f} {:.1f} {:.1f}\n'.
                                format(index, dets[k, -1],
                                       dets[k, 0] + 1, dets[k, 1] + 1,
                                       dets[k, 2] + 1, dets[k, 3] + 1))
        return comp_id

    def _do_matlab_eval(self, comp_id, output_dir='output'):
        rm_results = self.config['cleanup']

        path = os.path.join(os.path.dirname(__file__),
                            'VOCdevkit-matlab-wrapper')
        cmd = 'cd {} && '.format(path)
        cmd += '{:s} -nodisplay -nodesktop '.format(datasets.MATLAB)
        cmd += '-r "dbstop if error; '
        cmd += 'setenv(\'LC_ALL\',\'C\'); voc_eval(\'{:s}\',\'{:s}\',\'{:s}\',\'{:s}\',{:d}); quit;"' \
               .format(self._devkit_path, comp_id,
                       self._image_set, output_dir, int(rm_results))
        print('Running:\n{}'.format(cmd))
        status = subprocess.call(cmd, shell=True)

    def evaluate_detections(self, all_boxes, output_dir):
        comp_id = self._write_graspRGB_results_file(all_boxes)
        self._do_matlab_eval(comp_id, output_dir)

    def competition_mode(self, on):
        if on:
            self.config['use_salt'] = False
            self.config['cleanup'] = False
        else:
            self.config['use_salt'] = True
            self.config['cleanup'] = True

if __name__ == '__main__':
    d = datasets.graspRGB('train', '')
    res = d.roidb
    from IPython import embed; embed()


================================================
FILE: lib/datasets/graspRGB.py~
================================================
# --------------------------------------------------------
# Fast R-CNN
# Copyright (c) 2015 Microsoft
# Licensed under The MIT License [see LICENSE for details]
# Written by Ross Girshick
# --------------------------------------------------------

import datasets
import datasets.graspRGB
import os
from datasets.imdb import imdb
import datasets.ds_utils as ds_utils
import xml.etree.ElementTree as ET
import numpy as np
import scipy.sparse
import scipy.io as sio
import utils.cython_bbox
import cPickle
import subprocess
import uuid
from voc_eval import voc_eval

class graspRGB(imdb):
    def __init__(self, image_set, devkit_path):
        imdb.__init__(self, image_set)
        self._image_set = image_set
        self._devkit_path = devkit_path
        self._data_path = os.path.join(self._devkit_path, 'data')
        self._classes = ('__background__', # always index 0
                         'angle_01', 'angle_02', 'angle_03', 'angle_04', 'angle_05',
                         'angle_06', 'angle_07', 'angle_08', 'angle_09', 'angle_10',
                         'angle_11', 'angle_12', 'angle_13', 'angle_14', 'angle_15',
                         'angle_16', 'angle_17', 'angle_18', 'angle_19')
        self._class_to_ind = dict(zip(self.classes, xrange(self.num_classes)))
        self._image_ext = ['.jpg', '.png']
        self._image_index = self._load_image_set_index()
        # Default to roidb handler
        self._roidb_handler = self.selective_search_roidb
        self._salt = str(uuid.uuid4())
        self._comp_id = 'comp4'

        # Specific config options
        self.config = {'cleanup'     : True,
                       'use_salt'    : True,
                       'use_diff'    : False,
                       'matlab_eval' : False,
                       'rpn_file'    : None,
                       'min_size'    : 2}

        assert os.path.exists(self._devkit_path), \
                'Devkit path does not exist: {}'.format(self._devkit_path)
        assert os.path.exists(self._data_path), \
                'Path does not exist: {}'.format(self._data_path)

    def image_path_at(self, i):
        """
        Return the absolute path to image i in the image sequence.
        """
        return self.image_path_from_index(self._image_index[i])

    def image_path_from_index(self, index):
        """
        Construct an image path from the image's "index" identifier.
        """
        for ext in self._image_ext:
            image_path = os.path.join(self._data_path, 'Images',
                                  index + ext)
            if os.path.exists(image_path):
                break
        assert os.path.exists(image_path), \
                'Path does not exist: {}'.format(image_path)
	return image_path

    def _load_image_set_index(self):
        """
        Load the indexes listed in this dataset's image set file.
        """
        # Example path to image set file:
        # self._data_path + /ImageSets/val.txt
        image_set_file = os.path.join(self._data_path, 'ImageSets', 
                                      self._image_set + '.txt')
        assert os.path.exists(image_set_file), \
                'Path does not exist: {}'.format(image_set_file)
        with open(image_set_file) as f:
            image_index = [x.strip() for x in f.readlines()] 
        # if you output image_index, it's ['I00001', 'I00002', ..] all the file names, not just numbers
        return image_index

    def gt_roidb(self):
        """
        Return the database of ground-truth regions of interest.

        This function loads/saves from/to a cache file to speed up future calls.
        """
        cache_file = os.path.join(self.cache_path, self.name + '_gt_roidb.pkl')
        if os.path.exists(cache_file):
            with open(cache_file, 'rb') as fid:
                roidb = cPickle.load(fid)
            print '{} gt roidb loaded from {}'.format(self.name, cache_file)
            return roidb

        gt_roidb = [self._load_graspRGB_annotation(index)
                    for index in self.image_index]
        with open(cache_file, 'wb') as fid:
            cPickle.dump(gt_roidb, fid, cPickle.HIGHEST_PROTOCOL)
        print 'wrote gt roidb to {}'.format(cache_file)

        return gt_roidb

    def selective_search_roidb(self):
        """
        Return the database of selective search regions of interest.
        Ground-truth ROIs are also included.

        This function loads/saves from/to a cache file to speed up future calls.
        """
        cache_file = os.path.join(self.cache_path,
                                  self.name + '_selective_search_roidb.pkl')

        if os.path.exists(cache_file):
            with open(cache_file, 'rb') as fid:
                roidb = cPickle.load(fid)
            print '{} ss roidb loaded from {}'.format(self.name, cache_file)
            return roidb

        if self._image_set != 'test':
            gt_roidb = self.gt_roidb()
            ss_roidb = self._load_selective_search_roidb(gt_roidb)
            roidb = imdb.merge_roidbs(gt_roidb, ss_roidb)
        else:
            roidb = self._load_selective_search_roidb(None)
            print len(roidb)
	with open(cache_file, 'wb') as fid:
            cPickle.dump(roidb, fid, cPickle.HIGHEST_PROTOCOL)
        print 'wrote ss roidb to {}'.format(cache_file)

        return roidb

    def _load_selective_search_roidb(self, gt_roidb):
        filename = os.path.abspath(os.path.join(self._devkit_path,
                                                self.name + '.mat'))
        assert os.path.exists(filename), \
               'Selective search data not found at: {}'.format(filename)
        print filename
	raw_data = sio.loadmat(filename)['all_boxes'].ravel()

        box_list = []
        for i in xrange(raw_data.shape[0]):
            box_list.append(raw_data[i][:, (1, 0, 3, 2)] - 1)

	return self.create_roidb_from_box_list(box_list, gt_roidb)

    def selective_search_IJCV_roidb(self):
        """
        eturn the database of selective search regions of interest.
        Ground-truth ROIs are also included.

        This function loads/saves from/to a cache file to speed up future calls.
        """
        cache_file = os.path.join(self.cache_path,
                '{:s}_selective_search_IJCV_top_{:d}_roidb.pkl'.
                format(self.name, self.config['top_k']))

        if os.path.exists(cache_file):
            with open(cache_file, 'rb') as fid:
                roidb = cPickle.load(fid)
            print '{} ss roidb loaded from {}'.format(self.name, cache_file)
            return roidb

        gt_roidb = self.gt_roidb()
        ss_roidb = self._load_selective_search_IJCV_roidb(gt_roidb)
        roidb = datasets.imdb.merge_roidbs(gt_roidb, ss_roidb)
        with open(cache_file, 'wb') as fid:
            cPickle.dump(roidb, fid, cPickle.HIGHEST_PROTOCOL)
        print 'wrote ss roidb to {}'.format(cache_file)

        return roidb

    def _load_selective_search_IJCV_roidb(self, gt_roidb):
        IJCV_path = os.path.abspath(os.path.join(self.cache_path, '..',
                                                 'selective_search_IJCV_data',
                                                 self.name))
        assert os.path.exists(IJCV_path), \
               'Selective search IJCV data not found at: {}'.format(IJCV_path)

        top_k = self.config['top_k']
        box_list = []
        for i in xrange(self.num_images):
            filename = os.path.join(IJCV_path, self.image_index[i] + '.mat')
            raw_data = sio.loadmat(filename)
            box_list.append((raw_data['boxes'][:top_k, :]-1).astype(np.uint16))

        return self.create_roidb_from_box_list(box_list, gt_roidb)

    def _load_graspRGB_annotation(self, index):
        """
        Load image and bounding boxes info from txt files of graspRGB.
        """
        filename = os.path.join(self._data_path, 'Annotations', index + '.txt')
        print 'Loading: {}'.format(filename)
	with open(filename) as f:
            data = f.readlines()
 
        num_objs = len(data)
        print len(data)
        if len(data) == 0:
           print 'yooooooooo'
           import sys
           sys.exit()
        boxes = np.zeros((num_objs, 4), dtype=np.uint16)
        gt_classes = np.zeros((num_objs), dtype=np.int32)
        overlaps = np.zeros((num_objs, self.num_classes), dtype=np.float32)
        # "Seg" area for pascal is just the box area
        seg_areas = np.zeros((num_objs), dtype=np.float32)

        # Load object bounding boxes into a data frame.
        for ix, aline in enumerate(data):
            # Make pixel indexes 0-based
	    tokens = aline.strip().split()   
	    if len(tokens) != 5:
		continue
	    cls = float(tokens[0])     # this file uses 0 as the background
	    x1 = float(tokens[1]) 
	    y1 = float(tokens[2]) 
	    x2 = float(tokens[3]) 
	    y2 = float(tokens[4]) 

            # if not doing this, there is negative value when bbs around boundary of image, and when it got read back, it becomes 655xx 
            if (x1<0 and x2<0) or (y1<0 and y2<0):
               print 'yooooooooo'
               import sys
               sys.exit()
            if x1 < 0:
                x1 = 0
            if x2 < 0:
                x2 = 0
            if y1 < 0:
                y1 = 0
            if y2 < 0:
                y2 = 0

	    gt_classes[ix] = cls	 
	    boxes[ix, :] = [x1, y1, x2, y2]
	    overlaps[ix, cls] = 1.0
            seg_areas[ix] = (x2 - x1 + 1) * (y2 - y1 + 1)

        overlaps = scipy.sparse.csr_matrix(overlaps)

        return {'boxes' : boxes,
                'gt_classes': gt_classes,
                'gt_overlaps' : overlaps,
                'flipped' : False,
                'seg_areas' : seg_areas}

    def _write_graspRGB_results_file(self, all_boxes):
        use_salt = self.config['use_salt']
        comp_id = 'comp4'
        if use_salt:
            comp_id += '-{}'.format(os.getpid())

        # VOCdevkit/results/comp4-44503_det_test_aeroplane.txt
        path = os.path.join(self._devkit_path, 'results', self.name, comp_id + '_')
        for cls_ind, cls in enumerate(self.classes):
            if cls == '__background__':
                continue
            print 'Writing {} results file'.format(cls)
            filename = path + 'det_' + self._image_set + '_' + cls + '.txt'
            with open(filename, 'wt') as f:
                for im_ind, index in enumerate(self.image_index):
                    dets = all_boxes[cls_ind][im_ind]
                    if dets == []:
                        continue
                    # the VOCdevkit expects 1-based indices
                    for k in xrange(dets.shape[0]):
                        f.write('{:s} {:.3f} {:.1f} {:.1f} {:.1f} {:.1f}\n'.
                                format(index, dets[k, -1],
                                       dets[k, 0] + 1, dets[k, 1] + 1,
                                       dets[k, 2] + 1, dets[k, 3] + 1))
        return comp_id

    def _do_matlab_eval(self, comp_id, output_dir='output'):
        rm_results = self.config['cleanup']

        path = os.path.join(os.path.dirname(__file__),
                            'VOCdevkit-matlab-wrapper')
        cmd = 'cd {} && '.format(path)
        cmd += '{:s} -nodisplay -nodesktop '.format(datasets.MATLAB)
        cmd += '-r "dbstop if error; '
        cmd += 'setenv(\'LC_ALL\',\'C\'); voc_eval(\'{:s}\',\'{:s}\',\'{:s}\',\'{:s}\',{:d}); quit;"' \
               .format(self._devkit_path, comp_id,
                       self._image_set, output_dir, int(rm_results))
        print('Running:\n{}'.format(cmd))
        status = subprocess.call(cmd, shell=True)

    def evaluate_detections(self, all_boxes, output_dir):
        comp_id = self._write_graspRGB_results_file(all_boxes)
        self._do_matlab_eval(comp_id, output_dir)

    def competition_mode(self, on):
        if on:
            self.config['use_salt'] = False
            self.config['cleanup'] = False
        else:
            self.config['use_salt'] = True
            self.config['cleanup'] = True

if __name__ == '__main__':
    d = datasets.graspRGB('train', '')
    res = d.roidb
    from IPython import embed; embed()


================================================
FILE: lib/datasets/imdb.py
================================================
# --------------------------------------------------------
# Fast R-CNN
# Copyright (c) 2015 Microsoft
# Licensed under The MIT License [see LICENSE for details]
# Written by Ross Girshick and Xinlei Chen
# --------------------------------------------------------
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import os
import os.path as osp
import PIL
from utils.cython_bbox import bbox_overlaps
import numpy as np
import scipy.sparse
from model.config import cfg


class imdb(object):
  """Image database."""

  def __init__(self, name, classes=None):
    self._name = name
    self._num_classes = 0
    if not classes:
      self._classes = []
    else:
      self._classes = classes
    self._image_index = []
    self._obj_proposer = 'gt'
    self._roidb = None
    self._roidb_handler = self.default_roidb
    # Use this dict for storing dataset specific config options
    self.config = {}

  @property
  def name(self):
    return self._name

  @property
  def num_classes(self):
    return len(self._classes)

  @property
  def classes(self):
    return self._classes

  @property
  def image_index(self):
    return self._image_index

  @property
  def roidb_handler(self):
    return self._roidb_handler

  @roidb_handler.setter
  def roidb_handler(self, val):
    self._roidb_handler = val

  def set_proposal_method(self, method):
    method = eval('self.' + method + '_roidb')
    self.roidb_handler = method

  @property
  def roidb(self):
    # A roidb is a list of dictionaries, each with the following keys:
    #   boxes
    #   gt_overlaps
    #   gt_classes
    #   flipped
    if self._roidb is not None:
      return self._roidb
    self._roidb = self.roidb_handler()
    return self._roidb

  @property
  def cache_path(self):
    cache_path = osp.abspath(osp.join(cfg.DATA_DIR, 'cache'))
    if not os.path.exists(cache_path):
      os.makedirs(cache_path)
    return cache_path

  @property
  def num_images(self):
    return len(self.image_index)

  def image_path_at(self, i):
    raise NotImplementedError

  def default_roidb(self):
    raise NotImplementedError

  def evaluate_detections(self, all_boxes, output_dir=None):
    """
    all_boxes is a list of length number-of-classes.
    Each list element is a list of length number-of-images.
    Each of those list elements is either an empty list []
    or a numpy array of detection.

    all_boxes[class][image] = [] or np.array of shape #dets x 5
    """
    raise NotImplementedError

  def _get_widths(self):
    return [PIL.Image.open(self.image_path_at(i)).size[0]
            for i in range(self.num_images)]

  def append_flipped_images(self):
    num_images = self.num_images
    widths = self._get_widths()
    for i in range(num_images):
      boxes = self.roidb[i]['boxes'].copy()
      oldx1 = boxes[:, 0].copy()
      oldx2 = boxes[:, 2].copy()
      boxes[:, 0] = widths[i] - oldx2 - 1
      boxes[:, 2] = widths[i] - oldx1 - 1
      assert (boxes[:, 2] >= boxes[:, 0]).all()
      entry = {'boxes': boxes,
               'gt_overlaps': self.roidb[i]['gt_overlaps'],
               'gt_classes': self.roidb[i]['gt_classes'],
               'flipped': True}
      self.roidb.append(entry)
    self._image_index = self._image_index * 2

  def evaluate_recall(self, candidate_boxes=None, thresholds=None,
                      area='all', limit=None):
    """Evaluate detection proposal recall metrics.

    Returns:
        results: dictionary of results with keys
            'ar': average recall
            'recalls': vector recalls at each IoU overlap threshold
            'thresholds': vector of IoU overlap thresholds
            'gt_overlaps': vector of all ground-truth overlaps
    """
    # Record max overlap value for each gt box
    # Return vector of overlap values
    areas = {'all': 0, 'small': 1, 'medium': 2, 'large': 3,
             '96-128': 4, '128-256': 5, '256-512': 6, '512-inf': 7}
    area_ranges = [[0 ** 2, 1e5 ** 2],  # all
                   [0 ** 2, 32 ** 2],  # small
                   [32 ** 2, 96 ** 2],  # medium
                   [96 ** 2, 1e5 ** 2],  # large
                   [96 ** 2, 128 ** 2],  # 96-128
                   [128 ** 2, 256 ** 2],  # 128-256
                   [256 ** 2, 512 ** 2],  # 256-512
                   [512 ** 2, 1e5 ** 2],  # 512-inf
                   ]
    assert area in areas, 'unknown area range: {}'.format(area)
    area_range = area_ranges[areas[area]]
    gt_overlaps = np.zeros(0)
    num_pos = 0
    for i in range(self.num_images):
      # Checking for max_overlaps == 1 avoids including crowd annotations
      # (...pretty hacking :/)
      max_gt_overlaps = self.roidb[i]['gt_overlaps'].toarray().max(axis=1)
      gt_inds = np.where((self.roidb[i]['gt_classes'] > 0) &
                         (max_gt_overlaps == 1))[0]
      gt_boxes = self.roidb[i]['boxes'][gt_inds, :]
      gt_areas = self.roidb[i]['seg_areas'][gt_inds]
      valid_gt_inds = np.where((gt_areas >= area_range[0]) &
                               (gt_areas <= area_range[1]))[0]
      gt_boxes = gt_boxes[valid_gt_inds, :]
      num_pos += len(valid_gt_inds)

      if candidate_boxes is None:
        # If candidate_boxes is not supplied, the default is to use the
        # non-ground-truth boxes from this roidb
        non_gt_inds = np.where(self.roidb[i]['gt_classes'] == 0)[0]
        boxes = self.roidb[i]['boxes'][non_gt_inds, :]
      else:
        boxes = candidate_boxes[i]
      if boxes.shape[0] == 0:
        continue
      if limit is not None and boxes.shape[0] > limit:
        boxes = boxes[:limit, :]

      overlaps = bbox_overlaps(boxes.astype(np.float),
                               gt_boxes.astype(np.float))

      _gt_overlaps = np.zeros((gt_boxes.shape[0]))
      for j in range(gt_boxes.shape[0]):
        # find which proposal box maximally covers each gt box
        argmax_overlaps = overlaps.argmax(axis=0)
        # and get the iou amount of coverage for each gt box
        max_overlaps = overlaps.max(axis=0)
        # find which gt box is 'best' covered (i.e. 'best' = most iou)
        gt_ind = max_overlaps.argmax()
        gt_ovr = max_overlaps.max()
        assert (gt_ovr >= 0)
        # find the proposal box that covers the best covered gt box
        box_ind = argmax_overlaps[gt_ind]
        # record the iou coverage of this gt box
        _gt_overlaps[j] = overlaps[box_ind, gt_ind]
        assert (_gt_overlaps[j] == gt_ovr)
        # mark the proposal box and the gt box as used
        overlaps[box_ind, :] = -1
        overlaps[:, gt_ind] = -1
      # append recorded iou coverage level
      gt_overlaps = np.hstack((gt_overlaps, _gt_overlaps))

    gt_overlaps = np.sort(gt_overlaps)
    if thresholds is None:
      step = 0.05
      thresholds = np.arange(0.5, 0.95 + 1e-5, step)
    recalls = np.zeros_like(thresholds)
    # compute recall for each iou threshold
    for i, t in enumerate(thresholds):
      recalls[i] = (gt_overlaps >= t).sum() / float(num_pos)
    # ar = 2 * np.trapz(recalls, thresholds)
    ar = recalls.mean()
    return {'ar': ar, 'recalls': recalls, 'thresholds': thresholds,
            'gt_overlaps': gt_overlaps}

  def create_roidb_from_box_list(self, box_list, gt_roidb):
    assert len(box_list) == self.num_images, \
      'Number of boxes must match number of ground-truth images'
    roidb = []
    for i in range(self.num_images):
      boxes = box_list[i]
      num_boxes = boxes.shape[0]
      overlaps = np.zeros((num_boxes, self.num_classes), dtype=np.float32)

      if gt_roidb is not None and gt_roidb[i]['boxes'].size > 0:
        gt_boxes = gt_roidb[i]['boxes']
        gt_classes = gt_roidb[i]['gt_classes']
        gt_overlaps = bbox_overlaps(boxes.astype(np.float),
                                    gt_boxes.astype(np.float))
        argmaxes = gt_overlaps.argmax(axis=1)
        maxes = gt_overlaps.max(axis=1)
        I = np.where(maxes > 0)[0]
        overlaps[I, gt_classes[argmaxes[I]]] = maxes[I]

      overlaps = scipy.sparse.csr_matrix(overlaps)
      roidb.append({
        'boxes': boxes,
        'gt_classes': np.zeros((num_boxes,), dtype=np.int32),
        'gt_overlaps': overlaps,
        'flipped': False,
        'seg_areas': np.zeros((num_boxes,), dtype=np.float32),
      })
    return roidb

  @staticmethod
  def merge_roidbs(a, b):
    assert len(a) == len(b)
    for i in range(len(a)):
      a[i]['boxes'] = np.vstack((a[i]['boxes'], b[i]['boxes']))
      a[i]['gt_classes'] = np.hstack((a[i]['gt_classes'],
                                      b[i]['gt_classes']))
      a[i]['gt_overlaps'] = scipy.sparse.vstack([a[i]['gt_overlaps'],
                                                 b[i]['gt_overlaps']])
      a[i]['seg_areas'] = np.hstack((a[i]['seg_areas'],
                                     b[i]['seg_areas']))
    return a

  def competition_mode(self, on):
    """Turn competition mode on or off."""
    pass


================================================
FILE: lib/datasets/pascal_voc.py
================================================
# --------------------------------------------------------
# Fast R-CNN
# Copyright (c) 2015 Microsoft
# Licensed under The MIT License [see LICENSE for details]
# Written by Ross Girshick and Xinlei Chen
# --------------------------------------------------------
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import os
from datasets.imdb import imdb
import datasets.ds_utils as ds_utils
import xml.etree.ElementTree as ET
import numpy as np
import scipy.sparse
import scipy.io as sio
import utils.cython_bbox
import pickle
import subprocess
import uuid
from .voc_eval import voc_eval
from model.config import cfg


class pascal_voc(imdb):
  def __init__(self, image_set, year, devkit_path=None):
    imdb.__init__(self, 'voc_' + year + '_' + image_set)
    self._year = year
    self._image_set = image_set
    self._devkit_path = self._get_default_path() if devkit_path is None \
      else devkit_path
    self._data_path = os.path.join(self._devkit_path, 'VOC' + self._year)
    self._classes = ('__background__',  # always index 0
                     'aeroplane', 'bicycle', 'bird', 'boat',
                     'bottle', 'bus', 'car', 'cat', 'chair',
                     'cow', 'diningtable', 'dog', 'horse',
                     'motorbike', 'person', 'pottedplant',
                     'sheep', 'sofa', 'train', 'tvmonitor')
    self._class_to_ind = dict(list(zip(self.classes, list(range(self.num_classes)))))
    self._image_ext = '.jpg'
    self._image_index = self._load_image_set_index()
    # Default to roidb handler
    self._roidb_handler = self.gt_roidb
    self._salt = str(uuid.uuid4())
    self._comp_id = 'comp4'

    # PASCAL specific config options
    self.config = {'cleanup': True,
                   'use_salt': True,
                   'use_diff': False,
                   'matlab_eval': False,
                   'rpn_file': None}

    assert os.path.exists(self._devkit_path), \
      'VOCdevkit path does not exist: {}'.format(self._devkit_path)
    assert os.path.exists(self._data_path), \
      'Path does not exist: {}'.format(self._data_path)

  def image_path_at(self, i):
    """
    Return the absolute path to image i in the image sequence.
    """
    return self.image_path_from_index(self._image_index[i])

  def image_path_from_index(self, index):
    """
    Construct an image path from the image's "index" identifier.
    """
    image_path = os.path.join(self._data_path, 'JPEGImages',
                              index + self._image_ext)
    assert os.path.exists(image_path), \
      'Path does not exist: {}'.format(image_path)
    return image_path

  def _load_image_set_index(self):
    """
    Load the indexes listed in this dataset's image set file.
    """
    # Example path to image set file:
    # self._devkit_path + /VOCdevkit2007/VOC2007/ImageSets/Main/val.txt
    image_set_file = os.path.join(self._data_path, 'ImageSets', 'Main',
                                  self._image_set + '.txt')
    assert os.path.exists(image_set_file), \
      'Path does not exist: {}'.format(image_set_file)
    with open(image_set_file) as f:
      image_index = [x.strip() for x in f.readlines()]
    return image_index

  def _get_default_path(self):
    """
    Return the default path where PASCAL VOC is expected to be installed.
    """
    return os.path.join(cfg.DATA_DIR, 'VOCdevkit' + self._year)

  def gt_roidb(self):
    """
    Return the database of ground-truth regions of interest.

    This function loads/saves from/to a cache file to speed up future calls.
    """
    cache_file = os.path.join(self.cache_path, self.name + '_gt_roidb.pkl')
    if os.path.exists(cache_file):
      with open(cache_file, 'rb') as fid:
        try:
          roidb = pickle.load(fid)
        except:
          roidb = pickle.load(fid, encoding='bytes')
      print('{} gt roidb loaded from {}'.format(self.name, cache_file))
      return roidb

    gt_roidb = [self._load_pascal_annotation(index)
                for index in self.image_index]
    with open(cache_file, 'wb') as fid:
      pickle.dump(gt_roidb, fid, pickle.HIGHEST_PROTOCOL)
    print('wrote gt roidb to {}'.format(cache_file))

    return gt_roidb

  def rpn_roidb(self):
    if int(self._year) == 2007 or self._image_set != 'test':
      gt_roidb = self.gt_roidb()
      rpn_roidb = self._load_rpn_roidb(gt_roidb)
      roidb = imdb.merge_roidbs(gt_roidb, rpn_roidb)
    else:
      roidb = self._load_rpn_roidb(None)

    return roidb

  def _load_rpn_roidb(self, gt_roidb):
    filename = self.config['rpn_file']
    print('loading {}'.format(filename))
    assert os.path.exists(filename), \
      'rpn data not found at: {}'.format(filename)
    with open(filename, 'rb') as f:
      box_list = pickle.load(f)
    return self.create_roidb_from_box_list(box_list, gt_roidb)

  def _load_pascal_annotation(self, index):
    """
    Load image and bounding boxes info from XML file in the PASCAL VOC
    format.
    """
    filename = os.path.join(self._data_path, 'Annotations', index + '.xml')
    tree = ET.parse(filename)
    objs = tree.findall('object')
    if not self.config['use_diff']:
      # Exclude the samples labeled as difficult
      non_diff_objs = [
        obj for obj in objs if int(obj.find('difficult').text) == 0]
      # if len(non_diff_objs) != len(objs):
      #     print 'Removed {} difficult objects'.format(
      #         len(objs) - len(non_diff_objs))
      objs = non_diff_objs
    num_objs = len(objs)

    boxes = np.zeros((num_objs, 4), dtype=np.uint16)
    gt_classes = np.zeros((num_objs), dtype=np.int32)
    overlaps = np.zeros((num_objs, self.num_classes), dtype=np.float32)
    # "Seg" area for pascal is just the box area
    seg_areas = np.zeros((num_objs), dtype=np.float32)

    # Load object bounding boxes into a data frame.
    for ix, obj in enumerate(objs):
      bbox = obj.find('bndbox')
      # Make pixel indexes 0-based
      x1 = float(bbox.find('xmin').text) - 1
      y1 = float(bbox.find('ymin').text) - 1
      x2 = float(bbox.find('xmax').text) - 1
      y2 = float(bbox.find('ymax').text) - 1
      cls = self._class_to_ind[obj.find('name').text.lower().strip()]
      boxes[ix, :] = [x1, y1, x2, y2]
      gt_classes[ix] = cls
      overlaps[ix, cls] = 1.0
      seg_areas[ix] = (x2 - x1 + 1) * (y2 - y1 + 1)

    overlaps = scipy.sparse.csr_matrix(overlaps)

    return {'boxes': boxes,
            'gt_classes': gt_classes,
            'gt_overlaps': overlaps,
            'flipped': False,
            'seg_areas': seg_areas}

  def _get_comp_id(self):
    comp_id = (self._comp_id + '_' + self._salt if self.config['use_salt']
               else self._comp_id)
    return comp_id

  def _get_voc_results_file_template(self):
    # VOCdevkit/results/VOC2007/Main/<comp_id>_det_test_aeroplane.txt
    filename = self._get_comp_id() + '_det_' + self._image_set + '_{:s}.txt'
    path = os.path.join(
      self._devkit_path,
      'results',
      'VOC' + self._year,
      'Main',
      filename)
    return path

  def _write_voc_results_file(self, all_boxes):
    for cls_ind, cls in enumerate(self.classes):
      if cls == '__background__':
        continue
      print('Writing {} VOC results file'.format(cls))
      filename = self._get_voc_results_file_template().format(cls)
      with open(filename, 'wt') as f:
        for im_ind, index in enumerate(self.image_index):
          dets = all_boxes[cls_ind][im_ind]
          if dets == []:
            continue
          # the VOCdevkit expects 1-based indices
          for k in range(dets.shape[0]):
            f.write('{:s} {:.3f} {:.1f} {:.1f} {:.1f} {:.1f}\n'.
                    format(index, dets[k, -1],
                           dets[k, 0] + 1, dets[k, 1] + 1,
                           dets[k, 2] + 1, dets[k, 3] + 1))

  def _do_python_eval(self, output_dir='output'):
    annopath = os.path.join(
      self._devkit_path,
      'VOC' + self._year,
      'Annotations',
      '{:s}.xml')
    imagesetfile = os.path.join(
      self._devkit_path,
      'VOC' + self._year,
      'ImageSets',
      'Main',
      self._image_set + '.txt')
    cachedir = os.path.join(self._devkit_path, 'annotations_cache')
    aps = []
    # The PASCAL VOC metric changed in 2010
    use_07_metric = True if int(self._year) < 2010 else False
    print('VOC07 metric? ' + ('Yes' if use_07_metric else 'No'))
    if not os.path.isdir(output_dir):
      os.mkdir(output_dir)
    for i, cls in enumerate(self._classes):
      if cls == '__background__':
        continue
      filename = self._get_voc_results_file_template().format(cls)
      rec, prec, ap = voc_eval(
        filename, annopath, imagesetfile, cls, cachedir, ovthresh=0.5,
        use_07_metric=use_07_metric)
      aps += [ap]
      print(('AP for {} = {:.4f}'.format(cls, ap)))
      with open(os.path.join(output_dir, cls + '_pr.pkl'), 'wb') as f:
        pickle.dump({'rec': rec, 'prec': prec, 'ap': ap}, f)
    print(('Mean AP = {:.4f}'.format(np.mean(aps))))
    print('~~~~~~~~')
    print('Results:')
    for ap in aps:
      print(('{:.3f}'.format(ap)))
    print(('{:.3f}'.format(np.mean(aps))))
    print('~~~~~~~~')
    print('')
    print('--------------------------------------------------------------')
    print('Results computed with the **unofficial** Python eval code.')
    print('Results should be very close to the official MATLAB eval code.')
    print('Recompute with `./tools/reval.py --matlab ...` for your paper.')
    print('-- Thanks, The Management')
    print('--------------------------------------------------------------')

  def _do_matlab_eval(self, output_dir='output'):
    print('-----------------------------------------------------')
    print('Computing results with the official MATLAB eval code.')
    print('-----------------------------------------------------')
    path = os.path.join(cfg.ROOT_DIR, 'lib', 'datasets',
                        'VOCdevkit-matlab-wrapper')
    cmd = 'cd {} && '.format(path)
    cmd += '{:s} -nodisplay -nodesktop '.format(cfg.MATLAB)
    cmd += '-r "dbstop if error; '
    cmd += 'voc_eval(\'{:s}\',\'{:s}\',\'{:s}\',\'{:s}\'); quit;"' \
      .format(self._devkit_path, self._get_comp_id(),
              self._image_set, output_dir)
    print(('Running:\n{}'.format(cmd)))
    status = subprocess.call(cmd, shell=True)

  def evaluate_detections(self, all_boxes, output_dir):
    self._write_voc_results_file(all_boxes)
    self._do_python_eval(output_dir)
    if self.config['matlab_eval']:
      self._do_matlab_eval(output_dir)
    if self.config['cleanup']:
      for cls in self._classes:
        if cls == '__background__':
          continue
        filename = self._get_voc_results_file_template().format(cls)
        os.remove(filename)

  def competition_mode(self, on):
    if on:
      self.config['use_salt'] = False
      self.config['cleanup'] = False
    else:
      self.config['use_salt'] = True
      self.config['cleanup'] = True


if __name__ == '__main__':
  from datasets.pascal_voc import pascal_voc

  d = pascal_voc('trainval', '2007')
  res = d.roidb
  from IPython import embed;

  embed()


================================================
FILE: lib/datasets/tools/mcg_munge.py
================================================
import os
import sys

"""Hacky tool to convert file system layout of MCG boxes downloaded from
http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/mcg/
so that it's consistent with those computed by Jan Hosang (see:
http://www.mpi-inf.mpg.de/departments/computer-vision-and-multimodal-
  computing/research/object-recognition-and-scene-understanding/how-
  good-are-detection-proposals-really/)

NB: Boxes from the MCG website are in (y1, x1, y2, x2) order.
Boxes from Hosang et al. are in (x1, y1, x2, y2) order.
"""

def munge(src_dir):
    # stored as: ./MCG-COCO-val2014-boxes/COCO_val2014_000000193401.mat
    # want:      ./MCG/mat/COCO_val2014_0/COCO_val2014_000000141/COCO_val2014_000000141334.mat

    files = os.listdir(src_dir)
    for fn in files:
        base, ext = os.path.splitext(fn)
        # first 14 chars / first 22 chars / all chars + .mat
        # COCO_val2014_0/COCO_val2014_000000447/COCO_val2014_000000447991.mat
        first = base[:14]
        second = base[:22]
        dst_dir = os.path.join('MCG', 'mat', first, second)
        if not os.path.exists(dst_dir):
            os.makedirs(dst_dir)
        src = os.path.join(src_dir, fn)
        dst = os.path.join(dst_dir, fn)
        print 'MV: {} -> {}'.format(src, dst)
        os.rename(src, dst)

if __name__ == '__main__':
    # src_dir should look something like:
    #  src_dir = 'MCG-COCO-val2014-boxes'
    src_dir = sys.argv[1]
    munge(src_dir)


================================================
FILE: lib/datasets/voc_eval.py
================================================
# --------------------------------------------------------
# Fast/er R-CNN
# Licensed under The MIT License [see LICENSE for details]
# Written by Bharath Hariharan
# --------------------------------------------------------
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import xml.etree.ElementTree as ET
import os
import pickle
import numpy as np

def parse_rec(filename):
  """ Parse a PASCAL VOC xml file """
  tree = ET.parse(filename)
  objects = []
  for obj in tree.findall('object'):
    obj_struct = {}
    obj_struct['name'] = obj.find('name').text
    obj_struct['pose'] = obj.find('pose').text
    obj_struct['truncated'] = int(obj.find('truncated').text)
    obj_struct['difficult'] = int(obj.find('difficult').text)
    bbox = obj.find('bndbox')
    obj_struct['bbox'] = [int(bbox.find('xmin').text),
                          int(bbox.find('ymin').text),
                          int(bbox.find('xmax').text),
                          int(bbox.find('ymax').text)]
    objects.append(obj_struct)

  return objects


def voc_ap(rec, prec, use_07_metric=False):
  """ ap = voc_ap(rec, prec, [use_07_metric])
  Compute VOC AP given precision and recall.
  If use_07_metric is true, uses the
  VOC 07 11 point method (default:False).
  """
  if use_07_metric:
    # 11 point metric
    ap = 0.
    for t in np.arange(0., 1.1, 0.1):
      if np.sum(rec >= t) == 0:
        p = 0
      else:
        p = np.max(prec[rec >= t])
      ap = ap + p / 11.
  else:
    # correct AP calculation
    # first append sentinel values at the end
    mrec = np.concatenate(([0.], rec, [1.]))
    mpre = np.concatenate(([0.], prec, [0.]))

    # compute the precision envelope
    for i in range(mpre.size - 1, 0, -1):
      mpre[i - 1] = np.maximum(mpre[i - 1], mpre[i])

    # to calculate area under PR curve, look for points
    # where X axis (recall) changes value
    i = np.where(mrec[1:] != mrec[:-1])[0]

    # and sum (\Delta recall) * prec
    ap = np.sum((mrec[i + 1] - mrec[i]) * mpre[i + 1])
  return ap


def voc_eval(detpath,
             annopath,
             imagesetfile,
             classname,
             cachedir,
             ovthresh=0.5,
             use_07_metric=False):
  """rec, prec, ap = voc_eval(detpath,
                              annopath,
                              imagesetfile,
                              classname,
                              [ovthresh],
                              [use_07_metric])

  Top level function that does the PASCAL VOC evaluation.

  detpath: Path to detections
      detpath.format(classname) should produce the detection results file.
  annopath: Path to annotations
      annopath.format(imagename) should be the xml annotations file.
  imagesetfile: Text file containing the list of images, one image per line.
  classname: Category name (duh)
  cachedir: Directory for caching the annotations
  [ovthresh]: Overlap threshold (default = 0.5)
  [use_07_metric]: Whether to use VOC07's 11 point AP computation
      (default False)
  """
  # assumes detections are in detpath.format(classname)
  # assumes annotations are in annopath.format(imagename)
  # assumes imagesetfile is a text file with each line an image name
  # cachedir caches the annotations in a pickle file

  # first load gt
  if not os.path.isdir(cachedir):
    os.mkdir(cachedir)
  cachefile = os.path.join(cachedir, 'annots.pkl')
  # read list of images
  with open(imagesetfile, 'r') as f:
    lines = f.readlines()
  imagenames = [x.strip() for x in lines]

  if not os.path.isfile(cachefile):
    # load annots
    recs = {}
    for i, imagename in enumerate(imagenames):
      recs[imagename] = parse_rec(annopath.format(imagename))
      if i % 100 == 0:
        print('Reading annotation for {:d}/{:d}'.format(
          i + 1, len(imagenames)))
    # save
    print('Saving cached annotations to {:s}'.format(cachefile))
    with open(cachefile, 'w') as f:
      pickle.dump(recs, f)
  else:
    # load
    with open(cachefile, 'rb') as f:
      try:
        recs = pickle.load(f)
      except:
        recs = pickle.load(f, encoding='bytes')

  # extract gt objects for this class
  class_recs = {}
  npos = 0
  for imagename in imagenames:
    R = [obj for obj in recs[imagename] if obj['name'] == classname]
    bbox = np.array([x['bbox'] for x in R])
    difficult = np.array([x['difficult'] for x in R]).astype(np.bool)
    det = [False] * len(R)
    npos = npos + sum(~difficult)
    class_recs[imagename] = {'bbox': bbox,
                             'difficult': difficult,
                             'det': det}

  # read dets
  detfile = detpath.format(classname)
  with open(detfile, 'r') as f:
    lines = f.readlines()

  splitlines = [x.strip().split(' ') for x in lines]
  image_ids = [x[0] for x in splitlines]
  confidence = np.array([float(x[1]) for x in splitlines])
  BB = np.array([[float(z) for z in x[2:]] for x in splitlines])

  nd = len(image_ids)
  tp = np.zeros(nd)
  fp = np.zeros(nd)

  if BB.shape[0] > 0:
    # sort by confidence
    sorted_ind = np.argsort(-confidence)
    sorted_scores = np.sort(-confidence)
    BB = BB[sorted_ind, :]
    image_ids = [image_ids[x] for x in sorted_ind]

    # go down dets and mark TPs and FPs
    for d in range(nd):
      R = class_recs[image_ids[d]]
      bb = BB[d, :].astype(float)
      ovmax = -np.inf
      BBGT = R['bbox'].astype(float)

      if BBGT.size > 0:
        # compute overlaps
        # intersection
        ixmin = np.maximum(BBGT[:, 0], bb[0])
        iymin = np.maximum(BBGT[:, 1], bb[1])
        ixmax = np.minimum(BBGT[:, 2], bb[2])
        iymax = np.minimum(BBGT[:, 3], bb[3])
        iw = np.maximum(ixmax - ixmin + 1., 0.)
        ih = np.maximum(iymax - iymin + 1., 0.)
        inters = iw * ih

        # union
        uni = ((bb[2] - bb[0] + 1.) * (bb[3] - bb[1] + 1.) +
               (BBGT[:, 2] - BBGT[:, 0] + 1.) *
               (BBGT[:, 3] - BBGT[:, 1] + 1.) - inters)

        overlaps = inters / uni
        ovmax = np.max(overlaps)
        jmax = np.argmax(overlaps)

      if ovmax > ovthresh:
        if not R['difficult'][jmax]:
          if not R['det'][jmax]:
            tp[d] = 1.
            R['det'][jmax] = 1
          else:
            fp[d] = 1.
      else:
        fp[d] = 1.

  # compute precision recall
  fp = np.cumsum(fp)
  tp = np.cumsum(tp)
  rec = tp / float(npos)
  # avoid divide by zero in case the first detection matches a difficult
  # ground truth
  prec = tp / np.maximum(tp + fp, np.finfo(np.float64).eps)
  ap = voc_ap(rec, prec, use_07_metric)

  return rec, prec, ap


================================================
FILE: lib/layer_utils/__init__.py
================================================


================================================
FILE: lib/layer_utils/anchor_target_layer.py
================================================
# --------------------------------------------------------
# Faster R-CNN
# Copyright (c) 2015 Microsoft
# Licensed under The MIT License [see LICENSE for details]
# Written by Ross Girshick and Xinlei Chen
# --------------------------------------------------------
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import os
from model.config import cfg
import numpy as np
import numpy.random as npr
from utils.cython_bbox import bbox_overlaps
from model.bbox_transform import bbox_transform

def anchor_target_layer(rpn_cls_score, gt_boxes, im_info, _feat_stride, all_anchors, num_anchors):
  """Same as the anchor target layer in original Fast/er RCNN """
  A = num_anchors
  total_anchors = all_anchors.shape[0]
  K = total_anchors / num_anchors
  im_info = im_info[0]

  # allow boxes to sit over the edge by a small amount
  _allowed_border = 0

  # map of shape (..., H, W)
  height, width = rpn_cls_score.shape[1:3]

  # only keep anchors inside the image
  inds_inside = np.where(
    (all_anchors[:, 0] >= -_allowed_border) &
    (all_anchors[:, 1] >= -_allowed_border) &
    (all_anchors[:, 2] < im_info[1] + _allowed_border) &  # width
    (all_anchors[:, 3] < im_info[0] + _allowed_border)  # height
  )[0]

  # keep only inside anchors
  anchors = all_anchors[inds_inside, :]

  # label: 1 is positive, 0 is negative, -1 is dont care
  labels = np.empty((len(inds_inside),), dtype=np.float32)
  labels.fill(-1)

  # overlaps between the anchors and the gt boxes
  # overlaps (ex, gt)
  overlaps = bbox_overlaps(
    np.ascontiguousarray(anchors, dtype=np.float),
    np.ascontiguousarray(gt_boxes, dtype=np.float))
  argmax_overlaps = overlaps.argmax(axis=1)
  max_overlaps = overlaps[np.arange(len(inds_inside)), argmax_overlaps]
  gt_argmax_overlaps = overlaps.argmax(axis=0)
  gt_max_overlaps = overlaps[gt_argmax_overlaps,
                             np.arange(overlaps.shape[1])]
  gt_argmax_overlaps = np.where(overlaps == gt_max_overlaps)[0]

  if not cfg.TRAIN.RPN_CLOBBER_POSITIVES:
    # assign bg labels first so that positive labels can clobber them
    # first set the negatives
    labels[max_overlaps < cfg.TRAIN.RPN_NEGATIVE_OVERLAP] = 0

  # fg label: for each gt, anchor with highest overlap
  labels[gt_argmax_overlaps] = 1

  # fg label: above threshold IOU
  labels[max_overlaps >= cfg.TRAIN.RPN_POSITIVE_OVERLAP] = 1

  if cfg.TRAIN.RPN_CLOBBER_POSITIVES:
    # assign bg labels last so that negative labels can clobber positives
    labels[max_overlaps < cfg.TRAIN.RPN_NEGATIVE_OVERLAP] = 0

  # subsample positive labels if we have too many
  num_fg = int(cfg.TRAIN.RPN_FG_FRACTION * cfg.TRAIN.RPN_BATCHSIZE)
  fg_inds = np.where(labels == 1)[0]
  if len(fg_inds) > num_fg:
    disable_inds = npr.choice(
      fg_inds, size=(len(fg_inds) - num_fg), replace=False)
    labels[disable_inds] = -1

  # subsample negative labels if we have too many
  num_bg = cfg.TRAIN.RPN_BATCHSIZE - np.sum(labels == 1)
  bg_inds = np.where(labels == 0)[0]
  if len(bg_inds) > num_bg:
    disable_inds = npr.choice(
      bg_inds, size=(len(bg_inds) - num_bg), replace=False)
    labels[disable_inds] = -1

  bbox_targets = np.zeros((len(inds_inside), 4), dtype=np.float32)
  bbox_targets = _compute_targets(anchors, gt_boxes[argmax_overlaps, :])

  bbox_inside_weights = np.zeros((len(inds_inside), 4), dtype=np.float32)
  # only the positive ones have regression targets
  bbox_inside_weights[labels == 1, :] = np.array(cfg.TRAIN.RPN_BBOX_INSIDE_WEIGHTS)

  bbox_outside_weights = np.zeros((len(inds_inside), 4), dtype=np.float32)
  if cfg.TRAIN.RPN_POSITIVE_WEIGHT < 0:
    # uniform weighting of examples (given non-uniform sampling)
    num_examples = np.sum(labels >= 0)
    positive_weights = np.ones((1, 4)) * 1.0 / num_examples
    negative_weights = np.ones((1, 4)) * 1.0 / num_examples
  else:
    assert ((cfg.TRAIN.RPN_POSITIVE_WEIGHT > 0) &
            (cfg.TRAIN.RPN_POSITIVE_WEIGHT < 1))
    positive_weights = (cfg.TRAIN.RPN_POSITIVE_WEIGHT /
                        np.sum(labels == 1))
    negative_weights = ((1.0 - cfg.TRAIN.RPN_POSITIVE_WEIGHT) /
                        np.sum(labels == 0))
  bbox_outside_weights[labels == 1, :] = positive_weights
  bbox_outside_weights[labels == 0, :] = negative_weights

  # map up to original set of anchors
  labels = _unmap(labels, total_anchors, inds_inside, fill=-1)
  bbox_targets = _unmap(bbox_targets, total_anchors, inds_inside, fill=0)
  bbox_inside_weights = _unmap(bbox_inside_weights, total_anchors, inds_inside, fill=0)
  bbox_outside_weights = _unmap(bbox_outside_weights, total_anchors, inds_inside, fill=0)

  # labels
  labels = labels.reshape((1, height, width, A)).transpose(0, 3, 1, 2)
  labels = labels.reshape((1, 1, A * height, width))
  rpn_labels = labels

  # bbox_targets
  bbox_targets = bbox_targets \
    .reshape((1, height, width, A * 4))

  rpn_bbox_targets = bbox_targets
  # bbox_inside_weights
  bbox_inside_weights = bbox_inside_weights \
    .reshape((1, height, width, A * 4))

  rpn_bbox_inside_weights = bbox_inside_weights

  # bbox_outside_weights
  bbox_outside_weights = bbox_outside_weights \
    .reshape((1, height, width, A * 4))

  rpn_bbox_outside_weights = bbox_outside_weights
  return rpn_labels, rpn_bbox_targets, rpn_bbox_inside_weights, rpn_bbox_outside_weights


def _unmap(data, count, inds, fill=0):
  """ Unmap a subset of item (data) back to the original set of items (of
  size count) """
  if len(data.shape) == 1:
    ret = np.empty((count,), dtype=np.float32)
    ret.fill(fill)
    ret[inds] = data
  else:
    ret = np.empty((count,) + data.shape[1:], dtype=np.float32)
    ret.fill(fill)
    ret[inds, :] = data
  return ret


def _compute_targets(ex_rois, gt_rois):
  """Compute bounding-box regression targets for an image."""

  assert ex_rois.shape[0] == gt_rois.shape[0]
  assert ex_rois.shape[1] == 4
  assert gt_rois.shape[1] == 5

  return bbox_transform(ex_rois, gt_rois[:, :4]).astype(np.float32, copy=False)


================================================
FILE: lib/layer_utils/generate_anchors.py
================================================
# --------------------------------------------------------
# Faster R-CNN
# Copyright (c) 2015 Microsoft
# Licensed under The MIT License [see LICENSE for details]
# Written by Ross Girshick and Sean Bell
# --------------------------------------------------------
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np


# Verify that we compute the same anchors as Shaoqing's matlab implementation:
#
#    >> load output/rpn_cachedir/faster_rcnn_VOC2007_ZF_stage1_rpn/anchors.mat
#    >> anchors
#
#    anchors =
#
#       -83   -39   100    56
#      -175   -87   192   104
#      -359  -183   376   200
#       -55   -55    72    72
#      -119  -119   136   136
#      -247  -247   264   264
#       -35   -79    52    96
#       -79  -167    96   184
#      -167  -343   184   360

# array([[ -83.,  -39.,  100.,   56.],
#       [-175.,  -87.,  192.,  104.],
#       [-359., -183.,  376.,  200.],
#       [ -55.,  -55.,   72.,   72.],
#       [-119., -119.,  136.,  136.],
#       [-247., -247.,  264.,  264.],
#       [ -35.,  -79.,   52.,   96.],
#       [ -79., -167.,   96.,  184.],
#       [-167., -343.,  184.,  360.]])

def generate_anchors(base_size=16, ratios=[0.5, 1, 2],
                     scales=2 ** np.arange(3, 6)):
  """
  Generate anchor (reference) windows by enumerating aspect ratios X
  scales wrt a reference (0, 0, 15, 15) window.
  """

  base_anchor = np.array([1, 1, base_size, base_size]) - 1
  ratio_anchors = _ratio_enum(base_anchor, ratios)
  anchors = np.vstack([_scale_enum(ratio_anchors[i, :], scales)
                       for i in range(ratio_anchors.shape[0])])
  return anchors


def _whctrs(anchor):
  """
  Return width, height, x center, and y center for an anchor (window).
  """

  w = anchor[2] - anchor[0] + 1
  h = anchor[3] - anchor[1] + 1
  x_ctr = anchor[0] + 0.5 * (w - 1)
  y_ctr = anchor[1] + 0.5 * (h - 1)
  return w, h, x_ctr, y_ctr


def _mkanchors(ws, hs, x_ctr, y_ctr):
  """
  Given a vector of widths (ws) and heights (hs) around a center
  (x_ctr, y_ctr), output a set of anchors (windows).
  """

  ws = ws[:, np.newaxis]
  hs = hs[:, np.newaxis]
  anchors = np.hstack((x_ctr - 0.5 * (ws - 1),
                       y_ctr - 0.5 * (hs - 1),
                       x_ctr + 0.5 * (ws - 1),
                       y_ctr + 0.5 * (hs - 1)))
  return anchors


def _ratio_enum(anchor, ratios):
  """
  Enumerate a set of anchors for each aspect ratio wrt an anchor.
  """

  w, h, x_ctr, y_ctr = _whctrs(anchor)
  size = w * h
  size_ratios = size / ratios
  ws = np.round(np.sqrt(size_ratios))
  hs = np.round(ws * ratios)
  anchors = _mkanchors(ws, hs, x_ctr, y_ctr)
  return anchors


def _scale_enum(anchor, scales):
  """
  Enumerate a set of anchors for each scale wrt an anchor.
  """

  w, h, x_ctr, y_ctr = _whctrs(anchor)
  ws = w * scales
  hs = h * scales
  anchors = _mkanchors(ws, hs, x_ctr, y_ctr)
  return anchors


if __name__ == '__main__':
  import time

  t = time.time()
  a = generate_anchors()
  print(time.time() - t)
  print(a)
  from IPython import embed;

  embed()


================================================
FILE: lib/layer_utils/proposal_layer.py
================================================
# --------------------------------------------------------
# Faster R-CNN
# Licensed under The MIT License [see LICENSE for details]
# Written by Ross Girshick and Xinlei Chen
# --------------------------------------------------------
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np
from model.config import cfg
from model.bbox_transform import bbox_transform_inv, clip_boxes
from model.nms_wrapper import nms


def proposal_layer(rpn_cls_prob, rpn_bbox_pred, im_info, cfg_key, _feat_stride, anchors, num_anchors):
  """A simplified version compared to fast/er RCNN
     For details please see the technical report
  """
  if type(cfg_key) == bytes:
      cfg_key = cfg_key.decode('utf-8')
  pre_nms_topN = cfg[cfg_key].RPN_PRE_NMS_TOP_N
  post_nms_topN = cfg[cfg_key].RPN_POST_NMS_TOP_N
  nms_thresh = cfg[cfg_key].RPN_NMS_THRESH

  im_info = im_info[0]
  # Get the scores and bounding boxes
  scores = rpn_cls_prob[:, :, :, num_anchors:]
  rpn_bbox_pred = rpn_bbox_pred.reshape((-1, 4))
  scores = scores.reshape((-1, 1))
  proposals = bbox_transform_inv(anchors, rpn_bbox_pred)
  proposals = clip_boxes(proposals, im_info[:2])

  # Pick the top region proposals
  order = scores.ravel().argsort()[::-1]
  if pre_nms_topN > 0:
    order = order[:pre_nms_topN]
  proposals = proposals[order, :]
  scores = scores[order]

  # Non-maximal suppression
  keep = nms(np.hstack((proposals, scores)), nms_thresh)

  # Pick th top region proposals after NMS
  if post_nms_topN > 0:
    keep = keep[:post_nms_topN]
  proposals = proposals[keep, :]
  scores = scores[keep]

  # Only support single image as input
  batch_inds = np.zeros((proposals.shape[0], 1), dtype=np.float32)
  blob = np.hstack((batch_inds, proposals.astype(np.float32, copy=False)))

  return blob, scores


================================================
FILE: lib/layer_utils/proposal_target_layer.py
================================================
# --------------------------------------------------------
# Faster R-CNN
# Copyright (c) 2015 Microsoft
# Licensed under The MIT License [see LICENSE for details]
# Written by Ross Girshick, Sean Bell and Xinlei Chen
# --------------------------------------------------------
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np
import numpy.random as npr
from model.config import cfg
from model.bbox_transform import bbox_transform
from utils.cython_bbox import bbox_overlaps


def proposal_target_layer(rpn_rois, rpn_scores, gt_boxes, _num_classes):
  """
  Assign object detection proposals to ground-truth targets. Produces proposal
  classification labels and bounding-box regression targets.
  """

  # Proposal ROIs (0, x1, y1, x2, y2) coming from RPN
  # (i.e., rpn.proposal_layer.ProposalLayer), or any other source
  all_rois = rpn_rois
  all_scores = rpn_scores

  # Include ground-truth boxes in the set of candidate rois
  if cfg.TRAIN.USE_GT:
    zeros = np.zeros((gt_boxes.shape[0], 1), dtype=gt_boxes.dtype)
    all_rois = np.vstack(
      (all_rois, np.hstack((zeros, gt_boxes[:, :-1])))
    )
    # not sure if it a wise appending, but anyway i am not using it
    all_scores = np.vstack((all_scores, zeros))

  num_images = 1
  rois_per_image = cfg.TRAIN.BATCH_SIZE / num_images
  fg_rois_per_image = np.round(cfg.TRAIN.FG_FRACTION * rois_per_image)

  # Sample rois with classification labels and bounding box regression
  # targets
  labels, rois, roi_scores, bbox_targets, bbox_inside_weights = _sample_rois(
    all_rois, all_scores, gt_boxes, fg_rois_per_image,
    rois_per_image, _num_classes)

  rois = rois.reshape(-1, 5)
  roi_scores = roi_scores.reshape(-1)
  labels = labels.reshape(-1, 1)
  bbox_targets = bbox_targets.reshape(-1, _num_classes * 4)
  bbox_inside_weights = bbox_inside_weights.reshape(-1, _num_classes * 4)
  bbox_outside_weights = np.array(bbox_inside_weights > 0).astype(np.float32)

  return rois, roi_scores, labels, bbox_targets, bbox_inside_weights, bbox_outside_weights


def _get_bbox_regression_labels(bbox_target_data, num_classes):
  """Bounding-box regression targets (bbox_target_data) are stored in a
  compact form N x (class, tx, ty, tw, th)

  This function expands those targets into the 4-of-4*K representation used
  by the network (i.e. only one class has non-zero targets).

  Returns:
      bbox_target (ndarray): N x 4K blob of regression targets
      bbox_inside_weights (ndarray): N x 4K blob of loss weights
  """

  clss = bbox_target_data[:, 0]
  bbox_targets = np.zeros((clss.size, 4 * num_classes), dtype=np.float32)
  bbox_inside_weights = np.zeros(bbox_targets.shape, dtype=np.float32)
  inds = np.where(clss > 0)[0]
  for ind in inds:
    cls = clss[ind]
    start = int(4 * cls)
    end = start + 4
    bbox_targets[ind, start:end] = bbox_target_data[ind, 1:]
    bbox_inside_weights[ind, start:end] = cfg.TRAIN.BBOX_INSIDE_WEIGHTS
  return bbox_targets, bbox_inside_weights


def _compute_targets(ex_rois, gt_rois, labels):
  """Compute bounding-box regression targets for an image."""

  assert ex_rois.shape[0] == gt_rois.shape[0]
  assert ex_rois.shape[1] == 4
  assert gt_rois.shape[1] == 4

  targets = bbox_transform(ex_rois, gt_rois)
  if cfg.TRAIN.BBOX_NORMALIZE_TARGETS_PRECOMPUTED:
    # Optionally normalize targets by a precomputed mean and stdev
    targets = ((targets - np.array(cfg.TRAIN.BBOX_NORMALIZE_MEANS))
               / np.array(cfg.TRAIN.BBOX_NORMALIZE_STDS))
  return np.hstack(
    (labels[:, np.newaxis], targets)).astype(np.float32, copy=False)


def _sample_rois(all_rois, all_scores, gt_boxes, fg_rois_per_image, rois_per_image, num_classes):
  """Generate a random sample of RoIs comprising foreground and background
  examples.
  """
  # overlaps: (rois x gt_boxes)
  overlaps = bbox_overlaps(
    np.ascontiguousarray(all_rois[:, 1:5], dtype=np.float),
    np.ascontiguousarray(gt_boxes[:, :4], dtype=np.float))
  gt_assignment = overlaps.argmax(axis=1)
  max_overlaps = overlaps.max(axis=1)
  labels = gt_boxes[gt_assignment, 4]

  # Select foreground RoIs as those with >= FG_THRESH overlap
  fg_inds = np.where(max_overlaps >= cfg.TRAIN.FG_THRESH)[0]
  # Guard against the case when an image has fewer than fg_rois_per_image
  # Select background RoIs as those within [BG_THRESH_LO, BG_THRESH_HI)
  bg_inds = np.where((max_overlaps < cfg.TRAIN.BG_THRESH_HI) &
                     (max_overlaps >= cfg.TRAIN.BG_THRESH_LO))[0]

  # Small modification to the original version where we ensure a fixed number of regions are sampled
  if fg_inds.size > 0 and bg_inds.size > 0:
    fg_rois_per_image = min(fg_rois_per_image, fg_inds.size)
    fg_inds = npr.choice(fg_inds, size=int(fg_rois_per_image), replace=False)
    bg_rois_per_image = rois_per_image - fg_rois_per_image
    to_replace = bg_inds.size < bg_rois_per_image
    bg_inds = npr.choice(bg_inds, size=int(bg_rois_per_image), replace=to_replace)
  elif fg_inds.size > 0:
    to_replace = fg_inds.size < rois_per_image
    fg_inds = npr.choice(fg_inds, size=int(rois_per_image), replace=to_replace)
    fg_rois_per_image = rois_per_image
  elif bg_inds.size > 0:
    to_replace = bg_inds.size < rois_per_image
    bg_inds = npr.choice(bg_inds, size=int(rois_per_image), replace=to_replace)
    fg_rois_per_image = 0
  else:
    import pdb
    pdb.set_trace()

  # The indices that we're selecting (both fg and bg)
  keep_inds = np.append(fg_inds, bg_inds)
  # Select sampled values from various arrays:
  labels = labels[keep_inds]
  # Clamp labels for the background RoIs to 0
  labels[int(fg_rois_per_image):] = 0
  rois = all_rois[keep_inds]
  roi_scores = all_scores[keep_inds]

  bbox_target_data = _compute_targets(
    rois[:, 1:5], gt_boxes[gt_assignment[keep_inds], :4], labels)

  bbox_targets, bbox_inside_weights = \
    _get_bbox_regression_labels(bbox_target_data, num_classes)

  return labels, rois, roi_scores, bbox_targets, bbox_inside_weights


================================================
FILE: lib/layer_utils/proposal_top_layer.py
================================================
# --------------------------------------------------------
# Faster R-CNN
# Licensed under The MIT License [see LICENSE for details]
# Written by Xinlei Chen
# --------------------------------------------------------
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np
from model.config import cfg
from model.bbox_transform import bbox_transform_inv, clip_boxes
import numpy.random as npr

def proposal_top_layer(rpn_cls_prob, rpn_bbox_pred, im_info, _feat_stride, anchors, num_anchors):
  """A layer that just selects the top region proposals
     without using non-maximal suppression,
     For details please see the technical report
  """
  rpn_top_n = cfg.TEST.RPN_TOP_N
  im_info = im_info[0]

  scores = rpn_cls_prob[:, :, :, num_anchors:]

  rpn_bbox_pred = rpn_bbox_pred.reshape((-1, 4))
  scores = scores.reshape((-1, 1))

  length = scores.shape[0]
  if length < rpn_top_n:
    # Random selection, maybe unnecessary and loses good proposals
    # But such case rarely happens
    top_inds = npr.choice(length, size=rpn_top_n, replace=True)
  else:
    top_inds = scores.argsort(0)[::-1]
    top_inds = top_inds[:rpn_top_n]
    top_inds = top_inds.reshape(rpn_top_n, )

  # Do the selection here
  anchors = anchors[top_inds, :]
  rpn_bbox_pred = rpn_bbox_pred[top_inds, :]
  scores = scores[top_inds]

  # Convert anchors into proposals via bbox transformations
  proposals = bbox_transform_inv(anchors, rpn_bbox_pred)

  # Clip predicted boxes to image
  proposals = clip_boxes(proposals, im_info[:2])

  # Output rois blob
  # Our RPN implementation only supports a single input image, so all
  # batch inds are 0
  batch_inds = np.zeros((proposals.shape[0], 1), dtype=np.float32)
  blob = np.hstack((batch_inds, proposals.astype(np.float32, copy=False)))
  return blob, scores


================================================
FILE: lib/layer_utils/snippets.py
================================================
# --------------------------------------------------------
# Tensorflow Faster R-CNN
# Licensed under The MIT License [see LICENSE for details]
# Written by Xinlei Chen
# --------------------------------------------------------
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np
import numpy.random as npr
from model.config import cfg
from layer_utils.generate_anchors import generate_anchors
from model.bbox_transform import bbox_transform_inv, clip_boxes
from utils.cython_bbox import bbox_overlaps

def generate_anchors_pre(height, width, feat_stride, anchor_scales=(8,16,32), anchor_ratios=(0.5,1,2)):
  """ A wrapper function to generate anchors given different scales
    Also return the number of anchors in variable 'length'
  """
  anchors = generate_anchors(ratios=np.array(anchor_ratios), scales=np.array(anchor_scales))
  A = anchors.shape[0]
  shift_x = np.arange(0, width) * feat_stride
  shift_y = np.arange(0, height) * feat_stride
  shift_x, shift_y = np.meshgrid(shift_x, shift_y)
  shifts = np.vstack((shift_x.ravel(), shift_y.ravel(), shift_x.ravel(), shift_y.ravel())).transpose()
  K = shifts.shape[0]
  # width changes faster, so here it is H, W, C
  anchors = anchors.reshape((1, A, 4)) + shifts.reshape((1, K, 4)).transpose((1, 0, 2))
  anchors = anchors.reshape((K * A, 4)).astype(np.float32, copy=False)
  length = np.int32(anchors.shape[0])

  return anchors, length


================================================
FILE: lib/model/__init__.py
================================================
from . import config


================================================
FILE: lib/model/bbox_transform.py
================================================
# --------------------------------------------------------
# Fast R-CNN
# Copyright (c) 2015 Microsoft
# Licensed under The MIT License [see LICENSE for details]
# Written by Ross Girshick
# --------------------------------------------------------
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np

def bbox_transform(ex_rois, gt_rois):
  ex_widths = ex_rois[:, 2] - ex_rois[:, 0] + 1.0
  ex_heights = ex_rois[:, 3] - ex_rois[:, 1] + 1.0
  ex_ctr_x = ex_rois[:, 0] + 0.5 * ex_widths
  ex_ctr_y = ex_rois[:, 1] + 0.5 * ex_heights

  gt_widths = gt_rois[:, 2] - gt_rois[:, 0] + 1.0
  gt_heights = gt_rois[:, 3] - gt_rois[:, 1] + 1.0
  gt_ctr_x = gt_rois[:, 0] + 0.5 * gt_widths
  gt_ctr_y = gt_rois[:, 1] + 0.5 * gt_heights

  targets_dx = (gt_ctr_x - ex_ctr_x) / ex_widths
  targets_dy = (gt_ctr_y - ex_ctr_y) / ex_heights
  targets_dw = np.log(gt_widths / ex_widths)
  targets_dh = np.log(gt_heights / ex_heights)

  targets = np.vstack(
    (targets_dx, targets_dy, targets_dw, targets_dh)).transpose()
  return targets


def bbox_transform_inv(boxes, deltas):
  if boxes.shape[0] == 0:
    return np.zeros((0, deltas.shape[1]), dtype=deltas.dtype)

  boxes = boxes.astype(deltas.dtype, copy=False)
  widths = boxes[:, 2] - boxes[:, 0] + 1.0
  heights = boxes[:, 3] - boxes[:, 1] + 1.0
  ctr_x = boxes[:, 0] + 0.5 * widths
  ctr_y = boxes[:, 1] + 0.5 * heights

  dx = deltas[:, 0::4]
  dy = deltas[:, 1::4]
  dw = deltas[:, 2::4]
  dh = deltas[:, 3::4]
  
  pred_ctr_x = dx * widths[:, np.newaxis] + ctr_x[:, np.newaxis]
  pred_ctr_y = dy * heights[:, np.newaxis] + ctr_y[:, np.newaxis]
  pred_w = np.exp(dw) * widths[:, np.newaxis]
  pred_h = np.exp(dh) * heights[:, np.newaxis]

  pred_boxes = np.zeros(deltas.shape, dtype=deltas.dtype)
  # x1
  pred_boxes[:, 0::4] = pred_ctr_x - 0.5 * pred_w
  # y1
  pred_boxes[:, 1::4] = pred_ctr_y - 0.5 * pred_h
  # x2
  pred_boxes[:, 2::4] = pred_ctr_x + 0.5 * pred_w
  # y2
  pred_boxes[:, 3::4] = pred_ctr_y + 0.5 * pred_h

  return pred_boxes


def clip_boxes(boxes, im_shape):
  """
  Clip boxes to image boundaries.
  """

  # x1 >= 0
  boxes[:, 0::4] = np.maximum(np.minimum(boxes[:, 0::4], im_shape[1] - 1), 0)
  # y1 >= 0
  boxes[:, 1::4] = np.maximum(np.minimum(boxes[:, 1::4], im_shape[0] - 1), 0)
  # x2 < im_shape[1]
  boxes[:, 2::4] = np.maximum(np.minimum(boxes[:, 2::4], im_shape[1] - 1), 0)
  # y2 < im_shape[0]
  boxes[:, 3::4] = np.maximum(np.minimum(boxes[:, 3::4], im_shape[0] - 1), 0)
  return boxes


================================================
FILE: lib/model/config.py
================================================
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import os
import os.path as osp
import numpy as np
# `pip install easydict` if you don't have it
from easydict import EasyDict as edict

__C = edict()
# Consumers can get config by:
#   from fast_rcnn_config import cfg
cfg = __C

#
# Training options
#
__C.TRAIN = edict()

# Initial learning rate
__C.TRAIN.LEARNING_RATE = 0.0001

# Momentum
__C.TRAIN.MOMENTUM = 0.9

# Weight decay, for regularization
__C.TRAIN.WEIGHT_DECAY = 0.0005

# Factor for reducing the learning rate
__C.TRAIN.GAMMA = 0.1

# Step size for reducing the learning rate, currently only support one step
__C.TRAIN.STEPSIZE = 30000

# Iteration intervals for showing the loss during training, on command line interface
__C.TRAIN.DISPLAY = 10

# Whether to double the learning rate for bias
__C.TRAIN.DOUBLE_BIAS = True

# Whether to initialize the weights with truncated normal distribution 
__C.TRAIN.TRUNCATED = False

# Whether to have weight decay on bias as well
__C.TRAIN.BIAS_DECAY = False

# Whether to add ground truth boxes to the pool when sampling regions
__C.TRAIN.USE_GT = False

# Whether to use aspect-ratio grouping of training images, introduced merely for saving
# GPU memory
__C.TRAIN.ASPECT_GROUPING = False

# The number of snapshots kept, older ones are deleted to save space
__C.TRAIN.SNAPSHOT_KEPT = 3

# The time interval for saving tensorflow summaries
__C.TRAIN.SUMMARY_INTERVAL = 180

# Scale to use during training (can list multiple scales)
# The scale is the pixel size of an image's shortest side
__C.TRAIN.SCALES = (600,)

# Max pixel size of the longest side of a scaled input image
__C.TRAIN.MAX_SIZE = 1000

# Images to use per minibatch
__C.TRAIN.IMS_PER_BATCH = 1

# Minibatch size (number of regions of interest [ROIs])
__C.TRAIN.BATCH_SIZE = 128

# Fraction of minibatch that is labeled foreground (i.e. class > 0)
__C.TRAIN.FG_FRACTION = 0.25

# Overlap threshold for a ROI to be considered foreground (if >= FG_THRESH)
__C.TRAIN.FG_THRESH = 0.5

# Overlap threshold for a ROI to be considered background (class = 0 if
# overlap in [LO, HI))
__C.TRAIN.BG_THRESH_HI = 0.5
__C.TRAIN.BG_THRESH_LO = 0.1

# Use horizontally-flipped images during training?
#__C.TRAIN.USE_FLIPPED = True
__C.TRAIN.USE_FLIPPED = False

# Train bounding-box regressors
__C.TRAIN.BBOX_REG = True

# Overlap required between a ROI and ground-truth box in order for that ROI to
# be used as a bounding-box regression training example
__C.TRAIN.BBOX_THRESH = 0.5

# Iterations between snapshots
__C.TRAIN.SNAPSHOT_ITERS = 3000

# solver.prototxt specifies the snapshot path prefix, this adds an optional
# infix to yield the path: <prefix>[_<infix>]_iters_XYZ.caffemodel
__C.TRAIN.SNAPSHOT_PREFIX = 'res101_faster_rcnn'
# __C.TRAIN.SNAPSHOT_INFIX = ''

# Use a prefetch thread in roi_data_layer.layer
# So far I haven't found this useful; likely more engineering work is required
# __C.TRAIN.USE_PREFETCH = False

# Normalize the targets (subtract empirical mean, divide by empirical stddev)
__C.TRAIN.BBOX_NORMALIZE_TARGETS = True
# Deprecated (inside weights)
__C.TRAIN.BBOX_INSIDE_WEIGHTS = (1.0, 1.0, 1.0, 1.0)
# Normalize the targets using "precomputed" (or made up) means and stdevs
# (BBOX_NORMALIZE_TARGETS must also be True)
__C.TRAIN.BBOX_NORMALIZE_TARGETS_PRECOMPUTED = True
__C.TRAIN.BBOX_NORMALIZE_MEANS = (0.0, 0.0, 0.0, 0.0)
__C.TRAIN.BBOX_NORMALIZE_STDS = (0.1, 0.1, 0.2, 0.2)

# Train using these proposals
__C.TRAIN.PROPOSAL_METHOD = 'gt'

# Make minibatches from images that have similar aspect ratios (i.e. both
# tall and thin or both short and wide) in order to avoid wasting computation
# on zero-padding.

# Use RPN to detect objects
__C.TRAIN.HAS_RPN = True
# IOU >= thresh: positive example
__C.TRAIN.RPN_POSITIVE_OVERLAP = 0.7
# IOU < thresh: negative example
__C.TRAIN.RPN_NEGATIVE_OVERLAP = 0.3
# If an anchor statisfied by positive and negative conditions set to negative
__C.TRAIN.RPN_CLOBBER_POSITIVES = False
# Max number of foreground examples
__C.TRAIN.RPN_FG_FRACTION = 0.5
# Total number of examples
__C.TRAIN.RPN_BATCHSIZE = 256
# NMS threshold used on RPN proposals
__C.TRAIN.RPN_NMS_THRESH = 0.7
# Number of top scoring boxes to keep before apply NMS to RPN proposals
__C.TRAIN.RPN_PRE_NMS_TOP_N = 12000
# Number of top scoring boxes to keep after applying NMS to RPN proposals
__C.TRAIN.RPN_POST_NMS_TOP_N = 2000
# Proposal height and width both need to be greater than RPN_MIN_SIZE (at orig image scale)
# __C.TRAIN.RPN_MIN_SIZE = 16
# Deprecated (outside weights)
__C.TRAIN.RPN_BBOX_INSIDE_WEIGHTS = (1.0, 1.0, 1.0, 1.0)
# Give the positive RPN examples weight of p * 1 / {num positives}
# and give negatives a weight of (1 - p)
# Set to -1.0 to use uniform example weighting
__C.TRAIN.RPN_POSITIVE_WEIGHT = -1.0
# Whether to use all ground truth bounding boxes for training, 
# For COCO, setting USE_ALL_GT to False will exclude boxes that are flagged as ''iscrowd''
__C.TRAIN.USE_ALL_GT = True

#
# Testing options
#
__C.TEST = edict()

# Scale to use during testing (can NOT list multiple scales)
# The scale is the pixel size of an image's shortest side
__C.TEST.SCALES = (600,)

# Max pixel size of the longest side of a scaled input image
__C.TEST.MAX_SIZE = 1000

# Overlap threshold used for non-maximum suppression (suppress boxes with
# IoU >= this threshold)
__C.TEST.NMS = 0.3

# Experimental: treat the (K+1) units in the cls_score layer as linear
# predictors (trained, eg, with one-vs-rest SVMs).
__C.TEST.SVM = False

# Test using bounding-box regressors
__C.TEST.BBOX_REG = True

# Propose boxes
__C.TEST.HAS_RPN = False

# Test using these proposals
__C.TEST.PROPOSAL_METHOD = 'gt'

## NMS threshold used on RPN proposals
__C.TEST.RPN_NMS_THRESH = 0.7
## Number of top scoring boxes to keep before apply NMS to RPN proposals
__C.TEST.RPN_PRE_NMS_TOP_N = 6000

## Number of top scoring boxes to keep after applying NMS to RPN proposals
__C.TEST.RPN_POST_NMS_TOP_N = 300

# Proposal height and width both need to be greater than RPN_MIN_SIZE (at orig image scale)
# __C.TEST.RPN_MIN_SIZE = 16

# Testing mode, default to be 'nms', 'top' is slower but better
# See report for details
__C.TEST.MODE = 'nms'

# Only useful when TEST.MODE is 'top', specifies the number of top proposals to select
__C.TEST.RPN_TOP_N = 5000

#
# ResNet options
#

__C.RESNET = edict()

# Option to set if max-pooling is appended after crop_and_resize. 
# if true, the region will be resized to a squre of 2xPOOLING_SIZE, 
# then 2x2 max-pooling is applied; otherwise the region will be directly
# resized to a square of POOLING_SIZE
__C.RESNET.MAX_POOL = False

# Number of fixed blocks during finetuning, by default the first of all 4 blocks is fixed
# Range: 0 (none) to 3 (all)
__C.RESNET.FIXED_BLOCKS = 1

# Whether to tune the batch nomalization parameters during training
__C.RESNET.BN_TRAIN = False

#
# MISC
#

# The mapping from image coordinates to feature map coordinates might cause
# some boxes that are distinct in image space to become identical in feature
# coordinates. If DEDUP_BOXES > 0, then DEDUP_BOXES is used as the scale factor
# for identifying duplicate boxes.
# 1/16 is correct for {Alex,Caffe}Net, VGG_CNN_M_1024, and VGG16
__C.DEDUP_BOXES = 1. / 16.

# Pixel mean values (BGR order) as a (1, 1, 3) array
# We use the same pixel mean for all networks even though it's not exactly what
# they were trained with
__C.PIXEL_MEANS = np.array([[[102.9801, 115.9465, 122.7717]]])

# For reproducibility
__C.RNG_SEED = 3

# A small number that's used many times
__C.EPS = 1e-14

# Root directory of project
__C.ROOT_DIR = osp.abspath(osp.join(osp.dirname(__file__), '..', '..'))

# Data directory
__C.DATA_DIR = osp.abspath(osp.join(__C.ROOT_DIR, 'data'))

# Name (or path to) the matlab executable
__C.MATLAB = 'matlab'

# Place outputs under an experiments directory
__C.EXP_DIR = 'default'

# Use GPU implementation of non-maximum suppression
__C.USE_GPU_NMS = True

# Default GPU device id
__C.GPU_ID = 0

# Default pooling mode, only 'crop' is available
__C.POOLING_MODE = 'crop'

# Size of the pooled region after RoI pooling
__C.POOLING_SIZE = 7

# Anchor scales for RPN
__C.ANCHOR_SCALES = [8,16,32]

# Anchor ratios for RPN
__C.ANCHOR_RATIOS = [0.5,1,2]


def get_output_dir(imdb, weights_filename):
  """Return the directory where experimental artifacts are placed.
  If the directory does not exist, it is created.

  A canonical path is built using the name from an imdb and a network
  (if not None).
  """
  outdir = osp.abspath(osp.join(__C.ROOT_DIR, 'output', __C.EXP_DIR, imdb.name))
  if weights_filename is None:
    weights_filename = 'default'
  outdir = osp.join(outdir, weights_filename)
  if not os.path.exists(outdir):
    os.makedirs(outdir)
  return outdir


def get_output_tb_dir(imdb, weights_filename):
  """Return the directory where tensorflow summaries are placed.
  If the directory does not exist, it is created.

  A canonical path is built using the name from an imdb and a network
  (if not None).
  """
  outdir = osp.abspath(osp.join(__C.ROOT_DIR, 'tensorboard', __C.EXP_DIR, imdb.name))
  if weights_filename is None:
    weights_filename = 'default'
  outdir = osp.join(outdir, weights_filename)
  if not os.path.exists(outdir):
    os.makedirs(outdir)
  return outdir


def _merge_a_into_b(a, b):
  """Merge config dictionary a into config dictionary b, clobbering the
  options in b whenever they are also specified in a.
  """
  if type(a) is not edict:
    return

  for k, v in a.items():
    # a must specify keys that are in b
    if k not in b:
      raise KeyError('{} is not a valid config key'.format(k))

    # the types must match, too
    old_type = type(b[k])
    if old_type is not type(v):
      if isinstance(b[k], np.ndarray):
        v = np.array(v, dtype=b[k].dtype)
      else:
        raise ValueError(('Type mismatch ({} vs. {}) '
                          'for config key: {}').format(type(b[k]),
                                                       type(v), k))

    # recursively merge dicts
    if type(v) is edict:
      try:
        _merge_a_into_b(a[k], b[k])
      except:
        print(('Error under config key: {}'.format(k)))
        raise
    else:
      b[k] = v


def cfg_from_file(filename):
  """Load a config file and merge it into the default options."""
  import yaml
  with open(filename, 'r') as f:
    yaml_cfg = edict(yaml.load(f))

  _merge_a_into_b(yaml_cfg, __C)


def cfg_from_list(cfg_list):
  """Set config keys via list (e.g., from command line)."""
  from ast import literal_eval
  assert len(cfg_list) % 2 == 0
  for k, v in zip(cfg_list[0::2], cfg_list[1::2]):
    key_list = k.split('.')
    d = __C
    for subkey in key_list[:-1]:
      assert subkey in d
      d = d[subkey]
    subkey = key_list[-1]
    assert subkey in d
    try:
      value = literal_eval(v)
    except:
      # handle the case when v is a string literal
      value = v
    assert type(value) == type(d[subkey]), \
      'type {} does not match original type {}'.format(
        type(value), type(d[subkey]))
    d[subkey] = value


================================================
FILE: lib/model/config.py~
================================================
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import os
import os.path as osp
import numpy as np
# `pip install easydict` if you don't have it
from easydict import EasyDict as edict

__C = edict()
# Consumers can get config by:
#   from fast_rcnn_config import cfg
cfg = __C

#
# Training options
#
__C.TRAIN = edict()

# Initial learning rate
__C.TRAIN.LEARNING_RATE = 0.001

# Momentum
__C.TRAIN.MOMENTUM = 0.9

# Weight decay, for regularization
__C.TRAIN.WEIGHT_DECAY = 0.0005

# Factor for reducing the learning rate
__C.TRAIN.GAMMA = 0.1

# Step size for reducing the learning rate, currently only support one step
__C.TRAIN.STEPSIZE = 30000

# Iteration intervals for showing the loss during training, on command line interface
__C.TRAIN.DISPLAY = 10

# Whether to double the learning rate for bias
__C.TRAIN.DOUBLE_BIAS = True

# Whether to initialize the weights with truncated normal distribution 
__C.TRAIN.TRUNCATED = False

# Whether to have weight decay on bias as well
__C.TRAIN.BIAS_DECAY = False

# Whether to add ground truth boxes to the pool when sampling regions
__C.TRAIN.USE_GT = False

# Whether to use aspect-ratio grouping of training images, introduced merely for saving
# GPU memory
__C.TRAIN.ASPECT_GROUPING = False

# The number of snapshots kept, older ones are deleted to save space
__C.TRAIN.SNAPSHOT_KEPT = 3

# The time interval for saving tensorflow summaries
__C.TRAIN.SUMMARY_INTERVAL = 180

# Scale to use during training (can list multiple scales)
# The scale is the pixel size of an image's shortest side
__C.TRAIN.SCALES = (600,)

# Max pixel size of the longest side of a scaled input image
__C.TRAIN.MAX_SIZE = 1000

# Images to use per minibatch
__C.TRAIN.IMS_PER_BATCH = 1

# Minibatch size (number of regions of interest [ROIs])
__C.TRAIN.BATCH_SIZE = 128

# Fraction of minibatch that is labeled foreground (i.e. class > 0)
__C.TRAIN.FG_FRACTION = 0.25

# Overlap threshold for a ROI to be considered foreground (if >= FG_THRESH)
__C.TRAIN.FG_THRESH = 0.5

# Overlap threshold for a ROI to be considered background (class = 0 if
# overlap in [LO, HI))
__C.TRAIN.BG_THRESH_HI = 0.5
__C.TRAIN.BG_THRESH_LO = 0.1

# Use horizontally-flipped images during training?
#__C.TRAIN.USE_FLIPPED = True
__C.TRAIN.USE_FLIPPED = False

# Train bounding-box regressors
__C.TRAIN.BBOX_REG = True

# Overlap required between a ROI and ground-truth box in order for that ROI to
# be used as a bounding-box regression training example
__C.TRAIN.BBOX_THRESH = 0.5

# Iterations between snapshots
__C.TRAIN.SNAPSHOT_ITERS = 3000

# solver.prototxt specifies the snapshot path prefix, this adds an optional
# infix to yield the path: <prefix>[_<infix>]_iters_XYZ.caffemodel
__C.TRAIN.SNAPSHOT_PREFIX = 'res101_faster_rcnn'
# __C.TRAIN.SNAPSHOT_INFIX = ''

# Use a prefetch thread in roi_data_layer.layer
# So far I haven't found this useful; likely more engineering work is required
# __C.TRAIN.USE_PREFETCH = False

# Normalize the targets (subtract empirical mean, divide by empirical stddev)
__C.TRAIN.BBOX_NORMALIZE_TARGETS = True
# Deprecated (inside weights)
__C.TRAIN.BBOX_INSIDE_WEIGHTS = (1.0, 1.0, 1.0, 1.0)
# Normalize the targets using "precomputed" (or made up) means and stdevs
# (BBOX_NORMALIZE_TARGETS must also be True)
__C.TRAIN.BBOX_NORMALIZE_TARGETS_PRECOMPUTED = True
__C.TRAIN.BBOX_NORMALIZE_MEANS = (0.0, 0.0, 0.0, 0.0)
__C.TRAIN.BBOX_NORMALIZE_STDS = (0.1, 0.1, 0.2, 0.2)

# Train using these proposals
__C.TRAIN.PROPOSAL_METHOD = 'gt'

# Make minibatches from images that have similar aspect ratios (i.e. both
# tall and thin or both short and wide) in order to avoid wasting computation
# on zero-padding.

# Use RPN to detect objects
__C.TRAIN.HAS_RPN = True
# IOU >= thresh: positive example
__C.TRAIN.RPN_POSITIVE_OVERLAP = 0.7
# IOU < thresh: negative example
__C.TRAIN.RPN_NEGATIVE_OVERLAP = 0.3
# If an anchor statisfied by positive and negative conditions set to negative
__C.TRAIN.RPN_CLOBBER_POSITIVES = False
# Max number of foreground examples
__C.TRAIN.RPN_FG_FRACTION = 0.5
# Total number of examples
__C.TRAIN.RPN_BATCHSIZE = 256
# NMS threshold used on RPN proposals
__C.TRAIN.RPN_NMS_THRESH = 0.7
# Number of top scoring boxes to keep before apply NMS to RPN proposals
__C.TRAIN.RPN_PRE_NMS_TOP_N = 12000
# Number of top scoring boxes to keep after applying NMS to RPN proposals
__C.TRAIN.RPN_POST_NMS_TOP_N = 2000
# Proposal height and width both need to be greater than RPN_MIN_SIZE (at orig image scale)
# __C.TRAIN.RPN_MIN_SIZE = 16
# Deprecated (outside weights)
__C.TRAIN.RPN_BBOX_INSIDE_WEIGHTS = (1.0, 1.0, 1.0, 1.0)
# Give the positive RPN examples weight of p * 1 / {num positives}
# and give negatives a weight of (1 - p)
# Set to -1.0 to use uniform example weighting
__C.TRAIN.RPN_POSITIVE_WEIGHT = -1.0
# Whether to use all ground truth bounding boxes for training, 
# For COCO, setting USE_ALL_GT to False will exclude boxes that are flagged as ''iscrowd''
__C.TRAIN.USE_ALL_GT = True

#
# Testing options
#
__C.TEST = edict()

# Scale to use during testing (can NOT list multiple scales)
# The scale is the pixel size of an image's shortest side
__C.TEST.SCALES = (600,)

# Max pixel size of the longest side of a scaled input image
__C.TEST.MAX_SIZE = 1000

# Overlap threshold used for non-maximum suppression (suppress boxes with
# IoU >= this threshold)
__C.TEST.NMS = 0.3

# Experimental: treat the (K+1) units in the cls_score layer as linear
# predictors (trained, eg, with one-vs-rest SVMs).
__C.TEST.SVM = False

# Test using bounding-box regressors
__C.TEST.BBOX_REG = True

# Propose boxes
__C.TEST.HAS_RPN = False

# Test using these proposals
__C.TEST.PROPOSAL_METHOD = 'gt'

## NMS threshold used on RPN proposals
__C.TEST.RPN_NMS_THRESH = 0.7
## Number of top scoring boxes to keep before apply NMS to RPN proposals
__C.TEST.RPN_PRE_NMS_TOP_N = 6000

## Number of top scoring boxes to keep after applying NMS to RPN proposals
__C.TEST.RPN_POST_NMS_TOP_N = 300

# Proposal height and width both need to be greater than RPN_MIN_SIZE (at orig image scale)
# __C.TEST.RPN_MIN_SIZE = 16

# Testing mode, default to be 'nms', 'top' is slower but better
# See report for details
__C.TEST.MODE = 'nms'

# Only useful when TEST.MODE is 'top', specifies the number of top proposals to select
__C.TEST.RPN_TOP_N = 5000

#
# ResNet options
#

__C.RESNET = edict()

# Option to set if max-pooling is appended after crop_and_resize. 
# if true, the region will be resized to a squre of 2xPOOLING_SIZE, 
# then 2x2 max-pooling is applied; otherwise the region will be directly
# resized to a square of POOLING_SIZE
__C.RESNET.MAX_POOL = False

# Number of fixed blocks during finetuning, by default the first of all 4 blocks is fixed
# Range: 0 (none) to 3 (all)
__C.RESNET.FIXED_BLOCKS = 1

# Whether to tune the batch nomalization parameters during training
__C.RESNET.BN_TRAIN = False

#
# MISC
#

# The mapping from image coordinates to feature map coordinates might cause
# some boxes that are distinct in image space to become identical in feature
# coordinates. If DEDUP_BOXES > 0, then DEDUP_BOXES is used as the scale factor
# for identifying duplicate boxes.
# 1/16 is correct for {Alex,Caffe}Net, VGG_CNN_M_1024, and VGG16
__C.DEDUP_BOXES = 1. / 16.

# Pixel mean values (BGR order) as a (1, 1, 3) array
# We use the same pixel mean for all networks even though it's not exactly what
# they were trained with
__C.PIXEL_MEANS = np.array([[[102.9801, 115.9465, 122.7717]]])

# For reproducibility
__C.RNG_SEED = 3

# A small number that's used many times
__C.EPS = 1e-14

# Root directory of project
__C.ROOT_DIR = osp.abspath(osp.join(osp.dirname(__file__), '..', '..'))

# Data directory
__C.DATA_DIR = osp.abspath(osp.join(__C.ROOT_DIR, 'data'))

# Name (or path to) the matlab executable
__C.MATLAB = 'matlab'

# Place outputs under an experiments directory
__C.EXP_DIR = 'default'

# Use GPU implementation of non-maximum suppression
__C.USE_GPU_NMS = True

# Default GPU device id
__C.GPU_ID = 0

# Default pooling mode, only 'crop' is available
__C.POOLING_MODE = 'crop'

# Size of the pooled region after RoI pooling
__C.POOLING_SIZE = 7

# Anchor scales for RPN
__C.ANCHOR_SCALES = [8,16,32]

# Anchor ratios for RPN
__C.ANCHOR_RATIOS = [0.5,1,2]


def get_output_dir(imdb, weights_filename):
  """Return the directory where experimental artifacts are placed.
  If the directory does not exist, it is created.

  A canonical path is built using the name from an imdb and a network
  (if not None).
  """
  outdir = osp.abspath(osp.join(__C.ROOT_DIR, 'output', __C.EXP_DIR, imdb.name))
  if weights_filename is None:
    weights_filename = 'default'
  outdir = osp.join(outdir, weights_filename)
  if not os.path.exists(outdir):
    os.makedirs(outdir)
  return outdir


def get_output_tb_dir(imdb, weights_filename):
  """Return the directory where tensorflow summaries are placed.
  If the directory does not exist, it is created.

  A canonical path is built using the name from an imdb and a network
  (if not None).
  """
  outdir = osp.abspath(osp.join(__C.ROOT_DIR, 'tensorboard', __C.EXP_DIR, imdb.name))
  if weights_filename is None:
    weights_filename = 'default'
  outdir = osp.join(outdir, weights_filename)
  if not os.path.exists(outdir):
    os.makedirs(outdir)
  return outdir


def _merge_a_into_b(a, b):
  """Merge config dictionary a into config dictionary b, clobbering the
  options in b whenever they are also specified in a.
  """
  if type(a) is not edict:
    return

  for k, v in a.items():
    # a must specify keys that are in b
    if k not in b:
      raise KeyError('{} is not a valid config key'.format(k))

    # the types must match, too
    old_type = type(b[k])
    if old_type is not type(v):
      if isinstance(b[k], np.ndarray):
        v = np.array(v, dtype=b[k].dtype)
      else:
        raise ValueError(('Type mismatch ({} vs. {}) '
                          'for config key: {}').format(type(b[k]),
                                                       type(v), k))

    # recursively merge dicts
    if type(v) is edict:
      try:
        _merge_a_into_b(a[k], b[k])
      except:
        print(('Error under config key: {}'.format(k)))
        raise
    else:
      b[k] = v


def cfg_from_file(filename):
  """Load a config file and merge it into the default options."""
  import yaml
  with open(filename, 'r') as f:
    yaml_cfg = edict(yaml.load(f))

  _merge_a_into_b(yaml_cfg, __C)


def cfg_from_list(cfg_list):
  """Set config keys via list (e.g., from command line)."""
  from ast import literal_eval
  assert len(cfg_list) % 2 == 0
  for k, v in zip(cfg_list[0::2], cfg_list[1::2]):
    key_list = k.split('.')
    d = __C
    for subkey in key_list[:-1]:
      assert subkey in d
      d = d[subkey]
    subkey = key_list[-1]
    assert subkey in d
    try:
      value = literal_eval(v)
    except:
      # handle the case when v is a string literal
      value = v
    assert type(value) == type(d[subkey]), \
      'type {} does not match original type {}'.format(
        type(value), type(d[subkey]))
    d[subkey] = value


================================================
FILE: lib/model/nms_wrapper.py
================================================
# --------------------------------------------------------
# Fast R-CNN
# Copyright (c) 2015 Microsoft
# Licensed under The MIT License [see LICENSE for details]
# Written by Ross Girshick
# --------------------------------------------------------
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from model.config import cfg
from nms.gpu_nms import gpu_nms
from nms.cpu_nms import cpu_nms

def nms(dets, thresh, force_cpu=False):
  """Dispatch to either CPU or GPU NMS implementations."""

  if dets.shape[0] == 0:
    return []
  if cfg.USE_GPU_NMS and not force_cpu:
    return gpu_nms(dets, thresh, device_id=cfg.GPU_ID)
  else:
    return cpu_nms(dets, thresh)


================================================
FILE: lib/model/test.py
================================================
# --------------------------------------------------------
# Tensorflow Faster R-CNN
# Licensed under The MIT License [see LICENSE for details]
# Written by Xinlei Chen
# --------------------------------------------------------
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import cv2
import numpy as np
try:
  import cPickle as pickle
except ImportError:
  import pickle
import os
import math

from utils.timer import Timer
from utils.cython_nms import nms, nms_new
from utils.boxes_grid import get_boxes_grid
from utils.blob import im_list_to_blob

from model.config import cfg, get_output_dir
from model.bbox_transform import clip_boxes, bbox_transform_inv

def _get_image_blob(im):
  """Converts an image into a network input.
  Arguments:
    im (ndarray): a color image in BGR order
  Returns:
    blob (ndarray): a data blob holding an image pyramid
    im_scale_factors (list): list of image scales (relative to im) used
      in the image pyramid
  """
  im_orig = im.astype(np.float32, copy=True)
  im_orig -= cfg.PIXEL_MEANS

  im_shape = im_orig.shape
  im_size_min = np.min(im_shape[0:2])
  im_size_max = np.max(im_shape[0:2])

  processed_ims = []
  im_scale_factors = []

  for target_size in cfg.TEST.SCALES:
    im_scale = float(target_size) / float(im_size_min)
    # Prevent the biggest axis from being more than MAX_SIZE
    if np.round(im_scale * im_size_max) > cfg.TEST.MAX_SIZE:
      im_scale = float(cfg.TEST.MAX_SIZE) / float(im_size_max)
    im = cv2.resize(im_orig, None, None, fx=im_scale, fy=im_scale,
            interpolation=cv2.INTER_LINEAR)
    im_scale_factors.append(im_scale)
    processed_ims.append(im)

  # Create a blob to hold the input images
  blob = im_list_to_blob(processed_ims)

  return blob, np.array(im_scale_factors)

def _get_blobs(im):
  """Convert an image and RoIs within that image into network inputs."""
  blobs = {}
  blobs['data'], im_scale_factors = _get_image_blob(im)

  return blobs, im_scale_factors

def _clip_boxes(boxes, im_shape):
  """Clip boxes to image boundaries."""
  # x1 >= 0
  boxes[:, 0::4] = np.maximum(boxes[:, 0::4], 0)
  # y1 >= 0
  boxes[:, 1::4] = np.maximum(boxes[:, 1::4], 0)
  # x2 < im_shape[1]
  boxes[:, 2::4] = np.minimum(boxes[:, 2::4], im_shape[1] - 1)
  # y2 < im_shape[0]
  boxes[:, 3::4] = np.minimum(boxes[:, 3::4], im_shape[0] - 1)
  return boxes

def _rescale_boxes(boxes, inds, scales):
  """Rescale boxes according to image rescaling."""
  for i in range(boxes.shape[0]):
    boxes[i,:] = boxes[i,:] / scales[int(inds[i])]

  return boxes

def im_detect(sess, net, im):
  blobs, im_scales = _get_blobs(im)
  assert len(im_scales) == 1, "Only single-image batch implemented"
   
  im_blob = blobs['data']
  # seems to have height, width, and image scales
  # still not sure about the scale, maybe full image it is 1.
  blobs['im_info'] = np.array([[im_blob.shape[1], im_blob.shape[2], im_scales[0]]], dtype=np.float32)

  _, scores, bbox_pred, rois = net.test_image(sess, blobs['data'], blobs['im_info'])
  
  boxes = rois[:, 1:5] / im_scales[0]
  # print(scores.shape, bbox_pred.shape, rois.shape, boxes.shape)
  scores = np.reshape(scores, [scores.shape[0], -1])
  bbox_pred = np.reshape(bbox_pred, [bbox_pred.shape[0], -1])
  if cfg.TEST.BBOX_REG:
    # Apply bounding-box regression deltas
    box_deltas = bbox_pred
    pred_boxes = bbox_transform_inv(boxes, box_deltas)
    pred_boxes = _clip_boxes(pred_boxes, im.shape)
  else:
    # Simply repeat the boxes, once for each class
    pred_boxes = np.tile(boxes, (1, scores.shape[1]))

  return scores, pred_boxes

def apply_nms(all_boxes, thresh):
  """Apply non-maximum suppression to all predicted boxes output by the
  test_net method.
  """
  num_classes = len(all_boxes)
  num_images = len(all_boxes[0])
  nms_boxes = [[[] for _ in range(num_images)] for _ in range(num_classes)]
  for cls_ind in range(num_classes):
    for im_ind in range(num_images):
      dets = all_boxes[cls_ind][im_ind]
      if dets == []:
        continue

      x1 = dets[:, 0]
      y1 = dets[:, 1]
      x2 = dets[:, 2]
      y2 = dets[:, 3]
      scores = dets[:, 4]
      inds = np.where((x2 > x1) & (y2 > y1) & (scores > cfg.TEST.DET_THRESHOLD))[0]
      dets = dets[inds,:]
      if dets == []:
        continue

      keep = nms(dets, thresh)
      if len(keep) == 0:
        continue
      nms_boxes[cls_ind][im_ind] = dets[keep, :].copy()
  return nms_boxes

def test_net(sess, net, imdb, weights_filename, max_per_image=100, thresh=0.05):
  np.random.seed(cfg.RNG_SEED)
  """Test a Fast R-CNN network on an image database."""
  num_images = len(imdb.image_index)
  # all detections are collected into:
  #  all_boxes[cls][image] = N x 5 array of detections in
  #  (x1, y1, x2, y2, score)
  all_boxes = [[[] for _ in range(num_images)]
         for _ in range(imdb.num_classes)]

  output_dir = get_output_dir(imdb, weights_filename)
  # timers
  _t = {'im_detect' : Timer(), 'misc' : Timer()}

  for i in range(num_images):
    im = cv2.imread(imdb.image_path_at(i))

    _t['im_detect'].tic()
    scores, boxes = im_detect(sess, net, im)
    _t['im_detect'].toc()

    _t['misc'].tic()

    # skip j = 0, because it's the background class
    for j in range(1, imdb.num_classes):
      inds = np.where(scores[:, j] > thresh)[0]
      cls_scores = scores[inds, j]
      cls_boxes = boxes[inds, j*4:(j+1)*4]
      cls_dets = np.hstack((cls_boxes, cls_scores[:, np.newaxis])) \
        .astype(np.float32, copy=False)
      keep = nms(cls_dets, cfg.TEST.NMS)
      cls_dets = cls_dets[keep, :]
      all_boxes[j][i] = cls_dets

    # Limit to max_per_image detections *over all classes*
    if max_per_image > 0:
      image_scores = np.hstack([all_boxes[j][i][:, -1]
                    for j in range(1, imdb.num_classes)])
      if len(image_scores) > max_per_image:
        image_thresh = np.sort(image_scores)[-max_per_image]
        for j in range(1, imdb.num_classes):
          keep = np.where(all_boxes[j][i][:, -1] >= image_thresh)[0]
          all_boxes[j][i] = all_boxes[j][i][keep, :]
    _t['misc'].toc()

    print('im_detect: {:d}/{:d} {:.3f}s {:.3f}s' \
        .format(i + 1, num_images, _t['im_detect'].average_time,
            _t['misc'].average_time))

  det_file = os.path.join(output_dir, 'detections.pkl')
  with open(det_file, 'wb') as f:
    pickle.dump(all_boxes, f, pickle.HIGHEST_PROTOCOL)

  print('Evaluating detections')
  imdb.evaluate_detections(all_boxes, output_dir)



================================================
FILE: lib/model/test.py~
================================================
# --------------------------------------------------------
# Tensorflow Faster R-CNN
# Licensed under The MIT License [see LICENSE for details]
# Written by Xinlei Chen
# --------------------------------------------------------
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import cv2
import numpy as np
try:
  import cPickle as pickle
except ImportError:
  import pickle
import os
import math

from utils.timer import Timer
from utils.cython_nms import nms, nms_new
from utils.boxes_grid import get_boxes_grid
from utils.blob import im_list_to_blob

from model.config import cfg, get_output_dir
from model.bbox_transform import clip_boxes, bbox_transform_inv

def _get_image_blob(im):
  """Converts an image into a network input.
  Arguments:
    im (ndarray): a color image in BGR order
  Returns:
    blob (ndarray): a data blob holding an image pyramid
    im_scale_factors (list): list of image scales (relative to im) used
      in the image pyramid
  """
  im_orig = im.astype(np.float32, copy=True)
  im_orig -= cfg.PIXEL_MEANS

  im_shape = im_orig.shape
  im_size_min = np.min(im_shape[0:2])
  im_size_max = np.max(im_shape[0:2])

  processed_ims = []
  im_scale_factors = []

  for target_size in cfg.TEST.SCALES:
    im_scale = float(target_size) / float(im_size_min)
    # Prevent the biggest axis from being more than MAX_SIZE
    if np.round(im_scale * im_size_max) > cfg.TEST.MAX_SIZE:
      im_scale = float(cfg.TEST.MAX_SIZE) / float(im_size_max)
    im = cv2.resize(im_orig, None, None, fx=im_scale, fy=im_scale,
            interpolation=cv2.INTER_LINEAR)
    im_scale_factors.append(im_scale)
    processed_ims.append(im)

  # Create a blob to hold the input images
  blob = im_list_to_blob(processed_ims)

  return blob, np.array(im_scale_factors)

def _get_blobs(im):
  """Convert an image and RoIs within that image into network inputs."""
  blobs = {}
  blobs['data'], im_scale_factors = _get_image_blob(im)

  return blobs, im_scale_factors

def _clip_boxes(boxes, im_shape):
  """Clip boxes to image boundaries."""
  # x1 >= 0
  boxes[:, 0::4] = np.maximum(boxes[:, 0::4], 0)
  # y1 >= 0
  boxes[:, 1::4] = np.maximum(boxes[:, 1::4], 0)
  # x2 < im_shape[1]
  boxes[:, 2::4] = np.minimum(boxes[:, 2::4], im_shape[1] - 1)
  # y2 < im_shape[0]
  boxes[:, 3::4] = np.minimum(boxes[:, 3::4], im_shape[0] - 1)
  return boxes

def _rescale_boxes(boxes, inds, scales):
  """Rescale boxes according to image rescaling."""
  for i in range(boxes.shape[0]):
    boxes[i,:] = boxes[i,:] / scales[int(inds[i])]

  return boxes

def im_detect(sess, net, im):
  blobs, im_scales = _get_blobs(im)
  assert len(im_scales) == 1, "Only single-image batch implemented"
  print (im_scales)
  im_blob = blobs['data']
  # seems to have height, width, and image scales
  # still not sure about the scale, maybe full image it is 1.
  blobs['im_info'] = np.array([[im_blob.shape[1], im_blob.shape[2], im_scales[0]]], dtype=np.float32)

  _, scores, bbox_pred, rois = net.test_image(sess, blobs['data'], blobs['im_info'])
  
  boxes = rois[:, 1:5] / im_scales[0]
  # print(scores.shape, bbox_pred.shape, rois.shape, boxes.shape)
  scores = np.reshape(scores, [scores.shape[0], -1])
  bbox_pred = np.reshape(bbox_pred, [bbox_pred.shape[0], -1])
  if cfg.TEST.BBOX_REG:
    # Apply bounding-box regression deltas
    box_deltas = bbox_pred
    pred_boxes = bbox_transform_inv(boxes, box_deltas)
    pred_boxes = _clip_boxes(pred_boxes, im.shape)
  else:
    # Simply repeat the boxes, once for each class
    pred_boxes = np.tile(boxes, (1, scores.shape[1]))

  return scores, pred_boxes

def apply_nms(all_boxes, thresh):
  """Apply non-maximum suppression to all predicted boxes output by the
  test_net method.
  """
  num_classes = len(all_boxes)
  num_images = len(all_boxes[0])
  nms_boxes = [[[] for _ in range(num_images)] for _ in range(num_classes)]
  for cls_ind in range(num_classes):
    for im_ind in range(num_images):
      dets = all_boxes[cls_ind][im_ind]
      if dets == []:
        continue

      x1 = dets[:, 0]
      y1 = dets[:, 1]
      x2 = dets[:, 2]
      y2 = dets[:, 3]
      scores = dets[:, 4]
      inds = np.where((x2 > x1) & (y2 > y1) & (scores > cfg.TEST.DET_THRESHOLD))[0]
      dets = dets[inds,:]
      if dets == []:
        continue

      keep = nms(dets, thresh)
      if len(keep) == 0:
        continue
      nms_boxes[cls_ind][im_ind] = dets[keep, :].copy()
  return nms_boxes

def test_net(sess, net, imdb, weights_filename, max_per_image=100, thresh=0.05):
  np.random.seed(cfg.RNG_SEED)
  """Test a Fast R-CNN network on an image database."""
  num_images = len(imdb.image_index)
  # all detections are collected into:
  #  all_boxes[cls][image] = N x 5 array of detections in
  #  (x1, y1, x2, y2, score)
  all_boxes = [[[] for _ in range(num_images)]
         for _ in range(imdb.num_classes)]

  output_dir = get_output_dir(imdb, weights_filename)
  # timers
  _t = {'im_detect' : Timer(), 'misc' : Timer()}

  for i in range(num_images):
    im = cv2.imread(imdb.image_path_at(i))

    _t['im_detect'].tic()
    scores, boxes = im_detect(sess, net, im)
    _t['im_detect'].toc()

    _t['misc'].tic()

    # skip j = 0, because it's the background class
    for j in range(1, imdb.num_classes):
      inds = np.where(scores[:, j] > thresh)[0]
      cls_scores = scores[inds, j]
      cls_boxes = boxes[inds, j*4:(j+1)*4]
      cls_dets = np.hstack((cls_boxes, cls_scores[:, np.newaxis])) \
        .astype(np.float32, copy=False)
      keep = nms(cls_dets, cfg.TEST.NMS)
      cls_dets = cls_dets[keep, :]
      all_boxes[j][i] = cls_dets

    # Limit to max_per_image detections *over all classes*
    if max_per_image > 0:
      image_scores = np.hstack([all_boxes[j][i][:, -1]
                    for j in range(1, imdb.num_classes)])
      if len(image_scores) > max_per_image:
        image_thresh = np.sort(image_scores)[-max_per_image]
        for j in range(1, imdb.num_classes):
          keep = np.where(all_boxes[j][i][:, -1] >= image_thresh)[0]
          all_boxes[j][i] = all_boxes[j][i][keep, :]
    _t['misc'].toc()

    print('im_detect: {:d}/{:d} {:.3f}s {:.3f}s' \
        .format(i + 1, num_images, _t['im_detect'].average_time,
            _t['misc'].average_time))

  det_file = os.path.join(output_dir, 'detections.pkl')
  with open(det_file, 'wb') as f:
    pickle.dump(all_boxes, f, pickle.HIGHEST_PROTOCOL)

  print('Evaluating detections')
  imdb.evaluate_detections(all_boxes, output_dir)



================================================
FILE: lib/model/train_val.py
================================================
# --------------------------------------------------------
# Tensorflow Faster R-CNN
# Licensed under The MIT License [see LICENSE for details]
# Written by Xinlei Chen and Zheqi He
# --------------------------------------------------------
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from model.config import cfg
import roi_data_layer.roidb as rdl_roidb
from roi_data_layer.layer import RoIDataLayer
from utils.timer import Timer
try:
  import cPickle as pickle
except ImportError:
  import pickle
import numpy as np
import os
import sys
import glob
import time

import tensorflow as tf
from tensorflow.python import pywrap_tensorflow

class SolverWrapper(object):
  """
    A wrapper class for the training process
  """

  def __init__(self, sess, network, imdb, roidb, valroidb, output_dir, tbdir, pretrained_model=None):
    self.net = network
    self.imdb = imdb
    self.roidb = roidb
    self.valroidb = valroidb
    self.output_dir = output_dir
    self.tbdir = tbdir
    # Simply put '_val' at the end to save the summaries from the validation set
    self.tbvaldir = tbdir + '_val'
    if not os.path.exists(self.tbvaldir):
      os.makedirs(self.tbvaldir)
    self.pretrained_model = pretrained_model

  def snapshot(self, sess, iter):
    net = self.net

    if not os.path.exists(self.output_dir):
      os.makedirs(self.output_dir)

    # Store the model snapshot
    filename = cfg.TRAIN.SNAPSHOT_PREFIX + '_iter_{:d}'.format(iter) + '.ckpt'
    filename = os.path.join(self.output_dir, filename)
    self.saver.save(sess, filename)
    print('Wrote snapshot to: {:s}'.format(filename))

    # Also store some meta information, random state, etc.
    nfilename = cfg.TRAIN.SNAPSHOT_PREFIX + '_iter_{:d}'.format(iter) + '.pkl'
    nfilename = os.path.join(self.output_dir, nfilename)
    # current state of numpy random
    st0 = np.random.get_state()
    # current position in the database
    cur = self.data_layer._cur
    # current shuffled indeces of the database
    perm = self.data_layer._perm
    # current position in the validation database
    cur_val = self.data_layer_val._cur
    # current shuffled indeces of the validation database
    perm_val = self.data_layer_val._perm

    # Dump the meta info
    with open(nfilename, 'wb') as fid:
      pickle.dump(st0, fid, pickle.HIGHEST_PROTOCOL)
      pickle.dump(cur, fid, pickle.HIGHEST_PROTOCOL)
      pickle.dump(perm, fid, pickle.HIGHEST_PROTOCOL)
      pickle.dump(cur_val, fid, pickle.HIGHEST_PROTOCOL)
      pickle.dump(perm_val, fid, pickle.HIGHEST_PROTOCOL)
      pickle.dump(iter, fid, pickle.HIGHEST_PROTOCOL)

    return filename, nfilename

  def get_variables_in_checkpoint_file(self, file_name):
    try:
      reader = pywrap_tensorflow.NewCheckpointReader(file_name)
      var_to_shape_map = reader.get_variable_to_shape_map()
      return var_to_shape_map 
    except Exception as e:  # pylint: disable=broad-except
      print(str(e))
      if "corrupted compressed block contents" in str(e):
        print("It's likely that your checkpoint file has been compressed "
              "with SNAPPY.")

  def train_model(self, sess, max_iters):
    # Build data layers for both training and validation set
    self.data_layer = RoIDataLayer(self.roidb, self.imdb.num_classes)
    self.data_layer_val = RoIDataLayer(self.valroidb, self.imdb.num_classes, random=True)

    # Determine different scales for anchors, see paper
    with sess.graph.as_default():
      # Set the random seed for tensorflow
      tf.set_random_seed(cfg.RNG_SEED)
      # Build the main computation graph
      layers = self.net.create_architecture(sess, 'TRAIN', self.imdb.num_classes, tag='default',
                                            anchor_scales=cfg.ANCHOR_SCALES,
                                            anchor_ratios=cfg.ANCHOR_RATIOS)
      # Define the loss
      loss = layers['total_loss']
      # Set learning rate and momentum
      lr = tf.Variable(cfg.TRAIN.LEARNING_RATE, trainable=False)
      momentum = cfg.TRAIN.MOMENTUM
      self.optimizer = tf.train.MomentumOptimizer(lr, momentum)

      # Compute the gradients wrt the loss
      gvs = self.optimizer.compute_gradients(loss)
      # Double the gradient of the bias if set
      if cfg.TRAIN.DOUBLE_BIAS:
        final_gvs = []
        with tf.variable_scope('Gradient_Mult') as scope:
          for grad, var in gvs:
            scale = 1.
            if cfg.TRAIN.DOUBLE_BIAS and '/biases:' in var.name:
              scale *= 2.
            if not np.allclose(scale, 1.0):
              grad = tf.multiply(grad, scale)
            final_gvs.append((grad, var))
        train_op = self.optimizer.apply_gradients(final_gvs)
      else:
        train_op = self.optimizer.apply_gradients(gvs)

      # We will handle the snapshots ourselves
      self.saver = tf.train.Saver(max_to_keep=100000)
      # Write the train and validation information to tensorboard
      self.writer = tf.summary.FileWriter(self.tbdir, sess.graph)
      self.valwriter = tf.summary.FileWriter(self.tbvaldir)

    # Find previous snapshots if there is any to restore from
    sfiles = os.path.join(self.output_dir, cfg.TRAIN.SNAPSHOT_PREFIX + '_iter_*.ckpt.meta')
    sfiles = glob.glob(sfiles)
    sfiles.sort(key=os.path.getmtime)
    # Get the snapshot name in TensorFlow
    redstr = '_iter_{:d}.'.format(cfg.TRAIN.STEPSIZE+1)
    sfiles = [ss.replace('.meta', '') for ss in sfiles]
    sfiles = [ss for ss in sfiles if redstr not in ss]

    nfiles = os.path.join(self.output_dir, cfg.TRAIN.SNAPSHOT_PREFIX + '_iter_*.pkl')
    nfiles = glob.glob(nfiles)
    nfiles.sort(key=os.path.getmtime)
    nfiles = [nn for nn in nfiles if redstr not in nn]

    lsf = len(sfiles)
    assert len(nfiles) == lsf

    np_paths = nfiles
    ss_paths = sfiles

    if lsf == 0:
      # Fresh train directly from ImageNet weights
      print('Loading initial model weights from {:s}'.format(self.pretrained_model))
      variables = tf.global_variables()
      # Initialize all variables first
      sess.run(tf.variables_initializer(variables, name='init'))
      var_keep_dic = self.get_variables_in_checkpoint_file(self.pretrained_model)
      # Get the variables to restore, ignorizing the variables to fix
      variables_to_restore = self.net.get_variables_to_restore(variables, var_keep_dic)

      restorer = tf.train.Saver(variables_to_restore)
      restorer.restore(sess, self.pretrained_model)
      print('Loaded.')
      # Need to fix the variables before loading, so that the RGB weights are changed to BGR
      # For VGG16 it also changes the convolutional weights fc6 and fc7 to
      # fully connected weights
      self.net.fix_variables(sess, self.pretrained_model)
      print('Fixed.')
      sess.run(tf.assign(lr, cfg.TRAIN.LEARNING_RATE))
      last_snapshot_iter = 0
    else:
      # Get the most recent snapshot and restore
      ss_paths = [ss_paths[-1]]
      np_paths = [np_paths[-1]]

      print('Restorining model snapshots from {:s}'.format(sfiles[-1]))
      self.saver.restore(sess, str(sfiles[-1]))
      print('Restored.')
      # Needs to restore the other hyperparameters/states for training, (TODO xinlei) I have
      # tried my best to find the random states so that it can be recovered exactly
      # However the Tensorflow state is currently not available
      with open(str(nfiles[-1]), 'rb') as fid:
        st0 = pickle.load(fid)
        cur = pickle.load(fid)
        perm = pickle.load(fid)
        cur_val = pickle.load(fid)
        perm_val = pickle.load(fid)
        last_snapshot_iter = pickle.load(fid)

        np.random.set_state(st0)
        self.data_layer._cur = cur
        self.data_layer._perm = perm
        self.data_layer_val._cur = cur_val
        self.data_layer_val._perm = perm_val

        # Set the learning rate, only reduce once
        if last_snapshot_iter > cfg.TRAIN.STEPSIZE:
          sess.run(tf.assign(lr, cfg.TRAIN.LEARNING_RATE * cfg.TRAIN.GAMMA))
        else:
          sess.run(tf.assign(lr, cfg.TRAIN.LEARNING_RATE))

    a = np.sum([np.prod(v.get_shape().as_list()) for v in tf.trainable_variables()])
    print(a)

    timer = Timer()
    iter = last_snapshot_iter + 1
    last_summary_time = time.time()
    while iter < max_iters + 1:
      # Learning rate
      if iter == cfg.TRAIN.STEPSIZE + 1:
        # Add snapshot here before reducing the learning rate
        self.snapshot(sess, iter)
        sess.run(tf.assign(lr, cfg.TRAIN.LEARNING_RATE * cfg.TRAIN.GAMMA))

      timer.tic()
      # Get training data, one batch at a time
      blobs = self.data_layer.forward()

      now = time.time()
      if now - last_summary_time > cfg.TRAIN.SUMMARY_INTERVAL:
        # Compute the graph with summary
        rpn_loss_cls, rpn_loss_box, loss_cls, loss_box, total_loss, summary = \
          self.net.train_step_with_summary(sess, blobs, train_op)
        self.writer.add_summary(summary, float(iter))
        # Also check the summary on the validation set
        blobs_val = self.data_layer_val.forward()
        summary_val = self.net.get_summary(sess, blobs_val)
        self.valwriter.add_summary(summary_val, float(iter))
        last_summary_time = now
      else:
        # Compute the graph without summary
        rpn_loss_cls, rpn_loss_box, loss_cls, loss_box, total_loss = \
          self.net.train_step(sess, blobs, train_op)
      timer.toc()

      # Display training information
      if iter % (cfg.TRAIN.DISPLAY) == 0:
        print('iter: %d / %d, total loss: %.6f\n >>> rpn_loss_cls: %.6f\n '
              '>>> rpn_loss_box: %.6f\n >>> loss_cls: %.6f\n >>> loss_box: %.6f\n >>> lr: %f' % \
              (iter, max_iters, total_loss, rpn_loss_cls, rpn_loss_box, loss_cls, loss_box, lr.eval()))
        print('speed: {:.3f}s / iter'.format(timer.average_time))

      if iter % cfg.TRAIN.SNAPSHOT_ITERS == 0:
        last_snapshot_iter = iter
        snapshot_path, np_path = self.snapshot(sess, iter)
        np_paths.append(np_path)
        ss_paths.append(snapshot_path)

        # Remove the old snapshots if there are too many
        if len(np_paths) > cfg.TRAIN.SNAPSHOT_KEPT:
          to_remove = len(np_paths) - cfg.TRAIN.SNAPSHOT_KEPT
          for c in range(to_remove):
            nfile = np_paths[0]
            os.remove(str(nfile))
            np_paths.remove(nfile)

        if len(ss_paths) > cfg.TRAIN.SNAPSHOT_KEPT:
          to_remove = len(ss_paths) - cfg.TRAIN.SNAPSHOT_KEPT
          for c in range(to_remove):
            sfile = ss_paths[0]
            # To make the code compatible to earlier versions of Tensorflow,
            # where the naming tradition for checkpoints are different
            if os.path.exists(str(sfile)):
              os.remove(str(sfile))
            else:
              os.remove(str(sfile + '.data-00000-of-00001'))
              os.remove(str(sfile + '.index'))
            sfile_meta = sfile + '.meta'
            os.remove(str(sfile_meta))
            ss_paths.remove(sfile)

      iter += 1

    if last_snapshot_iter != iter - 1:
      self.snapshot(sess, iter - 1)

    self.writer.close()
    self.valwriter.close()


def get_training_roidb(imdb):
  """Returns a roidb (Region of Interest database) for use in training."""
  if cfg.TRAIN.USE_FLIPPED:
    print('Appending horizontally-flipped training examples...')
    imdb.append_flipped_images()
    print('done')

  print('Preparing training data...')
  rdl_roidb.prepare_roidb(imdb)
  print('done')

  return imdb.roidb


def filter_roidb(roidb):
  """Remove roidb entries that have no usable RoIs."""

  def is_valid(entry):
    # Valid images have:
    #   (1) At least one foreground RoI OR
    #   (2) At least one background RoI
    overlaps = entry['max_overlaps']
    # find boxes with sufficient overlap
    fg_inds = np.where(overlaps >= cfg.TRAIN.FG_THRESH)[0]
    # Select background RoIs as those within [BG_THRESH_LO, BG_THRESH_HI)
    bg_inds = np.where((overlaps < cfg.TRAIN.BG_THRESH_HI) &
                       (overlaps >= cfg.TRAIN.BG_THRESH_LO))[0]
    # image is only valid if such boxes exist
    valid = len(fg_inds) > 0 or len(bg_inds) > 0
    return valid

  num = len(roidb)
  filtered_roidb = [entry for entry in roidb if is_valid(entry)]
  num_after = len(filtered_roidb)
  print('Filtered {} roidb entries: {} -> {}'.format(num - num_after,
                                                     num, num_after))
  return filtered_roidb


def train_net(network, imdb, roidb, valroidb, output_dir, tb_dir,
              pretrained_model=None,
              max_iters=40000):
  """Train a Fast R-CNN network."""
  roidb = filter_roidb(roidb)
  valroidb = filter_roidb(valroidb)

  tfconfig = tf.ConfigProto(allow_soft_placement=True)
  tfconfig.gpu_options.allow_growth = True

  with tf.Session(config=tfconfig) as sess:
    sw = SolverWrapper(sess, network, imdb, roidb, valroidb, output_dir, tb_dir,
                       pretrained_model=pretrained_model)
    print('Solving...')
    sw.train_model(sess, max_iters)
    print('done solving')


================================================
FILE: lib/model/train_val.py~
================================================
# --------------------------------------------------------
# Tensorflow Faster R-CNN
# Licensed under The MIT License [see LICENSE for details]
# Written by Xinlei Chen and Zheqi He
# --------------------------------------------------------
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from model.config import cfg
import roi_data_layer.roidb as rdl_roidb
from roi_data_layer.layer import RoIDataLayer
from utils.timer import Timer
try:
  import cPickle as pickle
except ImportError:
  import pickle
import numpy as np
import os
import sys
import glob
import time

import tensorflow as tf
from tensorflow.python import pywrap_tensorflow

class SolverWrapper(object):
  """
    A wrapper class for the training process
  """

  def __init__(self, sess, network, imdb, roidb, valroidb, output_dir, tbdir, pretrained_model=None):
    self.net = network
    self.imdb = imdb
    self.roidb = roidb
    self.valroidb = valroidb
    self.output_dir = output_dir
    self.tbdir = tbdir
    # Simply put '_val' at the end to save the summaries from the validation set
    self.tbvaldir = tbdir + '_val'
    if not os.path.exists(self.tbvaldir):
      os.makedirs(self.tbvaldir)
    self.pretrained_model = pretrained_model

  def snapshot(self, sess, iter):
    net = self.net

    if not os.path.exists(self.output_dir):
      os.makedirs(self.output_dir)

    # Store the model snapshot
    filename = cfg.TRAIN.SNAPSHOT_PREFIX + '_iter_{:d}'.format(iter) + '.ckpt'
    filename = os.path.join(self.output_dir, filename)
    self.saver.save(sess, filename)
    print('Wrote snapshot to: {:s}'.format(filename))

    # Also store some meta information, random state, etc.
    nfilename = cfg.TRAIN.SNAPSHOT_PREFIX + '_iter_{:d}'.format(iter) + '.pkl'
    nfilename = os.path.join(self.output_dir, nfilename)
    # current state of numpy random
    st0 = np.random.get_state()
    # current position in the database
    cur = self.data_layer._cur
    # current shuffled indeces of the database
    perm = self.data_layer._perm
    # current position in the validation database
    cur_val = self.data_layer_val._cur
    # current shuffled indeces of the validation database
    perm_val = self.data_layer_val._perm

    # Dump the meta info
    with open(nfilename, 'wb') as fid:
      pickle.dump(st0, fid, pickle.HIGHEST_PROTOCOL)
      pickle.dump(cur, fid, pickle.HIGHEST_PROTOCOL)
      pickle.dump(perm, fid, pickle.HIGHEST_PROTOCOL)
      pickle.dump(cur_val, fid, pickle.HIGHEST_PROTOCOL)
      pickle.dump(perm_val, fid, pickle.HIGHEST_PROTOCOL)
      pickle.dump(iter, fid, pickle.HIGHEST_PROTOCOL)

    return filename, nfilename

  def get_variables_in_checkpoint_file(self, file_name):
    try:
      reader = pywrap_tensorflow.NewCheckpointReader(file_name)
      var_to_shape_map = reader.get_variable_to_shape_map()
      return var_to_shape_map 
    except Exception as e:  # pylint: disable=broad-except
      print(str(e))
      if "corrupted compressed block contents" in str(e):
        print("It's likely that your checkpoint file has been compressed "
              "with SNAPPY.")

  def train_model(self, sess, max_iters):
    # Build data layers for both training and validation set
    self.data_layer = RoIDataLayer(self.roidb, self.imdb.num_classes)
    self.data_layer_val = RoIDataLayer(self.valroidb, self.imdb.num_classes, random=True)

    # Determine different scales for anchors, see paper
    with sess.graph.as_default():
      # Set the random seed for tensorflow
      tf.set_random_seed(cfg.RNG_SEED)
      # Build the main computation graph
      layers = self.net.create_architecture(sess, 'TRAIN', self.imdb.num_classes, tag='default',
                                            anchor_scales=cfg.ANCHOR_SCALES,
                                            anchor_ratios=cfg.ANCHOR_RATIOS)
      # Define the loss
      loss = layers['total_loss']
      # Set learning rate and momentum
      lr = tf.Variable(cfg.TRAIN.LEARNING_RATE, trainable=False)
      momentum = cfg.TRAIN.MOMENTUM
      self.optimizer = tf.train.MomentumOptimizer(lr, momentum)

      # Compute the gradients wrt the loss
      gvs = self.optimizer.compute_gradients(loss)
      # Double the gradient of the bias if set
      if cfg.TRAIN.DOUBLE_BIAS:
        final_gvs = []
        with tf.variable_scope('Gradient_Mult') as scope:
          for grad, var in gvs:
            scale = 1.
            if cfg.TRAIN.DOUBLE_BIAS and '/biases:' in var.name:
              scale *= 2.
            if not np.allclose(scale, 1.0):
              grad = tf.multiply(grad, scale)
            final_gvs.append((grad, var))
        train_op = self.optimizer.apply_gradients(final_gvs)
      else:
        train_op = self.optimizer.apply_gradients(gvs)

      # We will handle the snapshots ourselves
      self.saver = tf.train.Saver(max_to_keep=100000)
      # Write the train and validation information to tensorboard
      self.writer = tf.summary.FileWriter(self.tbdir, sess.graph)
      self.valwriter = tf.summary.FileWriter(self.tbvaldir)

    # Find previous snapshots if there is any to restore from
    sfiles = os.path.join(self.output_dir, cfg.TRAIN.SNAPSHOT_PREFIX + '_iter_*.ckpt.meta')
    sfiles = glob.glob(sfiles)
    sfiles.sort(key=os.path.getmtime)
    # Get the snapshot name in TensorFlow
    redstr = '_iter_{:d}.'.format(cfg.TRAIN.STEPSIZE+1)
    sfiles = [ss.replace('.meta', '') for ss in sfiles]
    sfiles = [ss for ss in sfiles if redstr not in ss]

    nfiles = os.path.join(self.output_dir, cfg.TRAIN.SNAPSHOT_PREFIX + '_iter_*.pkl')
    nfiles = glob.glob(nfiles)
    nfiles.sort(key=os.path.getmtime)
    nfiles = [nn for nn in nfiles if redstr not in nn]

    lsf = len(sfiles)
    assert len(nfiles) == lsf

    np_paths = nfiles
    ss_paths = sfiles

    if lsf == 0:
      # Fresh train directly from ImageNet weights
      print('Loading initial model weights from {:s}'.format(self.pretrained_model))
      variables = tf.global_variables()
      # Initialize all variables first
      sess.run(tf.variables_initializer(variables, name='init'))
      var_keep_dic = self.get_variables_in_checkpoint_file(self.pretrained_model)
      # Get the variables to restore, ignorizing the variables to fix
      variables_to_restore = self.net.get_variables_to_restore(variables, var_keep_dic)

      restorer = tf.train.Saver(variables_to_restore)
      restorer.restore(sess, self.pretrained_model)
      print('Loaded.')
      # Need to fix the variables before loading, so that the RGB weights are changed to BGR
      # For VGG16 it also changes the convolutional weights fc6 and fc7 to
      # fully connected weights
      self.net.fix_variables(sess, self.pretrained_model)
      print('Fixed.')
      sess.run(tf.assign(lr, cfg.TRAIN.LEARNING_RATE))
      last_snapshot_iter = 0
    else:
      # Get the most recent snapshot and restore
      ss_paths = [ss_paths[-1]]
      np_paths = [np_paths[-1]]

      print('Restorining model snapshots from {:s}'.format(sfiles[-1]))
      self.saver.restore(sess, str(sfiles[-1]))
      print('Restored.')
      # Needs to restore the other hyperparameters/states for training, (TODO xinlei) I have
      # tried my best to find the random states so that it can be recovered exactly
      # However the Tensorflow state is currently not available
      with open(str(nfiles[-1]), 'rb') as fid:
        st0 = pickle.load(fid)
        cur = pickle.load(fid)
        perm = pickle.load(fid)
        cur_val = pickle.load(fid)
        perm_val = pickle.load(fid)
        last_snapshot_iter = pickle.load(fid)

        np.random.set_state(st0)
        self.data_layer._cur = cur
        self.data_layer._perm = perm
        self.data_layer_val._cur = cur_val
        self.data_layer_val._perm = perm_val

        # Set the learning rate, only reduce once
        if last_snapshot_iter > cfg.TRAIN.STEPSIZE:
          sess.run(tf.assign(lr, cfg.TRAIN.LEARNING_RATE * cfg.TRAIN.GAMMA))
        else:
          sess.run(tf.assign(lr, cfg.TRAIN.LEARNING_RATE))

    np.sum([np.prod(v.get_shape().as_list()) for v in tf.trainable_variables()])

    timer = Timer()
    iter = last_snapshot_iter + 1
    last_summary_time = time.time()
    while iter < max_iters + 1:
      # Learning rate
      if iter == cfg.TRAIN.STEPSIZE + 1:
        # Add snapshot here before reducing the learning rate
        self.snapshot(sess, iter)
        sess.run(tf.assign(lr, cfg.TRAIN.LEARNING_RATE * cfg.TRAIN.GAMMA))

      timer.tic()
      # Get training data, one batch at a time
      blobs = self.data_layer.forward()

      now = time.time()
      if now - last_summary_time > cfg.TRAIN.SUMMARY_INTERVAL:
        # Compute the graph with summary
        rpn_loss_cls, rpn_loss_box, loss_cls, loss_box, total_loss, summary = \
          self.net.train_step_with_summary(sess, blobs, train_op)
        self.writer.add_summary(summary, float(iter))
        # Also check the summary on the validation set
        blobs_val = self.data_layer_val.forward()
        summary_val = self.net.get_summary(sess, blobs_val)
        self.valwriter.add_summary(summary_val, float(iter))
        last_summary_time = now
      else:
        # Compute the graph without summary
        rpn_loss_cls, rpn_loss_box, loss_cls, loss_box, total_loss = \
          self.net.train_step(sess, blobs, train_op)
      timer.toc()

      # Display training information
      if iter % (cfg.TRAIN.DISPLAY) == 0:
        print('iter: %d / %d, total loss: %.6f\n >>> rpn_loss_cls: %.6f\n '
              '>>> rpn_loss_box: %.6f\n >>> loss_cls: %.6f\n >>> loss_box: %.6f\n >>> lr: %f' % \
              (iter, max_iters, total_loss, rpn_loss_cls, rpn_loss_box, loss_cls, loss_box, lr.eval()))
        print('speed: {:.3f}s / iter'.format(timer.average_time))

      if iter % cfg.TRAIN.SNAPSHOT_ITERS == 0:
        last_snapshot_iter = iter
        snapshot_path, np_path = self.snapshot(sess, iter)
        np_paths.append(np_path)
        ss_paths.append(snapshot_path)

        # Remove the old snapshots if there are too many
        if len(np_paths) > cfg.TRAIN.SNAPSHOT_KEPT:
          to_remove = len(np_paths) - cfg.TRAIN.SNAPSHOT_KEPT
          for c in range(to_remove):
            nfile = np_paths[0]
            os.remove(str(nfile))
            np_paths.remove(nfile)

        if len(ss_paths) > cfg.TRAIN.SNAPSHOT_KEPT:
          to_remove = len(ss_paths) - cfg.TRAIN.SNAPSHOT_KEPT
          for c in range(to_remove):
            sfile = ss_paths[0]
            # To make the code compatible to earlier versions of Tensorflow,
            # where the naming tradition for checkpoints are different
            if os.path.exists(str(sfile)):
              os.remove(str(sfile))
            else:
              os.remove(str(sfile + '.data-00000-of-00001'))
              os.remove(str(sfile + '.index'))
            sfile_meta = sfile + '.meta'
            os.remove(str(sfile_meta))
            ss_paths.remove(sfile)

      iter += 1

    if last_snapshot_iter != iter - 1:
      self.snapshot(sess, iter - 1)

    self.writer.close()
    self.valwriter.close()


def get_training_roidb(imdb):
  """Returns a roidb (Region of Interest database) for use in training."""
  if cfg.TRAIN.USE_FLIPPED:
    print('Appending horizontally-flipped training examples...')
    imdb.append_flipped_images()
    print('done')

  print('Preparing training data...')
  rdl_roidb.prepare_roidb(imdb)
  print('done')

  return imdb.roidb


def filter_roidb(roidb):
  """Remove roidb entries that have no usable RoIs."""

  def is_valid(entry):
    # Valid images have:
    #   (1) At least one foreground RoI OR
    #   (2) At least one background RoI
    overlaps = entry['max_overlaps']
    # find boxes with sufficient overlap
    fg_inds = np.where(overlaps >= cfg.TRAIN.FG_THRESH)[0]
    # Select background RoIs as those within [BG_THRESH_LO, BG_THRESH_HI)
    bg_inds = np.where((overlaps < cfg.TRAIN.BG_THRESH_HI) &
                       (overlaps >= cfg.TRAIN.BG_THRESH_LO))[0]
    # image is only valid if such boxes exist
    valid = len(fg_inds) > 0 or len(bg_inds) > 0
    return valid

  num = len(roidb)
  filtered_roidb = [entry for entry in roidb if is_valid(entry)]
  num_after = len(filtered_roidb)
  print('Filtered {} roidb entries: {} -> {}'.format(num - num_after,
                                                     num, num_after))
  return filtered_roidb


def train_net(network, imdb, roidb, valroidb, output_dir, tb_dir,
              pretrained_model=None,
              max_iters=40000):
  """Train a Fast R-CNN network."""
  roidb = filter_roidb(roidb)
  valroidb = filter_roidb(valroidb)

  tfconfig = tf.ConfigProto(allow_soft_placement=True)
  tfconfig.gpu_options.allow_growth = True

  with tf.Session(config=tfconfig) as sess:
    sw = SolverWrapper(sess, network, imdb, roidb, valroidb, output_dir, tb_dir,
                       pretrained_model=pretrained_model)
    print('Solving...')
    sw.train_model(sess, max_iters)
    print('done solving')


================================================
FILE: lib/nets/__init__.py
================================================


================================================
FILE: lib/nets/network.py
================================================
# --------------------------------------------------------
# Tensorflow Faster R-CNN
# Licensed under The MIT License [see LICENSE for details]
# Written by Xinlei Chen
# --------------------------------------------------------
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tensorflow as tf
import tensorflow.contrib.slim as slim
from tensorflow.contrib.slim import losses
from tensorflow.contrib.slim import arg_scope

import numpy as np

from layer_utils.snippets import generate_anchors_pre
from layer_utils.proposal_layer import proposal_layer
from layer_utils.proposal_top_layer import proposal_top_layer
from layer_utils.anchor_target_layer import anchor_target_layer
from layer_utils.proposal_target_layer import proposal_target_layer

from model.config import cfg

class Network(object):
  def __init__(self, batch_size=1):
    self._feat_stride = [16, ]
    self._feat_compress = [1. / 16., ]
    self._batch_size = batch_size
    self._predictions = {}
    self._losses = {}
    self._anchor_targets = {}
    self._proposal_targets = {}
    self._layers = {}
    self._act_summaries = []
    self._score_summaries = {}
    self._train_summaries = []
    self._event_summaries = {}
    self._variables_to_fix = {}

  def _add_image_summary(self, image, boxes):
    # add back mean
    image += cfg.PIXEL_MEANS
    # bgr to rgb (opencv uses bgr)
    channels = tf.unstack (image, axis=-1)
    image    = tf.stack ([channels[2], channels[1], channels[0]], axis=-1)
    # dims for normalization
    width  = tf.to_float(tf.shape(image)[2])
    height = tf.to_float(tf.shape(image)[1])
    # from [x1, y1, x2, y2, cls] to normalized [y1, x1, y1, x1]
    cols = tf.unstack(boxes, axis=1)
    boxes = tf.stack([cols[1] / height,
                      cols[0] / width,
                      cols[3] / height,
                      cols[2] / width], axis=1)
    # add batch dimension (assume batch_size==1)
    assert image.get_shape()[0] == 1
    boxes = tf.expand_dims(boxes, dim=0)
    image = tf.image.draw_bounding_boxes(image, boxes)
    
    return tf.summary.image('ground_truth', image)

  def _add_act_summary(self, tensor):
    tf.summary.histogram('ACT/' + tensor.op.name + '/activations', tensor)
    tf.summary.scalar('ACT/' + tensor.op.name + '/zero_fraction',
                      tf.nn.zero_fraction(tensor))

  def _add_score_summary(self, key, tensor):
    tf.summary.histogram('SCORE/' + tensor.op.name + '/' + key + '/scores', tensor)

  def _add_train_summary(self, var):
    tf.summary.histogram('TRAIN/' + var.op.name, var)

  def _reshape_layer(self, bottom, num_dim, name):
    input_shape = tf.shape(bottom)
    with tf.variable_scope(name) as scope:
      # change the channel to the caffe format
      to_caffe = tf.transpose(bottom, [0, 3, 1, 2])
      # then force it to have channel 2
      reshaped = tf.reshape(to_caffe,
                            tf.concat(axis=0, values=[[self._batch_size], [num_dim, -1], [input_shape[2]]]))
      # then swap the channel back
      to_tf = tf.transpose(reshaped, [0, 2, 3, 1])
      return to_tf

  def _softmax_layer(self, bottom, name):
    if name == 'rpn_cls_prob_reshape':
      input_shape = tf.shape(bottom)
      bottom_reshaped = tf.reshape(bottom, [-1, input_shape[-1]])
      reshaped_score = tf.nn.softmax(bottom_reshaped, name=name)
      return tf.reshape(reshaped_score, input_shape)
    return tf.nn.softmax(bottom, name=name)

  def _proposal_top_layer(self, rpn_cls_prob, rpn_bbox_pred, name):
    with tf.variable_scope(name) as scope:
      rois, rpn_scores = tf.py_func(proposal_top_layer,
                                    [rpn_cls_prob, rpn_bbox_pred, self._im_info,
                                     self._feat_stride, self._anchors, self._num_anchors],
                                    [tf.float32, tf.float32])
      rois.set_shape([cfg.TEST.RPN_TOP_N, 5])
      rpn_scores.set_shape([cfg.TEST.RPN_TOP_N, 1])

    return rois, rpn_scores

  def _proposal_layer(self, rpn_cls_prob, rpn_bbox_pred, name):
    with tf.variable_scope(name) as scope:
      rois, rpn_scores = tf.py_func(proposal_layer,
                                    [rpn_cls_prob, rpn_bbox_pred, self._im_info, self._mode,
                                     self._feat_stride, self._anchors, self._num_anchors],
                                    [tf.float32, tf.float32])
      rois.set_shape([None, 5])
      rpn_scores.set_shape([None, 1])

    return rois, rpn_scores

  # Only use it if you have roi_pooling op written in tf.image
  def _roi_pool_layer(self, bootom, rois, name):
    with tf.variable_scope(name) as scope:
      return tf.image.roi_pooling(bootom, rois,
                                  pooled_height=cfg.POOLING_SIZE,
                                  pooled_width=cfg.POOLING_SIZE,
                                  spatial_scale=1. / 16.)[0]

  def _crop_pool_layer(self, bottom, rois, name):
    with tf.variable_scope(name) as scope:
      batch_ids = tf.squeeze(tf.slice(rois, [0, 0], [-1, 1], name="batch_id"), [1])
      # Get the normalized coordinates of bboxes
      bottom_shape = tf.shape(bottom)
      height = (tf.to_float(bottom_shape[1]) - 1.) * np.float32(self._feat_stride[0])
      width = (tf.to_float(bottom_shape[2]) - 1.) * np.float32(self._feat_stride[0])
      x1 = tf.slice(rois, [0, 1], [-1, 1], name="x1") / width
      y1 = tf.slice(rois, [0, 2], [-1, 1], name="y1") / height
      x2 = tf.slice(rois, [0, 3], [-1, 1], name="x2") / width
      y2 = tf.slice(rois, [0, 4], [-1, 1], name="y2") / height
      # Won't be backpropagated to rois anyway, but to save time
      bboxes = tf.stop_gradient(tf.concat([y1, x1, y2, x2], axis=1))
      pre_pool_size = cfg.POOLING_SIZE * 2
      crops = tf.image.crop_and_resize(bottom, bboxes, tf.to_int32(batch_ids), [pre_pool_size, pre_pool_size], name="crops")

    return slim.max_pool2d(crops, [2, 2], padding='SAME')

  def _dropout_layer(self, bottom, name, ratio=0.5):
    return tf.nn.dropout(bottom, ratio, name=name)

  def _anchor_target_layer(self, rpn_cls_score, name):
    with tf.variable_scope(name) as scope:
      rpn_labels, rpn_bbox_targets, rpn_bbox_inside_weights, rpn_bbox_outside_weights = tf.py_func(
        anchor_target_layer,
        [rpn_cls_score, self._gt_boxes, self._im_info, self._feat_stride, self._anchors, self._num_anchors],
        [tf.float32, tf.float32, tf.float32, tf.float32])

      rpn_labels.set_shape([1, 1, None, None])
      rpn_bbox_targets.set_shape([1, None, None, self._num_anchors * 4])
      rpn_bbox_inside_weights.set_shape([1, None, None, self._num_anchors * 4])
      rpn_bbox_outside_weights.set_shape([1, None, None, self._num_anchors * 4])

      rpn_labels = tf.to_int32(rpn_labels, name="to_int32")
      self._anchor_targets['rpn_labels'] = rpn_labels
      self._anchor_targets['rpn_bbox_targets'] = rpn_bbox_targets
      self._anchor_targets['rpn_bbox_inside_weights'] = rpn_bbox_inside_weights
      self._anchor_targets['rpn_bbox_outside_weights'] = rpn_bbox_outside_weights

      self._score_summaries.update(self._anchor_targets)

    return rpn_labels

  def _proposal_target_layer(self, rois, roi_scores, name):
    with tf.variable_scope(name) as scope:
      rois, roi_scores, labels, bbox_targets, bbox_inside_weights, bbox_outside_weights = tf.py_func(
        proposal_target_layer,
        [rois, roi_scores, self._gt_boxes, self._num_classes],
        [tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32])

      rois.set_shape([cfg.TRAIN.BATCH_SIZE, 5])
      roi_scores.set_shape([cfg.TRAIN.BATCH_SIZE])
      labels.set_shape([cfg.TRAIN.BATCH_SIZE, 1])
      bbox_targets.set_shape([cfg.TRAIN.BATCH_SIZE, self._num_classes * 4])
      bbox_inside_weights.set_shape([cfg.TRAIN.BATCH_SIZE, self._num_classes * 4])
      bbox_outside_weights.set_shape([cfg.TRAIN.BATCH_SIZE, self._num_classes * 4])

      self._proposal_targets['rois'] = rois
      self._proposal_targets['labels'] = tf.to_int32(labels, name="to_int32")
      self._proposal_targets['bbox_targets'] = bbox_targets
      self._proposal_targets['bbox_inside_weights'] = bbox_inside_weights
      self._proposal_targets['bbox_outside_weights'] = bbox_outside_weights

      self._score_summaries.update(self._proposal_targets)

      return rois, roi_scores

  def _anchor_component(self):
    with tf.variable_scope('ANCHOR_' + self._tag) as scope:
      # just to get the shape right
      height = tf.to_int32(tf.ceil(self._im_info[0, 0] / np.float32(self._feat_stride[0])))
      width = tf.to_int32(tf.ceil(self._im_info[0, 1] / np.float32(self._feat_stride[0])))
      anchors, anchor_length = tf.py_func(generate_anchors_pre,
                                          [height, width,
                                           self._feat_stride, self._anchor_scales, self._anchor_ratios],
                                          [tf.float32, tf.int32], name="generate_anchors")
      anchors.set_shape([None, 4])
      anchor_length.set_shape([])
      self._anchors = anchors
      self._anchor_length = anchor_length

  def build_network(self, sess, is_training=True):
    raise NotImplementedError

  def _smooth_l1_loss(self, bbox_pred, bbox_targets, bbox_inside_weights, bbox_outside_weights, sigma=1.0, dim=[1]):
    sigma_2 = sigma ** 2
    box_diff = bbox_pred - bbox_targets
    in_box_diff = bbox_inside_weights * box_diff
    abs_in_box_diff = tf.abs(in_box_diff)
    smoothL1_sign = tf.stop_gradient(tf.to_float(tf.less(abs_in_box_diff, 1. / sigma_2)))
    in_loss_box = tf.pow(in_box_diff, 2) * (sigma_2 / 2.) * smoothL1_sign \
                  + (abs_in_box_diff - (0.5 / sigma_2)) * (1. - smoothL1_sign)
    out_loss_box = bbox_outside_weights * in_loss_box
    loss_box = tf.reduce_mean(tf.reduce_sum(
      out_loss_box,
      axis=dim
    ))
    return loss_box

  def _add_losses(self, sigma_rpn=3.0):
    with tf.variable_scope('loss_' + self._tag) as scope:
      # RPN, class loss
      rpn_cls_score = tf.reshape(self._predictions['rpn_cls_score_reshape'], [-1, 2])
      rpn_label = tf.reshape(self._anchor_targets['rpn_labels'], [-1])
      rpn_select = tf.where(tf.not_equal(rpn_label, -1))
      rpn_cls_score = tf.reshape(tf.gather(rpn_cls_score, rpn_select), [-1, 2])
      rpn_label = tf.reshape(tf.gather(rpn_label, rpn_select), [-1])
      rpn_cross_entropy = tf.reduce_mean(
        tf.nn.sparse_softmax_cross_entropy_with_logits(logits=rpn_cls_score, labels=rpn_label))

      # RPN, bbox loss
      rpn_bbox_pred = self._predictions['rpn_bbox_pred']
      rpn_bbox_targets = self._anchor_targets['rpn_bbox_targets']
      rpn_bbox_inside_weights = self._anchor_targets['rpn_bbox_inside_weights']
      rpn_bbox_outside_weights = self._anchor_targets['rpn_bbox_outside_weights']

      rpn_loss_box = self._smooth_l1_loss(rpn_bbox_pred, rpn_bbox_targets, rpn_bbox_inside_weights,
                                          rpn_bbox_outside_weights, sigma=sigma_rpn, dim=[1, 2, 3])

      # RCNN, class loss
      cls_score = self._predictions["cls_score"]
      label = tf.reshape(self._proposal_targets["labels"], [-1])

      cross_entropy = tf.reduce_mean(
        tf.nn.sparse_softmax_cross_entropy_with_logits(
          logits=tf.reshape(cls_score, [-1, self._num_classes]), labels=label))

      # RCNN, bbox loss
      bbox_pred = self._predictions['bbox_pred']
      bbox_targets = self._proposal_targets['bbox_targets']
      bbox_inside_weights = self._proposal_targets['bbox_inside_weights']
      bbox_outside_weights = self._proposal_targets['bbox_outside_weights']

      loss_box = self._smooth_l1_loss(bbox_pred, bbox_targets, bbox_inside_weights, bbox_outside_weights)

      self._losses['cross_entropy'] = cross_entropy
      self._losses['loss_box'] = loss_box
      self._losses['rpn_cross_entropy'] = rpn_cross_entropy
      self._losses['rpn_loss_box'] = rpn_loss_box

      loss = cross_entropy + loss_box + rpn_cross_entropy + rpn_loss_box
      self._losses['total_loss'] = loss

      self._event_summaries.update(self._losses)

    return loss

  def create_architecture(self, sess, mode, num_classes, tag=None,
                          anchor_scales=(8, 16, 32), anchor_ratios=(0.5, 1, 2)):
    self._image = tf.placeholder(tf.float32, shape=[self._batch_size, None, None, 3])
    self._im_info = tf.placeholder(tf.float32, shape=[self._batch_size, 3])
    self._gt_boxes = tf.placeholder(tf.float32, shape=[None, 5])
    self._tag = tag

    self._num_classes = num_classes
    self._mode = mode
    self._anchor_scales = anchor_scales
    self._num_scales = len(anchor_scales)

    self._anchor_ratios = anchor_ratios
    self._num_ratios = len(anchor_ratios)

    self._num_anchors = self._num_scales * self._num_ratios

    training = mode == 'TRAIN'
    testing = mode == 'TEST'

    assert tag != None

    # handle most of the regularizers here
    weights_regularizer = tf.contrib.layers.l2_regularizer(cfg.TRAIN.WEIGHT_DECAY)
    if cfg.TRAIN.BIAS_DECAY:
      biases_regularizer = weights_regularizer
    else:
      biases_regularizer = tf.no_regularizer

    # list as many types of layers as possible, even if they are not used now
    with arg_scope([slim.conv2d, slim.conv2d_in_plane, \
                    slim.conv2d_transpose, slim.separable_conv2d, slim.fully_connected], 
                    weights_regularizer=weights_regularizer,
                    biases_regularizer=biases_regularizer, 
                    biases_initializer=tf.constant_initializer(0.0)): 
      rois, cls_prob, bbox_pred = self.build_network(sess, training)

    layers_to_output = {'rois': rois}
    layers_to_output.update(self._predictions)

    for var in tf.trainable_variables():
      self._train_summaries.append(var)

    if mode == 'TEST':
      stds = np.tile(np.array(cfg.TRAIN.BBOX_NORMALIZE_STDS), (self._num_classes))
      means = np.tile(np.array(cfg.TRAIN.BBOX_NORMALIZE_MEANS), (self._num_classes))
      self._predictions["bbox_pred"] *= stds
      self._predictions["bbox_pred"] += means
    else:
      self._add_losses()
      layers_to_output.update(self._losses)

    val_summaries = []
    with tf.device("/cpu:0"):
      val_summaries.append(self._add_image_summary(self._image, self._gt_boxes))
      for key, var in self._event_summaries.items():
        val_summaries.append(tf.summary.scalar(key, var))
      for key, var in self._score_summaries.items():
        self._add_score_summary(key, var)
      for var in self._act_summaries:
        self._add_act_summary(var)
      for var in self._train_summaries:
        self._add_train_summary(var)

    self._summary_op = tf.summary.merge_all()
    if not testing:
      self._summary_op_val = tf.summary.merge(val_summaries)

    return layers_to_output

  def get_variables_to_restore(self, variables, var_keep_dic):
    raise NotImplementedError

  def fix_variables(self, sess, pretrained_model):
    raise NotImplementedError

  # Extract the head feature maps, for example for vgg16 it is conv5_3
  # only useful during testing mode
  def extract_head(self, sess, image):
    feed_dict = {self._image: image}
    feat = sess.run(self._layers["head"], feed_dict=feed_dict)
    return feat

  # only useful during testing mode
  def test_image(self, sess, image, im_info):
    feed_dict = {self._image: image,
                 self._im_info: im_info}
    cls_score, cls_prob, bbox_pred, rois = sess.run([self._predictions["cls_score"],
                                                     self._predictions['cls_prob'],
                                                     self._predictions['bbox_pred'],
                                                     self._predictions['rois']],
                                                    feed_dict=feed_dict)
    return cls_score, cls_prob, bbox_pred, rois

  def get_summary(self, sess, blobs):
    feed_dict = {self._image: blobs['data'], self._im_info: blobs['im_info'],
                 self._gt_boxes: blobs['gt_boxes']}
    summary = sess.run(self._summary_op_val, feed_dict=feed_dict)

    return summary

  def train_step(self, sess, blobs, train_op):
    feed_dict = {self._image: blobs['data'], self._im_info: blobs['im_info'],
                 self._gt_boxes: blobs['gt_boxes']}
    rpn_loss_cls, rpn_loss_box, loss_cls, loss_box

Download .txt

gitextract_6dytcin7/

├── README.md
├── data/
│   └── scripts/
│       ├── dataPreprocessingTest_fasterrcnn_split.m
│       ├── dataPreprocessing_fasterrcnn.m
│       └── fetch_faster_rcnn_models.sh
├── experiments/
│   ├── cfgs/
│   │   ├── res101-lg.yml
│   │   ├── res101.yml
│   │   ├── res50.yml
│   │   └── vgg16.yml
│   ├── logs/
│   │   └── .gitignore
│   └── scripts/
│       ├── convert_vgg16.sh
│       ├── test_faster_rcnn.sh
│       ├── train_faster_rcnn.sh
│       └── train_faster_rcnn.sh~
├── lib/
│   ├── Makefile
│   ├── datasets/
│   │   ├── VOCdevkit-matlab-wrapper/
│   │   │   ├── get_voc_opts.m
│   │   │   ├── voc_eval.m
│   │   │   └── xVOCap.m
│   │   ├── __init__.py
│   │   ├── coco.py
│   │   ├── ds_utils.py
│   │   ├── factory.py
│   │   ├── factory.py~
│   │   ├── graspRGB.py
│   │   ├── graspRGB.py~
│   │   ├── imdb.py
│   │   ├── pascal_voc.py
│   │   ├── tools/
│   │   │   └── mcg_munge.py
│   │   └── voc_eval.py
│   ├── layer_utils/
│   │   ├── __init__.py
│   │   ├── anchor_target_layer.py
│   │   ├── generate_anchors.py
│   │   ├── proposal_layer.py
│   │   ├── proposal_target_layer.py
│   │   ├── proposal_top_layer.py
│   │   └── snippets.py
│   ├── model/
│   │   ├── __init__.py
│   │   ├── bbox_transform.py
│   │   ├── config.py
│   │   ├── config.py~
│   │   ├── nms_wrapper.py
│   │   ├── test.py
│   │   ├── test.py~
│   │   ├── train_val.py
│   │   └── train_val.py~
│   ├── nets/
│   │   ├── __init__.py
│   │   ├── network.py
│   │   ├── resnet_v1.py
│   │   ├── resnet_v1.py~
│   │   └── vgg16.py
│   ├── nms/
│   │   ├── .gitignore
│   │   ├── __init__.py
│   │   ├── cpu_nms.c
│   │   ├── cpu_nms.pyx
│   │   ├── gpu_nms.cpp
│   │   ├── gpu_nms.hpp
│   │   ├── gpu_nms.pyx
│   │   ├── nms_kernel.cu
│   │   └── py_cpu_nms.py
│   ├── roi_data_layer/
│   │   ├── __init__.py
│   │   ├── layer.py
│   │   ├── minibatch.py
│   │   ├── minibatch.py~
│   │   └── roidb.py
│   ├── setup.py
│   ├── setup.py~
│   └── utils/
│       ├── .gitignore
│       ├── __init__.py
│       ├── bbox.pyx
│       ├── blob.py
│       ├── boxes_grid.py
│       ├── nms.py
│       ├── nms.pyx
│       └── timer.py
└── tools/
    ├── _init_paths.py
    ├── demo.py~
    ├── demo_graspRGD.py
    ├── demo_graspRGD.py~
    ├── demo_graspRGD_socket.py
    ├── demo_graspRGD_socket.py~
    ├── demo_graspRGD_socket_drawer.py~
    ├── demo_graspRGD_socket_save_to_rgbd.py~
    ├── demo_graspRGD_vis_mask.py
    ├── demo_graspRGD_vis_select.py
    ├── eval_graspRGD.py~
    ├── mask_gen.py
    └── trainval_net.py

Download .txt

SYMBOL INDEX (484 symbols across 39 files)

FILE: lib/datasets/coco.py
  class coco (line 27) | class coco(imdb):
    method __init__ (line 28) | def __init__(self, image_set, year):
    method _get_ann_file (line 65) | def _get_ann_file(self):
    method _load_image_set_index (line 71) | def _load_image_set_index(self):
    method _get_widths (line 78) | def _get_widths(self):
    method image_path_at (line 83) | def image_path_at(self, i):
    method image_path_from_index (line 89) | def image_path_from_index(self, index):
    method gt_roidb (line 103) | def gt_roidb(self):
    method _load_coco_annotation (line 123) | def _load_coco_annotation(self, index):
    method _get_widths (line 181) | def _get_widths(self):
    method append_flipped_images (line 184) | def append_flipped_images(self):
    method _get_box_file (line 205) | def _get_box_file(self, index):
    method _print_detection_eval_metrics (line 212) | def _print_detection_eval_metrics(self, coco_eval):
    method _do_detection_eval (line 245) | def _do_detection_eval(self, res_file, output_dir):
    method _coco_results_one_category (line 258) | def _coco_results_one_category(self, boxes, cat_id):
    method _write_coco_results_file (line 276) | def _write_coco_results_file(self, all_boxes, res_file):
    method evaluate_detections (line 294) | def evaluate_detections(self, all_boxes, output_dir):
    method competition_mode (line 310) | def competition_mode(self, on):

FILE: lib/datasets/ds_utils.py
  function unique_boxes (line 13) | def unique_boxes(boxes, scale=1.0):
  function xywh_to_xyxy (line 21) | def xywh_to_xyxy(boxes):
  function xyxy_to_xywh (line 26) | def xyxy_to_xywh(boxes):
  function validate_boxes (line 31) | def validate_boxes(boxes, width=0, height=0):
  function filter_small_boxes (line 45) | def filter_small_boxes(boxes, min_size):

FILE: lib/datasets/factory.py
  function get_imdb (line 44) | def get_imdb(name):
  function list_imdbs (line 51) | def list_imdbs():

FILE: lib/datasets/graspRGB.py
  class graspRGB (line 23) | class graspRGB(imdb):
    method __init__ (line 24) | def __init__(self, image_set, devkit_path):
    method image_path_at (line 55) | def image_path_at(self, i):
    method image_path_from_index (line 61) | def image_path_from_index(self, index):
    method _load_image_set_index (line 74) | def _load_image_set_index(self):
    method gt_roidb (line 89) | def gt_roidb(self):
    method selective_search_roidb (line 110) | def selective_search_roidb(self):
    method _load_selective_search_roidb (line 139) | def _load_selective_search_roidb(self, gt_roidb):
    method selective_search_IJCV_roidb (line 153) | def selective_search_IJCV_roidb(self):
    method _load_selective_search_IJCV_roidb (line 179) | def _load_selective_search_IJCV_roidb(self, gt_roidb):
    method _load_graspRGB_annotation (line 195) | def _load_graspRGB_annotation(self, index):
    method _write_graspRGB_results_file (line 255) | def _write_graspRGB_results_file(self, all_boxes):
    method _do_matlab_eval (line 281) | def _do_matlab_eval(self, comp_id, output_dir='output'):
    method evaluate_detections (line 295) | def evaluate_detections(self, all_boxes, output_dir):
    method competition_mode (line 299) | def competition_mode(self, on):

FILE: lib/datasets/imdb.py
  class imdb (line 20) | class imdb(object):
    method __init__ (line 23) | def __init__(self, name, classes=None):
    method name (line 38) | def name(self):
    method num_classes (line 42) | def num_classes(self):
    method classes (line 46) | def classes(self):
    method image_index (line 50) | def image_index(self):
    method roidb_handler (line 54) | def roidb_handler(self):
    method roidb_handler (line 58) | def roidb_handler(self, val):
    method set_proposal_method (line 61) | def set_proposal_method(self, method):
    method roidb (line 66) | def roidb(self):
    method cache_path (line 78) | def cache_path(self):
    method num_images (line 85) | def num_images(self):
    method image_path_at (line 88) | def image_path_at(self, i):
    method default_roidb (line 91) | def default_roidb(self):
    method evaluate_detections (line 94) | def evaluate_detections(self, all_boxes, output_dir=None):
    method _get_widths (line 105) | def _get_widths(self):
    method append_flipped_images (line 109) | def append_flipped_images(self):
    method evaluate_recall (line 126) | def evaluate_recall(self, candidate_boxes=None, thresholds=None,
    method create_roidb_from_box_list (line 216) | def create_roidb_from_box_list(self, box_list, gt_roidb):
    method merge_roidbs (line 246) | def merge_roidbs(a, b):
    method competition_mode (line 258) | def competition_mode(self, on):

FILE: lib/datasets/pascal_voc.py
  class pascal_voc (line 26) | class pascal_voc(imdb):
    method __init__ (line 27) | def __init__(self, image_set, year, devkit_path=None):
    method image_path_at (line 60) | def image_path_at(self, i):
    method image_path_from_index (line 66) | def image_path_from_index(self, index):
    method _load_image_set_index (line 76) | def _load_image_set_index(self):
    method _get_default_path (line 90) | def _get_default_path(self):
    method gt_roidb (line 96) | def gt_roidb(self):
    method rpn_roidb (line 120) | def rpn_roidb(self):
    method _load_rpn_roidb (line 130) | def _load_rpn_roidb(self, gt_roidb):
    method _load_pascal_annotation (line 139) | def _load_pascal_annotation(self, index):
    method _get_comp_id (line 185) | def _get_comp_id(self):
    method _get_voc_results_file_template (line 190) | def _get_voc_results_file_template(self):
    method _write_voc_results_file (line 201) | def _write_voc_results_file(self, all_boxes):
    method _do_python_eval (line 219) | def _do_python_eval(self, output_dir='output'):
    method _do_matlab_eval (line 264) | def _do_matlab_eval(self, output_dir='output'):
    method evaluate_detections (line 279) | def evaluate_detections(self, all_boxes, output_dir):
    method competition_mode (line 291) | def competition_mode(self, on):

FILE: lib/datasets/tools/mcg_munge.py
  function munge (line 15) | def munge(src_dir):

FILE: lib/datasets/voc_eval.py
  function parse_rec (line 15) | def parse_rec(filename):
  function voc_ap (line 35) | def voc_ap(rec, prec, use_07_metric=False):
  function voc_eval (line 69) | def voc_eval(detpath,

FILE: lib/layer_utils/anchor_target_layer.py
  function anchor_target_layer (line 18) | def anchor_target_layer(rpn_cls_score, gt_boxes, im_info, _feat_stride, ...
  function _unmap (line 142) | def _unmap(data, count, inds, fill=0):
  function _compute_targets (line 156) | def _compute_targets(ex_rois, gt_rois):

FILE: lib/layer_utils/generate_anchors.py
  function generate_anchors (line 41) | def generate_anchors(base_size=16, ratios=[0.5, 1, 2],
  function _whctrs (line 55) | def _whctrs(anchor):
  function _mkanchors (line 67) | def _mkanchors(ws, hs, x_ctr, y_ctr):
  function _ratio_enum (line 82) | def _ratio_enum(anchor, ratios):
  function _scale_enum (line 96) | def _scale_enum(anchor, scales):

FILE: lib/layer_utils/proposal_layer.py
  function proposal_layer (line 16) | def proposal_layer(rpn_cls_prob, rpn_bbox_pred, im_info, cfg_key, _feat_...

FILE: lib/layer_utils/proposal_target_layer.py
  function proposal_target_layer (line 18) | def proposal_target_layer(rpn_rois, rpn_scores, gt_boxes, _num_classes):
  function _get_bbox_regression_labels (line 58) | def _get_bbox_regression_labels(bbox_target_data, num_classes):
  function _compute_targets (line 83) | def _compute_targets(ex_rois, gt_rois, labels):
  function _sample_rois (line 99) | def _sample_rois(all_rois, all_scores, gt_boxes, fg_rois_per_image, rois...

FILE: lib/layer_utils/proposal_top_layer.py
  function proposal_top_layer (line 15) | def proposal_top_layer(rpn_cls_prob, rpn_bbox_pred, im_info, _feat_strid...

FILE: lib/layer_utils/snippets.py
  function generate_anchors_pre (line 17) | def generate_anchors_pre(height, width, feat_stride, anchor_scales=(8,16...

FILE: lib/model/bbox_transform.py
  function bbox_transform (line 13) | def bbox_transform(ex_rois, gt_rois):
  function bbox_transform_inv (line 34) | def bbox_transform_inv(boxes, deltas):
  function clip_boxes (line 67) | def clip_boxes(boxes, im_shape):

FILE: lib/model/config.py
  function get_output_dir (line 274) | def get_output_dir(imdb, weights_filename):
  function get_output_tb_dir (line 290) | def get_output_tb_dir(imdb, weights_filename):
  function _merge_a_into_b (line 306) | def _merge_a_into_b(a, b):
  function cfg_from_file (line 339) | def cfg_from_file(filename):
  function cfg_from_list (line 348) | def cfg_from_list(cfg_list):

FILE: lib/model/nms_wrapper.py
  function nms (line 15) | def nms(dets, thresh, force_cpu=False):

FILE: lib/model/test.py
  function _get_image_blob (line 27) | def _get_image_blob(im):
  function _get_blobs (line 61) | def _get_blobs(im):
  function _clip_boxes (line 68) | def _clip_boxes(boxes, im_shape):
  function _rescale_boxes (line 80) | def _rescale_boxes(boxes, inds, scales):
  function im_detect (line 87) | def im_detect(sess, net, im):
  function apply_nms (line 113) | def apply_nms(all_boxes, thresh):
  function test_net (line 142) | def test_net(sess, net, imdb, weights_filename, max_per_image=100, thres...

FILE: lib/model/train_val.py
  class SolverWrapper (line 27) | class SolverWrapper(object):
    method __init__ (line 32) | def __init__(self, sess, network, imdb, roidb, valroidb, output_dir, t...
    method snapshot (line 45) | def snapshot(self, sess, iter):
    method get_variables_in_checkpoint_file (line 82) | def get_variables_in_checkpoint_file(self, file_name):
    method train_model (line 93) | def train_model(self, sess, max_iters):
  function get_training_roidb (line 286) | def get_training_roidb(imdb):
  function filter_roidb (line 300) | def filter_roidb(roidb):
  function train_net (line 325) | def train_net(network, imdb, roidb, valroidb, output_dir, tb_dir,

FILE: lib/nets/network.py
  class Network (line 25) | class Network(object):
    method __init__ (line 26) | def __init__(self, batch_size=1):
    method _add_image_summary (line 41) | def _add_image_summary(self, image, boxes):
    method _add_act_summary (line 63) | def _add_act_summary(self, tensor):
    method _add_score_summary (line 68) | def _add_score_summary(self, key, tensor):
    method _add_train_summary (line 71) | def _add_train_summary(self, var):
    method _reshape_layer (line 74) | def _reshape_layer(self, bottom, num_dim, name):
    method _softmax_layer (line 86) | def _softmax_layer(self, bottom, name):
    method _proposal_top_layer (line 94) | def _proposal_top_layer(self, rpn_cls_prob, rpn_bbox_pred, name):
    method _proposal_layer (line 105) | def _proposal_layer(self, rpn_cls_prob, rpn_bbox_pred, name):
    method _roi_pool_layer (line 117) | def _roi_pool_layer(self, bootom, rois, name):
    method _crop_pool_layer (line 124) | def _crop_pool_layer(self, bottom, rois, name):
    method _dropout_layer (line 142) | def _dropout_layer(self, bottom, name, ratio=0.5):
    method _anchor_target_layer (line 145) | def _anchor_target_layer(self, rpn_cls_score, name):
    method _proposal_target_layer (line 167) | def _proposal_target_layer(self, rois, roi_scores, name):
    method _anchor_component (line 191) | def _anchor_component(self):
    method build_network (line 205) | def build_network(self, sess, is_training=True):
    method _smooth_l1_loss (line 208) | def _smooth_l1_loss(self, bbox_pred, bbox_targets, bbox_inside_weights...
    method _add_losses (line 223) | def _add_losses(self, sigma_rpn=3.0):
    method create_architecture (line 271) | def create_architecture(self, sess, mode, num_classes, tag=None,
    method get_variables_to_restore (line 341) | def get_variables_to_restore(self, variables, var_keep_dic):
    method fix_variables (line 344) | def fix_variables(self, sess, pretrained_model):
    method extract_head (line 349) | def extract_head(self, sess, image):
    method test_image (line 355) | def test_image(self, sess, image, im_info):
    method get_summary (line 365) | def get_summary(self, sess, blobs):
    method train_step (line 372) | def train_step(self, sess, blobs, train_op):
    method train_step_with_summary (line 384) | def train_step_with_summary(self, sess, blobs, train_op):
    method train_step_no_return (line 397) | def train_step_no_return(self, sess, blobs, train_op):

FILE: lib/nets/resnet_v1.py
  function resnet_arg_scope (line 26) | def resnet_arg_scope(is_training=True,
  class resnetv1 (line 53) | class resnetv1(Network):
    method __init__ (line 54) | def __init__(self, batch_size=1, num_layers=50):
    method _crop_pool_layer (line 59) | def _crop_pool_layer(self, bottom, rois, name):
    method build_base (line 84) | def build_base(self):
    method build_network (line 92) | def build_network(self, sess, is_training=True):
    method get_variables_to_restore (line 241) | def get_variables_to_restore(self, variables, var_keep_dic):
    method fix_variables (line 265) | def fix_variables(self, sess, pretrained_model):

FILE: lib/nets/vgg16.py
  class vgg16 (line 19) | class vgg16(Network):
    method __init__ (line 20) | def __init__(self, batch_size=1):
    method build_network (line 23) | def build_network(self, sess, is_training=True):
    method get_variables_to_restore (line 115) | def get_variables_to_restore(self, variables, var_keep_dic):
    method fix_variables (line 133) | def fix_variables(self, sess, pretrained_model):

FILE: lib/nms/cpu_nms.c
  type Py_ssize_t (line 61) | typedef int Py_ssize_t;
  type Py_buffer (line 89) | typedef struct {
  type Py_hash_t (line 241) | typedef long Py_hash_t;
  function CYTHON_INLINE (line 310) | static CYTHON_INLINE float __PYX_NAN() {
  type __Pyx_StringTabEntry (line 369) | typedef struct {PyObject **p; char *s; const Py_ssize_t n; const char* e...
  function CYTHON_INLINE (line 409) | static CYTHON_INLINE size_t __Pyx_Py_UNICODE_strlen(const Py_UNICODE *u)
  function __Pyx_init_sys_getdefaultencoding_params (line 435) | static int __Pyx_init_sys_getdefaultencoding_params(void) {
  function __Pyx_init_sys_getdefaultencoding_params (line 484) | static int __Pyx_init_sys_getdefaultencoding_params(void) {
  type __Pyx_StructField_ (line 559) | struct __Pyx_StructField_
  type __Pyx_TypeInfo (line 561) | typedef struct {
  type __Pyx_StructField (line 571) | typedef struct __Pyx_StructField_ {
  type __Pyx_BufFmt_StackElem (line 576) | typedef struct {
  type __Pyx_BufFmt_Context (line 580) | typedef struct {
  type npy_int8 (line 601) | typedef npy_int8 __pyx_t_5numpy_int8_t;
  type npy_int16 (line 610) | typedef npy_int16 __pyx_t_5numpy_int16_t;
  type npy_int32 (line 619) | typedef npy_int32 __pyx_t_5numpy_int32_t;
  type npy_int64 (line 628) | typedef npy_int64 __pyx_t_5numpy_int64_t;
  type npy_uint8 (line 637) | typedef npy_uint8 __pyx_t_5numpy_uint8_t;
  type npy_uint16 (line 646) | typedef npy_uint16 __pyx_t_5numpy_uint16_t;
  type npy_uint32 (line 655) | typedef npy_uint32 __pyx_t_5numpy_uint32_t;
  type npy_uint64 (line 664) | typedef npy_uint64 __pyx_t_5numpy_uint64_t;
  type npy_float32 (line 673) | typedef npy_float32 __pyx_t_5numpy_float32_t;
  type npy_float64 (line 682) | typedef npy_float64 __pyx_t_5numpy_float64_t;
  type npy_long (line 691) | typedef npy_long __pyx_t_5numpy_int_t;
  type npy_longlong (line 700) | typedef npy_longlong __pyx_t_5numpy_long_t;
  type npy_longlong (line 709) | typedef npy_longlong __pyx_t_5numpy_longlong_t;
  type npy_ulong (line 718) | typedef npy_ulong __pyx_t_5numpy_uint_t;
  type npy_ulonglong (line 727) | typedef npy_ulonglong __pyx_t_5numpy_ulong_t;
  type npy_ulonglong (line 736) | typedef npy_ulonglong __pyx_t_5numpy_ulonglong_t;
  type npy_intp (line 745) | typedef npy_intp __pyx_t_5numpy_intp_t;
  type npy_uintp (line 754) | typedef npy_uintp __pyx_t_5numpy_uintp_t;
  type npy_double (line 763) | typedef npy_double __pyx_t_5numpy_float_t;
  type npy_double (line 772) | typedef npy_double __pyx_t_5numpy_double_t;
  type npy_longdouble (line 781) | typedef npy_longdouble __pyx_t_5numpy_longdouble_t;
  type std (line 784) | typedef ::std::complex< float > __pyx_t_float_complex;
  type __pyx_t_float_complex (line 786) | typedef float _Complex __pyx_t_float_complex;
  type __pyx_t_float_complex (line 789) | typedef struct { float real, imag; } __pyx_t_float_complex;
  type std (line 794) | typedef ::std::complex< double > __pyx_t_double_complex;
  type __pyx_t_double_complex (line 796) | typedef double _Complex __pyx_t_double_complex;
  type __pyx_t_double_complex (line 799) | typedef struct { double real, imag; } __pyx_t_double_complex;
  type npy_cfloat (line 812) | typedef npy_cfloat __pyx_t_5numpy_cfloat_t;
  type npy_cdouble (line 821) | typedef npy_cdouble __pyx_t_5numpy_cdouble_t;
  type npy_clongdouble (line 830) | typedef npy_clongdouble __pyx_t_5numpy_clongdouble_t;
  type npy_cdouble (line 839) | typedef npy_cdouble __pyx_t_5numpy_complex_t;
  type __Pyx_RefNannyAPIStruct (line 844) | typedef struct {
  function CYTHON_INLINE (line 903) | static CYTHON_INLINE PyObject* __Pyx_PyObject_GetAttrStr(PyObject* obj, ...
  function CYTHON_INLINE (line 949) | static CYTHON_INLINE int __Pyx_PyList_Append(PyObject* list, PyObject* x) {
  type __Pyx_Buf_DimInfo (line 979) | typedef struct {
  type __Pyx_Buffer (line 982) | typedef struct {
  type __Pyx_LocalBuf_ND (line 986) | typedef struct {
  type __Pyx_CodeObjectCacheEntry (line 1126) | typedef struct {
  type __Pyx_CodeObjectCache (line 1130) | struct __Pyx_CodeObjectCache {
  type __Pyx_CodeObjectCache (line 1135) | struct __Pyx_CodeObjectCache
  function CYTHON_INLINE (line 1341) | static CYTHON_INLINE __pyx_t_5numpy_float32_t __pyx_f_3nms_7cpu_nms_max(...
  function CYTHON_INLINE (line 1384) | static CYTHON_INLINE __pyx_t_5numpy_float32_t __pyx_f_3nms_7cpu_nms_min(...
  function PyObject (line 1430) | static PyObject *__pyx_pw_3nms_7cpu_nms_1cpu_nms(PyObject *__pyx_self, P...
  function PyObject (line 1495) | static PyObject *__pyx_pf_3nms_7cpu_nms_cpu_nms(CYTHON_UNUSED PyObject *...
  function CYTHON_UNUSED (line 2348) | static CYTHON_UNUSED int __pyx_pw_5numpy_7ndarray_1__getbuffer__(PyObjec...
  function __pyx_pf_5numpy_7ndarray___getbuffer__ (line 2359) | static int __pyx_pf_5numpy_7ndarray___getbuffer__(PyArrayObject *__pyx_v...
  function CYTHON_UNUSED (line 3144) | static CYTHON_UNUSED void __pyx_pw_5numpy_7ndarray_3__releasebuffer__(Py...
  function __pyx_pf_5numpy_7ndarray_2__releasebuffer__ (line 3153) | static void __pyx_pf_5numpy_7ndarray_2__releasebuffer__(PyArrayObject *_...
  function CYTHON_INLINE (line 3222) | static CYTHON_INLINE PyObject *__pyx_f_5numpy_PyArray_MultiIterNew1(PyOb...
  function CYTHON_INLINE (line 3272) | static CYTHON_INLINE PyObject *__pyx_f_5numpy_PyArray_MultiIterNew2(PyOb...
  function CYTHON_INLINE (line 3322) | static CYTHON_INLINE PyObject *__pyx_f_5numpy_PyArray_MultiIterNew3(PyOb...
  function CYTHON_INLINE (line 3372) | static CYTHON_INLINE PyObject *__pyx_f_5numpy_PyArray_MultiIterNew4(PyOb...
  function CYTHON_INLINE (line 3422) | static CYTHON_INLINE PyObject *__pyx_f_5numpy_PyArray_MultiIterNew5(PyOb...
  function CYTHON_INLINE (line 3472) | static CYTHON_INLINE char *__pyx_f_5numpy__util_dtypestring(PyArray_Desc...
  function CYTHON_INLINE (line 4176) | static CYTHON_INLINE void __pyx_f_5numpy_set_array_base(PyArrayObject *_...
  function CYTHON_INLINE (line 4264) | static CYTHON_INLINE PyObject *__pyx_f_5numpy_get_array_base(PyArrayObje...
  type PyModuleDef (line 4325) | struct PyModuleDef
  function __Pyx_InitCachedBuiltins (line 4397) | static int __Pyx_InitCachedBuiltins(void) {
  function __Pyx_InitCachedConstants (line 4406) | static int __Pyx_InitCachedConstants(void) {
  function __Pyx_InitGlobals (line 4575) | static int __Pyx_InitGlobals(void) {
  function PyMODINIT_FUNC (line 4593) | PyMODINIT_FUNC PyInit_cpu_nms(void)
  function __Pyx_RefNannyAPIStruct (line 4746) | static __Pyx_RefNannyAPIStruct *__Pyx_RefNannyImportAPI(const char *modn...
  function PyObject (line 4761) | static PyObject *__Pyx_GetBuiltinName(PyObject *name) {
  function __Pyx_RaiseArgtupleInvalid (line 4774) | static void __Pyx_RaiseArgtupleInvalid(
  function __Pyx_RaiseDoubleKeywordsError (line 4799) | static void __Pyx_RaiseDoubleKeywordsError(
  function __Pyx_ParseOptionalKeywords (line 4812) | static int __Pyx_ParseOptionalKeywords(
  function __Pyx_RaiseArgumentTypeInvalid (line 4913) | static void __Pyx_RaiseArgumentTypeInvalid(const char* name, PyObject *o...
  function CYTHON_INLINE (line 4918) | static CYTHON_INLINE int __Pyx_ArgTypeTest(PyObject *obj, PyTypeObject *...
  function CYTHON_INLINE (line 4939) | static CYTHON_INLINE int __Pyx_IsLittleEndian(void) {
  function __Pyx_BufFmt_Init (line 4943) | static void __Pyx_BufFmt_Init(__Pyx_BufFmt_Context* ctx,
  function __Pyx_BufFmt_ParseNumber (line 4970) | static int __Pyx_BufFmt_ParseNumber(const char** ts) {
  function __Pyx_BufFmt_ExpectNumber (line 4985) | static int __Pyx_BufFmt_ExpectNumber(const char **ts) {
  function __Pyx_BufFmt_RaiseUnexpectedChar (line 4992) | static void __Pyx_BufFmt_RaiseUnexpectedChar(char ch) {
  function __Pyx_BufFmt_TypeCharToStandardSize (line 5020) | static size_t __Pyx_BufFmt_TypeCharToStandardSize(char ch, int is_comple...
  function __Pyx_BufFmt_TypeCharToNativeSize (line 5038) | static size_t __Pyx_BufFmt_TypeCharToNativeSize(char ch, int is_complex) {
  type __Pyx_st_short (line 5057) | typedef struct { char c; short x; } __Pyx_st_short;
  type __Pyx_st_int (line 5058) | typedef struct { char c; int x; } __Pyx_st_int;
  type __Pyx_st_long (line 5059) | typedef struct { char c; long x; } __Pyx_st_long;
  type __Pyx_st_float (line 5060) | typedef struct { char c; float x; } __Pyx_st_float;
  type __Pyx_st_double (line 5061) | typedef struct { char c; double x; } __Pyx_st_double;
  type __Pyx_st_longdouble (line 5062) | typedef struct { char c; long double x; } __Pyx_st_longdouble;
  type __Pyx_st_void_p (line 5063) | typedef struct { char c; void *x; } __Pyx_st_void_p;
  type __Pyx_st_longlong (line 5065) | typedef struct { char c; PY_LONG_LONG x; } __Pyx_st_longlong;
  function __Pyx_BufFmt_TypeCharToAlignment (line 5067) | static size_t __Pyx_BufFmt_TypeCharToAlignment(char ch, CYTHON_UNUSED in...
  type __Pyx_pad_short (line 5089) | typedef struct { short x; char c; } __Pyx_pad_short;
  type __Pyx_pad_int (line 5090) | typedef struct { int x; char c; } __Pyx_pad_int;
  type __Pyx_pad_long (line 5091) | typedef struct { long x; char c; } __Pyx_pad_long;
  type __Pyx_pad_float (line 5092) | typedef struct { float x; char c; } __Pyx_pad_float;
  type __Pyx_pad_double (line 5093) | typedef struct { double x; char c; } __Pyx_pad_double;
  type __Pyx_pad_longdouble (line 5094) | typedef struct { long double x; char c; } __Pyx_pad_longdouble;
  type __Pyx_pad_void_p (line 5095) | typedef struct { void *x; char c; } __Pyx_pad_void_p;
  type __Pyx_pad_longlong (line 5097) | typedef struct { PY_LONG_LONG x; char c; } __Pyx_pad_longlong;
  function __Pyx_BufFmt_TypeCharToPadding (line 5099) | static size_t __Pyx_BufFmt_TypeCharToPadding(char ch, CYTHON_UNUSED int ...
  function __Pyx_BufFmt_TypeCharToGroup (line 5117) | static char __Pyx_BufFmt_TypeCharToGroup(char ch, int is_complex) {
  function __Pyx_BufFmt_RaiseExpected (line 5138) | static void __Pyx_BufFmt_RaiseExpected(__Pyx_BufFmt_Context* ctx) {
  function __Pyx_BufFmt_ProcessTypeChunk (line 5162) | static int __Pyx_BufFmt_ProcessTypeChunk(__Pyx_BufFmt_Context* ctx) {
  function CYTHON_INLINE (line 5264) | static CYTHON_INLINE PyObject *
  function CYTHON_INLINE (line 5437) | static CYTHON_INLINE void __Pyx_ZeroBuffer(Py_buffer* buf) {
  function CYTHON_INLINE (line 5444) | static CYTHON_INLINE int __Pyx_GetBufferAndValidate(
  function CYTHON_INLINE (line 5478) | static CYTHON_INLINE void __Pyx_SafeReleaseBuffer(Py_buffer* info) {
  function CYTHON_INLINE (line 5484) | static CYTHON_INLINE int __Pyx_TypeTest(PyObject *obj, PyTypeObject *typ...
  function CYTHON_INLINE (line 5497) | static CYTHON_INLINE PyObject* __Pyx_PyObject_Call(PyObject *func, PyObj...
  function __Pyx_RaiseBufferIndexError (line 5536) | static void __Pyx_RaiseBufferIndexError(int axis) {
  function CYTHON_INLINE (line 5541) | static CYTHON_INLINE void __Pyx_ErrRestore(PyObject *type, PyObject *val...
  function CYTHON_INLINE (line 5558) | static CYTHON_INLINE void __Pyx_ErrFetch(PyObject **type, PyObject **val...
  function CYTHON_INLINE (line 5737) | static CYTHON_INLINE void __Pyx_RaiseTooManyValuesError(Py_ssize_t expec...
  function CYTHON_INLINE (line 5742) | static CYTHON_INLINE void __Pyx_RaiseNeedMoreValuesError(Py_ssize_t inde...
  function CYTHON_INLINE (line 5748) | static CYTHON_INLINE void __Pyx_RaiseNoneNotIterableError(void) {
  function __Pyx_GetBuffer (line 5753) | static int __Pyx_GetBuffer(PyObject *obj, Py_buffer *view, int flags) {
  function __Pyx_ReleaseBuffer (line 5779) | static void __Pyx_ReleaseBuffer(Py_buffer *view) {
  function PyObject (line 5817) | static PyObject *__Pyx_Import(PyObject *name, PyObject *from_list, int l...
  function CYTHON_INLINE (line 5899) | static CYTHON_INLINE PyObject* __Pyx_PyInt_From_int(int value) {
  function CYTHON_INLINE (line 5946) | static CYTHON_INLINE int __Pyx_PyInt_As_int(PyObject *x) {
  function CYTHON_INLINE (line 6041) | static CYTHON_INLINE PyObject* __Pyx_PyInt_From_long(long value) {
  function CYTHON_INLINE (line 6069) | static CYTHON_INLINE __pyx_t_float_complex __pyx_t_float_complex_from_pa...
  function CYTHON_INLINE (line 6073) | static CYTHON_INLINE __pyx_t_float_complex __pyx_t_float_complex_from_pa...
  function CYTHON_INLINE (line 6078) | static CYTHON_INLINE __pyx_t_float_complex __pyx_t_float_complex_from_pa...
  function CYTHON_INLINE (line 6088) | static CYTHON_INLINE int __Pyx_c_eqf(__pyx_t_float_complex a, __pyx_t_fl...
  function CYTHON_INLINE (line 6091) | static CYTHON_INLINE __pyx_t_float_complex __Pyx_c_sumf(__pyx_t_float_co...
  function CYTHON_INLINE (line 6097) | static CYTHON_INLINE __pyx_t_float_complex __Pyx_c_difff(__pyx_t_float_c...
  function CYTHON_INLINE (line 6103) | static CYTHON_INLINE __pyx_t_float_complex __Pyx_c_prodf(__pyx_t_float_c...
  function CYTHON_INLINE (line 6109) | static CYTHON_INLINE __pyx_t_float_complex __Pyx_c_quotf(__pyx_t_float_c...
  function CYTHON_INLINE (line 6116) | static CYTHON_INLINE __pyx_t_float_complex __Pyx_c_negf(__pyx_t_float_co...
  function CYTHON_INLINE (line 6122) | static CYTHON_INLINE int __Pyx_c_is_zerof(__pyx_t_float_complex a) {
  function CYTHON_INLINE (line 6125) | static CYTHON_INLINE __pyx_t_float_complex __Pyx_c_conjf(__pyx_t_float_c...
  function CYTHON_INLINE (line 6132) | static CYTHON_INLINE float __Pyx_c_absf(__pyx_t_float_complex z) {
  function CYTHON_INLINE (line 6139) | static CYTHON_INLINE __pyx_t_float_complex __Pyx_c_powf(__pyx_t_float_co...
  function CYTHON_INLINE (line 6189) | static CYTHON_INLINE __pyx_t_double_complex __pyx_t_double_complex_from_...
  function CYTHON_INLINE (line 6193) | static CYTHON_INLINE __pyx_t_double_complex __pyx_t_double_complex_from_...
  function CYTHON_INLINE (line 6198) | static CYTHON_INLINE __pyx_t_double_complex __pyx_t_double_complex_from_...
  function CYTHON_INLINE (line 6208) | static CYTHON_INLINE int __Pyx_c_eq(__pyx_t_double_complex a, __pyx_t_do...
  function CYTHON_INLINE (line 6211) | static CYTHON_INLINE __pyx_t_double_complex __Pyx_c_sum(__pyx_t_double_c...
  function CYTHON_INLINE (line 6217) | static CYTHON_INLINE __pyx_t_double_complex __Pyx_c_diff(__pyx_t_double_...
  function CYTHON_INLINE (line 6223) | static CYTHON_INLINE __pyx_t_double_complex __Pyx_c_prod(__pyx_t_double_...
  function CYTHON_INLINE (line 6229) | static CYTHON_INLINE __pyx_t_double_complex __Pyx_c_quot(__pyx_t_double_...
  function CYTHON_INLINE (line 6236) | static CYTHON_INLINE __pyx_t_double_complex __Pyx_c_neg(__pyx_t_double_c...
  function CYTHON_INLINE (line 6242) | static CYTHON_INLINE int __Pyx_c_is_zero(__pyx_t_double_complex a) {
  function CYTHON_INLINE (line 6245) | static CYTHON_INLINE __pyx_t_double_complex __Pyx_c_conj(__pyx_t_double_...
  function CYTHON_INLINE (line 6252) | static CYTHON_INLINE double __Pyx_c_abs(__pyx_t_double_complex z) {
  function CYTHON_INLINE (line 6259) | static CYTHON_INLINE __pyx_t_double_complex __Pyx_c_pow(__pyx_t_double_c...
  function __Pyx_PyInt_As_long (line 6312) | static CYTHON_INLINE long __Pyx_PyInt_As_long(PyObject *x) {
  function __Pyx_check_binary_version (line 6407) | static int __Pyx_check_binary_version(void) {
  function PyObject (line 6428) | static PyObject *__Pyx_ImportModule(const char *name) {
  function PyTypeObject (line 6445) | static PyTypeObject *__Pyx_ImportType(const char *module_name, const cha...
  function __pyx_bisect_code_objects (line 6511) | static int __pyx_bisect_code_objects(__Pyx_CodeObjectCacheEntry* entries...
  function PyCodeObject (line 6532) | static PyCodeObject *__pyx_find_code_object(int code_line) {
  function __pyx_insert_code_object (line 6546) | static void __pyx_insert_code_object(int code_line, PyCodeObject* code_o...
  function PyCodeObject (line 6593) | static PyCodeObject* __Pyx_CreateCodeObjectForTraceback(
  function __Pyx_AddTraceback (line 6645) | static void __Pyx_AddTraceback(const char *funcname, int c_line,
  function __Pyx_InitStrings (line 6673) | static int __Pyx_InitStrings(__Pyx_StringTabEntry *t) {
  function CYTHON_INLINE (line 6703) | static CYTHON_INLINE PyObject* __Pyx_PyUnicode_FromString(char* c_str) {
  function CYTHON_INLINE (line 6706) | static CYTHON_INLINE char* __Pyx_PyObject_AsString(PyObject* o) {
  function CYTHON_INLINE (line 6770) | static CYTHON_INLINE int __Pyx_PyObject_IsTrue(PyObject* x) {
  function CYTHON_INLINE (line 6825) | static CYTHON_INLINE Py_ssize_t __Pyx_PyIndex_AsSsize_t(PyObject* b) {
  function CYTHON_INLINE (line 6854) | static CYTHON_INLINE PyObject * __Pyx_PyInt_FromSize_t(size_t ival) {

FILE: lib/nms/gpu_nms.cpp
  function CYTHON_INLINE (line 310) | static CYTHON_INLINE float __PYX_NAN() {
  function CYTHON_INLINE (line 410) | static CYTHON_INLINE size_t __Pyx_Py_UNICODE_strlen(const Py_UNICODE *u)
  function __Pyx_init_sys_getdefaultencoding_params (line 436) | static int __Pyx_init_sys_getdefaultencoding_params(void) {
  function __Pyx_init_sys_getdefaultencoding_params (line 485) | static int __Pyx_init_sys_getdefaultencoding_params(void) {
  type __Pyx_StructField_ (line 560) | struct __Pyx_StructField_
  type __Pyx_StructField_ (line 564) | struct __Pyx_StructField_
  type __Pyx_StructField_ (line 572) | struct __Pyx_StructField_ {
  function CYTHON_INLINE (line 920) | static CYTHON_INLINE PyObject* __Pyx_PyObject_GetAttrStr(PyObject* obj, ...
  type __Pyx_CodeObjectCache (line 1121) | struct __Pyx_CodeObjectCache {
  type __Pyx_CodeObjectCache (line 1126) | struct __Pyx_CodeObjectCache
  function PyObject (line 1287) | static PyObject *__pyx_pw_3nms_7gpu_nms_1gpu_nms(PyObject *__pyx_self, P...
  function PyObject (line 1367) | static PyObject *__pyx_pf_3nms_7gpu_nms_gpu_nms(CYTHON_UNUSED PyObject *...
  function CYTHON_UNUSED (line 1720) | static CYTHON_UNUSED int __pyx_pw_5numpy_7ndarray_1__getbuffer__(PyObjec...
  function __pyx_pf_5numpy_7ndarray___getbuffer__ (line 1731) | static int __pyx_pf_5numpy_7ndarray___getbuffer__(PyArrayObject *__pyx_v...
  function CYTHON_UNUSED (line 2516) | static CYTHON_UNUSED void __pyx_pw_5numpy_7ndarray_3__releasebuffer__(Py...
  function __pyx_pf_5numpy_7ndarray_2__releasebuffer__ (line 2525) | static void __pyx_pf_5numpy_7ndarray_2__releasebuffer__(PyArrayObject *_...
  function CYTHON_INLINE (line 2594) | static CYTHON_INLINE PyObject *__pyx_f_5numpy_PyArray_MultiIterNew1(PyOb...
  function CYTHON_INLINE (line 2644) | static CYTHON_INLINE PyObject *__pyx_f_5numpy_PyArray_MultiIterNew2(PyOb...
  function CYTHON_INLINE (line 2694) | static CYTHON_INLINE PyObject *__pyx_f_5numpy_PyArray_MultiIterNew3(PyOb...
  function CYTHON_INLINE (line 2744) | static CYTHON_INLINE PyObject *__pyx_f_5numpy_PyArray_MultiIterNew4(PyOb...
  function CYTHON_INLINE (line 2794) | static CYTHON_INLINE PyObject *__pyx_f_5numpy_PyArray_MultiIterNew5(PyOb...
  function CYTHON_INLINE (line 2844) | static CYTHON_INLINE char *__pyx_f_5numpy__util_dtypestring(PyArray_Desc...
  function CYTHON_INLINE (line 3548) | static CYTHON_INLINE void __pyx_f_5numpy_set_array_base(PyArrayObject *_...
  function CYTHON_INLINE (line 3636) | static CYTHON_INLINE PyObject *__pyx_f_5numpy_get_array_base(PyArrayObje...
  type PyModuleDef (line 3697) | struct PyModuleDef
  function __Pyx_InitCachedBuiltins (line 3750) | static int __Pyx_InitCachedBuiltins(void) {
  function __Pyx_InitCachedConstants (line 3759) | static int __Pyx_InitCachedConstants(void) {
  function __Pyx_InitGlobals (line 3883) | static int __Pyx_InitGlobals(void) {
  function PyMODINIT_FUNC (line 3897) | PyMODINIT_FUNC PyInit_gpu_nms(void)
  function __Pyx_RefNannyAPIStruct (line 4066) | static __Pyx_RefNannyAPIStruct *__Pyx_RefNannyImportAPI(const char *modn...
  function __Pyx_RaiseArgtupleInvalid (line 4081) | static void __Pyx_RaiseArgtupleInvalid(
  function __Pyx_RaiseDoubleKeywordsError (line 4106) | static void __Pyx_RaiseDoubleKeywordsError(
  function __Pyx_ParseOptionalKeywords (line 4119) | static int __Pyx_ParseOptionalKeywords(
  function __Pyx_RaiseArgumentTypeInvalid (line 4220) | static void __Pyx_RaiseArgumentTypeInvalid(const char* name, PyObject *o...
  function CYTHON_INLINE (line 4225) | static CYTHON_INLINE int __Pyx_ArgTypeTest(PyObject *obj, PyTypeObject *...
  function CYTHON_INLINE (line 4246) | static CYTHON_INLINE int __Pyx_IsLittleEndian(void) {
  function __Pyx_BufFmt_Init (line 4250) | static void __Pyx_BufFmt_Init(__Pyx_BufFmt_Context* ctx,
  function __Pyx_BufFmt_ParseNumber (line 4277) | static int __Pyx_BufFmt_ParseNumber(const char** ts) {
  function __Pyx_BufFmt_ExpectNumber (line 4292) | static int __Pyx_BufFmt_ExpectNumber(const char **ts) {
  function __Pyx_BufFmt_RaiseUnexpectedChar (line 4299) | static void __Pyx_BufFmt_RaiseUnexpectedChar(char ch) {
  function __Pyx_BufFmt_TypeCharToStandardSize (line 4327) | static size_t __Pyx_BufFmt_TypeCharToStandardSize(char ch, int is_comple...
  function __Pyx_BufFmt_TypeCharToNativeSize (line 4345) | static size_t __Pyx_BufFmt_TypeCharToNativeSize(char ch, int is_complex) {
  function __Pyx_BufFmt_TypeCharToAlignment (line 4374) | static size_t __Pyx_BufFmt_TypeCharToAlignment(char ch, CYTHON_UNUSED in...
  function __Pyx_BufFmt_TypeCharToPadding (line 4406) | static size_t __Pyx_BufFmt_TypeCharToPadding(char ch, CYTHON_UNUSED int ...
  function __Pyx_BufFmt_TypeCharToGroup (line 4424) | static char __Pyx_BufFmt_TypeCharToGroup(char ch, int is_complex) {
  function __Pyx_BufFmt_RaiseExpected (line 4445) | static void __Pyx_BufFmt_RaiseExpected(__Pyx_BufFmt_Context* ctx) {
  function __Pyx_BufFmt_ProcessTypeChunk (line 4469) | static int __Pyx_BufFmt_ProcessTypeChunk(__Pyx_BufFmt_Context* ctx) {
  function CYTHON_INLINE (line 4571) | static CYTHON_INLINE PyObject *
  function CYTHON_INLINE (line 4744) | static CYTHON_INLINE void __Pyx_ZeroBuffer(Py_buffer* buf) {
  function CYTHON_INLINE (line 4751) | static CYTHON_INLINE int __Pyx_GetBufferAndValidate(
  function CYTHON_INLINE (line 4785) | static CYTHON_INLINE void __Pyx_SafeReleaseBuffer(Py_buffer* info) {
  function PyObject (line 4791) | static PyObject *__Pyx_GetBuiltinName(PyObject *name) {
  function CYTHON_INLINE (line 4822) | static CYTHON_INLINE PyObject* __Pyx_PyObject_Call(PyObject *func, PyObj...
  function CYTHON_INLINE (line 4844) | static CYTHON_INLINE int __Pyx_TypeTest(PyObject *obj, PyTypeObject *typ...
  function __Pyx_RaiseBufferIndexError (line 4856) | static void __Pyx_RaiseBufferIndexError(int axis) {
  function CYTHON_INLINE (line 4861) | static CYTHON_INLINE PyObject* __Pyx_PyObject_GetSlice(
  function __Pyx_RaiseBufferFallbackError (line 4958) | static void __Pyx_RaiseBufferFallbackError(void) {
  function CYTHON_INLINE (line 4963) | static CYTHON_INLINE void __Pyx_ErrRestore(PyObject *type, PyObject *val...
  function CYTHON_INLINE (line 4980) | static CYTHON_INLINE void __Pyx_ErrFetch(PyObject **type, PyObject **val...
  function CYTHON_INLINE (line 5159) | static CYTHON_INLINE void __Pyx_RaiseTooManyValuesError(Py_ssize_t expec...
  function CYTHON_INLINE (line 5164) | static CYTHON_INLINE void __Pyx_RaiseNeedMoreValuesError(Py_ssize_t inde...
  function CYTHON_INLINE (line 5170) | static CYTHON_INLINE void __Pyx_RaiseNoneNotIterableError(void) {
  function __Pyx_GetBuffer (line 5175) | static int __Pyx_GetBuffer(PyObject *obj, Py_buffer *view, int flags) {
  function __Pyx_ReleaseBuffer (line 5201) | static void __Pyx_ReleaseBuffer(Py_buffer *view) {
  function PyObject (line 5239) | static PyObject *__Pyx_Import(PyObject *name, PyObject *from_list, int l...
  function CYTHON_INLINE (line 5342) | static CYTHON_INLINE npy_int32 __Pyx_PyInt_As_npy_int32(PyObject *x) {
  function CYTHON_INLINE (line 5437) | static CYTHON_INLINE PyObject* __Pyx_PyInt_From_int(int value) {
  function CYTHON_INLINE (line 5465) | static CYTHON_INLINE __pyx_t_float_complex __pyx_t_float_complex_from_pa...
  function CYTHON_INLINE (line 5469) | static CYTHON_INLINE __pyx_t_float_complex __pyx_t_float_complex_from_pa...
  function CYTHON_INLINE (line 5474) | static CYTHON_INLINE __pyx_t_float_complex __pyx_t_float_complex_from_pa...
  function CYTHON_INLINE (line 5484) | static CYTHON_INLINE int __Pyx_c_eqf(__pyx_t_float_complex a, __pyx_t_fl...
  function CYTHON_INLINE (line 5487) | static CYTHON_INLINE __pyx_t_float_complex __Pyx_c_sumf(__pyx_t_float_co...
  function CYTHON_INLINE (line 5493) | static CYTHON_INLINE __pyx_t_float_complex __Pyx_c_difff(__pyx_t_float_c...
  function CYTHON_INLINE (line 5499) | static CYTHON_INLINE __pyx_t_float_complex __Pyx_c_prodf(__pyx_t_float_c...
  function CYTHON_INLINE (line 5505) | static CYTHON_INLINE __pyx_t_float_complex __Pyx_c_quotf(__pyx_t_float_c...
  function CYTHON_INLINE (line 5512) | static CYTHON_INLINE __pyx_t_float_complex __Pyx_c_negf(__pyx_t_float_co...
  function CYTHON_INLINE (line 5518) | static CYTHON_INLINE int __Pyx_c_is_zerof(__pyx_t_float_complex a) {
  function CYTHON_INLINE (line 5521) | static CYTHON_INLINE __pyx_t_float_complex __Pyx_c_conjf(__pyx_t_float_c...
  function CYTHON_INLINE (line 5528) | static CYTHON_INLINE float __Pyx_c_absf(__pyx_t_float_complex z) {
  function CYTHON_INLINE (line 5535) | static CYTHON_INLINE __pyx_t_float_complex __Pyx_c_powf(__pyx_t_float_co...
  function CYTHON_INLINE (line 5585) | static CYTHON_INLINE __pyx_t_double_complex __pyx_t_double_complex_from_...
  function CYTHON_INLINE (line 5589) | static CYTHON_INLINE __pyx_t_double_complex __pyx_t_double_complex_from_...
  function CYTHON_INLINE (line 5594) | static CYTHON_INLINE __pyx_t_double_complex __pyx_t_double_complex_from_...
  function CYTHON_INLINE (line 5604) | static CYTHON_INLINE int __Pyx_c_eq(__pyx_t_double_complex a, __pyx_t_do...
  function CYTHON_INLINE (line 5607) | static CYTHON_INLINE __pyx_t_double_complex __Pyx_c_sum(__pyx_t_double_c...
  function CYTHON_INLINE (line 5613) | static CYTHON_INLINE __pyx_t_double_complex __Pyx_c_diff(__pyx_t_double_...
  function CYTHON_INLINE (line 5619) | static CYTHON_INLINE __pyx_t_double_complex __Pyx_c_prod(__pyx_t_double_...
  function CYTHON_INLINE (line 5625) | static CYTHON_INLINE __pyx_t_double_complex __Pyx_c_quot(__pyx_t_double_...
  function CYTHON_INLINE (line 5632) | static CYTHON_INLINE __pyx_t_double_complex __Pyx_c_neg(__pyx_t_double_c...
  function CYTHON_INLINE (line 5638) | static CYTHON_INLINE int __Pyx_c_is_zero(__pyx_t_double_complex a) {
  function CYTHON_INLINE (line 5641) | static CYTHON_INLINE __pyx_t_double_complex __Pyx_c_conj(__pyx_t_double_...
  function CYTHON_INLINE (line 5648) | static CYTHON_INLINE double __Pyx_c_abs(__pyx_t_double_complex z) {
  function CYTHON_INLINE (line 5655) | static CYTHON_INLINE __pyx_t_double_complex __Pyx_c_pow(__pyx_t_double_c...
  function CYTHON_INLINE (line 5708) | static CYTHON_INLINE int __Pyx_PyInt_As_int(PyObject *x) {
  function CYTHON_INLINE (line 5803) | static CYTHON_INLINE PyObject* __Pyx_PyInt_From_long(long value) {
  function __Pyx_PyInt_As_long (line 5834) | static CYTHON_INLINE long __Pyx_PyInt_As_long(PyObject *x) {
  function __Pyx_check_binary_version (line 5929) | static int __Pyx_check_binary_version(void) {
  function PyObject (line 5950) | static PyObject *__Pyx_ImportModule(const char *name) {
  function PyTypeObject (line 5967) | static PyTypeObject *__Pyx_ImportType(const char *module_name, const cha...
  function __pyx_bisect_code_objects (line 6033) | static int __pyx_bisect_code_objects(__Pyx_CodeObjectCacheEntry* entries...
  function PyCodeObject (line 6054) | static PyCodeObject *__pyx_find_code_object(int code_line) {
  function __pyx_insert_code_object (line 6068) | static void __pyx_insert_code_object(int code_line, PyCodeObject* code_o...
  function PyCodeObject (line 6115) | static PyCodeObject* __Pyx_CreateCodeObjectForTraceback(
  function __Pyx_AddTraceback (line 6167) | static void __Pyx_AddTraceback(const char *funcname, int c_line,
  function __Pyx_InitStrings (line 6195) | static int __Pyx_InitStrings(__Pyx_StringTabEntry *t) {
  function CYTHON_INLINE (line 6225) | static CYTHON_INLINE PyObject* __Pyx_PyUnicode_FromString(char* c_str) {
  function CYTHON_INLINE (line 6228) | static CYTHON_INLINE char* __Pyx_PyObject_AsString(PyObject* o) {
  function CYTHON_INLINE (line 6292) | static CYTHON_INLINE int __Pyx_PyObject_IsTrue(PyObject* x) {
  function CYTHON_INLINE (line 6347) | static CYTHON_INLINE Py_ssize_t __Pyx_PyIndex_AsSsize_t(PyObject* b) {
  function CYTHON_INLINE (line 6376) | static CYTHON_INLINE PyObject * __Pyx_PyInt_FromSize_t(size_t ival) {

FILE: lib/nms/py_cpu_nms.py
  function py_cpu_nms (line 10) | def py_cpu_nms(dets, thresh):

FILE: lib/roi_data_layer/layer.py
  class RoIDataLayer (line 21) | class RoIDataLayer(object):
    method __init__ (line 24) | def __init__(self, roidb, num_classes, random=False):
    method _shuffle_roidb_inds (line 32) | def _shuffle_roidb_inds(self):
    method _get_next_minibatch_inds (line 64) | def _get_next_minibatch_inds(self):
    method _get_next_minibatch (line 77) | def _get_next_minibatch(self):
    method forward (line 87) | def forward(self):

FILE: lib/roi_data_layer/minibatch.py
  function get_minibatch (line 19) | def get_minibatch(roidb, num_classes):
  function _get_image_blob (line 54) | def _get_image_blob(roidb, scale_inds):

FILE: lib/roi_data_layer/roidb.py
  function prepare_roidb (line 19) | def prepare_roidb(imdb):

FILE: lib/setup.py
  function find_in_path (line 15) | def find_in_path(name, path):
  function locate_cuda (line 24) | def locate_cuda():
  function customize_compiler_for_nvcc (line 63) | def customize_compiler_for_nvcc(self):
  class custom_build_ext (line 102) | class custom_build_ext(build_ext):
    method build_extensions (line 103) | def build_extensions(self):

FILE: lib/utils/blob.py
  function im_list_to_blob (line 17) | def im_list_to_blob(ims):
  function prep_im_for_blob (line 33) | def prep_im_for_blob(im, pixel_means, target_size, max_size):

FILE: lib/utils/boxes_grid.py
  function get_boxes_grid (line 16) | def get_boxes_grid(image_height, image_width):

FILE: lib/utils/nms.py
  function nms (line 10) | def nms(dets, thresh):

FILE: lib/utils/timer.py
  class Timer (line 10) | class Timer(object):
    method __init__ (line 12) | def __init__(self):
    method tic (line 19) | def tic(self):
    method toc (line 24) | def toc(self, average=True):

FILE: tools/_init_paths.py
  function add_path (line 4) | def add_path(path):

FILE: tools/demo_graspRGD.py
  function Rotate2D (line 50) | def Rotate2D(pts,cnt,ang=scipy.pi/4):
  function vis_detections (line 54) | def vis_detections(ax, image_name, im, class_name, dets, thresh=0.5):
  function demo (line 105) | def demo(sess, net, image_name):
  function parse_args (line 154) | def parse_args():

FILE: tools/demo_graspRGD_socket.py
  function Rotate2D (line 66) | def Rotate2D(pts,cnt,ang=scipy.pi/4):
  function vis_detections (line 70) | def vis_detections(ax, im, class_name, dets, thresh=0.5):
  function compute_imgRot (line 121) | def compute_imgRot(frame):
  function coordinate_img2table (line 171) | def coordinate_img2table(frame, u, v, rot):
  function demo_process (line 237) | def demo_process(sess, net):
  function parse_args (line 395) | def parse_args():

FILE: tools/demo_graspRGD_vis_mask.py
  function Rotate2D (line 50) | def Rotate2D(pts,cnt,ang=scipy.pi/4):
  function vis_detections (line 54) | def vis_detections(ax, image_name, im, class_name, dets, thresh=0.5):
  function demo (line 106) | def demo(sess, net, image_name, mask_name):
  function parse_args (line 159) | def parse_args():

FILE: tools/demo_graspRGD_vis_select.py
  function Rotate2D (line 53) | def Rotate2D(pts,cnt,ang=scipy.pi/4):
  function vis_detections (line 57) | def vis_detections(ax, image_name, im, class_name, dets, thresh=0.5):
  function demo (line 114) | def demo(sess, net, image_name):
  function parse_args (line 165) | def parse_args():

FILE: tools/trainval_net.py
  function parse_args (line 24) | def parse_args():
  function combined_roidb (line 62) | def combined_roidb(imdb_names):

Download .json

Condensed preview — 86 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (1,004K chars).

[
  {
    "path": "README.md",
    "chars": 2603,
    "preview": "# grasp_multiObject_multiGrasp\n\nThis is the implementation of our RA-L work 'Real-world Multi-object, Multi-grasp Detect"
  },
  {
    "path": "data/scripts/dataPreprocessingTest_fasterrcnn_split.m",
    "chars": 4488,
    "preview": "%% script to test dataPreprocessing\r\n%% created by Fu-Jen Chu on 09/15/2016\r\n\r\nclose all\r\nclear\r\n\r\n%parpool(4)\r\naddpath("
  },
  {
    "path": "data/scripts/dataPreprocessing_fasterrcnn.m",
    "chars": 2649,
    "preview": "function [imagesOut bbsOut] = dataPreprocessing( imageIn, bbsIn_all, cropSize, translationShiftNumber, roatateAngleNumbe"
  },
  {
    "path": "data/scripts/fetch_faster_rcnn_models.sh",
    "chars": 956,
    "preview": "#!/bin/bash\n\nDIR=\"$( cd \"$( dirname \"${BASH_SOURCE[0]}\" )/../\" && pwd )\"\ncd $DIR\n\nNET=res101\nFILE=voc_0712_80k-110k.tgz\n"
  },
  {
    "path": "experiments/cfgs/res101-lg.yml",
    "chars": 472,
    "preview": "EXP_DIR: res101-lg\nTRAIN:\n  HAS_RPN: True\n  IMS_PER_BATCH: 1\n  BBOX_NORMALIZE_TARGETS_PRECOMPUTED: True\n  RPN_POSITIVE_O"
  },
  {
    "path": "experiments/cfgs/res101.yml",
    "chars": 347,
    "preview": "EXP_DIR: res101\nTRAIN:\n  HAS_RPN: True\n  IMS_PER_BATCH: 1\n  BBOX_NORMALIZE_TARGETS_PRECOMPUTED: True\n  RPN_POSITIVE_OVER"
  },
  {
    "path": "experiments/cfgs/res50.yml",
    "chars": 345,
    "preview": "EXP_DIR: res50\nTRAIN:\n  HAS_RPN: True\n  IMS_PER_BATCH: 1\n  BBOX_NORMALIZE_TARGETS_PRECOMPUTED: True\n  RPN_POSITIVE_OVERL"
  },
  {
    "path": "experiments/cfgs/vgg16.yml",
    "chars": 301,
    "preview": "EXP_DIR: vgg16\nTRAIN:\n  HAS_RPN: True\n  IMS_PER_BATCH: 1\n  BBOX_NORMALIZE_TARGETS_PRECOMPUTED: True\n  RPN_POSITIVE_OVERL"
  },
  {
    "path": "experiments/logs/.gitignore",
    "chars": 8,
    "preview": "*.txt.*\n"
  },
  {
    "path": "experiments/scripts/convert_vgg16.sh",
    "chars": 1701,
    "preview": "#!/bin/bash\n\nset -x\nset -e\n\nexport PYTHONUNBUFFERED=\"True\"\n\nGPU_ID=$1\nDATASET=$2\nNET=vgg16\n\narray=( $@ )\nlen=${#array[@]"
  },
  {
    "path": "experiments/scripts/test_faster_rcnn.sh",
    "chars": 1743,
    "preview": "#!/bin/bash\n\nset -x\nset -e\n\nexport PYTHONUNBUFFERED=\"True\"\n\nGPU_ID=$1\nDATASET=$2\nNET=$3\n\narray=( $@ )\nlen=${#array[@]}\nE"
  },
  {
    "path": "experiments/scripts/train_faster_rcnn.sh",
    "chars": 2441,
    "preview": "#!/bin/bash\n\nset -x\nset -e\n\nexport PYTHONUNBUFFERED=\"True\"\n\nGPU_ID=$1\nDATASET=$2\nNET=$3\n\narray=( $@ )\nlen=${#array[@]}\nE"
  },
  {
    "path": "experiments/scripts/train_faster_rcnn.sh~",
    "chars": 2441,
    "preview": "#!/bin/bash\n\nset -x\nset -e\n\nexport PYTHONUNBUFFERED=\"True\"\n\nGPU_ID=$1\nDATASET=$2\nNET=$3\n\narray=( $@ )\nlen=${#array[@]}\nE"
  },
  {
    "path": "lib/Makefile",
    "chars": 94,
    "preview": "all:\n\tpython setup.py build_ext --inplace\n\trm -rf build\nclean:\n\trm -rf */*.pyc\n\trm -rf */*.so\n"
  },
  {
    "path": "lib/datasets/VOCdevkit-matlab-wrapper/get_voc_opts.m",
    "chars": 231,
    "preview": "function VOCopts = get_voc_opts(path)\n\ntmp = pwd;\ncd(path);\ntry\n  addpath('VOCcode');\n  VOCinit;\ncatch\n  rmpath('VOCcode"
  },
  {
    "path": "lib/datasets/VOCdevkit-matlab-wrapper/voc_eval.m",
    "chars": 1332,
    "preview": "function res = voc_eval(path, comp_id, test_set, output_dir)\n\nVOCopts = get_voc_opts(path);\nVOCopts.testset = test_set;\n"
  },
  {
    "path": "lib/datasets/VOCdevkit-matlab-wrapper/xVOCap.m",
    "chars": 258,
    "preview": "function ap = xVOCap(rec,prec)\r\n% From the PASCAL VOC 2011 devkit\r\n\r\nmrec=[0 ; rec ; 1];\r\nmpre=[0 ; prec ; 0];\r\nfor i=nu"
  },
  {
    "path": "lib/datasets/__init__.py",
    "chars": 248,
    "preview": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under "
  },
  {
    "path": "lib/datasets/coco.py",
    "chars": 11804,
    "preview": "# --------------------------------------------------------\n# Fast/er R-CNN\n# Licensed under The MIT License [see LICENSE"
  },
  {
    "path": "lib/datasets/ds_utils.py",
    "chars": 1402,
    "preview": "# --------------------------------------------------------\n# Fast/er R-CNN\n# Licensed under The MIT License [see LICENSE"
  },
  {
    "path": "lib/datasets/factory.py",
    "chars": 1811,
    "preview": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under "
  },
  {
    "path": "lib/datasets/factory.py~",
    "chars": 1811,
    "preview": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under "
  },
  {
    "path": "lib/datasets/graspRGB.py",
    "chars": 12227,
    "preview": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under "
  },
  {
    "path": "lib/datasets/graspRGB.py~",
    "chars": 12229,
    "preview": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under "
  },
  {
    "path": "lib/datasets/imdb.py",
    "chars": 8972,
    "preview": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under "
  },
  {
    "path": "lib/datasets/pascal_voc.py",
    "chars": 11182,
    "preview": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under "
  },
  {
    "path": "lib/datasets/tools/mcg_munge.py",
    "chars": 1451,
    "preview": "import os\nimport sys\n\n\"\"\"Hacky tool to convert file system layout of MCG boxes downloaded from\nhttp://www.eecs.berkeley."
  },
  {
    "path": "lib/datasets/voc_eval.py",
    "chars": 6635,
    "preview": "# --------------------------------------------------------\n# Fast/er R-CNN\n# Licensed under The MIT License [see LICENSE"
  },
  {
    "path": "lib/layer_utils/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "lib/layer_utils/anchor_target_layer.py",
    "chars": 6031,
    "preview": "# --------------------------------------------------------\n# Faster R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed unde"
  },
  {
    "path": "lib/layer_utils/generate_anchors.py",
    "chars": 3129,
    "preview": "# --------------------------------------------------------\n# Faster R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed unde"
  },
  {
    "path": "lib/layer_utils/proposal_layer.py",
    "chars": 1850,
    "preview": "# --------------------------------------------------------\n# Faster R-CNN\n# Licensed under The MIT License [see LICENSE "
  },
  {
    "path": "lib/layer_utils/proposal_target_layer.py",
    "chars": 6017,
    "preview": "# --------------------------------------------------------\n# Faster R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed unde"
  },
  {
    "path": "lib/layer_utils/proposal_top_layer.py",
    "chars": 1868,
    "preview": "# --------------------------------------------------------\n# Faster R-CNN\n# Licensed under The MIT License [see LICENSE "
  },
  {
    "path": "lib/layer_utils/snippets.py",
    "chars": 1473,
    "preview": "# --------------------------------------------------------\n# Tensorflow Faster R-CNN\n# Licensed under The MIT License [s"
  },
  {
    "path": "lib/model/__init__.py",
    "chars": 21,
    "preview": "from . import config\n"
  },
  {
    "path": "lib/model/bbox_transform.py",
    "chars": 2544,
    "preview": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under "
  },
  {
    "path": "lib/model/config.py",
    "chars": 11160,
    "preview": "from __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport os\n"
  },
  {
    "path": "lib/model/config.py~",
    "chars": 11159,
    "preview": "from __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport os\n"
  },
  {
    "path": "lib/model/nms_wrapper.py",
    "chars": 727,
    "preview": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under "
  },
  {
    "path": "lib/model/test.py",
    "chars": 6518,
    "preview": "# --------------------------------------------------------\n# Tensorflow Faster R-CNN\n# Licensed under The MIT License [s"
  },
  {
    "path": "lib/model/test.py~",
    "chars": 6534,
    "preview": "# --------------------------------------------------------\n# Tensorflow Faster R-CNN\n# Licensed under The MIT License [s"
  },
  {
    "path": "lib/model/train_val.py",
    "chars": 13100,
    "preview": "# --------------------------------------------------------\n# Tensorflow Faster R-CNN\n# Licensed under The MIT License [s"
  },
  {
    "path": "lib/model/train_val.py~",
    "chars": 13083,
    "preview": "# --------------------------------------------------------\n# Tensorflow Faster R-CNN\n# Licensed under The MIT License [s"
  },
  {
    "path": "lib/nets/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "lib/nets/network.py",
    "chars": 18402,
    "preview": "# --------------------------------------------------------\n# Tensorflow Faster R-CNN\n# Licensed under The MIT License [s"
  },
  {
    "path": "lib/nets/resnet_v1.py",
    "chars": 13227,
    "preview": "# --------------------------------------------------------\n# Tensorflow Faster R-CNN\n# Licensed under The MIT License [s"
  },
  {
    "path": "lib/nets/resnet_v1.py~",
    "chars": 13218,
    "preview": "# --------------------------------------------------------\n# Tensorflow Faster R-CNN\n# Licensed under The MIT License [s"
  },
  {
    "path": "lib/nets/vgg16.py",
    "chars": 7550,
    "preview": "# --------------------------------------------------------\n# Tensorflow Faster R-CNN\n# Licensed under The MIT License [s"
  },
  {
    "path": "lib/nms/.gitignore",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "lib/nms/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "lib/nms/cpu_nms.c",
    "chars": 292233,
    "preview": "/* Generated by Cython 0.20.1 on Wed Oct  5 13:15:30 2016 */\n\n#define PY_SSIZE_T_CLEAN\n#ifndef CYTHON_USE_PYLONG_INTERNA"
  },
  {
    "path": "lib/nms/cpu_nms.pyx",
    "chars": 2241,
    "preview": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under "
  },
  {
    "path": "lib/nms/gpu_nms.cpp",
    "chars": 269661,
    "preview": "/* Generated by Cython 0.20.1 on Wed Oct  5 13:15:30 2016 */\n\n#define PY_SSIZE_T_CLEAN\n#ifndef CYTHON_USE_PYLONG_INTERNA"
  },
  {
    "path": "lib/nms/gpu_nms.hpp",
    "chars": 146,
    "preview": "void _nms(int* keep_out, int* num_out, const float* boxes_host, int boxes_num,\n          int boxes_dim, float nms_overla"
  },
  {
    "path": "lib/nms/gpu_nms.pyx",
    "chars": 1110,
    "preview": "# --------------------------------------------------------\n# Faster R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed unde"
  },
  {
    "path": "lib/nms/nms_kernel.cu",
    "chars": 5064,
    "preview": "// ------------------------------------------------------------------\n// Faster R-CNN\n// Copyright (c) 2015 Microsoft\n//"
  },
  {
    "path": "lib/nms/py_cpu_nms.py",
    "chars": 1051,
    "preview": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under "
  },
  {
    "path": "lib/roi_data_layer/__init__.py",
    "chars": 248,
    "preview": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under "
  },
  {
    "path": "lib/roi_data_layer/layer.py",
    "chars": 3001,
    "preview": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under "
  },
  {
    "path": "lib/roi_data_layer/minibatch.py",
    "chars": 2602,
    "preview": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under "
  },
  {
    "path": "lib/roi_data_layer/minibatch.py~",
    "chars": 2652,
    "preview": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under "
  },
  {
    "path": "lib/roi_data_layer/roidb.py",
    "chars": 1963,
    "preview": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under "
  },
  {
    "path": "lib/setup.py",
    "chars": 5587,
    "preview": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under "
  },
  {
    "path": "lib/setup.py~",
    "chars": 5587,
    "preview": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under "
  },
  {
    "path": "lib/utils/.gitignore",
    "chars": 20,
    "preview": "*.c\n*.cpp\n*.h\n*.hpp\n"
  },
  {
    "path": "lib/utils/__init__.py",
    "chars": 248,
    "preview": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under "
  },
  {
    "path": "lib/utils/bbox.pyx",
    "chars": 2994,
    "preview": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under "
  },
  {
    "path": "lib/utils/blob.py",
    "chars": 1504,
    "preview": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under "
  },
  {
    "path": "lib/utils/boxes_grid.py",
    "chars": 2599,
    "preview": "# --------------------------------------------------------\n# Subcategory CNN\n# Copyright (c) 2015 CVGL Stanford\n# Licens"
  },
  {
    "path": "lib/utils/nms.py",
    "chars": 1008,
    "preview": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under "
  },
  {
    "path": "lib/utils/nms.pyx",
    "chars": 4103,
    "preview": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under "
  },
  {
    "path": "lib/utils/timer.py",
    "chars": 948,
    "preview": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under "
  },
  {
    "path": "tools/_init_paths.py",
    "chars": 324,
    "preview": "import os.path as osp\nimport sys\n\ndef add_path(path):\n    if path not in sys.path:\n        sys.path.insert(0, path)\n\nthi"
  },
  {
    "path": "tools/demo.py~",
    "chars": 5396,
    "preview": "#!/usr/bin/env python\n\n# --------------------------------------------------------\n# Tensorflow Faster R-CNN\n# Licensed u"
  },
  {
    "path": "tools/demo_graspRGD.py",
    "chars": 7889,
    "preview": "#!/usr/bin/env python\n\n# --------------------------------------------------------\n# Tensorflow Faster R-CNN\n# Licensed u"
  },
  {
    "path": "tools/demo_graspRGD.py~",
    "chars": 7866,
    "preview": "#!/usr/bin/env python\n\n# --------------------------------------------------------\n# Tensorflow Faster R-CNN\n# Licensed u"
  },
  {
    "path": "tools/demo_graspRGD_socket.py",
    "chars": 21248,
    "preview": "#!/usr/bin/env python\n\n# --------------------------------------------------------\n# Tensorflow Faster R-CNN\n# Licensed u"
  },
  {
    "path": "tools/demo_graspRGD_socket.py~",
    "chars": 18681,
    "preview": "#!/usr/bin/env python\n\n# --------------------------------------------------------\n# Tensorflow Faster R-CNN\n# Licensed u"
  },
  {
    "path": "tools/demo_graspRGD_socket_drawer.py~",
    "chars": 21574,
    "preview": "#!/usr/bin/env python\n\n# --------------------------------------------------------\n# Tensorflow Faster R-CNN\n# Licensed u"
  },
  {
    "path": "tools/demo_graspRGD_socket_save_to_rgbd.py~",
    "chars": 18111,
    "preview": "#!/usr/bin/env python\n\n# --------------------------------------------------------\n# Tensorflow Faster R-CNN\n# Licensed u"
  },
  {
    "path": "tools/demo_graspRGD_vis_mask.py",
    "chars": 8021,
    "preview": "#!/usr/bin/env python\n\n# --------------------------------------------------------\n# Tensorflow Faster R-CNN\n# Licensed u"
  },
  {
    "path": "tools/demo_graspRGD_vis_select.py",
    "chars": 8270,
    "preview": "#!/usr/bin/env python\n\n# --------------------------------------------------------\n# Tensorflow Faster R-CNN\n# Licensed u"
  },
  {
    "path": "tools/eval_graspRGD.py~",
    "chars": 11398,
    "preview": "#!/usr/bin/env python\n\n# --------------------------------------------------------\n# Tensorflow Faster R-CNN\n# Licensed u"
  },
  {
    "path": "tools/mask_gen.py",
    "chars": 125,
    "preview": "import numpy as np    \nmask = np.zeros((227,227))\nmask[70:120,90:150]=1\nimport scipy.miscscipy.misc.imsave('mask.jpg', m"
  },
  {
    "path": "tools/trainval_net.py",
    "chars": 4531,
    "preview": "# --------------------------------------------------------\n# Tensorflow Faster R-CNN\n# Licensed under The MIT License [s"
  }
]

About this extraction

This page contains the full source code of the ivalab/grasp_multiObject_multiGrasp GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 86 files (947.1 KB), approximately 298.3k tokens, and a symbol index with 484 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Extract another repo