[
  {
    "path": ".gitignore",
    "content": "*~\n"
  },
  {
    "path": "README.md",
    "content": "# Weakly Supervised Deep Detection Networks (WSDDN)\n\n\n## Installation\n1. Download and install [MatConvNet](http://www.vlfeat.org/matconvnet/install/)\n2. Install this module with the package manager of MatConvNet [`vl_contrib`](http://www.vlfeat.org/matconvnet/mfiles/vl_contrib/#notes):\n\n```\n    vl_contrib('install', 'WSDDN') ;\n    vl_contrib('setup', 'WSDDN') ;\n```\n\n3. If you want to train a WSDDN model, `wsddn_train` will automatically download the items below:\n\n    a.  [PASCAL VOC 2007 devkit and dataset](http://host.robots.ox.ac.uk/pascal/VOC/) under `data` folder\n\n    b.  Pre-computed edge-boxes for [trainval](http://groups.inf.ed.ac.uk/hbilen-data/data/WSDDN/EdgeBoxesVOC2007trainval.mat) and [test](http://groups.inf.ed.ac.uk/hbilen-data/WSDDN/EdgeBoxesVOC2007test.mat) splits:\n\n    c. Pre-trained network from [MatConvNet website](http://www.vlfeat.org/matconvnet/models)\n\n4. You can also download the pre-trained WSDDN model ([VGGF-EB-BoxSc-SpReg](http://groups.inf.ed.ac.uk/hbilen-data/data/WSDDN/wsddn.mat)). Note that it gives slightly different performance reported than in the paper (34.4% mAP instead of 34.5% mAP)\n\n\n## Demo\n\nAfter completing the installation and downloading the required files, you are ready for the demo\n\n```matlab\n            cd scripts;\n            opts.modelPath = '....' ;\n            opts.imdbPath = '....' ;\n            opts.gpu = .... ;\n            wsddn_demo(opts) ;\n                        \n```\n\n## Test\n\n```matlab\n            addpath scripts;\n            opts.modelPath = '....' ;\n            opts.imdbPath = '....' ;\n            opts.gpu = .... ;\n            opts.vis = true ; % visualize\n            wsddn_test(opts) ;\n                        \n```\n\n## Train\n\nDownload an ImageNet pre-trained model from [http://www.vlfeat.org/matconvnet/pretrained/](http://www.vlfeat.org/matconvnet/pretrained/)\n\n```matlab\n            addpath scripts;\n            opts.modelPath = '....' ;\n            opts.imdbPath = '....' ;\n            opts.train.gpus = .... ;\n            [net,info] = wsddn_train(opts) ;\n                        \n```\n\n## Citing WSDDN\nIf you find the code useful, please cite:\n\n```latex\n    @inproceedings{Bilen16,\n      author     = \"Bilen, H. and Vedaldi, A.\",\n      title      = \"Weakly Supervised Deep Detection Networks\",\n      booktitle  = \"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition\",\n      year       = \"2016\"\n    }\n```\n\n## Acknowledgement\nMany thanks to Sam Albanie for his help with contrib package manager and other nameless heros who diligently found my bugs.\n\n### License\nThe analysis work performed with the program(s) must be non-proprietary work. Licensee and its contract users must be or be affiliated with an academic facility. Licensee may additionally permit individuals who are students at such academic facility to access and use the program(s). Such students will be considered contract users of licensee. The program(s) may not be used for commercial competitive analysis (such as benchmarking) or for any commercial activity, including consulting.\n"
  },
  {
    "path": "core/wsddn_demo.m",
    "content": "function wsddn_demo(varargin)\n% @author: Hakan Bilen\n% wsddn_demo : this script shows a detection demo\n\nopts.dataDir = fullfile(vl_rootnn, 'data') ;\nopts.expDir = fullfile(vl_rootnn, 'exp') ;\nopts.imdbPath = fullfile(vl_rootnn, 'data', 'imdbs', 'imdb-eb.mat');\nopts.modelPath = fullfile(vl_rootnn, 'exp', 'net.mat') ;\nopts.proposalType = 'eb' ;\nopts.proposalDir = fullfile(vl_rootnn, 'data','EdgeBoxes') ;\n\n% if you have limited gpu memory (<6gb), you can change the next 2 params\nopts.maxNumProposals = inf; % limit number\n% opts.imageScales = [480,576,688,864,1200]; % scales\nopts.imageScales = [480,576,688,864,1200]; % scales\n\nopts.gpu = [] ;\nopts.train.prefetch = true ;\n\nopts.numFetchThreads = 1 ;\nopts = vl_argparse(opts, varargin) ;\n\ndisplay(opts);\nif ~exist(fullfile(opts.dataDir,'VOCdevkit','VOCcode','VOCinit.m'),'file')\n  error('VOCdevkit is not installed');\nend\naddpath(fullfile(opts.dataDir,'VOCdevkit','VOCcode'));\nopts.train.expDir = opts.expDir ;\n% -------------------------------------------------------------------------\n%                                                    Network initialization\n% -------------------------------------------------------------------------\n\nif ~exist(opts.modelPath, 'file')\n  url = 'http://groups.inf.ed.ac.uk/hbilen-data/data/WSDDN/wsddn.mat' ;\n  fprintf('Downloading %s to %s\\n', url, opts.modelPath) ;\n  urlwrite(url, opts.modelPath) ;\nend\n\nnet = load(opts.modelPath);\nnet = dagnn.DagNN.loadobj(net) ;\n\nnet.mode = 'test' ;\nif ~isempty(opts.gpu)\n  gpuDevice(opts.gpu) ;\n  net.move('gpu') ;\nend\n\nif isfield(net,'normalization')\n  bopts = net.normalization;\nelse\n  bopts = net.meta.normalization;\nend\n\nbopts.rgbVariance = [] ;\nbopts.interpolation = net.meta.normalization.interpolation;\nbopts.jitterBrightness = 0 ;\nbopts.imageScales = opts.imageScales;\nbopts.numThreads = opts.numFetchThreads;\nbs = find(arrayfun(@(a) isa(a.block, 'dagnn.BiasSamples'), net.layers)==1);\nbopts.addBiasSamples = ~isempty(bs) ;\nbopts.vgg16 = any(arrayfun(@(a) strcmp(a.name, 'relu5_1'), net.layers)==1) ;\n\n% -------------------------------------------------------------------------\n%                                                   Database initialization\n% -------------------------------------------------------------------------\nfprintf('loading imdb...');\nif exist(opts.imdbPath,'file')==2\n  imdb = load(opts.imdbPath) ;\nelse\n  imdb = setup_voc07_eb('dataDir',opts.dataDir, ...\n    'proposalDir',opts.proposalDir,'loadTest',1);\n    \n  save(opts.imdbPath,'-struct', 'imdb', '-v7.3');\nend\n\nfprintf('done\\n');\nminSize = 20;\nimdb = fixBBoxes(imdb, minSize, opts.maxNumProposals);\n\n% --------------------------------------------------------------------\n%                                                               Detect\n% --------------------------------------------------------------------\n% query images\ntestIdx = [12,15];\n\nVOCinit;\ncats = VOCopts.classes;\novTh = 0.4; % nms threshold\nscTh = 0.1; % det confidence threshold\n\nbopts.useGpu = numel(opts.gpu) >  0 ;\n\ndetLayer = find(arrayfun(@(a) strcmp(a.name, 'xTimes'), net.vars)==1);\n\nnet.vars(detLayer(1)).precious = 1;\n% run detection\nrcolors = randi(255,3,numel(cats));\nfor t=1:numel(testIdx)\n  batch = testIdx(t);  \n  \n  scoret = [];\n  for s=1:numel(opts.imageScales)\n    for f=1:2 % add flips\n      inputs = getBatch(bopts, imdb, batch, opts.imageScales(s), f-1 );\n      net.eval(inputs) ;\n  \n      if isempty(scoret)\n        scoret = squeeze(gather(net.vars(detLayer).value));\n      else\n        scoret = scoret + squeeze(gather(net.vars(detLayer).value));\n      end\n    end\n  end\n  \n  % divide by number of scales and flips\n  scoret = scoret / (2 * numel(opts.imageScales));\n  im = imread(fullfile(imdb.imageDir,imdb.images.name{testIdx(t)}));\n  \n  for cls = 1:numel(cats)\n    scores = scoret;\n    boxes  = double(imdb.images.boxes{testIdx(t)});\n    boxesSc = [boxes,scores(cls,:)'];\n    boxesSc = boxesSc(boxesSc(:,5)>scTh,:);\n    if isempty(boxesSc), continue; end;\n    \n    pick = nms(boxesSc, ovTh);\n    boxesSc = boxesSc(pick,:);\n    im = bbox_draw(im,boxesSc(1,1:4),rcolors(:,cls),2);\n    fprintf('%s %.2f\\n',cats{cls},boxesSc(1,5));\n  end\n  imshow(im);\n  pause() ;\n  if exist('zs_dispFig', 'file'), zs_dispFig ; end\nend\n\n\n\n% --------------------------------------------------------------------\nfunction inputs = getBatch(opts, imdb, batch, scale, flip)\n% --------------------------------------------------------------------\n\nopts.scale = scale;\nopts.flip = flip;\nis_vgg16 = opts.vgg16 ;\nopts = rmfield(opts,'vgg16') ;\n\nimages = strcat([imdb.imageDir filesep], imdb.images.name(batch)) ;\nopts.prefetch = (nargout == 0);\n\n[im,rois] = wsddn_get_batch(images, imdb, batch, opts);\n\n\nrois = single(rois');\nif opts.useGpu > 0\n  im = gpuArray(im) ;\n  rois = gpuArray(rois) ;\nend\nrois = rois([1 3 2 5 4],:) ;\n\n\nss = [16 16] ;\nif is_vgg16\n  o0 = 8.5 ;\n  o1 = 9.5 ;\nelse\n  o0 = 18 ;\n  o1 = 9.5 ;\nend\nrois = [ rois(1,:);\n        floor((rois(2,:) - o0 + o1) / ss(1) + 0.5) + 1;\n        floor((rois(3,:) - o0 + o1) / ss(2) + 0.5) + 1;\n        ceil((rois(4,:) - o0 - o1) / ss(1) - 0.5) + 1;\n        ceil((rois(5,:) - o0 - o1) / ss(2) - 0.5) + 1];\n\n      \ninputs = {'input', im, 'rois', rois} ;\n  \n  \nif opts.addBiasSamples && isfield(imdb.images,'boxScores')\n  boxScore = reshape(imdb.images.boxScores{batch},[1 1 1 numel(imdb.images.boxScores{batch})]);\n  inputs{end+1} = 'boxScore';\n  inputs{end+1} = boxScore ; \nend\n\n\n% -------------------------------------------------------------------------\nfunction imdb = fixBBoxes(imdb, minSize, maxNum)\n% -------------------------------------------------------------------------\n\nfor i=1:numel(imdb.images.name)\n  bbox = imdb.images.boxes{i};\n  % remove small bbox\n  isGood = (bbox(:,3)>=bbox(:,1)+minSize) & (bbox(:,4)>=bbox(:,2)+minSize);\n  bbox = bbox(isGood,:);\n  % remove duplicate ones\n  [dummy, uniqueIdx] = unique(bbox, 'rows', 'first');\n  uniqueIdx = sort(uniqueIdx);\n  bbox = bbox(uniqueIdx,:);\n  % limit number for training\n  if imdb.images.set(i)~=3\n    nB = min(size(bbox,1),maxNum);\n  else\n    nB = size(bbox,1);\n  end\n  \n  if isfield(imdb.images,'boxScores')\n    imdb.images.boxScores{i} = imdb.images.boxScores{i}(uniqueIdx);\n    imdb.images.boxScores{i} = imdb.images.boxScores{i}(1:nB);\n  end\n  imdb.images.boxes{i} = bbox(1:nB,:);\n  %   [h,w,~] = size(imdb.images.data{i});\n  %   imdb.images.boxes{i} = [1 1 h w];\n  \nend\n\n% -------------------------------------------------------------------------\nfunction im = bbox_draw(im,roi,color,t)\n% DRAWRECT\n% IM : input image\n% ROI : rectangle\n% COLOR :\n% T : thickness\n\n[h,w,d] = size(im);\nassert(d == numel(color));\nif any(roi(:,1)>h) || any(roi(:,3)>h) || any(roi(:,2)>w) || any(roi(:,4)>w)\n  error('Wrong bounding box coord!\\n');\nend\nfor c=1:d\n  im(max(roi(1)-t,1):min(roi(1)+t,h),max(roi(2)-t,1):min(roi(4)+t,w),c) = color(c);\n  im(max(roi(3)-t,1):min(roi(3)+t,h),max(roi(2)-t,1):min(roi(4)+t,w),c) = color(c);\n  im(max(roi(1)-t,1):min(roi(3)+t,h),max(roi(2)-t,1):min(roi(2)+t,w),c) = color(c);\n  im(max(roi(1)-t,1):min(roi(3)+t,h),max(roi(4)-t,1):min(roi(4)+t,w),c) = color(c);\nend\n"
  },
  {
    "path": "core/wsddn_get_batch.m",
    "content": "function [imo,rois] = wsddn_get_batch(images, imdb, batch, opts)\n% cnn_wsddn_get_batch  Load, preprocess, and pack images for CNN evaluation\n\nif isempty(images)\n  imo = [] ;\n  rois = [] ;\n  return ;\nend\n\n% fetch is true if images is a list of filenames (instead of\n% a cell array of images)\nfetch = ischar(images{1}) ;\n\n% prefetch is used to load images in a separate thread\nprefetch = fetch & opts.prefetch ;\n\n% pick size\nimSize = imdb.images.size(batch(1),:);\nfactor = min(opts.scale(1)/imSize(1),opts.scale(1)/imSize(2));\nheight = floor(factor*imSize(1));\n\nif prefetch\n  vl_imreadjpeg(images, 'numThreads',opts.numThreads,'Resize',height,'prefetch') ;\n  imo = [] ;\n  rois = [] ;\n  return ;\nend\n\nif fetch\n  ims = vl_imreadjpeg(images,'numThreads',opts.numThreads,'Resize',height) ;\nelse\n  ims = images ;\nend\n\nfor i=1:numel(images)\n  % acquire image\n  if isempty(ims{i})\n    imt = imread(images{i}) ;\n    if size(imt,3) == 1\n      imt = cat(3, imt, imt, imt) ;\n    end\n    \n    ims{i} = imresize(imt,factor,'Method',opts.interpolation);\n    ims{i} = single(ims{i}) ; % faster than im2single (and multiplies by 255)\n  end\nend\n\n\n\nbboxes = cell(1,numel(batch));\nnBoxes = 0;\nfor b=1:numel(batch)\n  bboxes{b} = double(imdb.images.boxes{batch(b)});\n  nBoxes = nBoxes + size(bboxes{b},1);\nend\n \n\nrois = zeros(nBoxes,5);\ncountr = 0;\n\nmaxW = 0;\nmaxH = 0;\n\n\n\nfor b=1:numel(batch)\n  \n  hw = imdb.images.size(batch(b),:);\n  h = hw(1);\n  w = hw(2);\n  \n  imsz = size(ims{b});\n  \n  if opts.flip(b)\n    im = ims{b};\n    ims{b} = im(:,end:-1:1,:);\n    \n    bbox = bboxes{b};\n    bbox(:,[2,4]) = w + 1 - bbox(:,[4,2]);\n    bboxes{b} = bbox;\n  end\n  \n\n  maxH = max(imsz(1),maxH);\n  maxW = max(imsz(2),maxW);\n \n  % adapt bounding boxes into new coord\n  bbox = bboxes{b};\n  if any(bbox(:)<=0)\n    error('bbox error');\n  end\n  nB = size(bbox,1);\n  tbbox = scale_box(bbox,[h,w],imsz);\n  if any(tbbox(:)<=0)\n    error('tbbox error');\n  end\n\n  rois(countr+1:countr+nB,:) = [b*ones(nB,1),tbbox];\n  countr = countr + nB;\nend\n\n% rois = single(rois);\ndepth = size(ims{1},3);\nimo = zeros(maxH,maxW,depth,numel(batch),'single');\n\nif isempty(opts.averageImage)\n  avgIm = [];\nelseif numel(opts.averageImage)==depth\n  avgIm = opts.averageImage;\nend\n\n\nfor b=1:numel(batch)\n  sz = size(ims{b});\n\n  imo(1:sz(1),1:sz(2),:,b) = single(ims{b});\n  \n  if ~isempty(avgIm)\n    imo(1:sz(1),1:sz(2),:,b) = single(bsxfun(@minus,imo(1:sz(1),1:sz(2),:,b),opts.averageImage));\n  end\n  if ~isempty(opts.rgbVariance)\n    imo(1:sz(1),1:sz(2),:,b) = bsxfun(@plus, imo(1:sz(1),1:sz(2),:,b), ...\n        reshape(opts.rgbVariance * randn(3,1), 1,1,3)) ;\n  end\nend\n\n\nfunction boxOut = scale_box(boxIn,szIn,szOut)\n  \n  h = szIn(1);\n  w = szIn(2);\n\n  bxr = 0.5 * (boxIn(:,2)+boxIn(:,4)) / w;\n  byr = 0.5 * (boxIn(:,1)+boxIn(:,3)) / h;\n \n  bwr = (boxIn(:,4)-boxIn(:,2)+1) / w;\n  bhr = (boxIn(:,3)-boxIn(:,1)+1) / h;\n  \n  % boxIn center in new coord\n  byhat = (szOut(1) * byr);\n  bxhat = (szOut(2) * bxr);\n  \n  % relative width, height\n  bhhat = szOut(1) * bhr;\n  bwhat = szOut(2) * bwr;\n  \n  % transformed boxIn\n  boxOut = [max(1,round(byhat - 0.5 * bhhat)),...\n    max(1,round(bxhat - 0.5 * bwhat)), ...\n    min(szOut(1),round(byhat + 0.5 * bhhat)),...\n    min(szOut(2),round(bxhat + 0.5 * bwhat))];\n\n"
  },
  {
    "path": "core/wsddn_init.m",
    "content": "% --------------------------------------------------------------------\nfunction net = wsddn_init(net,varargin)\n% --------------------------------------------------------------------\n% @author: Hakan Bilen\n% wsddn_init : this script initalise WSDDN model\n\nopts.addBiasSamples = 1 ;\nopts.softmaxTempCls = 1 ;\nopts.softmaxTempDet = 2 ;\nopts.addLossSmooth  = 1 ;\nopts.averageImage = [] ;\nopts.rgbVariance = [] ;\nopts.numClasses = 1 ;\nopts.classNames = {''} ;\n\nopts = vl_argparse(opts, varargin) ;\n\n% add drop-out layers\nrelu6p = find(cellfun(@(a) strcmp(a.name, 'relu6'), net.layers)==1);\nrelu7p = find(cellfun(@(a) strcmp(a.name, 'relu7'), net.layers)==1);\n\ndrop6 = struct('type', 'dropout', 'rate', 0.5, 'name','drop6');\ndrop7 = struct('type', 'dropout', 'rate', 0.5, 'name','drop7');\nnet.layers = [net.layers(1:relu6p) drop6 net.layers(relu6p+1:relu7p) drop7 net.layers(relu7p+1:end)];\n\n\n% change loss fc layer\nfc8p = (cellfun(@(a) strcmp(a.name, 'fc8'), net.layers)==1);\nnet.layers{fc8p}.weights{1} = 0.01 * ...\n  randn(1,1,size(net.layers{fc8p}.weights{1},3),opts.numClasses,'single');\n\nnet.layers{fc8p}.weights{2} = zeros(1, opts.numClasses, 'single');\nnet.layers{fc8p}.name = 'fc8C';\n\nnet.layers(end) = [] ;\n% add loss (this will be changed to binary log at the end)\n% net.layers{end} = struct('name','loss', 'type','softmaxloss') ;\n\n% add detection layer\nclsLayerPos  = (cellfun(@(a) strcmp(a.name, 'fc8C'), net.layers)==1);\ndetLayer = net.layers{clsLayerPos};\ndetLayer.weights{1} = 0.01 * randn(1,1,size(detLayer.weights{1},3),opts.numClasses,'single');\n% detLayer.weights{1} = zeros(1,1,size(detLayer.weights{1},3),opts.numClasses,'single');\ndetLayer.weights{2} = zeros(1, opts.numClasses, 'single');\n\ndetLayer.name = 'fc8R';\n\n% remove pool5\npPool5 = find(cellfun(@(a) strcmp(a.name, 'pool5'), net.layers)==1);\nnet.layers = [net.layers([1:pPool5-1,pPool5+1:end]) detLayer];\n\n% convert to dagnn\nnet = dagnn.DagNN.fromSimpleNN(net, 'canonicalNames', true) ;\n\n% fix fc8R\npFc8R = (arrayfun(@(a) strcmp(a.name, 'fc8R'), net.layers)==1);\npFc8C = (arrayfun(@(a) strcmp(a.name, 'fc8C'), net.layers)==1);\n\nnet.layers(pFc8R).inputs = net.layers(pFc8C).inputs;\nnet.layers(pFc8R).inputIndexes = net.layers(pFc8C).inputIndexes;\n\n% add spp\n\npRelu5 = (arrayfun(@(a) strcmp(a.name, 'relu5'), net.layers)==1);\nvggdeep = 0;\nif all(pRelu5==0)\n  pRelu5 = (arrayfun(@(a) strcmp(a.name, 'relu5_3'), net.layers)==1);\n  assert(any(pRelu5==1));\n  vggdeep = 1;\nend\npFc6 = (arrayfun(@(a) strcmp(a.name, 'fc6'), net.layers)==1);\n\n% add spp (offset1 = rf offset, offset2 = shrinking factor)\n% offset1=18  offset2=9.5 levels=6 for vgg-f and vgg-m-1024\n% offset1=8.5 offset2=9.5 levels=7 for vgg-very-deep-16\nif vggdeep\n  net.addLayer('SPP', dagnn.ROIPooling('subdivisions',[7 7],...\n    'transform',1), ...\n    {net.layers(pRelu5).outputs{1},'rois'}, ...\n    'xSPP');\nelse\n  net.addLayer('SPP', dagnn.ROIPooling('subdivisions',[6 6],...\n    'transform',1), ...\n    {net.layers(pRelu5).outputs{1},'rois'}, ...\n    'xSPP');\nend\n\n\nif opts.addBiasSamples\n  % add boost\n  net.addLayer('boostBox', ...\n    dagnn.BiasSamples('scale',10), ...\n    {'xSPP','boxScore'},'xBoostBox');\n  net.layers(pFc6).inputs{1} = 'xBoostBox';\nelse\n  net.layers(pFc6).inputs{1} = 'xSPP';\nend\n\n\n\n% add softmax layer for det\npFc8R = (arrayfun(@(a) strcmp(a.name, 'fc8R'), net.layers)==1);\nnet.addLayer('softmaxDet', ...\n  dagnn.SoftMax2('dim',4, 'temp',opts.softmaxTempDet), ...\n  net.layers(pFc8R).outputs{1},'xSoftmaxDet');\n\n% add softmax layers for cls\npFc8C = (arrayfun(@(a) strcmp(a.name, 'fc8C'), net.layers)==1);\nnet.layers(pFc8C).outputs{1} = 'xfc8C';\n\nnet.addLayer('softmaxCls', ...\n  dagnn.SoftMax2('dim',3, 'temp',opts.softmaxTempCls), ...\n  net.layers(pFc8C).outputs{1},'xSoftmaxCls');\n\n% add times layer\nnet.addLayer('timesCR', ...\n  dagnn.Times(), ...\n  {'xSoftmaxCls','xSoftmaxDet'},'xTimes');\n\n% add sum layer\nnet.addLayer('sum', ...\n  dagnn.SumOverDim('dim',4), ...\n  'xTimes','prediction');\n\n\n\n% add classification AP\nnet.addLayer('mAP', dagnn.LayerAP('cls_index',1:opts.numClasses), ...\n  {'prediction','label', 'ids'}, 'mAP') ;\n\nnet.addLayer('loss', dagnn.Loss('loss','binarylog'), ...\n  {'prediction','label'}, 'objective') ;\n\n\n% no decay for bias\nfor i=2:2:numel(net.params)\n  net.params(i).weightDecay = 0;\nend\n\nif opts.addLossSmooth\n  net.addLayer('LossTopBoxSmooth',dagnn.LossTopBoxSmoothProb('minOverlap',0.6),...\n    {net.layers(pFc8R).inputs{1},'boxes','xTimes','label'},...\n    'lossTopB');\nend\nmeta = net.meta ; \nnet.meta = [] ;\nnet.meta.normalization.interpolation = meta.normalization.interpolation ;\nnet.meta.normalization.averageImage  = opts.averageImage ;\nnet.meta.normalization.rgbVariance   = opts.rgbVariance ;\nnet.meta.classes.name = {'aeroplane', 'bicycle', 'bird', ...\n    'boat', 'bottle', 'bus', 'car', 'cat', 'chair', 'cow', 'diningtable', ...\n    'dog', 'horse', 'motorbike', 'person', 'pottedplant', 'sheep', ...\n    'sofa', 'train', 'tvmonitor', 'background' };"
  },
  {
    "path": "core/wsddn_test.m",
    "content": "function aps = wsddn_test(varargin)\n% @author: Hakan Bilen\n% wsddn_test : this script evaluates detection performance in PASCAL VOC\n% dataset for given a WSDDN model\n\nopts.dataDir = fullfile(vl_rootnn, 'data') ;\nopts.expDir = fullfile(vl_rootnn, 'exp') ;\nopts.imdbPath = fullfile(vl_rootnn, 'data', 'imdbs', 'imdb-eb.mat');\nopts.modelPath = fullfile(vl_rootnn, 'exp', 'net.mat') ;\nopts.proposalType = 'eb' ;\nopts.proposalDir = fullfile(vl_rootnn, 'data','EdgeBoxes') ;\n\n% if you have limited gpu memory (<6gb), you can change the next 2 params\nopts.maxNumProposals = inf; % limit number\nopts.imageScales = [480,576,688,864,1200]; % scales\n\nopts.gpu = [] ;\nopts.train.prefetch = true ;\nopts.vis = 0 ;\nopts.numFetchThreads = 1 ;\nopts = vl_argparse(opts, varargin) ;\n\ndisplay(opts);\nif ~exist(fullfile(opts.dataDir,'VOCdevkit','VOCcode','VOCinit.m'),'file')\n  error('VOCdevkit is not installed');\nend\naddpath(fullfile(opts.dataDir,'VOCdevkit','VOCcode'));\nopts.train.expDir = opts.expDir ;\n% -------------------------------------------------------------------------\n%                                                    Network initialization\n% -------------------------------------------------------------------------\nnet = load(opts.modelPath);\n% figure(2) ;\nif isfield(net,'net')\n  net = net.net;\nend\nnet = dagnn.DagNN.loadobj(net) ;\n\nnet.mode = 'test' ;\nif ~isempty(opts.gpu)\n  gpuDevice(opts.gpu) ;\n  net.move('gpu') ;\nend\n\nif isfield(net,'normalization')\n  bopts = net.normalization;\nelse\n  bopts = net.meta.normalization;\nend\n\nbopts.rgbVariance = [] ;\nbopts.interpolation = net.meta.normalization.interpolation;\nbopts.jitterBrightness = 0 ;\nbopts.imageScales = opts.imageScales;\nbopts.numThreads = opts.numFetchThreads;\nbs = find(arrayfun(@(a) isa(a.block, 'dagnn.BiasSamples'), net.layers)==1);\nbopts.addBiasSamples = ~isempty(bs) ;\nbopts.vgg16 = any(arrayfun(@(a) strcmp(a.name, 'relu5_1'), net.layers)==1) ;\n% -------------------------------------------------------------------------\n%                                                   Database initialization\n% -------------------------------------------------------------------------\nfprintf('loading imdb...');\nif exist(opts.imdbPath,'file')==2\n  imdb = load(opts.imdbPath) ;\nelse\n  imdb = cnn_voc07_eb_setup_data('dataDir',opts.dataDir, ...\n    'proposalDir',opts.proposalDir,'loadTest',1);\n  save(opts.imdbPath,'-struct', 'imdb', '-v7.3');\nend\n\nfprintf('done\\n');\nminSize = 20;\nimdb = fixBBoxes(imdb, minSize, opts.maxNumProposals);\n\nVOCinit;\nVOCopts.testset = 'test';\nVOCopts.annopath = fullfile(opts.dataDir,'VOCdevkit','VOC2007','Annotations','%s.xml');\nVOCopts.imgsetpath = fullfile(opts.dataDir,'VOCdevkit','VOC2007','ImageSets','Main','%s.txt');\nVOCopts.localdir = fullfile(opts.dataDir,'VOCdevkit','local','VOC2007');\ncats = VOCopts.classes;\novTh = 0.4;\nscTh = 1e-3;\n% --------------------------------------------------------------------\n%                                                               Detect\n% --------------------------------------------------------------------\nif strcmp(VOCopts.testset,'test')\n  testIdx = find(imdb.images.set == 3);\nelseif strcmp(VOCopts.testset,'trainval')\n  testIdx = find(imdb.images.set < 3);\nend\nbopts.useGpu = numel(opts.gpu) >  0 ;\n\nscores = cell(1,numel(testIdx));\nboxes = imdb.images.boxes(testIdx);\nnames = imdb.images.name(testIdx);\n\ndetLayer = find(arrayfun(@(a) strcmp(a.name, 'xTimes'), net.vars)==1);\nnet.vars(detLayer(1)).precious = 1;\n% run detection\nstart = tic ;\nfor t=1:numel(testIdx)\n  batch = testIdx(t);  \n  \n  scoret = [];\n  for s=1:numel(opts.imageScales)\n    for f=1:2 % add flips\n      inputs = getBatch(bopts, imdb, batch, opts.imageScales(s), f-1 );\n      net.eval(inputs) ;\n  \n      if isempty(scoret)\n        scoret = squeeze(gather(net.vars(detLayer).value));\n      else\n        scoret = scoret + squeeze(gather(net.vars(detLayer).value));\n      end\n    end\n  end\n  scores{t} = scoret;\n  % show speed\n  time = toc(start) ;\n  n = t * 2 * numel(opts.imageScales) ; % number of images processed overall\n  speed = n/time ;\n  if mod(t,10)==0\n    fprintf('test %d / %d speed %.1f Hz\\n',t,numel(testIdx),speed);\n  end\n  \n  \n  if opts.vis\n    for cls = 1:numel(cats)\n      idx = (scores{t}(cls,:)>0.05);\n      if sum(idx)==0, continue;end\n        % divide by number of scales and flips\n  \n      im = imread(fullfile(imdb.imageDir,imdb.images.name{testIdx(t)}));\n      boxest  = double(imdb.images.boxes{testIdx(t)}(idx,:));\n      scorest = scores{t}(cls,idx)' / (2 * numel(opts.imageScales));\n      boxesSc = [boxest,scorest];\n      pick = nms(boxesSc, ovTh);\n      boxesSc = boxesSc(pick,:);\n      figure(1) ;\n      im = bbox_draw(im,boxesSc(1,[2 1 4 3 5]));\n      fprintf('%s %.2f',cats{cls},boxesSc(1,5));\n     \n      fprintf('\\n') ;\n      title(cats{cls});\n      pause;\n\n    end\n  end  \nend\n\ndets.names  = names;\ndets.scores = scores;\ndets.boxes  = boxes;\n\n% --------------------------------------------------------------------\n%                                                PASCAL VOC evaluation\n% --------------------------------------------------------------------\n\naps = zeros(numel(cats),1);\nfor cls = 1:numel(cats)\n  \n  vocDets.confidence = [];\n  vocDets.bbox       = [];\n  vocDets.ids        = [];\n\n  for i=1:numel(dets.names)\n    \n    scores = double(dets.scores{i});\n    boxes  = double(dets.boxes{i});\n    \n    boxesSc = [boxes,scores(cls,:)'];\n    boxesSc = boxesSc(boxesSc(:,5)>scTh,:);\n    pick = nms(boxesSc, ovTh);\n    boxesSc = boxesSc(pick,:);\n    \n    vocDets.confidence = [vocDets.confidence;boxesSc(:,5)];\n    vocDets.bbox = [vocDets.bbox;boxesSc(:,[2 1 4 3])];\n    vocDets.ids = [vocDets.ids; repmat({dets.names{i}(1:6)},size(boxesSc,1),1)];\n    \n  end\n  [rec,prec,ap] = wsddnVOCevaldet(VOCopts,cats{cls},vocDets,0);\n  \n  fprintf('%s %.1f\\n',cats{cls},100*ap);\n  aps(cls) = ap;\nend\n\n% --------------------------------------------------------------------\nfunction inputs = getBatch(opts, imdb, batch, scale, flip)\n% --------------------------------------------------------------------\n\nopts.scale = scale;\nopts.flip = flip;\nis_vgg16 = opts.vgg16 ;\nopts = rmfield(opts,'vgg16') ;\n\nimages = strcat([imdb.imageDir filesep], imdb.images.name(batch)) ;\nopts.prefetch = (nargout == 0);\n\n[im,rois] = wsddn_get_batch(images, imdb, batch, opts);\n\n\nrois = single(rois');\nif opts.useGpu > 0\n  im = gpuArray(im) ;\n  rois = gpuArray(rois) ;\nend\nrois = rois([1 3 2 5 4],:) ;\n\n\nss = [16 16] ;\nif is_vgg16\n  o0 = 8.5 ;\n  o1 = 9.5 ;\nelse\n  o0 = 18 ;\n  o1 = 9.5 ;\nend\nrois = [ rois(1,:);\n        floor((rois(2,:) - o0 + o1) / ss(1) + 0.5) + 1;\n        floor((rois(3,:) - o0 + o1) / ss(2) + 0.5) + 1;\n        ceil((rois(4,:) - o0 - o1) / ss(1) - 0.5) + 1;\n        ceil((rois(5,:) - o0 - o1) / ss(2) - 0.5) + 1];\n\n      \ninputs = {'input', im, 'rois', rois} ;\n  \n  \nif opts.addBiasSamples && isfield(imdb.images,'boxScores')\n  boxScore = reshape(imdb.images.boxScores{batch},[1 1 1 numel(imdb.images.boxScores{batch})]);\n  inputs{end+1} = 'boxScore';\n  inputs{end+1} = boxScore ; \nend\n\n\n% -------------------------------------------------------------------------\nfunction imdb = fixBBoxes(imdb, minSize, maxNum)\n\nfor i=1:numel(imdb.images.name)\n  bbox = imdb.images.boxes{i};\n  % remove small bbox\n  isGood = (bbox(:,3)>=bbox(:,1)+minSize) & (bbox(:,4)>=bbox(:,2)+minSize);\n  bbox = bbox(isGood,:);\n  % remove duplicate ones\n  [dummy, uniqueIdx] = unique(bbox, 'rows', 'first');\n  uniqueIdx = sort(uniqueIdx);\n  bbox = bbox(uniqueIdx,:);\n  % limit number for training\n  if imdb.images.set(i)~=3\n    nB = min(size(bbox,1),maxNum);\n  else\n    nB = size(bbox,1);\n  end\n  \n  if isfield(imdb.images,'boxScores')\n    imdb.images.boxScores{i} = imdb.images.boxScores{i}(isGood);\n    imdb.images.boxScores{i} = imdb.images.boxScores{i}(uniqueIdx);\n    imdb.images.boxScores{i} = imdb.images.boxScores{i}(1:nB);\n  end\n  imdb.images.boxes{i} = bbox(1:nB,:);\n  %   [h,w,~] = size(imdb.images.data{i});\n  %   imdb.images.boxes{i} = [1 1 h w];\n  \nend\n\n%-------------------------------------------------------------------------%\n\nfunction im = bbox_draw(im,boxes,c,t)\n\n% copied from Ross Girshick\n% Fast R-CNN\n% Copyright (c) 2015 Microsoft\n% Licensed under The MIT License [see LICENSE for details]\n% Written by Ross Girshick\n% --------------------------------------------------------\n% source: https://github.com/rbgirshick/fast-rcnn/blob/master/matlab/showboxes.m\n%\n%\n% Fast R-CNN\n% \n% Copyright (c) Microsoft Corporation\n% \n% All rights reserved.\n% \n% MIT License\n% \n% Permission is hereby granted, free of charge, to any person obtaining a\n% copy of this software and associated documentation files (the \"Software\"),\n% to deal in the Software without restriction, including without limitation\n% the rights to use, copy, modify, merge, publish, distribute, sublicense,\n% and/or sell copies of the Software, and to permit persons to whom the\n% Software is furnished to do so, subject to the following conditions:\n% \n% The above copyright notice and this permission notice shall be included\n% in all copies or substantial portions of the Software.\n% \n% THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\n% IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\n% FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL\n% THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR\n% OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,\n% ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR\n% OTHER DEALINGS IN THE SOFTWARE.\n\nimage(im);\naxis image;\naxis off;\nset(gcf, 'Color', 'white');\n\nif nargin<3\n  c = 'r';\n  t = 2;\nend\n\ns = '-';\nif ~isempty(boxes)\n    x1 = boxes(:, 1);\n    y1 = boxes(:, 2);\n    x2 = boxes(:, 3);\n    y2 = boxes(:, 4);\n    line([x1 x1 x2 x2 x1]', [y1 y2 y2 y1 y1]', ...\n        'color', c, 'linewidth', t, 'linestyle', s);\n    for i = 1:size(boxes, 1)\n        text(double(x1(i)), double(y1(i)) - 2, ...\n            sprintf('%.4f', boxes(i, end)), ...\n            'backgroundcolor', 'b', 'color', 'w', 'FontSize', 8);\n    end\nend\n"
  },
  {
    "path": "core/wsddn_train.m",
    "content": "function [net, info] = wsddn_train(varargin)\n% @author: Hakan Bilen\n% wsddn_train: training script for WSDDN\n\nopts.dataDir = fullfile(vl_rootnn, 'data') ;\nopts.expDir = fullfile(vl_rootnn, 'exp') ;\nopts.imdbPath = fullfile(vl_rootnn, 'data', 'imdbs', 'imdb-eb.mat');\nopts.modelPath = fullfile(vl_rootnn, 'models', 'imagenet-vgg-f.mat') ;\nopts.proposalType = 'eb' ;\nopts.proposalDir = fullfile(vl_rootnn, 'data', 'EdgeBoxes') ;\n\n\nopts.addBiasSamples = 1; % add Box Scores\nopts.addLossSmooth  = 1; % add Spatial Regulariser\nopts.softmaxTempCls = 1; % softmax temp for cls\nopts.softmaxTempDet = 2; % softmax temp for det\nopts.maxScale = 2000 ;\n\n% if you have limited gpu memory (<6gb), you can change the next 2 params\nopts.maxNumProposals = inf; % limit number (eg 1500)\nopts.imageScales = [480,576,688,864,1200]; % scales\nopts.minBoxSize = 20; % minimum bounding box size\nopts.train.gpus = [] ;\nopts.train.continue = true ;\nopts.train.prefetch = true ;\nopts.train.learningRate = 1e-5 * [ones(1,10) 0.1*ones(1,10)] ;\nopts.train.weightDecay = 0.0005;\nopts.train.numEpochs = 20;\nopts.train.derOutputs = {'objective', 1} ;\n\nopts.numFetchThreads = 1 ;\nopts = vl_argparse(opts, varargin) ;\n\ndisplay(opts);\n\nopts.train.batchSize = 1 ;\nopts.train.expDir = opts.expDir ;\nopts.train.numEpochs = numel(opts.train.learningRate) ;\n%% -------------------------------------------------------------------------\n%                                                   Database initialization\n% -------------------------------------------------------------------------\nfprintf('loading imdb...');\nif exist(opts.imdbPath,'file')==2\n  imdb = load(opts.imdbPath) ;\nelse\n  if strcmp(opts.proposalType,'ssw')\n    imdb = setup_voc07_ssw('dataDir',opts.dataDir, ...\n      'proposalDir',opts.proposalDir,'loadTest',1);\n  elseif strcmp(opts.proposalType,'eb')\n    imdb = setup_voc07_eb('dataDir',opts.dataDir, ...\n      'proposalDir',opts.proposalDir,'loadTest',1);\n  else\n    error('undefined proposal type %s\\n',opts.proposalType)\n  end\n  \n  imdbFolder = fileparts(opts.imdbPath);\n  \n  if ~exist(imdbFolder,'dir')\n    mkdir(imdbFolder);\n  end\n  save(opts.imdbPath,'-struct', 'imdb', '-v7.3');\nend\n\nfprintf('done\\n');\n\nimdb = fixBBoxes(imdb, opts.minBoxSize, opts.maxNumProposals);\n\n% use train + val for training\nimdb.images.set(imdb.images.set == 2) = 1;\ntrainIdx = find(imdb.images.set == 1);\n\n%% Compute image statistics (mean, RGB covariances, etc.)\nimageStatsPath = fullfile(opts.dataDir, 'imageStats.mat') ;\nif exist(imageStatsPath,'file')\n  load(imageStatsPath, 'averageImage', 'rgbMean', 'rgbCovariance') ;\nelse\n \n  images = imdb.images.name(imdb.images.set == 1) ;\n  images = strcat([imdb.imageDir filesep],images) ;\n  \n  [averageImage, rgbMean, rgbCovariance] = getImageStats(images, ...\n    'imageSize', [256 256], ...\n    'numThreads', opts.numFetchThreads, ...\n    'gpus', opts.train.gpus) ;\n  save(imageStatsPath, 'averageImage', 'rgbMean', 'rgbCovariance') ;\nend\n[v,d] = eig(rgbCovariance) ;\nrgbDeviation = v*sqrt(d) ;\nclear v d ;\n\n\n%% ------------------------------------------------------------------------\n%                                                    Network initialization\n% -------------------------------------------------------------------------\nnopts.addBiasSamples = opts.addBiasSamples; % add Box Scores (only with Edge Boxes)\nnopts.addLossSmooth  = opts.addLossSmooth; % add Spatial Regulariser\nnopts.softmaxTempCls = opts.softmaxTempCls; % softmax temp for cls\nnopts.softmaxTempDet = opts.softmaxTempDet; % softmax temp for det\n\nnopts.averageImage = reshape(rgbMean,[1 1 3]) ;\n% nopts.rgbVariance = 0.1 * rgbDeviation ;\nnopts.rgbVariance = [] ;\nnopts.numClasses = numel(imdb.classes.name) ;\nnopts.classNames = imdb.classes.name ;\n\nif ~exist(opts.modelPath,'file')\n  [pname,fname,ext]  = fileparts(opts.modelPath) ;\n  if ~exist(pname,'dir')\n    mkdir(pname) ;\n  end\n  fprintf('Downloading %s to %s\\n', [fname ext], pname) ;\n  urlwrite(sprintf('http://www.vlfeat.org/matconvnet/models/%s',[fname ext]),...\n    opts.modelPath) ;\nend\n\nnet = load(opts.modelPath);\nnet = wsddn_init(net,nopts);\n\nif nopts.addLossSmooth\n  opts.train.derOutputs = {'objective', 1, 'lossTopB', 1e-4} ;\nend\n\n\nif ~exist(opts.expDir,'dir')\n  mkdir(opts.expDir) ;\nend\n\n%% -------------------------------------------------------------------------\n%                                                   Database stats\n% -------------------------------------------------------------------------\nbopts = net.meta.normalization;\nnet.meta.augmentation.jitterBrightness = 0 ;\n% bopts.interpolation = 'bilinear';\nbopts.jitterBrightness = net.meta.augmentation.jitterBrightness ;\nbopts.imageScales = opts.imageScales;\nbopts.numThreads = opts.numFetchThreads;\nbopts.addLossSmooth = opts.addLossSmooth;\nbopts.addBiasSamples = opts.addBiasSamples;\nbopts.maxScale = opts.maxScale ;\nbopts.vgg16 = any(arrayfun(@(a) strcmp(a.name, 'relu5_1'), net.layers)==1) ;\n%% -------------------------------------------------------------------\n%                                                                Train\n% --------------------------------------------------------------------\n% avoid test data\nvalIdx = find(imdb.images.set == 3);\nvalIdx = valIdx(1:5:end) ;\n% valIdx = [];\n\n%% \nbopts.useGpu = numel(opts.train.gpus) >  0 ;\nbopts.prefetch = opts.train.prefetch;\n\ninfo = cnn_train_dag(net, imdb, @(i,b) ...\n  getBatch(bopts,i,b), ...\n  opts.train, 'train', trainIdx, ...\n  'val', valIdx) ;\n\n%% -------------------------------------------------------------------\n%                                                       Deploy network\n% --------------------------------------------------------------------\nif ~exist(fullfile(opts.expDir,'net.mat'),'file')\n  removeLoss = {'dagnn.Loss','dagnn.DropOut'};\n  for i=1:numel(removeLoss)\n    dagRemoveLayersOfType(net,removeLoss{i}) ;\n  end\n  \n  net.mode = 'test' ;\n  net_ = net ;\n  net = net_.saveobj() ;\n  save(fullfile(opts.expDir,'net.mat'), '-struct','net');\nend\n% --------------------------------------------------------------------\nfunction inputs = getBatch(opts, imdb, batch)\n% --------------------------------------------------------------------\nif isempty(batch)\n  inputs = {'input', [], 'label', [], 'rois', [], 'ids', []};\n  return;\nend\n\nopts.scale = opts.imageScales(randi(numel(opts.imageScales)));\nopts.flip = randi(2,numel(batch),1)-1; % random flip\nis_vgg16 = opts.vgg16 ;\nopts = rmfield(opts,'vgg16') ;\n\nimages = strcat([imdb.imageDir filesep], imdb.images.name(batch)) ;\nopts.prefetch = (nargout == 0);\n\n[im,rois] = wsddn_get_batch(images, imdb, batch, opts);\n\nif nargout>0\n  rois = single(rois') ;\n  labels = imdb.images.label(:,batch) ;\n  labels = reshape(labels,[1 1 size(labels,1) numel(batch)]);\n\n  if opts.useGpu > 0\n    im = gpuArray(im) ;\n    rois = gpuArray(rois) ;\n  end\n\n  if ~isempty(rois)\n   rois = rois([1 3 2 5 4],:) ;\n  end\n\n  ss = [16 16] ;\n\n  if is_vgg16\n    o0 = 8.5 ;\n    o1 = 9.5 ;\n  else\n    o0 = 18 ;\n    o1 = 9.5 ;\n  end\n\n  rois = [ rois(1,:); ...\n    floor((rois(2,:) - o0 + o1) / ss(1) + 0.5) + 1;\n    floor((rois(3,:) - o0 + o1) / ss(2) + 0.5) + 1;\n    ceil((rois(4,:) - o0 - o1) / ss(1) - 0.5) + 1;\n    ceil((rois(5,:) - o0 - o1) / ss(2) - 0.5) + 1];\n\n\n  inputs = {'input', im, 'label', labels, 'rois', rois, 'ids', batch} ;\n\n  if opts.addLossSmooth\n    inputs{end+1} = 'boxes' ;\n    inputs{end+1} = imdb.images.boxes{batch} ;\n  end\n\n  if opts.addBiasSamples==1\n    boxScore = reshape(imdb.images.boxScores{batch},[1 1 1 numel(imdb.images.boxScores{batch})]);\n    inputs{end+1} = 'boxScore';\n    inputs{end+1} = boxScore ;\n  end\nend\n\n% -------------------------------------------------------------------------\nfunction imdb = fixBBoxes(imdb, minSize, maxNum)\n% -------------------------------------------------------------------------\nfor i=1:numel(imdb.images.name)\n  bbox = imdb.images.boxes{i};\n  % remove small bbox\n  isGood = (bbox(:,3)>=bbox(:,1)+minSize) & (bbox(:,4)>=bbox(:,2)+minSize);\n  bbox = bbox(isGood,:);\n  % remove duplicate ones\n  [dummy, uniqueIdx] = unique(bbox, 'rows', 'first');\n  uniqueIdx = sort(uniqueIdx);\n  bbox = bbox(uniqueIdx,:);\n  % limit number for training\n  if imdb.images.set(i)~=3\n    nB = min(size(bbox,1),maxNum);\n  else\n    nB = size(bbox,1);\n  end\n  \n  if isfield(imdb.images,'boxScores')\n    imdb.images.boxScores{i} = imdb.images.boxScores{i}(isGood);\n    imdb.images.boxScores{i} = imdb.images.boxScores{i}(uniqueIdx);\n    imdb.images.boxScores{i} = imdb.images.boxScores{i}(1:nB);\n  end\n  imdb.images.boxes{i} = bbox(1:nB,:);\n  %   [h,w,~] = size(imdb.images.data{i});\n  %   imdb.images.boxes{i} = [1 1 h w];\n  \nend\n\n% -------------------------------------------------------------------------\nfunction layers = dagFindLayersOfType(net, type)\n% -------------------------------------------------------------------------\nlayers = [] ;\nfor l = 1:numel(net.layers)\n  if isa(net.layers(l).block, type)\n    layers{1,end+1} = net.layers(l).name ;\n  end\nend\n% -------------------------------------------------------------------------\nfunction dagRemoveLayersOfType(net, type)\n% -------------------------------------------------------------------------\nnames = dagFindLayersOfType(net, type) ;\nfor i = 1:numel(names)\n  layer = net.layers(net.getLayerIndex(names{i})) ;\n  net.removeLayer(names{i}) ;\n  net.renameVar(layer.outputs{1}, layer.inputs{1}, 'quiet', true) ;\nend\n"
  },
  {
    "path": "matlab/+dagnn/BiasSamples.m",
    "content": "classdef BiasSamples < dagnn.ElementWise\n  % @author: Hakan Bilen\n  properties\n    scale = single(1)\n  end\n  properties (Transient)\n    boxCoefs = []\n  end\n  methods\n    function outputs = forward(obj, inputs, params)\n      if numel(inputs) ~= 2\n        error('Number of inputs is not 2');\n      end\n      obj.boxCoefs = single(1)+obj.scale*inputs{2};\n      outputs{1} = bsxfun(@times,inputs{1},obj.boxCoefs);\n    end\n    \n    function [derInputs, derParams] = backward(obj, inputs, params, derOutputs)\n      derInputs = cell(1,2) ;\n      obj.boxCoefs = single(1)+obj.scale*inputs{2};\n      derInputs{1} = bsxfun(@times,derOutputs{1},obj.boxCoefs) ;\n      derParams = {} ;\n    end\n    \n    function obj = BiasSamples(varargin)\n      obj.load(varargin) ;\n    end\n    \n    function reset(obj)\n      obj.boxCoefs = [] ;\n    end\n    \n    function rfs = getReceptiveFields(obj)\n      rfs.size = [1 1] ;\n      rfs.stride = [1 1] ;\n      rfs.offset = [1 1] ;\n    end\n\n    function outputSizes = getOutputSizes(obj, inputSizes)\n      outputSizes = inputSizes(1) ;\n    end\n    \n  end\n  \nend\n"
  },
  {
    "path": "matlab/+dagnn/LayerAP.m",
    "content": "classdef LayerAP < dagnn.Loss\n  % @author: Hakan Bilen\n  % 11 step average precision\n  properties\n    cls_index = 1\n    resetLayer = false \n    gtLabels = []\n    scores   = []\n    ids      = []\n    aps      = []\n    voc07    = true % 11 step\n    classNames = {} \n  end\n\n\n  methods\n    function outputs = forward(obj, inputs, params)\n      if obj.resetLayer \n        obj.gtLabels = [] ;\n        obj.scores   = [] ;\n        obj.ids      = [] ;\n        obj.aps      = [] ;\n        obj.resetLayer = false ;\n      end\n      \n      if numel(inputs)==2\n        obj.scores = [obj.scores gather(squeeze(inputs{1}(:,:,obj.cls_index,:)))];\n        obj.gtLabels = [obj.gtLabels gather(squeeze(inputs{2}(:,:,obj.cls_index,:)))];\n      elseif numel(inputs)>2\n        scoresCur = gather(squeeze(inputs{1}(:,:,obj.cls_index,:)));\n        gtLabelsCur = gather(squeeze(inputs{2}(:,:,obj.cls_index,:)));\n        \n        idsCur = gather(squeeze(inputs{3}));\n        \n        [lia,locb] = ismember(idsCur,obj.ids);\n        \n        if any(lia)\n          obj.scores = [obj.scores scoresCur(~lia,:)];\n          obj.gtLabels = [obj.gtLabels gtLabelsCur(~lia,:)];\n          obj.ids = [obj.ids(:) ; idsCur(~lia,:)];\n          \n          nz = find(lia);\n          for i=1:numel(nz)\n            obj.scores(locb(nz(i)),:) = obj.scores(locb(nz(i)),:) + ...\n              scoresCur(nz(i),:);\n          end\n        else\n          obj.scores = [obj.scores scoresCur];\n          obj.gtLabels = [obj.gtLabels gtLabelsCur];\n          obj.ids = [obj.ids(:) ; idsCur]';\n        end\n      else\n        error('wrong number of inputs');\n      end\n      \n      obj.aps = obj.compute_average_precision();\n      obj.average = 100 * mean(obj.aps);\n      outputs{1} =  100 * mean(obj.aps);\n    end\n\n    function [derInputs, derParams] = backward(obj, inputs, params, derOutputs)\n      derInputs = cell(1,numel(inputs));\n      derInputs{1} = derOutputs{1} ;\n      derParams = {} ;\n    end\n\n    function reset(obj)\n      obj.resetLayer = true ;\n%       obj.average = 0 ;\n%       obj.aps = 0 ;\n%       obj.gtLabels = [];\n%       obj.scores   = [];\n%       obj.ids      = [];\n    end\n\n    function printAP(obj)\n      if isempty(obj.classNames)\n        for i=1:numel(obj.aps)\n          fprintf('class-%d %.1f\\n',i,100*obj.aps(i)) ;\n        end\n      else\n        for i=1:numel(obj.aps)\n          fprintf('%-50s %.1f\\n',obj.classNames{i},100*obj.aps(i)) ;\n        end\n      end\n    end\n    \n    function aps = compute_average_precision(obj)\n      assert(all(size(obj.scores)==size(obj.gtLabels)));\n      % nImg = size(obj.scores,1);\n      nCls = numel(obj.cls_index);\n\n      aps = zeros(1,nCls);\n\n      for c=1:nCls\n        gt = obj.gtLabels(c,:);\n        conf = obj.scores(c,:) ;\n        if sum(gt>0)==0, continue ; end\n        \n        % compute average precision\n        if obj.voc07\n          [rec,prec,ap]=obj.VOC07ap(conf,gt) ;\n        else\n          [rec,prec,ap]=obj.THUMOSeventclspr(conf,gt) ;\n        end\n        aps(c) = ap;\n      end\n    end\n\n    function [rec,prec,ap]=VOC07ap(obj,conf,gt)\n      [~,si]=sort(-conf);\n      tp=gt(si)>0;\n      fp=gt(si)<0;\n      \n      fp=cumsum(fp);\n      tp=cumsum(tp);\n      \n      rec=tp/sum(gt>0);\n      prec=tp./(fp+tp);\n      ap=0;\n      for t=0:0.1:1\n        p=max(prec(rec>=t));\n        if isempty(p)\n          p=0;\n        end\n        ap=ap+p/11;\n      end\n    end\n    \n    function [rec,prec,ap]=THUMOSeventclspr(obj,conf,gt)\n      [so,sortind]=sort(-conf);\n      tp=gt(sortind)==1;\n      fp=gt(sortind)~=1;\n      npos=length(find(gt==1));\n      \n      % compute precision/recall\n      fp=cumsum(fp);\n      tp=cumsum(tp);\n      rec=tp/npos;\n      prec=tp./(fp+tp);\n      \n      % compute average precision\n      \n      ap=0;\n      tmp=gt(sortind)==1;\n      for i=1:length(conf)\n        if tmp(i)==1\n          ap=ap+prec(i);\n        end\n      end\n      ap=ap/npos;\n    end\n    \n    function obj = LayerAP(varargin)\n      obj.load(varargin) ;\n      obj.loss = 'average_precision' ;\n    end\n  end\nend\n"
  },
  {
    "path": "matlab/+dagnn/LossTopBoxSmoothProb.m",
    "content": "classdef LossTopBoxSmoothProb < dagnn.Loss\n  % given top scoring box, it finds other boxes with at least overlap of\n  % minOverlap and calculates the euclidean dist between top and other\n  % boxes\n  \n  properties (Transient)\n    gtIdx = []\n    boxIdx = []\n    probs = []\n    minOverlap = 0.5\n    nBoxes = 10\n  end\n  \n  methods\n    function outputs = forward(obj, inputs, params)\n      if numel(inputs) ~= 4\n        error('Number of inputs is not 2');\n      end\n      obj.gtIdx = [];\n      obj.boxIdx = [];\n      obj.probs = [];\n      boxes  = double(gather(inputs{2})');\n      scores = gather(squeeze(inputs{3}));\n      labels = gather(squeeze(inputs{4}));\n      \n      if numel(boxes)<5\n        return;\n      end\n      \n      outputs{1} = zeros(1,'like',inputs{1});\n      for c=1:numel(labels)\n        if labels(c)<=0\n          continue;\n        end\n        \n        [so, si] = sort(scores(c,:),'descend');\n        obj.gtIdx{c} = si(1);\n        gtBox = boxes(:,obj.gtIdx{c});\n        gtArea = (gtBox(3)-gtBox(1)+1) .* (gtBox(4)-gtBox(2)+1);\n        \n        bbs = boxes(:,si(2:min(obj.nBoxes,end)))';\n        \n        y1 = bbs(:,1);\n        x1 = bbs(:,2);\n        y2 = bbs(:,3);\n        x2 = bbs(:,4);\n        \n        area = (x2-x1+1) .* (y2-y1+1);\n        \n        yy1 = max(gtBox(1), y1);\n        xx1 = max(gtBox(2), x1);\n        yy2 = min(gtBox(3), y2);\n        xx2 = min(gtBox(4), x2);\n        \n        w = max(0.0, xx2-xx1+1);\n        h = max(0.0, yy2-yy1+1);\n        \n        inter = w.*h;\n        o = find((inter ./ (gtArea + area - inter))>obj.minOverlap);\n        \n        if isempty(o)\n          continue;\n        end\n        \n        obj.boxIdx{c} = si(o+1);\n        obj.probs{c} = so(o+1);\n        d = bsxfun(@minus,inputs{1}(:,:,:,obj.boxIdx{c}),inputs{1}(:,:,:,obj.gtIdx{c}));\n        d = bsxfun(@times,d,obj.probs{c});\n        outputs{1} = outputs{1} + 0.5 * sum(d(:).^2);\n      end\n      \n      n = obj.numAveraged ;\n      m = n + 1 ;\n      obj.average = (n * obj.average + gather(outputs{1})) / m ;\n      obj.numAveraged = m ;\n    end\n    \n    function [derInputs, derParams] = backward(obj, inputs, params, derOutputs)\n      derInputs = cell(1,4) ;\n      derInputs{1} = zeros(size(inputs{1}),'like',inputs{1});\n      for c=1:numel(obj.boxIdx)\n        if isempty(obj.boxIdx{c}), continue; end\n        derInputs{1}(:,:,:,obj.boxIdx{c}) = ...\n          bsxfun(@minus,inputs{1}(:,:,:,obj.boxIdx{c}),inputs{1}(:,:,:,obj.gtIdx{c}));\n        derInputs{1}(:,:,:,obj.boxIdx{c}) = bsxfun(@times,...\n          reshape(obj.probs{c},[1 1 1 numel(obj.probs{c})]),derInputs{1}(:,:,:,obj.boxIdx{c}));\n        derInputs{1}(:,:,:,obj.gtIdx{c}) = -sum(derInputs{1}(:,:,:,obj.boxIdx{c}),4);\n\n      end\n      derInputs{1} = derInputs{1} * derOutputs{1};\n%       fprintf('LossTopBox l2 %f ',sqrt(sum(derInputs{1}(:).^2)));\n      derParams = {} ;\n    end\n    \n    function obj = LossTopBoxSmoothProb(varargin)\n      obj.load(varargin) ;\n      obj.loss = 'LossTopBoxSmoothProb';\n    end\n    \n    function reset(obj)\n      obj.gtIdx = [];\n      obj.boxIdx = [];\n      obj.probs = [];\n      obj.average = 0 ;\n      obj.numAveraged = 0 ;\n    end\n    \n    \n  end\n  \nend\n"
  },
  {
    "path": "matlab/+dagnn/SoftMax2.m",
    "content": "classdef SoftMax2 < dagnn.ElementWise\n  % @author: Hakan Bilen\n  % Softmax2 : it is a more generic softmax layer with a dimension and temperature parameter\n  properties\n    dim = 3;\n    temp = 1;\n    scale = 1;\n  end\n  \n  methods\n    function outputs = forward(self, inputs, params)\n      inputs{1} = inputs{1} / self.temp;\n      order = 1:numel(size(inputs{1}));\n      if self.dim~=3\n        order([3 self.dim]) = [self.dim 3];\n        inputs{1} = permute(inputs{1},order);\n      end\n      outputs{1} = vl_nnsoftmax(inputs{1}) ;\n      if self.dim~=3\n        outputs{1} = permute(outputs{1},order) ;\n      end\n    end\n    \n    function [derInputs, derParams] = backward(self, inputs, params, derOutputs)\n      \n      inputs{1} = inputs{1} / self.temp;\n      order = 1:numel(size(inputs{1}));\n      if self.dim~=3\n        order(3) = self.dim;\n        order(self.dim) = 3;\n        inputs{1} = permute(inputs{1},order);\n        derOutputs{1} = permute(derOutputs{1},order);\n      end\n      \n      derInputs{1} = vl_nnsoftmax(inputs{1}, derOutputs{1}) ;\n      if self.dim~=3\n        derInputs{1} = permute(derInputs{1},order) ;\n      end\n      derParams = {} ;\n    end\n    \n    function obj = SoftMax2(varargin)\n      obj.load(varargin) ;\n      obj.dim   = single(obj.dim);\n      obj.temp  = single(obj.temp);\n      obj.scale = single(obj.scale);\n    end\n  end\nend\n\n"
  },
  {
    "path": "matlab/+dagnn/SumOverDim.m",
    "content": "classdef SumOverDim < dagnn.ElementWise\n  % @author: Hakan Bilen\n  % SumOverDim is the sum of the elements of inputs{1} over dimension dim\n  properties \n    dim = 3;\n  end\n  \n  methods\n    function outputs = forward(obj, inputs, params)\n      outputs{1} = sum(inputs{1},obj.dim) ;\n    end\n\n    function [derInputs, derParams] = backward(obj, inputs, params, derOutputs)\n      \n      ndims = ones(1,numel(size(inputs{1})));\n      ndims(obj.dim) = size(inputs{1},obj.dim); \n      derInputs{1} = repmat(derOutputs{1},ndims);\n      \n      derParams = {} ;\n    end\n\n    function outputSizes = getOutputSizes(obj, inputSizes)\n      outputSizes{1} = inputSizes{1} ;\n      outputSizes{1}(obj.dim) = 1;\n    end\n\n    function obj = SumOverDim(varargin)\n      obj.load(varargin) ;\n      obj.dim = obj.dim;\n    end\n  end\nend\n"
  },
  {
    "path": "matlab/+dagnn/Times.m",
    "content": "classdef Times < dagnn.ElementWise\n  % @author: Hakan Bilen\n  % Times (multiply) DagNN layer\n  %   The Times layer takes the multiplication of two inputs and store the result\n  %   as its only output.\n  methods\n    function outputs = forward(obj, inputs, params)\n      if numel(inputs) ~= 2\n        error('Number of inputs is not 2');\n      end\n      outputs{1} = inputs{1} .* inputs{2} ;\n    end\n    \n    function [derInputs, derParams] = backward(obj, inputs, params, derOutputs)\n      derInputs = cell(1,2) ;\n      derInputs{1} = derOutputs{1} .* inputs{2}  ;\n      derInputs{2} = derOutputs{1} .* inputs{1}  ;\n      derParams = {} ;\n    end\n    \n    function obj = Times(varargin)\n      obj.load(varargin) ;\n    end\n    \n    function rfs = getReceptiveFields(obj)\n      rfs.size = [1 1] ;\n      rfs.stride = [1 1] ;\n      rfs.offset = [1 1] ;\n    end\n\n    function outputSizes = getOutputSizes(obj, inputSizes)\n      outputSizes = inputSizes(1) ;\n    end\n  end\n  \nend"
  },
  {
    "path": "pascal/nms.m",
    "content": "function pick = nms(boxes, overlap)\n% top = nms(boxes, overlap)\n% Non-maximum suppression. (FAST VERSION)\n% Greedily select high-scoring detections and skip detections\n% that are significantly covered by a previously selected\n% detection.\n%\n% NOTE: This is adapted from Pedro Felzenszwalb's version (nms.m),\n% but an inner loop has been eliminated to significantly speed it\n% up in the case of a large number of boxes\n\n% Copyright (C) 2011-12 by Tomasz Malisiewicz\n% All rights reserved.\n%\n% This file is part of the Exemplar-SVM library and is made\n% available under the terms of the MIT license (see COPYING file).\n% Project homepage: https://github.com/quantombone/exemplarsvm\n\n\nif isempty(boxes)\n  pick = [];\n  return;\nend\n\nx1 = boxes(:,1);\ny1 = boxes(:,2);\nx2 = boxes(:,3);\ny2 = boxes(:,4);\nif size(boxes,2)==4\n  s = ones(1,size(boxes,1));\nelse\n  s = boxes(:,end);\nend\n\narea = (x2-x1+1) .* (y2-y1+1);\n[~, I] = sort(s);\n\npick = s*0;\ncounter = 1;\nwhile ~isempty(I)\n  last = length(I);\n  i = I(last);\n  pick(counter) = i;\n  counter = counter + 1;\n  \n  xx1 = max(x1(i), x1(I(1:last-1)));\n  yy1 = max(y1(i), y1(I(1:last-1)));\n  xx2 = min(x2(i), x2(I(1:last-1)));\n  yy2 = min(y2(i), y2(I(1:last-1)));\n  \n  w = max(0.0, xx2-xx1+1);\n  h = max(0.0, yy2-yy1+1);\n  \n  inter = w.*h;\n  o = inter ./ (area(i) + area(I(1:last-1)) - inter);\n  \n%   I = I(find(o<=overlap));\n  I = I((o<=overlap));\nend\n\npick = pick(1:(counter-1));\n"
  },
  {
    "path": "pascal/setup_voc07_eb.m",
    "content": "function imdb = setup_voc07_eb(varargin)\n% cnn_voc07_eb_setup_data  Initialize PASCAL VOC2007 data with edge\n% boxes\n\n% Warning! boxes are in the format of ([y1 x1 y2 x2])\n\nopts.dataDir = fullfile('data') ;\nopts.proposalDir = fullfile(opts.dataDir,'EB');\nopts.loadTest = 1;\nopts = vl_argparse(opts, varargin) ;\n\n% -------------------------------------------------------------------------\n%                                                 Load selective search win\n% -------------------------------------------------------------------------\n%% Get selective search windows\nfiles = {'EdgeBoxesVOC2007trainval.mat', ...\n  'EdgeBoxesVOC2007test.mat'} ;\n\nif ~exist(opts.proposalDir, 'dir')\n  mkdir(opts.proposalDir) ;\nend\n\nfor i=1:numel(files)\n  outPath = fullfile(opts.proposalDir, files{i}) ;\n  if ~exist(outPath, 'file')\n    url = sprintf('http://groups.inf.ed.ac.uk/hbilen-data/data/WSDDN/%s',files{i}) ;\n    fprintf('Downloading %s to %s\\n', url, outPath) ;\n    urlwrite(url,outPath) ;\n  end\nend\n\n\nif ~isempty(opts.proposalDir)\n  t1 = load([opts.proposalDir,filesep,files{1}]);\n  if opts.loadTest\n    t2 = load([opts.proposalDir,filesep,files{2}]);\n    ssw.id = [str2double(t1.images) str2double(t2.images)];\n    ssw.boxes = cat(2,t1.boxes,t2.boxes);\n    ssw.boxScores = cat(2,t1.boxScores,t2.boxScores);\n  else\n    ssw.id = str2double(t1.images);\n    ssw.boxes = t1.boxes;\n    ssw.boxScores = t1.boxScores;\n  end\n  \n  [~,si] = sort(ssw.id);\n  ssw.id = ssw.id(si);\n  ssw.boxes = ssw.boxes(si);\n  ssw.boxScores = ssw.boxScores(si);\nend\n\n% -------------------------------------------------------------------------\n%                                                  Load categories metadata\n% -------------------------------------------------------------------------\ncats = {'aeroplane','bicycle','bird','boat','bottle','bus','car',...\n  'cat','chair','cow','diningtable','dog','horse','motorbike','person',...\n  'pottedplant','sheep','sofa','train','tvmonitor'};\n\nif ~exist(opts.dataDir,'dir')\n  error('wrong data folder!');\nend\n\n% Download VOC Devkit and data\nif ~exist(fullfile(opts.dataDir,'VOCdevkit'),'dir')\n  files = {'VOCtest_06-Nov-2007.tar',...\n           'VOCtrainval_06-Nov-2007.tar',...\n           'VOCdevkit_08-Jun-2007.tar'} ;\n  for i=1:numel(files)\n    if ~exist(fullfile(opts.dataDir, files{i}), 'file')\n      outPath = fullfile(opts.dataDir,files{i}) ;\n      url = sprintf('http://host.robots.ox.ac.uk/pascal/VOC/voc2007/%s',files{i}) ;\n      fprintf('Downloading %s to %s\\n', url, outPath) ;\n      urlwrite(url,outPath) ;\n      untar(outPath,opts.dataDir);\n    end\n  end\nend\naddpath(fullfile(opts.dataDir, 'VOCdevkit', 'VOCcode'));\n\ntraindata = importdata(fullfile(opts.dataDir,'VOCdevkit','VOC2007','ImageSets','Main','train.txt'));\nvaldata = importdata(fullfile(opts.dataDir,'VOCdevkit','VOC2007','ImageSets','Main','val.txt'));\ntestdata = importdata(fullfile(opts.dataDir,'VOCdevkit','VOC2007','ImageSets','Main','test.txt'));\n\nassert(numel(traindata)==2501);\nassert(numel(valdata)==2510);\nassert(numel(testdata)==4952);\n\nimdb.classes.name = cats ;\nimdb.classes.description = cats ;\nimdb.imageDir = fullfile(opts.dataDir, fullfile('VOCdevkit','VOC2007','JPEGImages')) ;\n\n% -------------------------------------------------------------------------\n%                                                           Training images\n% -------------------------------------------------------------------------%\nnames = cell(1,numel(traindata));\nlabels = zeros(numel(traindata),numel(cats));\n\n\n% load image names\nfor t=1:numel(traindata)\n  names{t} = sprintf('%06d.jpg',traindata(t));\n  %   data{t} = imread(sprintf('%s/%s',imdb.imageDir,names{t}));\nend\n\n% load binary labels\nfor c=1:numel(cats)\n  t = importdata(fullfile(opts.dataDir,'VOCdevkit','VOC2007','ImageSets','Main',[cats{c},'_train.txt']));\n  labels(:,c) = t(:,2);\nend\n\nimdb.images.id = traindata';\nimdb.images.name = names ;\nimdb.images.set = ones(1, numel(names)) ;\nimdb.images.label = labels' ;\n% imdb.images.data = data;\n\n% -------------------------------------------------------------------------\n%                                                         Validation images\n% -------------------------------------------------------------------------\n\nnames = cell(1,numel(valdata));\nlabels = zeros(numel(valdata),numel(cats));\n% data = cell(1,numel(valdata));\n\n% load image names\nfor t=1:numel(valdata)\n  names{t} = sprintf('%06d.jpg',valdata(t));\n  %   data{t} = imread(sprintf('%s/%s',imdb.imageDir,names{t}));\nend\n\n% load binary labels\nfor c=1:numel(cats)\n  t = importdata(fullfile(opts.dataDir,'VOCdevkit','VOC2007','ImageSets','Main',[cats{c},'_val.txt']));\n  labels(:,c) = t(:,2);\nend\n\n\nimdb.images.id = horzcat(imdb.images.id, valdata') ;\nimdb.images.name = horzcat(imdb.images.name, names) ;\nimdb.images.set = horzcat(imdb.images.set, 2*ones(1,numel(names))) ;\nimdb.images.label = horzcat(imdb.images.label, labels') ;\n% imdb.images.data = horzcat(imdb.images.data, data) ;\n\n% % -------------------------------------------------------------------------\n% %                                                               Test images\n% % -------------------------------------------------------------------------\n%\n%\nif opts.loadTest\n  names = cell(1,numel(testdata));\n  labels = zeros(numel(testdata),numel(cats));\n  % data = cell(1,numel(testdata));\n  \n  % load image names\n  for t=1:numel(testdata)\n    names{t} = sprintf('%06d.jpg',testdata(t));\n    %   data{t} = imread(sprintf('%s/%s',imdb.imageDir,names{t}));\n  end\n  \n  % load binary labels\n  for c=1:numel(cats)\n    t = importdata(fullfile(opts.dataDir,'VOCdevkit','VOC2007','ImageSets','Main',[cats{c},'_test.txt']));\n    labels(:,c) = t(:,2);\n  end\n  \n  imdb.images.id = horzcat(imdb.images.id, testdata') ;\n  imdb.images.name = horzcat(imdb.images.name, names) ;\n  imdb.images.set = horzcat(imdb.images.set, 3 * ones(1,numel(names))) ;\n  imdb.images.label = horzcat(imdb.images.label, labels') ;\n  % imdb.images.data = horzcat(imdb.images.data, data) ;\nend\n% -------------------------------------------------------------------------\n%                                                            Postprocessing\n% -------------------------------------------------------------------------\n[~,sorti] = sort(imdb.images.id);\n\n\nimdb.images.id = imdb.images.id(sorti);\nimdb.images.name = imdb.images.name(sorti) ;\nimdb.images.set = imdb.images.set(sorti) ;\nimdb.images.label = single(imdb.images.label(:,sorti)) ;\nimdb.images.size = zeros(numel(imdb.images.name),2);\n\nif ~isempty(opts.proposalDir)\n  imdb.images.boxes = ssw.boxes;\n  imdb.images.boxScores = ssw.boxScores;\n  assert(all(ssw.id==imdb.images.id));\nend\n\n% this is zero as scores of selective search windows are not much\n% informative\nif ~isempty(opts.proposalDir)\n  % imdb.images.boxScores = cell(size(imdb.images.boxes));\n  for i=1:numel(imdb.images.boxes)\n    imdb.images.boxes{i} = int16(imdb.images.boxes{i});\n    imdb.images.boxScores{i} = single(imdb.images.boxScores{i});\n    \n    imf = imfinfo(fullfile(imdb.imageDir,imdb.images.name{i}));\n    imdb.images.size(i,:) = [imf.Height,imf.Width];\n    \n    maxBoxes = max(imdb.images.boxes{i});\n    if imdb.images.size(i,1)< max(maxBoxes([1,3]))\n      error('Wrong box coordinates');\n    end\n    if imdb.images.size(i,2)< max(maxBoxes([2,4]))\n      error('Wrong box coordinates');\n    end\n    \n  end\nend\nend\n"
  },
  {
    "path": "pascal/setup_voc07_ssw.m",
    "content": "function imdb = setup_voc07_ssw(varargin)\n% setup_voc07_ssw  Initialize PASCAL VOC2007 data with selective\n% search windows \n\n% Warning! boxes are in the format of ([y1 x1 y2 x2])\n\nopts.dataDir = fullfile('data') ;\nopts.proposalDir = fullfile(opts.dataDir,'SSW');\nopts.loadTest = 1;\nopts = vl_argparse(opts, varargin) ;\n\n% -------------------------------------------------------------------------\n%                                                 Load selective search win\n% -------------------------------------------------------------------------\n%% get selective search windows\nfiles = {'SelectiveSearchVOC2007trainval.mat', ...\n  'SelectiveSearchVOC2007test.mat'} ;\n\nif ~exist(opts.proposalDir, 'dir')\n  mkdir(opts.proposalDir) ;\nend\n\nfor i=1:numel(files)\n  if ~exist(fullfile(opts.proposalDir, files{i}), 'file')\n    url = sprintf('http://koen.me/research/downloads/%s',files{i}) ;\n    fprintf('downloading %s\\n', url) ;\n    urlwrite(url,[opts.proposalDir filesep files{i}]);\n  end\nend\n\nif ~isempty(opts.proposalDir)\n  t1 = load([opts.proposalDir,filesep,files{1}]);\n  if opts.loadTest\n    t2 = load([opts.proposalDir,filesep,files{2}]);\n    ssw.id = [str2double(t1.images);str2double(t2.images)]';\n    ssw.boxes = cat(2,t1.boxes,t2.boxes);\n  else\n    ssw.id = str2double(t1.images)';\n    ssw.boxes = t1.boxes;\n  end\n\n  [~,si] = sort(ssw.id);\n  ssw.id = ssw.id(si);\n  ssw.boxes = ssw.boxes(si);\nend\n\n% -------------------------------------------------------------------------\n%                                                  Load categories metadata\n% -------------------------------------------------------------------------\ncats = {'aeroplane','bicycle','bird','boat','bottle','bus','car',...\n  'cat','chair','cow','diningtable','dog','horse','motorbike','person',...\n  'pottedplant','sheep','sofa','train','tvmonitor'};\n    \nif ~exist(opts.dataDir,'dir')\n  error('wrong data folder!');\nend\n\nif ~exist(opts.dataDir,'dir')\n  error('wrong data folder!');\nend\n\n% Download VOC Devkit and data\nif ~exist(fullfile(opts.dataDir,'VOCdevkit'),'dir')\n  files = {'VOCtest_06-Nov-2007.tar',...\n           'VOCtrainval_06-Nov-2007.tar',...\n           'VOCdevkit_08-Jun-2007.tar'} ;\n  for i=1:numel(files)\n    if ~exist(fullfile(opts.dataDir, files{i}), 'file')\n      outPath = fullfile(opts.dataDir,files{i}) ;\n      url = sprintf('http://host.robots.ox.ac.uk/pascal/VOC/voc2007/%s',files{i}) ;\n      fprintf('Downloading %s to %s\\n', url, outPath) ;\n      urlwrite(url,outPath) ;\n      untar(outPath,opts.dataDir);\n    end\n  end\nend\naddpath(fullfile(opts.dataDir, 'VOCdevkit', 'VOCcode'));\n\ntraindata = importdata(fullfile(opts.dataDir,'VOCdevkit','VOC2007','ImageSets','Main','train.txt'));\nvaldata = importdata(fullfile(opts.dataDir,'VOCdevkit','VOC2007','ImageSets','Main','val.txt'));\ntestdata = importdata(fullfile(opts.dataDir,'VOCdevkit','VOC2007','ImageSets','Main','test.txt'));\n\nassert(numel(traindata)==2501);\nassert(numel(valdata)==2510);\nassert(numel(testdata)==4952);\n\nimdb.classes.name = cats ;\nimdb.classes.description = cats ;\nimdb.imageDir = fullfile(opts.dataDir, fullfile('VOCdevkit','VOC2007','JPEGImages')) ;\n\n% -------------------------------------------------------------------------\n%                                                           Training images\n% -------------------------------------------------------------------------% \nnames = cell(1,numel(traindata));\nlabels = zeros(numel(traindata),numel(cats));\n\n\n% load image names\nfor t=1:numel(traindata)\n  names{t} = sprintf('%06d.jpg',traindata(t));\n%   data{t} = imread(sprintf('%s/%s',imdb.imageDir,names{t}));\nend\n\n% load binary labels\nfor c=1:numel(cats)\n  t = importdata(fullfile(opts.dataDir,'VOCdevkit','VOC2007','ImageSets','Main',[cats{c},'_train.txt']));\n  labels(:,c) = t(:,2);\nend\n\nimdb.images.id = traindata';\nimdb.images.name = names ;\nimdb.images.set = ones(1, numel(names)) ;\nimdb.images.label = labels' ;\n% imdb.images.data = data;\n\n% -------------------------------------------------------------------------\n%                                                         Validation images\n% -------------------------------------------------------------------------\n\nnames = cell(1,numel(valdata));\nlabels = zeros(numel(valdata),numel(cats));\n% data = cell(1,numel(valdata));\n\n% load image names\nfor t=1:numel(valdata)\n  names{t} = sprintf('%06d.jpg',valdata(t));\n%   data{t} = imread(sprintf('%s/%s',imdb.imageDir,names{t}));\nend\n\n% load binary labels\nfor c=1:numel(cats)\n  t = importdata(fullfile(opts.dataDir,'VOCdevkit','VOC2007','ImageSets','Main',[cats{c},'_val.txt']));\n  labels(:,c) = t(:,2);\nend\n\n\nimdb.images.id = horzcat(imdb.images.id, valdata') ;\nimdb.images.name = horzcat(imdb.images.name, names) ;\nimdb.images.set = horzcat(imdb.images.set, 2*ones(1,numel(names))) ;\nimdb.images.label = horzcat(imdb.images.label, labels') ;\n% imdb.images.data = horzcat(imdb.images.data, data) ;\n\n% % -------------------------------------------------------------------------\n% %                                                               Test images\n% % -------------------------------------------------------------------------\n% \n%\nif opts.loadTest\n  names = cell(1,numel(testdata));\n  labels = zeros(numel(testdata),numel(cats));\n  % data = cell(1,numel(testdata));\n\n  % load image names\n  for t=1:numel(testdata)\n    names{t} = sprintf('%06d.jpg',testdata(t));\n  %   data{t} = imread(sprintf('%s/%s',imdb.imageDir,names{t}));\n  end\n\n  % load binary labels\n  for c=1:numel(cats)\n    t = importdata(fullfile(opts.dataDir,'VOCdevkit','VOC2007','ImageSets','Main',[cats{c},'_test.txt']));\n    labels(:,c) = t(:,2);\n  end\n\n  imdb.images.id = horzcat(imdb.images.id, testdata') ;\n  imdb.images.name = horzcat(imdb.images.name, names) ;\n  imdb.images.set = horzcat(imdb.images.set, 3 * ones(1,numel(names))) ;\n  imdb.images.label = horzcat(imdb.images.label, labels') ;\n  % imdb.images.data = horzcat(imdb.images.data, data) ;\nend\n% -------------------------------------------------------------------------\n%                                                            Postprocessing\n% -------------------------------------------------------------------------\n[~,sorti] = sort(imdb.images.id);\n\n\nimdb.images.id = imdb.images.id(sorti);\nimdb.images.name = imdb.images.name(sorti) ;\nimdb.images.set = imdb.images.set(sorti) ;\nimdb.images.label = single(imdb.images.label(:,sorti)) ;\nimdb.images.size = zeros(numel(imdb.images.name),2);\n\nif ~isempty(opts.proposalDir)\n  imdb.images.boxes = ssw.boxes;\n  assert(all(ssw.id==imdb.images.id));\nend\n\n% this is zero as scores of selective search windows are not much\n% informative\nif ~isempty(opts.proposalDir)\nimdb.images.boxScores = cell(size(imdb.images.boxes));\nfor i=1:numel(imdb.images.boxes)\n  imdb.images.boxes{i} = int16(imdb.images.boxes{i});\n  imdb.images.boxScores{i} = zeros(size(imdb.images.boxes{i},1),1,'single');\n  imf = imfinfo(fullfile(imdb.imageDir,imdb.images.name{i}));\n  imdb.images.size(i,:) = [imf.Height,imf.Width];\nend\nend\nend\n"
  },
  {
    "path": "pascal/wsddnVOCap.m",
    "content": "function ap = wsddnVOCap(rec,prec)\n% From the PASCAL VOC 2011 devkit\n\nmrec=[0 ; rec ; 1];\nmpre=[0 ; prec ; 0];\nfor i=numel(mpre)-1:-1:1\n    mpre(i)=max(mpre(i),mpre(i+1));\nend\ni=find(mrec(2:end)~=mrec(1:end-1))+1;\nap=sum((mrec(i)-mrec(i-1)).*mpre(i));\n"
  },
  {
    "path": "pascal/wsddnVOCevaldet.m",
    "content": "function [rec,prec,ap] = wsddnVOCevaldet(VOCopts,cls,res,draw)\n\n% load test set\ntic;\nVOCopts.annocachepath=[VOCopts.localdir '%s_anno_cache.mat'];\ncp=sprintf(VOCopts.annocachepath,VOCopts.testset);\nif exist(cp,'file')\n  fprintf('%s: pr: loading ground truth\\n',cls);\n  load(cp,'gtids','recs');\nelse\n  [gtids,t]=textread(sprintf(VOCopts.imgsetpath,VOCopts.testset),'%s %d');\n  for i=1:length(gtids)\n    % display progress\n    if toc>1\n      fprintf('%s: pr: load: %d/%d\\n',cls,i,length(gtids));\n      drawnow;\n      tic;\n    end\n    \n    % read annotation\n    recs(i)=PASreadrecord(sprintf(VOCopts.annopath,gtids{i}));\n  end\n  save(cp,'gtids','recs');\nend\n\nfprintf('%s: pr: evaluating detections\\n',cls);\n\n% hash image ids\nhash=wsddnVOChash_init(gtids);\n\n% extract ground truth objects\n\nnpos=0;\ngt(length(gtids))=struct('BB',[],'diff',[],'det',[]);\nfor i=1:length(gtids)\n  % extract objects of class\n  clsinds=strmatch(cls,{recs(i).objects(:).class},'exact');\n  gt(i).BB=cat(1,recs(i).objects(clsinds).bbox)';\n  gt(i).diff=[recs(i).objects(clsinds).difficult];\n  gt(i).det=false(length(clsinds),1);\n  npos=npos+sum(~gt(i).diff);\nend\n\n% load results\nids        = res.ids;\nconfidence = res.confidence;\nBB         = res.bbox';\n\n% sort detections by decreasing confidence\n[sc,si]=sort(-confidence);\nids=ids(si);\nBB=BB(:,si);\n\n% assign detections to ground truth objects\nnd=length(confidence);\ntp=zeros(nd,1);\nfp=zeros(nd,1);\ntic;\nfor d=1:nd\n  % display progress\n  if toc>1\n    fprintf('%s: pr: compute: %d/%d\\n',cls,d,nd);\n    drawnow;\n    tic;\n  end\n  \n  % find ground truth image\n  i=wsddnVOChash_lookup(hash,ids{d});\n  if isempty(i)\n    error('unrecognized image \"%s\"',ids{d});\n  elseif length(i)>1\n    error('multiple image \"%s\"',ids{d});\n  end\n  \n  % assign detection to ground truth object if any\n  bb=BB(:,d);\n  ovmax=-inf;\n  for j=1:size(gt(i).BB,2)\n    bbgt=gt(i).BB(:,j);\n    bi=[max(bb(1),bbgt(1)) ; max(bb(2),bbgt(2)) ; min(bb(3),bbgt(3)) ; min(bb(4),bbgt(4))];\n    iw=bi(3)-bi(1)+1;\n    ih=bi(4)-bi(2)+1;\n    if iw>0 & ih>0\n      % compute overlap as area of intersection / area of union\n      ua=(bb(3)-bb(1)+1)*(bb(4)-bb(2)+1)+...\n        (bbgt(3)-bbgt(1)+1)*(bbgt(4)-bbgt(2)+1)-...\n        iw*ih;\n      ov=iw*ih/ua;\n      if ov>ovmax\n        ovmax=ov;\n        jmax=j;\n      end\n    end\n  end\n  % assign detection as true positive/don't care/false positive\n  if ovmax>=VOCopts.minoverlap\n    if ~gt(i).diff(jmax)\n      if ~gt(i).det(jmax)\n        tp(d)=1;            % true positive\n        gt(i).det(jmax)=true;\n      else\n        fp(d)=1;            % false positive (multiple detection)\n      end\n    end\n  else\n    fp(d)=1;                    % false positive\n  end\nend\n\n% compute precision/recall\nfp=cumsum(fp);\ntp=cumsum(tp);\nrec=tp/npos;\nprec=tp./(fp+tp);\n\nap=wsddnVOCap(rec,prec);\n\nif draw\n  % plot precision/recall\n  plot(rec,prec,'-');\n  grid;\n  xlabel 'recall'\n  ylabel 'precision'\n  title(sprintf('class: %s, subset: %s, AP = %.3f',cls,VOCopts.testset,ap));\nend\n"
  },
  {
    "path": "pascal/wsddnVOChash_init.m",
    "content": "function hash = wsddnVOChash_init(strs)\n% From the PASCAL VOC 2011 devkit\n\nhsize=4999;\nhash.key=cell(hsize,1);\nhash.val=cell(hsize,1);\n\nfor i=1:numel(strs)\n    s=strs{i};\n    h=mod(str2double(s([4 6:end])),hsize)+1;\n    j=numel(hash.key{h})+1;\n    hash.key{h}{j}=strs{i};\n    hash.val{h}(j)=i;\nend\n\n"
  },
  {
    "path": "pascal/wsddnVOChash_lookup.m",
    "content": "function ind = wsddnVOChash_lookup(hash,s)\n% From the PASCAL VOC 2011 devkit\n\nhsize=numel(hash.key);\nh=mod(str2double(s([4 6:end])),hsize)+1;\nind=hash.val{h}(strmatch(s,hash.key{h},'exact'));\n"
  },
  {
    "path": "setup_WSDDN.m",
    "content": "function setup_WSDDN()\n%SETUP_WSDDN Sets up WSDDN, by adding its folders to the Matlab path\n\nroot = fileparts(mfilename('fullpath')) ;\naddpath(root, [root '/matlab'], [root '/pascal'], [root '/core']) ;\naddpath([vl_rootnn '/examples/']) ;\naddpath([vl_rootnn '/examples/imagenet/']) ;\n\n"
  }
]