[
  {
    "path": ".gitignore",
    "content": "#\n*~\n\n*.DS_Store\ncache/\nresults/\ncheckpoints/\n\n# luarocks build files\n*.src.rock\n*.zip\n*.tar.gz\n*.t7\n\n# Object files\n*.o\n*.os\n*.ko\n*.obj\n*.elf\n\n# Precompiled Headers\n*.gch\n*.pch\n\n# Libraries\n*.lib\n*.a\n*.la\n*.lo\n*.def\n*.exp\n\n# Shared objects (inc. Windows DLLs)\n*.dll\n*.so\n*.so.*\n*.dylib\n\n# Executables\n*.exe\n*.out\n*.app\n*.i*86\n*.x86_64\n*.hex"
  },
  {
    "path": "LICENSE",
    "content": "Copyright (c) 2016, Phillip Isola and Jun-Yan Zhu\nAll rights reserved.\n\nRedistribution and use in source and binary forms, with or without\nmodification, are permitted provided that the following conditions are met:\n\n* Redistributions of source code must retain the above copyright notice, this\n  list of conditions and the following disclaimer.\n\n* Redistributions in binary form must reproduce the above copyright notice,\n  this list of conditions and the following disclaimer in the documentation\n  and/or other materials provided with the distribution.\n\nTHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS \"AS IS\"\nAND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE\nIMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE\nDISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE\nFOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL\nDAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR\nSERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER\nCAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,\nOR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\nOF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n\n\n\n----------------------------- LICENSE FOR DCGAN --------------------------------\nBSD License\n\nFor dcgan.torch software\n\nCopyright (c) 2015, Facebook, Inc. All rights reserved.\n\nRedistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:\n\nRedistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.\n\nRedistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.\n\nNeither the name Facebook nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.\n\nTHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n"
  },
  {
    "path": "README.md",
    "content": "\n# pix2pix\n[Project](https://phillipi.github.io/pix2pix/) | [Arxiv](https://arxiv.org/abs/1611.07004) |\n[PyTorch](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix)\n\nTorch implementation for learning a mapping from input images to output images, for example:\n\n<img src=\"imgs/examples.jpg\" width=\"900px\"/>\n\nImage-to-Image Translation with Conditional Adversarial Networks  \n [Phillip Isola](http://web.mit.edu/phillipi/), [Jun-Yan Zhu](https://www.cs.cmu.edu/~junyanz/), [Tinghui Zhou](https://people.eecs.berkeley.edu/~tinghuiz/), [Alexei A. Efros](https://people.eecs.berkeley.edu/~efros/)   \n CVPR, 2017.\n\nOn some tasks, decent results can be obtained fairly quickly and on small datasets. For example, to learn to generate facades (example shown above), we trained on just 400 images for about 2 hours (on a single Pascal Titan X GPU). However, for harder problems it may be important to train on far larger datasets, and for many hours or even days.\n\n**Note**: Please check out our [PyTorch](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix) implementation for pix2pix and CycleGAN. The PyTorch version is under active development and can produce results comparable to or better than this Torch version.\n\n## Setup\n\n### Prerequisites\n- Linux or OSX\n- NVIDIA GPU + CUDA CuDNN (CPU mode and CUDA without CuDNN may work with minimal modification, but untested)\n\n### Getting Started\n- Install torch and dependencies from https://github.com/torch/distro\n- Install torch packages `nngraph` and `display`\n```bash\nluarocks install nngraph\nluarocks install https://raw.githubusercontent.com/szym/display/master/display-scm-0.rockspec\n```\n- Clone this repo:\n```bash\ngit clone git@github.com:phillipi/pix2pix.git\ncd pix2pix\n```\n- Download the dataset (e.g., [CMP Facades](http://cmp.felk.cvut.cz/~tylecr1/facade/)):\n```bash\nbash ./datasets/download_dataset.sh facades\n```\n- Train the model\n```bash\nDATA_ROOT=./datasets/facades name=facades_generation which_direction=BtoA th train.lua\n```\n- (CPU only) The same training command without using a GPU or CUDNN. Setting the environment variables ```gpu=0 cudnn=0``` forces CPU only\n```bash\nDATA_ROOT=./datasets/facades name=facades_generation which_direction=BtoA gpu=0 cudnn=0 batchSize=10 save_epoch_freq=5 th train.lua\n```\n- (Optionally) start the display server to view results as the model trains. ( See [Display UI](#display-ui) for more details):\n```bash\nth -ldisplay.start 8000 0.0.0.0\n```\n\n- Finally, test the model:\n```bash\nDATA_ROOT=./datasets/facades name=facades_generation which_direction=BtoA phase=val th test.lua\n```\nThe test results will be saved to an html file here: `./results/facades_generation/latest_net_G_val/index.html`.\n\n## Train\n```bash\nDATA_ROOT=/path/to/data/ name=expt_name which_direction=AtoB th train.lua\n```\nSwitch `AtoB` to `BtoA` to train translation in opposite direction.\n\nModels are saved to `./checkpoints/expt_name` (can be changed by passing `checkpoint_dir=your_dir` in train.lua).\n\nSee `opt` in train.lua for additional training options.\n\n## Test\n```bash\nDATA_ROOT=/path/to/data/ name=expt_name which_direction=AtoB phase=val th test.lua\n```\n\nThis will run the model named `expt_name` in direction `AtoB` on all images in `/path/to/data/val`.\n\nResult images, and a webpage to view them, are saved to `./results/expt_name` (can be changed by passing `results_dir=your_dir` in test.lua).\n\nSee `opt` in test.lua for additional testing options.\n\n\n## Datasets\nDownload the datasets using the following script. Some of the datasets are collected by other researchers. Please cite their papers if you use the data.\n```bash\nbash ./datasets/download_dataset.sh dataset_name\n```\n- `facades`: 400 images from [CMP Facades dataset](http://cmp.felk.cvut.cz/~tylecr1/facade/). [[Citation](datasets/bibtex/facades.tex)]\n- `cityscapes`: 2975 images from the [Cityscapes training set](https://www.cityscapes-dataset.com/).  [[Citation](datasets/bibtex/cityscapes.tex)]\n- `maps`: 1096 training images scraped from Google Maps\n- `edges2shoes`: 50k training images from [UT Zappos50K dataset](http://vision.cs.utexas.edu/projects/finegrained/utzap50k/). Edges are computed by [HED](https://github.com/s9xie/hed) edge detector + post-processing.\n[[Citation](datasets/bibtex/shoes.tex)]\n- `edges2handbags`: 137K Amazon Handbag images from [iGAN project](https://github.com/junyanz/iGAN). Edges are computed by [HED](https://github.com/s9xie/hed) edge detector + post-processing. [[Citation](datasets/bibtex/handbags.tex)]\n- `night2day`: around 20K natural scene images from  [Transient Attributes dataset](http://transattr.cs.brown.edu/) [[Citation](datasets/bibtex/transattr.tex)]. To train a `day2night` pix2pix model, you need to add `which_direction=BtoA`.\n\n## Models\nDownload the pre-trained models with the following script. You need to rename the model (e.g., `facades_label2image` to `/checkpoints/facades/latest_net_G.t7`) after the download has finished.\n```bash\nbash ./models/download_model.sh model_name\n```\n- `facades_label2image` (label -> facade): trained on the CMP Facades dataset.\n- `cityscapes_label2image` (label -> street scene): trained on the Cityscapes dataset.\n- `cityscapes_image2label` (street scene -> label): trained on the Cityscapes dataset.\n- `edges2shoes` (edge -> photo): trained on UT Zappos50K dataset.\n- `edges2handbags` (edge -> photo): trained on Amazon handbags images.\n- `day2night` (daytime scene -> nighttime scene): trained on around 100 [webcams](http://transattr.cs.brown.edu/).\n\n## Setup Training and Test data\n### Generating Pairs\nWe provide a python script to generate training data in the form of pairs of images {A,B}, where A and B are two different depictions of the same underlying scene. For example, these might be pairs {label map, photo} or {bw image, color image}. Then we can learn to translate A to B or B to A:\n\nCreate folder `/path/to/data` with subfolders `A` and `B`. `A` and `B` should each have their own subfolders `train`, `val`, `test`, etc. In `/path/to/data/A/train`, put training images in style A. In `/path/to/data/B/train`, put the corresponding images in style B. Repeat same for other data splits (`val`, `test`, etc).\n\nCorresponding images in a pair {A,B} must be the same size and have the same filename, e.g., `/path/to/data/A/train/1.jpg` is considered to correspond to `/path/to/data/B/train/1.jpg`.\n\nOnce the data is formatted this way, call:\n```bash\npython scripts/combine_A_and_B.py --fold_A /path/to/data/A --fold_B /path/to/data/B --fold_AB /path/to/data\n```\n\nThis will combine each pair of images (A,B) into a single image file, ready for training.\n\n### Notes on Colorization\nNo need to run `combine_A_and_B.py` for colorization. Instead, you need to prepare some natural images and set `preprocess=colorization` in the script. The program will automatically convert each RGB image into Lab color space, and create  `L -> ab` image pair during the training. Also set `input_nc=1` and `output_nc=2`.\n\n### Extracting Edges\nWe provide python and Matlab scripts to extract coarse edges from photos. Run `scripts/edges/batch_hed.py` to compute [HED](https://github.com/s9xie/hed) edges. Run `scripts/edges/PostprocessHED.m` to simplify edges with additional post-processing steps. Check the code documentation for more details.\n\n### Evaluating Labels2Photos on Cityscapes\nWe provide scripts for running the evaluation of the Labels2Photos task on the Cityscapes **validation** set. We assume that you have installed `caffe` (and `pycaffe`) in your system. If not, see the [official website](http://caffe.berkeleyvision.org/installation.html) for installation instructions. Once `caffe` is successfully installed, download the pre-trained FCN-8s semantic segmentation model (512MB) by running\n```bash\nbash ./scripts/eval_cityscapes/download_fcn8s.sh\n```\nThen make sure `./scripts/eval_cityscapes/` is in your system's python path. If not, run the following command to add it\n```bash\nexport PYTHONPATH=${PYTHONPATH}:./scripts/eval_cityscapes/\n```\nNow you can run the following command to evaluate your predictions:\n```bash\npython ./scripts/eval_cityscapes/evaluate.py --cityscapes_dir /path/to/original/cityscapes/dataset/ --result_dir /path/to/your/predictions/ --output_dir /path/to/output/directory/\n```\nImages stored under `--result_dir` should contain your model predictions on the Cityscapes **validation** split, and have the original Cityscapes naming convention (e.g., `frankfurt_000001_038418_leftImg8bit.png`). The script will output a text file under `--output_dir` containing the metric.\n\n**Further notes**: Our pre-trained FCN model is **not** supposed to work on Cityscapes in the original resolution (1024x2048) as it was trained on 256x256 images that are then upsampled to 1024x2048 during training. The purpose of the resizing during training was to 1) keep the label maps in the original high resolution untouched and 2) avoid the need to change the standard FCN training code and the architecture for Cityscapes. During test time, you need to synthesize 256x256 results. Our test code will automatically upsample your results to 1024x2048 before feeding them to the pre-trained FCN model. The output is at 1024x2048 resolution and will be compared to 1024x2048 ground truth labels. You do not need to resize the ground truth labels. The best way to verify whether everything is correct is to reproduce the numbers for real images in the paper first. To achieve it, you need to resize the original/real Cityscapes images (**not** labels) to 256x256 and feed them to the evaluation code.\n\n\n## Display UI\nOptionally, for displaying images during training and test, use the [display package](https://github.com/szym/display).\n\n- Install it with: `luarocks install https://raw.githubusercontent.com/szym/display/master/display-scm-0.rockspec`\n- Then start the server with: `th -ldisplay.start`\n- Open this URL in your browser: [http://localhost:8000](http://localhost:8000)\n\nBy default, the server listens on localhost. Pass `0.0.0.0` to allow external connections on any interface:\n```bash\nth -ldisplay.start 8000 0.0.0.0\n```\nThen open `http://(hostname):(port)/` in your browser to load the remote desktop.\n\nL1 error is plotted to the display by default. Set the environment variable `display_plot` to a comma-separated list of values `errL1`, `errG` and `errD` to visualize the L1, generator, and discriminator error respectively. For example, to plot only the generator and discriminator errors to the display instead of the default L1 error, set `display_plot=\"errG,errD\"`.\n\n## Citation\nIf you use this code for your research, please cite our paper <a href=\"https://arxiv.org/pdf/1611.07004v1.pdf\">Image-to-Image Translation Using Conditional Adversarial Networks</a>:\n\n```\n@article{pix2pix2017,\n  title={Image-to-Image Translation with Conditional Adversarial Networks},\n  author={Isola, Phillip and Zhu, Jun-Yan and Zhou, Tinghui and Efros, Alexei A},\n  journal={CVPR},\n  year={2017}\n}\n```\n\n## Cat Paper Collection\nIf you love cats, and love reading cool graphics, vision, and learning papers, please check out the Cat Paper Collection:  \n[[Github]](https://github.com/junyanz/CatPapers) [[Webpage]](https://www.cs.cmu.edu/~junyanz/cat/cat_papers.html)\n\n## Acknowledgments\nCode borrows heavily from [DCGAN](https://github.com/soumith/dcgan.torch). The data loader is modified from [DCGAN](https://github.com/soumith/dcgan.torch) and  [Context-Encoder](https://github.com/pathak22/context-encoder).\n"
  },
  {
    "path": "data/data.lua",
    "content": "--[[\n    This data loader is a modified version of the one from dcgan.torch\n    (see https://github.com/soumith/dcgan.torch/blob/master/data/data.lua).\n\n    Copyright (c) 2016, Deepak Pathak [See LICENSE file for details]\n]]--\n\nlocal Threads = require 'threads'\nThreads.serialization('threads.sharedserialize')\n\nlocal data = {}\n\nlocal result = {}\nlocal unpack = unpack and unpack or table.unpack\n\nfunction data.new(n, opt_)\n   opt_ = opt_ or {}\n   local self = {}\n   for k,v in pairs(data) do\n      self[k] = v\n   end\n\n   local donkey_file = 'donkey_folder.lua'\n   if n > 0 then\n      local options = opt_\n      self.threads = Threads(n,\n                             function() require 'torch' end,\n                             function(idx)\n                                opt = options\n                                tid = idx\n                                local seed = (opt.manualSeed and opt.manualSeed or 0) + idx\n                                torch.manualSeed(seed)\n                                torch.setnumthreads(1)\n                                print(string.format('Starting donkey with id: %d seed: %d', tid, seed))\n                                assert(options, 'options not found')\n                                assert(opt, 'opt not given')\n                                print(opt)\n                                paths.dofile(donkey_file)\n                             end\n\n      )\n   else\n      if donkey_file then paths.dofile(donkey_file) end\n      self.threads = {}\n      function self.threads:addjob(f1, f2) f2(f1()) end\n      function self.threads:dojob() end\n      function self.threads:synchronize() end\n   end\n\n   local nSamples = 0\n   self.threads:addjob(function() return trainLoader:size() end,\n         function(c) nSamples = c end)\n   self.threads:synchronize()\n   self._size = nSamples\n\n   for i = 1, n do\n      self.threads:addjob(self._getFromThreads,\n                          self._pushResult)\n   end\n   return self\nend\n\nfunction data._getFromThreads()\n   assert(opt.batchSize, 'opt.batchSize not found')\n   return trainLoader:sample(opt.batchSize)\nend\n\nfunction data._pushResult(...)\n   local res = {...}\n   if res == nil then\n      self.threads:synchronize()\n   end\n   result[1] = res\nend\n\n\n\nfunction data:getBatch()\n   self.threads:addjob(self._getFromThreads, self._pushResult)\n   self.threads:dojob()\n   local res = result[1]\n\n   img_data = res[1]\n   img_paths =  res[3]\n\n   result[1] = nil\n   if torch.type(img_data) == 'table' then\n      img_data = unpack(img_data)\n   end\n\n   return img_data, img_paths\nend\n\nfunction data:size()\n   return self._size\nend\n\nreturn data\n"
  },
  {
    "path": "data/dataset.lua",
    "content": "--[[\n    Copyright (c) 2015-present, Facebook, Inc.\n    All rights reserved.\n\n    This source code is licensed under the BSD-style license found in the\n    LICENSE file in the root directory of this source tree. An additional grant\n    of patent rights can be found in the PATENTS file in the same directory.\n]]--\n\nrequire 'torch'\ntorch.setdefaulttensortype('torch.FloatTensor')\nlocal ffi = require 'ffi'\nlocal class = require('pl.class')\nlocal dir = require 'pl.dir'\nlocal tablex = require 'pl.tablex'\nlocal argcheck = require 'argcheck'\nrequire 'sys'\nrequire 'xlua'\nrequire 'image'\n\nlocal dataset = torch.class('dataLoader')\n\nlocal initcheck = argcheck{\n   pack=true,\n   help=[[\n     A dataset class for images in a flat folder structure (folder-name is class-name).\n     Optimized for extremely large datasets (upwards of 14 million images).\n     Tested only on Linux (as it uses command-line linux utilities to scale up)\n]],\n   {check=function(paths)\n       local out = true;\n       for k,v in ipairs(paths) do\n          if type(v) ~= 'string' then\n             print('paths can only be of string input');\n             out = false\n          end\n       end\n       return out\n   end,\n    name=\"paths\",\n    type=\"table\",\n    help=\"Multiple paths of directories with images\"},\n\n   {name=\"sampleSize\",\n    type=\"table\",\n    help=\"a consistent sample size to resize the images\"},\n\n   {name=\"split\",\n    type=\"number\",\n    help=\"Percentage of split to go to Training\"\n   },\n   {name=\"serial_batches\",\n    type=\"number\",\n    help=\"if randomly sample training images\"},\n\n   {name=\"samplingMode\",\n    type=\"string\",\n    help=\"Sampling mode: random | balanced \",\n    default = \"balanced\"},\n\n   {name=\"verbose\",\n    type=\"boolean\",\n    help=\"Verbose mode during initialization\",\n    default = false},\n\n   {name=\"loadSize\",\n    type=\"table\",\n    help=\"a size to load the images to, initially\",\n    opt = true},\n\n   {name=\"forceClasses\",\n    type=\"table\",\n    help=\"If you want this loader to map certain classes to certain indices, \"\n       .. \"pass a classes table that has {classname : classindex} pairs.\"\n       .. \" For example: {3 : 'dog', 5 : 'cat'}\"\n       .. \"This function is very useful when you want two loaders to have the same \"\n    .. \"class indices (trainLoader/testLoader for example)\",\n    opt = true},\n\n   {name=\"sampleHookTrain\",\n    type=\"function\",\n    help=\"applied to sample during training(ex: for lighting jitter). \"\n       .. \"It takes the image path as input\",\n    opt = true},\n\n   {name=\"sampleHookTest\",\n    type=\"function\",\n    help=\"applied to sample during testing\",\n    opt = true},\n}\n\nfunction dataset:__init(...)\n\n   -- argcheck\n   local args =  initcheck(...)\n   print(args)\n   for k,v in pairs(args) do self[k] = v end\n\n   if not self.loadSize then self.loadSize = self.sampleSize; end\n\n   if not self.sampleHookTrain then self.sampleHookTrain = self.defaultSampleHook end\n   if not self.sampleHookTest then self.sampleHookTest = self.defaultSampleHook end\n   self.image_count = 1\n   -- find class names\n   self.classes = {}\n   local classPaths = {}\n   if self.forceClasses then\n      for k,v in pairs(self.forceClasses) do\n         self.classes[k] = v\n         classPaths[k] = {}\n      end\n   end\n   local function tableFind(t, o) for k,v in pairs(t) do if v == o then return k end end end\n   -- loop over each paths folder, get list of unique class names,\n   -- also store the directory paths per class\n   -- for each class,\n   for k,path in ipairs(self.paths) do\n      local dirs = {} -- hack\n      dirs[1] = path\n      for k,dirpath in ipairs(dirs) do\n         local class = paths.basename(dirpath)\n         local idx = tableFind(self.classes, class)\n         if not idx then\n            table.insert(self.classes, class)\n            idx = #self.classes\n            classPaths[idx] = {}\n         end\n         if not tableFind(classPaths[idx], dirpath) then\n            table.insert(classPaths[idx], dirpath);\n         end\n      end\n   end\n\n   self.classIndices = {}\n   for k,v in ipairs(self.classes) do\n      self.classIndices[v] = k\n   end\n\n   -- define command-line tools, try your best to maintain OSX compatibility\n   local wc = 'wc'\n   local cut = 'cut'\n   local find = 'find -H'  -- if folder name is symlink, do find inside it after dereferencing\n   if ffi.os == 'OSX' then\n      wc = 'gwc'\n      cut = 'gcut'\n      find = 'gfind'\n   end\n   ----------------------------------------------------------------------\n   -- Options for the GNU find command\n   local extensionList = {'jpg', 'png','JPG','PNG','JPEG', 'ppm', 'PPM', 'bmp', 'BMP'}\n   local findOptions = ' -iname \"*.' .. extensionList[1] .. '\"'\n   for i=2,#extensionList do\n      findOptions = findOptions .. ' -o -iname \"*.' .. extensionList[i] .. '\"'\n   end\n\n   -- find the image path names\n   self.imagePath = torch.CharTensor()  -- path to each image in dataset\n   self.imageClass = torch.LongTensor() -- class index of each image (class index in self.classes)\n   self.classList = {}                  -- index of imageList to each image of a particular class\n   self.classListSample = self.classList -- the main list used when sampling data\n\n   print('running \"find\" on each class directory, and concatenate all'\n         .. ' those filenames into a single file containing all image paths for a given class')\n   -- so, generates one file per class\n   local classFindFiles = {}\n   for i=1,#self.classes do\n      classFindFiles[i] = os.tmpname()\n   end\n   local combinedFindList = os.tmpname();\n\n   local tmpfile = os.tmpname()\n   local tmphandle = assert(io.open(tmpfile, 'w'))\n   -- iterate over classes\n   for i, class in ipairs(self.classes) do\n      -- iterate over classPaths\n      for j,path in ipairs(classPaths[i]) do\n         local command = find .. ' \"' .. path .. '\" ' .. findOptions\n            .. ' >>\"' .. classFindFiles[i] .. '\" \\n'\n         tmphandle:write(command)\n      end\n   end\n   io.close(tmphandle)\n   os.execute('bash ' .. tmpfile)\n   os.execute('rm -f ' .. tmpfile)\n\n   print('now combine all the files to a single large file')\n   local tmpfile = os.tmpname()\n   local tmphandle = assert(io.open(tmpfile, 'w'))\n   -- concat all finds to a single large file in the order of self.classes\n   for i=1,#self.classes do\n      local command = 'cat \"' .. classFindFiles[i] .. '\" >>' .. combinedFindList .. ' \\n'\n      tmphandle:write(command)\n   end\n   io.close(tmphandle)\n   os.execute('bash ' .. tmpfile)\n   os.execute('rm -f ' .. tmpfile)\n\n   --==========================================================================\n   print('load the large concatenated list of sample paths to self.imagePath')\n   local cmd = wc .. \" -L '\"\n                                                  .. combinedFindList .. \"' |\"\n                                                  .. cut .. \" -f1 -d' '\"\n   print('cmd..' .. cmd)\n   local maxPathLength = tonumber(sys.fexecute(wc .. \" -L '\"\n                                                  .. combinedFindList .. \"' |\"\n                                                  .. cut .. \" -f1 -d' '\")) + 1\n   local length = tonumber(sys.fexecute(wc .. \" -l '\"\n                                           .. combinedFindList .. \"' |\"\n                                           .. cut .. \" -f1 -d' '\"))\n   assert(length > 0, \"Could not find any image file in the given input paths\")\n   assert(maxPathLength > 0, \"paths of files are length 0?\")\n   self.imagePath:resize(length, maxPathLength):fill(0)\n   local s_data = self.imagePath:data()\n   local count = 0\n   for line in io.lines(combinedFindList) do\n      ffi.copy(s_data, line)\n      s_data = s_data + maxPathLength\n      if self.verbose and count % 10000 == 0 then\n         xlua.progress(count, length)\n      end;\n      count = count + 1\n   end\n\n   self.numSamples = self.imagePath:size(1)\n   if self.verbose then print(self.numSamples ..  ' samples found.') end\n   --==========================================================================\n   print('Updating classList and imageClass appropriately')\n   self.imageClass:resize(self.numSamples)\n   local runningIndex = 0\n   for i=1,#self.classes do\n      if self.verbose then xlua.progress(i, #(self.classes)) end\n      local length = tonumber(sys.fexecute(wc .. \" -l '\"\n                                              .. classFindFiles[i] .. \"' |\"\n                                              .. cut .. \" -f1 -d' '\"))\n      if length == 0 then\n         error('Class has zero samples')\n      else\n         self.classList[i] = torch.range(runningIndex + 1, runningIndex + length):long()\n         self.imageClass[{{runningIndex + 1, runningIndex + length}}]:fill(i)\n      end\n      runningIndex = runningIndex + length\n   end\n\n   --==========================================================================\n   -- clean up temporary files\n   print('Cleaning up temporary files')\n   local tmpfilelistall = ''\n   for i=1,#(classFindFiles) do\n      tmpfilelistall = tmpfilelistall .. ' \"' .. classFindFiles[i] .. '\"'\n      if i % 1000 == 0 then\n         os.execute('rm -f ' .. tmpfilelistall)\n         tmpfilelistall = ''\n      end\n   end\n   os.execute('rm -f '  .. tmpfilelistall)\n   os.execute('rm -f \"' .. combinedFindList .. '\"')\n   --==========================================================================\n\n   if self.split == 100 then\n      self.testIndicesSize = 0\n   else\n      print('Splitting training and test sets to a ratio of '\n               .. self.split .. '/' .. (100-self.split))\n      self.classListTrain = {}\n      self.classListTest  = {}\n      self.classListSample = self.classListTrain\n      local totalTestSamples = 0\n      -- split the classList into classListTrain and classListTest\n      for i=1,#self.classes do\n         local list = self.classList[i]\n         local count = self.classList[i]:size(1)\n         local splitidx = math.floor((count * self.split / 100) + 0.5) -- +round\n         local perm = torch.randperm(count)\n         self.classListTrain[i] = torch.LongTensor(splitidx)\n         for j=1,splitidx do\n            self.classListTrain[i][j] = list[perm[j]]\n         end\n         if splitidx == count then -- all samples were allocated to train set\n            self.classListTest[i]  = torch.LongTensor()\n         else\n            self.classListTest[i]  = torch.LongTensor(count-splitidx)\n            totalTestSamples = totalTestSamples + self.classListTest[i]:size(1)\n            local idx = 1\n            for j=splitidx+1,count do\n               self.classListTest[i][idx] = list[perm[j]]\n               idx = idx + 1\n            end\n         end\n      end\n      -- Now combine classListTest into a single tensor\n      self.testIndices = torch.LongTensor(totalTestSamples)\n      self.testIndicesSize = totalTestSamples\n      local tdata = self.testIndices:data()\n      local tidx = 0\n      for i=1,#self.classes do\n         local list = self.classListTest[i]\n         if list:dim() ~= 0 then\n            local ldata = list:data()\n            for j=0,list:size(1)-1 do\n               tdata[tidx] = ldata[j]\n               tidx = tidx + 1\n            end\n         end\n      end\n   end\nend\n\n-- size(), size(class)\nfunction dataset:size(class, list)\n   list = list or self.classList\n   if not class then\n      return self.numSamples\n   elseif type(class) == 'string' then\n      return list[self.classIndices[class]]:size(1)\n   elseif type(class) == 'number' then\n      return list[class]:size(1)\n   end\nend\n\n-- getByClass\nfunction dataset:getByClass(class)\n   local index = 0\n   if self.serial_batches == 1 then\n     index = math.fmod(self.image_count-1, self.classListSample[class]:nElement())+1\n     self.image_count = self.image_count +1\n   else\n    index = math.ceil(torch.uniform() * self.classListSample[class]:nElement())\n   end\n   local imgpath = ffi.string(torch.data(self.imagePath[self.classListSample[class][index]]))\n   return self:sampleHookTrain(imgpath),  imgpath\nend\n\n-- converts a table of samples (and corresponding labels) to a clean tensor\nlocal function tableToOutput(self, dataTable, scalarTable)\n   local data, scalarLabels, labels\n   local quantity = #scalarTable\n   assert(dataTable[1]:dim() == 3)\n   data = torch.Tensor(quantity,\n\t\t       self.sampleSize[1], self.sampleSize[2], self.sampleSize[3])\n   scalarLabels = torch.LongTensor(quantity):fill(-1111)\n   for i=1,#dataTable do\n      data[i]:copy(dataTable[i])\n      scalarLabels[i] = scalarTable[i]\n   end\n   return data, scalarLabels\nend\n\n-- sampler, samples from the training set.\nfunction dataset:sample(quantity)\n   assert(quantity)\n   local dataTable = {}\n   local scalarTable = {}\n   local samplePaths = {}\n   for i=1,quantity do\n      local class = torch.random(1, #self.classes)\n      local out, imgpath = self:getByClass(class)\n      table.insert(dataTable, out)\n      table.insert(scalarTable, class)\n      samplePaths[i] = imgpath\n   end\n   local data, scalarLabels = tableToOutput(self, dataTable, scalarTable)\n   return data, scalarLabels, samplePaths-- filePaths\nend\n\nfunction dataset:get(i1, i2)\n   local indices = torch.range(i1, i2);\n   local quantity = i2 - i1 + 1;\n   assert(quantity > 0)\n   -- now that indices has been initialized, get the samples\n   local dataTable = {}\n   local scalarTable = {}\n   for i=1,quantity do\n      -- load the sample\n      local imgpath = ffi.string(torch.data(self.imagePath[indices[i]]))\n      local out = self:sampleHookTest(imgpath)\n      table.insert(dataTable, out)\n      table.insert(scalarTable, self.imageClass[indices[i]])\n   end\n   local data, scalarLabels = tableToOutput(self, dataTable, scalarTable)\n   return data, scalarLabels\nend\n\nreturn dataset\n"
  },
  {
    "path": "data/donkey_folder.lua",
    "content": "\n--[[\n    This data loader is a modified version of the one from dcgan.torch\n    (see https://github.com/soumith/dcgan.torch/blob/master/data/donkey_folder.lua).\n    Copyright (c) 2016, Deepak Pathak [See LICENSE file for details]\n    Copyright (c) 2015-present, Facebook, Inc.\n    All rights reserved.\n    This source code is licensed under the BSD-style license found in the\n    LICENSE file in the root directory of this source tree. An additional grant\n    of patent rights can be found in the PATENTS file in the same directory.\n]]--\n\nrequire 'image'\npaths.dofile('dataset.lua')\n-- This file contains the data-loading logic and details.\n-- It is run by each data-loader thread.\n------------------------------------------\n-------- COMMON CACHES and PATHS\n-- Check for existence of opt.data\nprint(os.getenv('DATA_ROOT'))\nopt.data = paths.concat(os.getenv('DATA_ROOT'), opt.phase)\n\nif not paths.dirp(opt.data) then\n    error('Did not find directory: ' .. opt.data)\nend\n\n-- a cache file of the training metadata (if doesnt exist, will be created)\nlocal cache = \"cache\"\nlocal cache_prefix = opt.data:gsub('/', '_')\nos.execute('mkdir -p cache')\nlocal trainCache = paths.concat(cache, cache_prefix .. '_trainCache.t7')\n\n--------------------------------------------------------------------------------------------\nlocal input_nc = opt.input_nc -- input channels\nlocal output_nc = opt.output_nc\nlocal loadSize   = {input_nc, opt.loadSize}\nlocal sampleSize = {input_nc, opt.fineSize}\n\nlocal preprocessAandB = function(imA, imB)\n  imA = image.scale(imA, loadSize[2], loadSize[2])\n  imB = image.scale(imB, loadSize[2], loadSize[2])\n  local perm = torch.LongTensor{3, 2, 1}\n  imA = imA:index(1, perm)--:mul(256.0): brg, rgb\n  imA = imA:mul(2):add(-1)\n  imB = imB:index(1, perm)\n  imB = imB:mul(2):add(-1)\n--   print(img:size())\n  assert(imA:max()<=1,\"A: badly scaled inputs\")\n  assert(imA:min()>=-1,\"A: badly scaled inputs\")\n  assert(imB:max()<=1,\"B: badly scaled inputs\")\n  assert(imB:min()>=-1,\"B: badly scaled inputs\")\n\n\n  local oW = sampleSize[2]\n  local oH = sampleSize[2]\n  local iH = imA:size(2)\n  local iW = imA:size(3)\n\n  if iH~=oH then\n    h1 = math.ceil(torch.uniform(1e-2, iH-oH))\n  end\n\n  if iW~=oW then\n    w1 = math.ceil(torch.uniform(1e-2, iW-oW))\n  end\n  if iH ~= oH or iW ~= oW then\n    imA = image.crop(imA, w1, h1, w1 + oW, h1 + oH)\n    imB = image.crop(imB, w1, h1, w1 + oW, h1 + oH)\n  end\n\n  if opt.flip == 1 and torch.uniform() > 0.5 then\n    imA = image.hflip(imA)\n    imB = image.hflip(imB)\n  end\n\n  return imA, imB\nend\n\n\n\nlocal function loadImageChannel(path)\n    local input = image.load(path, 3, 'float')\n    input = image.scale(input, loadSize[2], loadSize[2])\n\n    local oW = sampleSize[2]\n    local oH = sampleSize[2]\n    local iH = input:size(2)\n    local iW = input:size(3)\n\n    if iH~=oH then\n      h1 = math.ceil(torch.uniform(1e-2, iH-oH))\n    end\n\n    if iW~=oW then\n      w1 = math.ceil(torch.uniform(1e-2, iW-oW))\n    end\n    if iH ~= oH or iW ~= oW then\n      input = image.crop(input, w1, h1, w1 + oW, h1 + oH)\n    end\n\n\n    if opt.flip == 1 and torch.uniform() > 0.5 then\n      input = image.hflip(input)\n    end\n\n    local input_lab = image.rgb2lab(input)\n    local imA = input_lab[{{1}, {}, {} }]:div(50.0) - 1.0\n    local imB = input_lab[{{2,3},{},{}}]:div(110.0)\n    local imAB = torch.cat(imA, imB, 1)\n    assert(imAB:max()<=1,\"A: badly scaled inputs\")\n    assert(imAB:min()>=-1,\"A: badly scaled inputs\")\n\n    return imAB\nend\n\n--local function loadImage\n\nlocal function loadImage(path)\n   local input = image.load(path, 3, 'float')\n   local h = input:size(2)\n   local w = input:size(3)\n\n   local imA = image.crop(input, 0, 0, w/2, h)\n   local imB = image.crop(input, w/2, 0, w, h)\n\n   return imA, imB\nend\n\nlocal function loadImageInpaint(path)\n  local imB = image.load(path, 3, 'float')\n  imB = image.scale(imB, loadSize[2], loadSize[2])\n  local perm = torch.LongTensor{3, 2, 1}\n  imB = imB:index(1, perm)--:mul(256.0): brg, rgb\n  imB = imB:mul(2):add(-1)\n  assert(imB:max()<=1,\"A: badly scaled inputs\")\n  assert(imB:min()>=-1,\"A: badly scaled inputs\")\n  local oW = sampleSize[2]\n  local oH = sampleSize[2]\n  local iH = imB:size(2)\n  local iW = imB:size(3)\n  if iH~=oH then\n    h1 = math.ceil(torch.uniform(1e-2, iH-oH))\n  end\n\n  if iW~=oW then\n    w1 = math.ceil(torch.uniform(1e-2, iW-oW))\n  end\n  if iH ~= oH or iW ~= oW then\n    imB = image.crop(imB, w1, h1, w1 + oW, h1 + oH)\n  end\n  local imA = imB:clone()\n  imA[{{},{1 + oH/4, oH/2 + oH/4},{1 + oW/4, oW/2 + oW/4}}] = 1.0\n  if opt.flip == 1 and torch.uniform() > 0.5 then\n    imA = image.hflip(imA)\n    imB = image.hflip(imB)\n  end\n  imAB = torch.cat(imA, imB, 1)\n  return imAB\nend\n\n-- channel-wise mean and std. Calculate or load them from disk later in the script.\nlocal mean,std\n--------------------------------------------------------------------------------\n-- Hooks that are used for each image that is loaded\n\n-- function to load the image, jitter it appropriately (random crops etc.)\nlocal trainHook = function(self, path)\n   collectgarbage()\n   if opt.preprocess == 'regular' then\n     local imA, imB = loadImage(path)\n     imA, imB = preprocessAandB(imA, imB)\n     imAB = torch.cat(imA, imB, 1)\n   end\n\n   if opt.preprocess == 'colorization' then\n     imAB = loadImageChannel(path)\n   end\n\n   if opt.preprocess == 'inpaint' then\n     imAB = loadImageInpaint(path)\n   end\n   return imAB\nend\n\n--------------------------------------\n-- trainLoader\nprint('trainCache', trainCache)\nprint('Creating train metadata')\nprint('serial batch:, ', opt.serial_batches)\ntrainLoader = dataLoader{\n    paths = {opt.data},\n    loadSize = {input_nc, loadSize[2], loadSize[2]},\n    sampleSize = {input_nc+output_nc, sampleSize[2], sampleSize[2]},\n    split = 100,\n    serial_batches = opt.serial_batches,\n    verbose = true\n }\n\ntrainLoader.sampleHookTrain = trainHook\ncollectgarbage()\n\n-- do some sanity checks on trainLoader\ndo\n   local class = trainLoader.imageClass\n   local nClasses = #trainLoader.classes\n   assert(class:max() <= nClasses, \"class logic has error\")\n   assert(class:min() >= 1, \"class logic has error\")\nend\n"
  },
  {
    "path": "datasets/bibtex/cityscapes.tex",
    "content": "@inproceedings{Cordts2016Cityscapes,\ntitle={The Cityscapes Dataset for Semantic Urban Scene Understanding},\nauthor={Cordts, Marius and Omran, Mohamed and Ramos, Sebastian and Rehfeld, Timo and Enzweiler, Markus and Benenson, Rodrigo and Franke, Uwe and Roth, Stefan and Schiele, Bernt},\nbooktitle={Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},\nyear={2016}\n}\n"
  },
  {
    "path": "datasets/bibtex/facades.tex",
    "content": "@INPROCEEDINGS{Tylecek13,\n  author = {Radim Tyle{\\v c}ek, Radim {\\v S}{\\' a}ra},\n  title = {Spatial Pattern Templates for Recognition of Objects with Regular Structure},\n  booktitle = {Proc. GCPR},\n  year = {2013},\n  address = {Saarbrucken, Germany},\n}\n"
  },
  {
    "path": "datasets/bibtex/handbags.tex",
    "content": "@inproceedings{zhu2016generative,\n  title={Generative Visual Manipulation on the Natural Image Manifold},\n  author={Zhu, Jun-Yan and Kr{\\\"a}henb{\\\"u}hl, Philipp and Shechtman, Eli and Efros, Alexei A.},\n  booktitle={Proceedings of European Conference on Computer Vision (ECCV)},\n  year={2016}\n}\n\n@InProceedings{xie15hed,\n  author = {\"Xie, Saining and Tu, Zhuowen\"},\n  Title = {Holistically-Nested Edge Detection},\n  Booktitle = \"Proceedings of IEEE International Conference on Computer Vision\",\n  Year  = {2015},\n}\n"
  },
  {
    "path": "datasets/bibtex/shoes.tex",
    "content": "@InProceedings{fine-grained,\n  author = {A. Yu and K. Grauman},\n  title = {{F}ine-{G}rained {V}isual {C}omparisons with {L}ocal {L}earning},\n  booktitle = {Computer Vision and Pattern Recognition (CVPR)},\n  month = {June},\n  year = {2014}\n}\n\n@InProceedings{xie15hed,\n  author = {\"Xie, Saining and Tu, Zhuowen\"},\n  Title = {Holistically-Nested Edge Detection},\n  Booktitle = \"Proceedings of IEEE International Conference on Computer Vision\",\n  Year  = {2015},\n}\n"
  },
  {
    "path": "datasets/bibtex/transattr.tex",
    "content": "@article {Laffont14,\n    title = {Transient Attributes for High-Level Understanding and Editing of Outdoor Scenes},\n    author = {Pierre-Yves Laffont and Zhile Ren and Xiaofeng Tao and Chao Qian and James Hays},\n    journal = {ACM Transactions on Graphics (proceedings of SIGGRAPH)},\n    volume = {33},\n    number = {4},\n    year = {2014}\n}\n"
  },
  {
    "path": "datasets/download_dataset.sh",
    "content": "FILE=$1\n\nif [[ $FILE != \"cityscapes\" &&  $FILE != \"night2day\" &&  $FILE != \"edges2handbags\" && $FILE != \"edges2shoes\" && $FILE != \"facades\" && $FILE != \"maps\" ]]; then\n  echo \"Available datasets are cityscapes, night2day, edges2handbags, edges2shoes, facades, maps\"\n  exit 1\nfi\n\necho \"Specified [$FILE]\"\n\nURL=http://efrosgans.eecs.berkeley.edu/pix2pix/datasets/$FILE.tar.gz\nTAR_FILE=./datasets/$FILE.tar.gz\nTARGET_DIR=./datasets/$FILE/\nwget -N $URL -O $TAR_FILE\nmkdir -p $TARGET_DIR\ntar -zxvf $TAR_FILE -C ./datasets/\nrm $TAR_FILE\n"
  },
  {
    "path": "models/download_model.sh",
    "content": "FILE=$1\nURL=http://efrosgans.eecs.berkeley.edu/pix2pix/models/$FILE.t7\nMODEL_FILE=./models/$FILE.t7\nwget -N $URL -O $MODEL_FILE\n"
  },
  {
    "path": "models.lua",
    "content": "require 'nngraph'\n\nfunction defineG_encoder_decoder(input_nc, output_nc, ngf)\n    local netG = nil \n    -- input is (nc) x 256 x 256\n    local e1 = - nn.SpatialConvolution(input_nc, ngf, 4, 4, 2, 2, 1, 1)\n    -- input is (ngf) x 128 x 128\n    local e2 = e1 - nn.LeakyReLU(0.2, true) - nn.SpatialConvolution(ngf, ngf * 2, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 2)\n    -- input is (ngf * 2) x 64 x 64\n    local e3 = e2 - nn.LeakyReLU(0.2, true) - nn.SpatialConvolution(ngf * 2, ngf * 4, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 4)\n    -- input is (ngf * 4) x 32 x 32\n    local e4 = e3 - nn.LeakyReLU(0.2, true) - nn.SpatialConvolution(ngf * 4, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8)\n    -- input is (ngf * 8) x 16 x 16\n    local e5 = e4 - nn.LeakyReLU(0.2, true) - nn.SpatialConvolution(ngf * 8, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8)\n    -- input is (ngf * 8) x 8 x 8\n    local e6 = e5 - nn.LeakyReLU(0.2, true) - nn.SpatialConvolution(ngf * 8, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8)\n    -- input is (ngf * 8) x 4 x 4\n    local e7 = e6 - nn.LeakyReLU(0.2, true) - nn.SpatialConvolution(ngf * 8, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8)\n    -- input is (ngf * 8) x 2 x 2\n    local e8 = e7 - nn.LeakyReLU(0.2, true) - nn.SpatialConvolution(ngf * 8, ngf * 8, 4, 4, 2, 2, 1, 1)\n    -- input is (ngf * 8) x 1 x 1\n    \n    local d1 = e8 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf * 8, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8) - nn.Dropout(0.5)\n    -- input is (ngf * 8) x 2 x 2\n    local d2 = d1 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf * 8, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8) - nn.Dropout(0.5)\n    -- input is (ngf * 8) x 4 x 4\n    local d3 = d2 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf * 8, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8) - nn.Dropout(0.5)\n    -- input is (ngf * 8) x 8 x 8\n    local d4 = d3 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf * 8, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8)\n    -- input is (ngf * 8) x 16 x 16\n    local d5 = d4 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf * 8, ngf * 4, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 4)\n    -- input is (ngf * 4) x 32 x 32\n    local d6 = d5 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf * 4, ngf * 2, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 2)\n    -- input is (ngf * 2) x 64 x 64\n    local d7 = d6 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf * 2, ngf, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf)\n    -- input is (ngf) x128 x 128\n    local d8 = d7 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf, output_nc, 4, 4, 2, 2, 1, 1)\n    -- input is (nc) x 256 x 256\n    \n    local o1 = d8 - nn.Tanh()\n    \n    netG = nn.gModule({e1},{o1})\n\n    return netG\nend\n\nfunction defineG_unet(input_nc, output_nc, ngf)\n    local netG = nil\n    -- input is (nc) x 256 x 256\n    local e1 = - nn.SpatialConvolution(input_nc, ngf, 4, 4, 2, 2, 1, 1)\n    -- input is (ngf) x 128 x 128\n    local e2 = e1 - nn.LeakyReLU(0.2, true) - nn.SpatialConvolution(ngf, ngf * 2, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 2)\n    -- input is (ngf * 2) x 64 x 64\n    local e3 = e2 - nn.LeakyReLU(0.2, true) - nn.SpatialConvolution(ngf * 2, ngf * 4, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 4)\n    -- input is (ngf * 4) x 32 x 32\n    local e4 = e3 - nn.LeakyReLU(0.2, true) - nn.SpatialConvolution(ngf * 4, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8)\n    -- input is (ngf * 8) x 16 x 16\n    local e5 = e4 - nn.LeakyReLU(0.2, true) - nn.SpatialConvolution(ngf * 8, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8)\n    -- input is (ngf * 8) x 8 x 8\n    local e6 = e5 - nn.LeakyReLU(0.2, true) - nn.SpatialConvolution(ngf * 8, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8)\n    -- input is (ngf * 8) x 4 x 4\n    local e7 = e6 - nn.LeakyReLU(0.2, true) - nn.SpatialConvolution(ngf * 8, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8)\n    -- input is (ngf * 8) x 2 x 2\n    local e8 = e7 - nn.LeakyReLU(0.2, true) - nn.SpatialConvolution(ngf * 8, ngf * 8, 4, 4, 2, 2, 1, 1)\n    -- input is (ngf * 8) x 1 x 1\n    \n    local d1_ = e8 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf * 8, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8) - nn.Dropout(0.5)\n    -- input is (ngf * 8) x 2 x 2\n    local d1 = {d1_,e7} - nn.JoinTable(2)\n    local d2_ = d1 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf * 8 * 2, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8) - nn.Dropout(0.5)\n    -- input is (ngf * 8) x 4 x 4\n    local d2 = {d2_,e6} - nn.JoinTable(2)\n    local d3_ = d2 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf * 8 * 2, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8) - nn.Dropout(0.5)\n    -- input is (ngf * 8) x 8 x 8\n    local d3 = {d3_,e5} - nn.JoinTable(2)\n    local d4_ = d3 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf * 8 * 2, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8)\n    -- input is (ngf * 8) x 16 x 16\n    local d4 = {d4_,e4} - nn.JoinTable(2)\n    local d5_ = d4 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf * 8 * 2, ngf * 4, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 4)\n    -- input is (ngf * 4) x 32 x 32\n    local d5 = {d5_,e3} - nn.JoinTable(2)\n    local d6_ = d5 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf * 4 * 2, ngf * 2, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 2)\n    -- input is (ngf * 2) x 64 x 64\n    local d6 = {d6_,e2} - nn.JoinTable(2)\n    local d7_ = d6 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf * 2 * 2, ngf, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf)\n    -- input is (ngf) x128 x 128\n    local d7 = {d7_,e1} - nn.JoinTable(2)\n    local d8 = d7 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf * 2, output_nc, 4, 4, 2, 2, 1, 1)\n    -- input is (nc) x 256 x 256\n    \n    local o1 = d8 - nn.Tanh()\n    \n    netG = nn.gModule({e1},{o1})\n    \n    --graph.dot(netG.fg,'netG')\n    \n    return netG\nend\n\nfunction defineG_unet_128(input_nc, output_nc, ngf)\n    -- Two layer less than the default unet to handle 128x128 input\n    local netG = nil\n    -- input is (nc) x 128 x 128\n    local e1 = - nn.SpatialConvolution(input_nc, ngf, 4, 4, 2, 2, 1, 1)\n    -- input is (ngf) x 64 x 64\n    local e2 = e1 - nn.LeakyReLU(0.2, true) - nn.SpatialConvolution(ngf, ngf * 2, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 2)\n    -- input is (ngf * 2) x 32 x 32\n    local e3 = e2 - nn.LeakyReLU(0.2, true) - nn.SpatialConvolution(ngf * 2, ngf * 4, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 4)\n    -- input is (ngf * 4) x 16 x 16\n    local e4 = e3 - nn.LeakyReLU(0.2, true) - nn.SpatialConvolution(ngf * 4, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8)\n    -- input is (ngf * 8) x 8 x 8\n    local e5 = e4 - nn.LeakyReLU(0.2, true) - nn.SpatialConvolution(ngf * 8, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8)\n    -- input is (ngf * 8) x 4 x 4\n    local e6 = e5 - nn.LeakyReLU(0.2, true) - nn.SpatialConvolution(ngf * 8, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8)\n    -- input is (ngf * 8) x 2 x 2\n    local e7 = e6 - nn.LeakyReLU(0.2, true) - nn.SpatialConvolution(ngf * 8, ngf * 8, 4, 4, 2, 2, 1, 1)\n    -- input is (ngf * 8) x 1 x 1\n    \n    local d1_ = e7 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf * 8, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8) - nn.Dropout(0.5)\n    -- input is (ngf * 8) x 2 x 2\n    local d1 = {d1_,e6} - nn.JoinTable(2)\n    local d2_ = d1 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf * 8 * 2, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8) - nn.Dropout(0.5)\n    -- input is (ngf * 8) x 4 x 4\n    local d2 = {d2_,e5} - nn.JoinTable(2)\n    local d3_ = d2 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf * 8 * 2, ngf * 8, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 8) - nn.Dropout(0.5)\n    -- input is (ngf * 8) x 8 x 8\n    local d3 = {d3_,e4} - nn.JoinTable(2)\n    local d4_ = d3 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf * 8 * 2, ngf * 4, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 4)\n    -- input is (ngf * 8) x 16 x 16\n    local d4 = {d4_,e3} - nn.JoinTable(2)\n    local d5_ = d4 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf * 4 * 2, ngf * 2, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf * 2)\n    -- input is (ngf * 4) x 32 x 32\n    local d5 = {d5_,e2} - nn.JoinTable(2)\n    local d6_ = d5 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf * 2 * 2, ngf, 4, 4, 2, 2, 1, 1) - nn.SpatialBatchNormalization(ngf)\n    -- input is (ngf * 2) x 64 x 64\n    local d6 = {d6_,e1} - nn.JoinTable(2)\n    local d7 = d6 - nn.ReLU(true) - nn.SpatialFullConvolution(ngf * 2, output_nc, 4, 4, 2, 2, 1, 1)\n    -- input is (ngf) x128 x 128\n    \n    local o1 = d7 - nn.Tanh()\n    \n    netG = nn.gModule({e1},{o1})\n    \n    --graph.dot(netG.fg,'netG')\n    \n    return netG\nend\n\nfunction defineD_basic(input_nc, output_nc, ndf)\n    n_layers = 3\n    return defineD_n_layers(input_nc, output_nc, ndf, n_layers)\nend\n\n-- rf=1\nfunction defineD_pixelGAN(input_nc, output_nc, ndf)\n    local netD = nn.Sequential()\n    \n    -- input is (nc) x 256 x 256\n    netD:add(nn.SpatialConvolution(input_nc+output_nc, ndf, 1, 1, 1, 1, 0, 0))\n    netD:add(nn.LeakyReLU(0.2, true))\n    -- state size: (ndf) x 256 x 256\n    netD:add(nn.SpatialConvolution(ndf, ndf * 2, 1, 1, 1, 1, 0, 0))\n    netD:add(nn.SpatialBatchNormalization(ndf * 2)):add(nn.LeakyReLU(0.2, true))\n    -- state size: (ndf*2) x 256 x 256\n    netD:add(nn.SpatialConvolution(ndf * 2, 1, 1, 1, 1, 1, 0, 0))\n    -- state size: 1 x 256 x 256\n    netD:add(nn.Sigmoid())\n    -- state size: 1 x 256 x 256\n        \n    return netD\nend\n\n-- if n=0, then use pixelGAN (rf=1)\n-- else rf is 16 if n=1\n--            34 if n=2\n--            70 if n=3\n--            142 if n=4\n--            286 if n=5\n--            574 if n=6\nfunction defineD_n_layers(input_nc, output_nc, ndf, n_layers)\n    if n_layers==0 then\n        return defineD_pixelGAN(input_nc, output_nc, ndf)\n    else\n    \n        local netD = nn.Sequential()\n        \n        -- input is (nc) x 256 x 256\n        netD:add(nn.SpatialConvolution(input_nc+output_nc, ndf, 4, 4, 2, 2, 1, 1))\n        netD:add(nn.LeakyReLU(0.2, true))\n        \n        local nf_mult = 1\n        local nf_mult_prev = 1\n        for n = 1, n_layers-1 do \n            nf_mult_prev = nf_mult\n            nf_mult = math.min(2^n,8)\n            netD:add(nn.SpatialConvolution(ndf * nf_mult_prev, ndf * nf_mult, 4, 4, 2, 2, 1, 1))\n            netD:add(nn.SpatialBatchNormalization(ndf * nf_mult)):add(nn.LeakyReLU(0.2, true))\n        end\n        \n        -- state size: (ndf*M) x N x N\n        nf_mult_prev = nf_mult\n        nf_mult = math.min(2^n_layers,8)\n        netD:add(nn.SpatialConvolution(ndf * nf_mult_prev, ndf * nf_mult, 4, 4, 1, 1, 1, 1))\n        netD:add(nn.SpatialBatchNormalization(ndf * nf_mult)):add(nn.LeakyReLU(0.2, true))\n        -- state size: (ndf*M*2) x (N-1) x (N-1)\n        netD:add(nn.SpatialConvolution(ndf * nf_mult, 1, 4, 4, 1, 1, 1, 1))\n        -- state size: 1 x (N-2) x (N-2)\n        \n        netD:add(nn.Sigmoid())\n        -- state size: 1 x (N-2) x (N-2)\n        \n        return netD\n    end\nend\n"
  },
  {
    "path": "scripts/combine_A_and_B.py",
    "content": "from pdb import set_trace as st\nimport os\nimport numpy as np\nimport cv2\nimport argparse\n\nparser = argparse.ArgumentParser('create image pairs')\nparser.add_argument('--fold_A', dest='fold_A', help='input directory for image A', type=str, default='../dataset/50kshoes_edges')\nparser.add_argument('--fold_B', dest='fold_B', help='input directory for image B', type=str, default='../dataset/50kshoes_jpg')\nparser.add_argument('--fold_AB', dest='fold_AB', help='output directory', type=str, default='../dataset/test_AB')\nparser.add_argument('--num_imgs', dest='num_imgs', help='number of images',type=int, default=1000000)\nparser.add_argument('--use_AB', dest='use_AB', help='if true: (0001_A, 0001_B) to (0001_AB)',action='store_true')\nargs = parser.parse_args()\n\nfor arg in vars(args):\n    print('[%s] = ' % arg,  getattr(args, arg))\n\nsplits = filter( lambda f: not f.startswith('.'), os.listdir(args.fold_A)) # ignore hidden folders like .DS_Store\n\nfor sp in splits:\n    img_fold_A = os.path.join(args.fold_A, sp)\n    img_fold_B = os.path.join(args.fold_B, sp)\n    img_list = filter( lambda f: not f.startswith('.'), os.listdir(img_fold_A)) # ignore hidden folders like .DS_Store\n    img_list = list(img_list)\n    if args.use_AB: \n        img_list = [img_path for img_path in img_list if '_A.' in img_path]\n\n    num_imgs = min(args.num_imgs, len(img_list))\n    print('split = %s, use %d/%d images' % (sp, num_imgs, len(img_list)))\n    img_fold_AB = os.path.join(args.fold_AB, sp)\n    if not os.path.isdir(img_fold_AB):\n        os.makedirs(img_fold_AB)\n    print('split = %s, number of images = %d' % (sp, num_imgs))\n    for n in range(num_imgs):\n        name_A = img_list[n]\n        path_A = os.path.join(img_fold_A, name_A)\n        if args.use_AB:\n            name_B = name_A.replace('_A.', '_B.')\n        else:\n            name_B = name_A\n        path_B = os.path.join(img_fold_B, name_B)\n        if os.path.isfile(path_A) and os.path.isfile(path_B):\n            name_AB = name_A\n            if args.use_AB:\n                name_AB = name_AB.replace('_A.', '.') # remove _A\n            path_AB = os.path.join(img_fold_AB, name_AB)\n            im_A = cv2.imread(path_A, cv2.IMREAD_COLOR)\n            im_B = cv2.imread(path_B, cv2.IMREAD_COLOR)\n            im_AB = np.concatenate([im_A, im_B], 1)\n            cv2.imwrite(path_AB, im_AB)\n\n"
  },
  {
    "path": "scripts/edges/PostprocessHED.m",
    "content": "%%% Prerequisites\n% You need to get the cpp file edgesNmsMex.cpp from https://raw.githubusercontent.com/pdollar/edges/master/private/edgesNmsMex.cpp\n% and compile it in Matlab: mex edgesNmsMex.cpp\n% You also need to download and install Piotr's Computer Vision Matlab Toolbox:  https://pdollar.github.io/toolbox/\n\n%%% parameters\n% hed_mat_dir: the hed mat file directory (the output of 'batch_hed.py')\n% edge_dir: the output HED edges directory\n% image_width: resize the edge map to [image_width, image_width]\n% threshold: threshold for image binarization (default 25.0/255.0)\n% small_edge: remove small edges (default 5)\n\nfunction [] = PostprocessHED(hed_mat_dir, edge_dir, image_width, threshold, small_edge)\n\nif ~exist(edge_dir, 'dir')\n    mkdir(edge_dir);\nend\nfileList = dir(fullfile(hed_mat_dir, '*.mat'));\nnFiles = numel(fileList);\nfprintf('find %d mat files\\n', nFiles);\n\nfor n = 1 : nFiles\n    if mod(n, 1000) == 0\n        fprintf('process %d/%d images\\n', n, nFiles);\n    end\n    fileName = fileList(n).name;\n    filePath = fullfile(hed_mat_dir, fileName);\n    jpgName = strrep(fileName, '.mat', '.jpg');\n    edge_path = fullfile(edge_dir, jpgName);\n\n    if ~exist(edge_path, 'file')\n        E = GetEdge(filePath);\n        E = imresize(E,[image_width,image_width]);\n        E_simple = SimpleEdge(E, threshold, small_edge);\n        E_simple = uint8(E_simple*255);\n        imwrite(E_simple, edge_path, 'Quality',100);\n    end\nend\nend\n\n\n\n\nfunction [E] = GetEdge(filePath)\nload(filePath);\nE = 1-edge_predict;\nend\n\nfunction [E4] = SimpleEdge(E, threshold, small_edge)\nif nargin <= 1\n    threshold = 25.0/255.0;\nend\n\nif nargin <= 2\n    small_edge = 5;\nend\n\nif ndims(E) == 3\n    E = E(:,:,1);\nend\n\nE1 = 1 - E;\nE2 = EdgeNMS(E1);\nE3 = double(E2>=max(eps,threshold));\nE3 = bwmorph(E3,'thin',inf);\nE4 = bwareaopen(E3, small_edge);\nE4=1-E4;\nend\n\nfunction [E_nms] = EdgeNMS( E )\nE=single(E);\n[Ox,Oy] = gradient2(convTri(E,4));\n[Oxx,~] = gradient2(Ox);\n[Oxy,Oyy] = gradient2(Oy);\nO = mod(atan(Oyy.*sign(-Oxy)./(Oxx+1e-5)),pi);\nE_nms = edgesNmsMex(E,O,1,5,1.01,1);\nend\n"
  },
  {
    "path": "scripts/edges/batch_hed.py",
    "content": "# HED batch processing script; modified from https://github.com/s9xie/hed/blob/master/examples/hed/HED-tutorial.ipynb\n# Step 1: download the hed repo: https://github.com/s9xie/hed\n# Step 2: download the models and protoxt, and put them under {caffe_root}/examples/hed/\n# Step 3: put this script under {caffe_root}/examples/hed/\n# Step 4: run the following script:\n#       python batch_hed.py --images_dir=/data/to/path/photos/ --hed_mat_dir=/data/to/path/hed_mat_files/\n# The code sometimes crashes after computation is done. Error looks like \"Check failed: ... driver shutting down\". You can just kill the job.\n# For large images, it will produce gpu memory issue. Therefore, you better resize the images before running this script.\n# Step 5: run the MATLAB post-processing script \"PostprocessHED.m\"\nimport scipy.io as sio\nimport caffe\nimport sys\nimport numpy as np\nfrom PIL import Image\nimport os\nimport argparse\n\n\ndef parse_args():\n    parser = argparse.ArgumentParser(description='batch proccesing: photos->edges')\n    parser.add_argument('--caffe_root', dest='caffe_root', help='caffe root', default='../../', type=str)\n    parser.add_argument('--caffemodel', dest='caffemodel', help='caffemodel', default='./hed_pretrained_bsds.caffemodel', type=str)\n    parser.add_argument('--prototxt', dest='prototxt', help='caffe prototxt file', default='./deploy.prototxt', type=str)\n    parser.add_argument('--images_dir', dest='images_dir', help='directory to store input photos', type=str)\n    parser.add_argument('--hed_mat_dir', dest='hed_mat_dir', help='directory to store output hed edges in mat file', type=str)\n    parser.add_argument('--border', dest='border', help='padding border', type=int, default=128)\n    parser.add_argument('--gpu_id', dest='gpu_id', help='gpu id', type=int, default=1)\n    args = parser.parse_args()\n    return args\n\n\nargs = parse_args()\nfor arg in vars(args):\n    print('[%s] =' % arg, getattr(args, arg))\n# Make sure that caffe is on the python path:\ncaffe_root = args.caffe_root   # this file is expected to be in {caffe_root}/examples/hed/\nsys.path.insert(0, caffe_root + 'python')\n\n\nif not os.path.exists(args.hed_mat_dir):\n    print('create output directory %s' % args.hed_mat_dir)\n    os.makedirs(args.hed_mat_dir)\n\nimgList = os.listdir(args.images_dir)\nnImgs = len(imgList)\nprint('#images = %d' % nImgs)\n\ncaffe.set_mode_gpu()\ncaffe.set_device(args.gpu_id)\n# load net\nnet = caffe.Net(args.prototxt, args.caffemodel, caffe.TEST)\n# pad border\nborder = args.border\n\nfor i in range(nImgs):\n    if i % 500 == 0:\n        print('processing image %d/%d' % (i, nImgs))\n    im = Image.open(os.path.join(args.images_dir, imgList[i]))\n\n    in_ = np.array(im, dtype=np.float32)\n    in_ = np.pad(in_, ((border, border), (border, border), (0, 0)), 'reflect')\n\n    in_ = in_[:, :, 0:3]\n    in_ = in_[:, :, ::-1]\n    in_ -= np.array((104.00698793, 116.66876762, 122.67891434))\n    in_ = in_.transpose((2, 0, 1))\n    # remove the following two lines if testing with cpu\n\n    # shape for input (data blob is N x C x H x W), set data\n    net.blobs['data'].reshape(1, *in_.shape)\n    net.blobs['data'].data[...] = in_\n    # run net and take argmax for prediction\n    net.forward()\n    fuse = net.blobs['sigmoid-fuse'].data[0][0, :, :]\n    # get rid of the border\n    fuse = fuse[(border+35):(-border+35), (border+35):(-border+35)]\n    # save hed file to the disk\n    name, ext = os.path.splitext(imgList[i])\n    sio.savemat(os.path.join(args.hed_mat_dir, name + '.mat'), {'edge_predict': fuse})\n"
  },
  {
    "path": "scripts/eval_cityscapes/caffemodel/deploy.prototxt",
    "content": "layer {\n  name: \"data\"\n  type: \"Input\"\n  top: \"data\"\n  input_param {\n    shape {\n      dim: 1\n      dim: 3\n      dim: 500\n      dim: 500\n    }\n  }\n}\nlayer {\n  name: \"conv1_1\"\n  type: \"Convolution\"\n  bottom: \"data\"\n  top: \"conv1_1\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 64\n    pad: 100\n    kernel_size: 3\n    stride: 1\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"relu1_1\"\n  type: \"ReLU\"\n  bottom: \"conv1_1\"\n  top: \"conv1_1\"\n}\nlayer {\n  name: \"conv1_2\"\n  type: \"Convolution\"\n  bottom: \"conv1_1\"\n  top: \"conv1_2\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 64\n    pad: 1\n    kernel_size: 3\n    stride: 1\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"relu1_2\"\n  type: \"ReLU\"\n  bottom: \"conv1_2\"\n  top: \"conv1_2\"\n}\nlayer {\n  name: \"pool1\"\n  type: \"Pooling\"\n  bottom: \"conv1_2\"\n  top: \"pool1\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv2_1\"\n  type: \"Convolution\"\n  bottom: \"pool1\"\n  top: \"conv2_1\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 128\n    pad: 1\n    kernel_size: 3\n    stride: 1\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"relu2_1\"\n  type: \"ReLU\"\n  bottom: \"conv2_1\"\n  top: \"conv2_1\"\n}\nlayer {\n  name: \"conv2_2\"\n  type: \"Convolution\"\n  bottom: \"conv2_1\"\n  top: \"conv2_2\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 128\n    pad: 1\n    kernel_size: 3\n    stride: 1\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"relu2_2\"\n  type: \"ReLU\"\n  bottom: \"conv2_2\"\n  top: \"conv2_2\"\n}\nlayer {\n  name: \"pool2\"\n  type: \"Pooling\"\n  bottom: \"conv2_2\"\n  top: \"pool2\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv3_1\"\n  type: \"Convolution\"\n  bottom: \"pool2\"\n  top: \"conv3_1\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n    stride: 1\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"relu3_1\"\n  type: \"ReLU\"\n  bottom: \"conv3_1\"\n  top: \"conv3_1\"\n}\nlayer {\n  name: \"conv3_2\"\n  type: \"Convolution\"\n  bottom: \"conv3_1\"\n  top: \"conv3_2\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n    stride: 1\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"relu3_2\"\n  type: \"ReLU\"\n  bottom: \"conv3_2\"\n  top: \"conv3_2\"\n}\nlayer {\n  name: \"conv3_3\"\n  type: \"Convolution\"\n  bottom: \"conv3_2\"\n  top: \"conv3_3\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 256\n    pad: 1\n    kernel_size: 3\n    stride: 1\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"relu3_3\"\n  type: \"ReLU\"\n  bottom: \"conv3_3\"\n  top: \"conv3_3\"\n}\nlayer {\n  name: \"pool3\"\n  type: \"Pooling\"\n  bottom: \"conv3_3\"\n  top: \"pool3\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv4_1\"\n  type: \"Convolution\"\n  bottom: \"pool3\"\n  top: \"conv4_1\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n    stride: 1\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"relu4_1\"\n  type: \"ReLU\"\n  bottom: \"conv4_1\"\n  top: \"conv4_1\"\n}\nlayer {\n  name: \"conv4_2\"\n  type: \"Convolution\"\n  bottom: \"conv4_1\"\n  top: \"conv4_2\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n    stride: 1\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"relu4_2\"\n  type: \"ReLU\"\n  bottom: \"conv4_2\"\n  top: \"conv4_2\"\n}\nlayer {\n  name: \"conv4_3\"\n  type: \"Convolution\"\n  bottom: \"conv4_2\"\n  top: \"conv4_3\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n    stride: 1\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"relu4_3\"\n  type: \"ReLU\"\n  bottom: \"conv4_3\"\n  top: \"conv4_3\"\n}\nlayer {\n  name: \"pool4\"\n  type: \"Pooling\"\n  bottom: \"conv4_3\"\n  top: \"pool4\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"conv5_1\"\n  type: \"Convolution\"\n  bottom: \"pool4\"\n  top: \"conv5_1\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n    stride: 1\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"relu5_1\"\n  type: \"ReLU\"\n  bottom: \"conv5_1\"\n  top: \"conv5_1\"\n}\nlayer {\n  name: \"conv5_2\"\n  type: \"Convolution\"\n  bottom: \"conv5_1\"\n  top: \"conv5_2\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n    stride: 1\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"relu5_2\"\n  type: \"ReLU\"\n  bottom: \"conv5_2\"\n  top: \"conv5_2\"\n}\nlayer {\n  name: \"conv5_3\"\n  type: \"Convolution\"\n  bottom: \"conv5_2\"\n  top: \"conv5_3\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 512\n    pad: 1\n    kernel_size: 3\n    stride: 1\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"relu5_3\"\n  type: \"ReLU\"\n  bottom: \"conv5_3\"\n  top: \"conv5_3\"\n}\nlayer {\n  name: \"pool5\"\n  type: \"Pooling\"\n  bottom: \"conv5_3\"\n  top: \"pool5\"\n  pooling_param {\n    pool: MAX\n    kernel_size: 2\n    stride: 2\n  }\n}\nlayer {\n  name: \"fc6_cs\"\n  type: \"Convolution\"\n  bottom: \"pool5\"\n  top: \"fc6_cs\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 4096\n    pad: 0\n    kernel_size: 7\n    stride: 1\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"relu6_cs\"\n  type: \"ReLU\"\n  bottom: \"fc6_cs\"\n  top: \"fc6_cs\"\n}\nlayer {\n  name: \"fc7_cs\"\n  type: \"Convolution\"\n  bottom: \"fc6_cs\"\n  top: \"fc7_cs\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 4096\n    pad: 0\n    kernel_size: 1\n    stride: 1\n    weight_filler {\n      type: \"gaussian\"\n      std: 0.01\n    }\n    bias_filler {\n      type: \"constant\"\n      value: 0\n    }\n  }\n}\nlayer {\n  name: \"relu7_cs\"\n  type: \"ReLU\"\n  bottom: \"fc7_cs\"\n  top: \"fc7_cs\"\n}\nlayer {\n  name: \"score_fr\"\n  type: \"Convolution\"\n  bottom: \"fc7_cs\"\n  top: \"score_fr\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 20\n    pad: 0\n    kernel_size: 1\n    weight_filler {\n      type: \"xavier\"\n    }\n    bias_filler {\n      type: \"constant\"\n    }\n  }\n}\nlayer {\n  name: \"upscore2\"\n  type: \"Deconvolution\"\n  bottom: \"score_fr\"\n  top: \"upscore2\"\n  param {\n    lr_mult: 1\n  }\n  convolution_param {\n    num_output: 20\n    bias_term: false\n    kernel_size: 4\n    stride: 2\n    weight_filler {\n      type: \"xavier\"\n    }\n    bias_filler {\n      type: \"constant\"\n    }\n  }\n}\nlayer {\n  name: \"score_pool4\"\n  type: \"Convolution\"\n  bottom: \"pool4\"\n  top: \"score_pool4\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 20\n    pad: 0\n    kernel_size: 1\n    weight_filler {\n      type: \"xavier\"\n    }\n    bias_filler {\n      type: \"constant\"\n    }\n  }\n}\nlayer {\n  name: \"score_pool4c\"\n  type: \"Crop\"\n  bottom: \"score_pool4\"\n  bottom: \"upscore2\"\n  top: \"score_pool4c\"\n  crop_param {\n    axis: 2\n    offset: 5\n  }\n}\nlayer {\n  name: \"fuse_pool4\"\n  type: \"Eltwise\"\n  bottom: \"upscore2\"\n  bottom: \"score_pool4c\"\n  top: \"fuse_pool4\"\n  eltwise_param {\n    operation: SUM\n  }\n}\nlayer {\n  name: \"upscore_pool4\"\n  type: \"Deconvolution\"\n  bottom: \"fuse_pool4\"\n  top: \"upscore_pool4\"\n  param {\n    lr_mult: 1\n  }\n  convolution_param {\n    num_output: 20\n    bias_term: false\n    kernel_size: 4\n    stride: 2\n    weight_filler {\n      type: \"xavier\"\n    }\n    bias_filler {\n      type: \"constant\"\n    }\n  }\n}\nlayer {\n  name: \"score_pool3\"\n  type: \"Convolution\"\n  bottom: \"pool3\"\n  top: \"score_pool3\"\n  param {\n    lr_mult: 1\n    decay_mult: 1\n  }\n  param {\n    lr_mult: 2\n    decay_mult: 0\n  }\n  convolution_param {\n    num_output: 20\n    pad: 0\n    kernel_size: 1\n    weight_filler {\n      type: \"xavier\"\n    }\n    bias_filler {\n      type: \"constant\"\n    }\n  }\n}\nlayer {\n  name: \"score_pool3c\"\n  type: \"Crop\"\n  bottom: \"score_pool3\"\n  bottom: \"upscore_pool4\"\n  top: \"score_pool3c\"\n  crop_param {\n    axis: 2\n    offset: 9\n  }\n}\nlayer {\n  name: \"fuse_pool3\"\n  type: \"Eltwise\"\n  bottom: \"upscore_pool4\"\n  bottom: \"score_pool3c\"\n  top: \"fuse_pool3\"\n  eltwise_param {\n    operation: SUM\n  }\n}\nlayer {\n  name: \"upscore8\"\n  type: \"Deconvolution\"\n  bottom: \"fuse_pool3\"\n  top: \"upscore8\"\n  param {\n    lr_mult: 1\n  }\n  convolution_param {\n    num_output: 20\n    bias_term: false\n    kernel_size: 16\n    stride: 8\n    weight_filler {\n      type: \"xavier\"\n    }\n    bias_filler {\n      type: \"constant\"\n    }\n  }\n}\nlayer {\n  name: \"score\"\n  type: \"Crop\"\n  bottom: \"upscore8\"\n  bottom: \"data\"\n  top: \"score\"\n  crop_param {\n    axis: 2\n    offset: 31\n  }\n}\n"
  },
  {
    "path": "scripts/eval_cityscapes/cityscapes.py",
    "content": "# The following code is modified from https://github.com/shelhamer/clockwork-fcn\nimport sys\nimport os\nimport glob\nimport numpy as np\nfrom PIL import Image\n\nclass cityscapes:\n    def __init__(self, data_path):\n        # data_path something like /data2/cityscapes\n        self.dir = data_path\n        self.classes = ['road', 'sidewalk', 'building', 'wall', 'fence', \n                        'pole', 'traffic light', 'traffic sign', 'vegetation', 'terrain', \n                        'sky', 'person', 'rider', 'car', 'truck', \n                        'bus', 'train', 'motorcycle', 'bicycle']\n        self.mean = np.array((72.78044, 83.21195, 73.45286), dtype=np.float32)\n        # import cityscapes label helper and set up label mappings\n        sys.path.insert(0, '{}/scripts/helpers/'.format(self.dir))\n        labels = __import__('labels')\n        self.id2trainId = {label.id: label.trainId for label in labels.labels}  # dictionary mapping from raw IDs to train IDs\n        self.trainId2color = {label.trainId: label.color for label in labels.labels}  # dictionary mapping train IDs to colors as 3-tuples\n\n    def get_dset(self, split):\n        '''\n        List images as (city, id) for the specified split\n\n        TODO(shelhamer) generate splits from cityscapes itself, instead of\n        relying on these separately made text files.\n        '''\n        if split == 'train':\n            dataset = open('{}/ImageSets/segFine/train.txt'.format(self.dir)).read().splitlines()\n        else:\n            dataset = open('{}/ImageSets/segFine/val.txt'.format(self.dir)).read().splitlines()\n        return [(item.split('/')[0], item.split('/')[1]) for item in dataset]\n\n    def load_image(self, split, city, idx):\n        im = Image.open('{}/leftImg8bit_sequence/{}/{}/{}_leftImg8bit.png'.format(self.dir, split, city, idx))\n        return im\n\n    def assign_trainIds(self, label):\n        \"\"\"\n        Map the given label IDs to the train IDs appropriate for training\n        Use the label mapping provided in labels.py from the cityscapes scripts\n        \"\"\"\n        label = np.array(label, dtype=np.float32)\n        if sys.version_info[0] < 3:\n            for k, v in self.id2trainId.iteritems():\n                label[label == k] = v\n        else:\n            for k, v in self.id2trainId.items():\n                label[label == k] = v\n        return label\n\n    def load_label(self, split, city, idx):\n        \"\"\"\n        Load label image as 1 x height x width integer array of label indices.\n        The leading singleton dimension is required by the loss.\n        \"\"\"\n        label = Image.open('{}/gtFine/{}/{}/{}_gtFine_labelIds.png'.format(self.dir, split, city, idx))\n        label = self.assign_trainIds(label)  # get proper labels for eval\n        label = np.array(label, dtype=np.uint8)\n        label = label[np.newaxis, ...]\n        return label\n\n    def preprocess(self, im):\n        \"\"\"\n        Preprocess loaded image (by load_image) for Caffe:\n        - cast to float\n        - switch channels RGB -> BGR\n        - subtract mean\n        - transpose to channel x height x width order\n        \"\"\"\n        in_ = np.array(im, dtype=np.float32)\n        in_ = in_[:, :, ::-1]\n        in_ -= self.mean\n        in_ = in_.transpose((2, 0, 1))\n        return in_\n\n    def palette(self, label):\n        '''\n        Map trainIds to colors as specified in labels.py\n        '''\n        if label.ndim == 3:\n            label= label[0]\n        color = np.empty((label.shape[0], label.shape[1], 3))\n        if sys.version_info[0] < 3:\n            for k, v in self.trainId2color.iteritems():\n                color[label == k, :] = v\n        else:\n            for k, v in self.trainId2color.items():\n                color[label == k, :] = v\n        return color\n\n    def make_boundaries(label, thickness=None):\n        \"\"\"\n        Input is an image label, output is a numpy array mask encoding the boundaries of the objects\n        Extract pixels at the true boundary by dilation - erosion of label.\n        Don't just pick the void label as it is not exclusive to the boundaries.\n        \"\"\"\n        assert(thickness is not None)\n        import skimage.morphology as skm\n        void = 255\n        mask = np.logical_and(label > 0, label != void)[0]\n        selem = skm.disk(thickness)\n        boundaries = np.logical_xor(skm.dilation(mask, selem),\n                                    skm.erosion(mask, selem))\n        return boundaries\n\n    def list_label_frames(self, split):\n        \"\"\"\n        Select labeled frames from a split for evaluation\n        collected as (city, shot, idx) tuples\n        \"\"\"\n        def file2idx(f):\n            \"\"\"Helper to convert file path into frame ID\"\"\"\n            city, shot, frame = (os.path.basename(f).split('_')[:3])\n            return \"_\".join([city, shot, frame])\n        frames = []\n        cities = [os.path.basename(f) for f in glob.glob('{}/gtFine/{}/*'.format(self.dir, split))]\n        for c in cities:\n            files = sorted(glob.glob('{}/gtFine/{}/{}/*labelIds.png'.format(self.dir, split, c)))\n            frames.extend([file2idx(f) for f in files])\n        return frames\n\n    def collect_frame_sequence(self, split, idx, length):\n        \"\"\"\n        Collect sequence of frames preceding (and including) a labeled frame\n        as a list of Images.\n\n        Note: 19 preceding frames are provided for each labeled frame.\n        \"\"\"\n        SEQ_LEN = length\n        city, shot, frame = idx.split('_')\n        frame = int(frame)\n        frame_seq = []\n        for i in range(frame - SEQ_LEN, frame + 1):\n            frame_path = '{0}/leftImg8bit_sequence/val/{1}/{1}_{2}_{3:0>6d}_leftImg8bit.png'.format(\n                self.dir, city, shot, i)\n            frame_seq.append(Image.open(frame_path))\n        return frame_seq\n"
  },
  {
    "path": "scripts/eval_cityscapes/download_fcn8s.sh",
    "content": "URL=http://people.eecs.berkeley.edu/~tinghuiz/projects/pix2pix/fcn-8s-cityscapes/fcn-8s-cityscapes.caffemodel\nOUTPUT_FILE=./scripts/eval_cityscapes/caffemodel/fcn-8s-cityscapes.caffemodel\nwget -N $URL -O $OUTPUT_FILE\n"
  },
  {
    "path": "scripts/eval_cityscapes/evaluate.py",
    "content": "import os\nimport sys\nimport caffe\nimport argparse\nimport numpy as np\nimport scipy.misc\nfrom PIL import Image\nfrom util import *\nfrom cityscapes import cityscapes\n\nparser = argparse.ArgumentParser()\nparser.add_argument(\"--cityscapes_dir\", type=str, required=True, help=\"Path to the original cityscapes dataset\")\nparser.add_argument(\"--result_dir\", type=str, required=True, help=\"Path to the generated images to be evaluated\")\nparser.add_argument(\"--output_dir\", type=str, required=True, help=\"Where to save the evaluation results\")\nparser.add_argument(\"--caffemodel_dir\", type=str, default='./scripts/eval_cityscapes/caffemodel/', help=\"Where the FCN-8s caffemodel stored\")\nparser.add_argument(\"--gpu_id\", type=int, default=0, help=\"Which gpu id to use\")\nparser.add_argument(\"--split\", type=str, default='val', help=\"Data split to be evaluated\")\nparser.add_argument(\"--save_output_images\", type=int, default=0, help=\"Whether to save the FCN output images\")\nargs = parser.parse_args()\n\ndef main():\n    if not os.path.isdir(args.output_dir):\n        os.makedirs(args.output_dir)\n    if args.save_output_images > 0:\n        output_image_dir = args.output_dir + 'image_outputs/'\n        if not os.path.isdir(output_image_dir):\n            os.makedirs(output_image_dir)\n    CS = cityscapes(args.cityscapes_dir)\n    n_cl = len(CS.classes)\n    label_frames = CS.list_label_frames(args.split)\n    caffe.set_device(args.gpu_id)\n    caffe.set_mode_gpu()\n    net = caffe.Net(args.caffemodel_dir + '/deploy.prototxt',\n                    args.caffemodel_dir + 'fcn-8s-cityscapes.caffemodel',\n                    caffe.TEST)\n\n    hist_perframe = np.zeros((n_cl, n_cl))\n    for i, idx in enumerate(label_frames):\n        if i % 10 == 0:\n            print('Evaluating: %d/%d' % (i, len(label_frames)))\n        city = idx.split('_')[0]\n        # idx is city_shot_frame\n        label = CS.load_label(args.split, city, idx)\n        im_file = args.result_dir + '/' + idx + '_leftImg8bit.png' \n        im = np.array(Image.open(im_file))\n        # im = scipy.misc.imresize(im, (256, 256))\n        im = scipy.misc.imresize(im, (label.shape[1], label.shape[2]))\n        out = segrun(net, CS.preprocess(im))\n        hist_perframe += fast_hist(label.flatten(), out.flatten(), n_cl)\n        if args.save_output_images > 0:\n            label_im = CS.palette(label)\n            pred_im = CS.palette(out)\n            scipy.misc.imsave(output_image_dir + '/' + str(i) + '_pred.jpg', pred_im)\n            scipy.misc.imsave(output_image_dir + '/' + str(i) + '_gt.jpg', label_im)\n            scipy.misc.imsave(output_image_dir + '/' + str(i) + '_input.jpg', im)\n\n    mean_pixel_acc, mean_class_acc, mean_class_iou, per_class_acc, per_class_iou = get_scores(hist_perframe)\n    with open(args.output_dir + '/evaluation_results.txt', 'w') as f:\n        f.write('Mean pixel accuracy: %f\\n' % mean_pixel_acc)\n        f.write('Mean class accuracy: %f\\n' % mean_class_acc)\n        f.write('Mean class IoU: %f\\n' % mean_class_iou)\n        f.write('************ Per class numbers below ************\\n')\n        for i, cl in enumerate(CS.classes):\n            while len(cl) < 15:\n                cl = cl + ' '\n            f.write('%s: acc = %f, iou = %f\\n' % (cl, per_class_acc[i], per_class_iou[i]))\nmain()"
  },
  {
    "path": "scripts/eval_cityscapes/util.py",
    "content": "# The following code is modified from https://github.com/shelhamer/clockwork-fcn\nimport numpy as np\nimport scipy.io as sio\n\ndef get_out_scoremap(net):\n    return net.blobs['score'].data[0].argmax(axis=0).astype(np.uint8)\n\ndef feed_net(net, in_):\n    \"\"\"\n    Load prepared input into net.\n    \"\"\"\n    net.blobs['data'].reshape(1, *in_.shape)\n    net.blobs['data'].data[...] = in_\n\ndef segrun(net, in_):\n    feed_net(net, in_)\n    net.forward()\n    return get_out_scoremap(net)\n\ndef fast_hist(a, b, n):\n    # print('saving')\n    # sio.savemat('/tmp/fcn_debug/xx.mat', {'a':a, 'b':b, 'n':n})\n    \n    k = np.where((a >= 0) & (a < n))[0]\n    bc = np.bincount(n * a[k].astype(int) + b[k], minlength=n**2)\n    if len(bc) != n**2:\n        # ignore this example if dimension mismatch\n        return 0\n    return bc.reshape(n, n)\n\ndef get_scores(hist):\n    # Mean pixel accuracy\n    acc = np.diag(hist).sum() / (hist.sum() + 1e-12)\n\n    # Per class accuracy\n    cl_acc = np.diag(hist) / (hist.sum(1) + 1e-12)\n\n    # Per class IoU\n    iu = np.diag(hist) / (hist.sum(1) + hist.sum(0) - np.diag(hist) + 1e-12)\n\n    return acc, np.nanmean(cl_acc), np.nanmean(iu), cl_acc, iu"
  },
  {
    "path": "scripts/receptive_field_sizes.m",
    "content": "% modified from: https://github.com/rbgirshick/rcnn/blob/master/utils/receptive_field_sizes.m\n% \n% RCNN LICENSE:\n%\n% Copyright (c) 2014, The Regents of the University of California (Regents)\n% All rights reserved.\n% \n% Redistribution and use in source and binary forms, with or without\n% modification, are permitted provided that the following conditions are met: \n% \n% 1. Redistributions of source code must retain the above copyright notice, this\n%    list of conditions and the following disclaimer. \n% 2. Redistributions in binary form must reproduce the above copyright notice,\n%    this list of conditions and the following disclaimer in the documentation\n%    and/or other materials provided with the distribution. \n% \n% THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS \"AS IS\" AND\n% ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED\n% WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE\n% DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR\n% ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES\n% (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;\n% LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND\n% ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n% (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS\n% SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n\nfunction receptive_field_sizes()\n\n\n% compute input size from a given output size\nf = @(output_size, ksize, stride) (output_size - 1) * stride + ksize;\n\n\n%% n=1 discriminator\n\n% fix the output size to 1 and derive the receptive field in the input\nout = ...\nf(f(f(1, 4, 1), ...   % conv2 -> conv3\n             4, 1), ...   % conv1 -> conv2\n             4, 2);       % input -> conv1\n\nfprintf('n=1 discriminator receptive field size: %d\\n', out);\n\n\n%% n=2 discriminator\n\n% fix the output size to 1 and derive the receptive field in the input\nout = ...\nf(f(f(f(1, 4, 1), ...   % conv3 -> conv4\n             4, 1), ...   % conv2 -> conv3\n             4, 2), ...   % conv1 -> conv2\n             4, 2);       % input -> conv1\n\nfprintf('n=2 discriminator receptive field size: %d\\n', out);\n\n\n%% n=3 discriminator\n\n% fix the output size to 1 and derive the receptive field in the input\nout = ...\nf(f(f(f(f(1, 4, 1), ...   % conv4 -> conv5\n             4, 1), ...   % conv3 -> conv4\n             4, 2), ...   % conv2 -> conv3\n             4, 2), ...   % conv1 -> conv2\n             4, 2);       % input -> conv1\n\nfprintf('n=3 discriminator receptive field size: %d\\n', out);\n\n\n%% n=4 discriminator\n\n% fix the output size to 1 and derive the receptive field in the input\nout = ...\nf(f(f(f(f(f(1, 4, 1), ...   % conv5 -> conv6\n             4, 1), ...   % conv4 -> conv5\n             4, 2), ...   % conv3 -> conv4\n             4, 2), ...   % conv2 -> conv3\n             4, 2), ...   % conv1 -> conv2\n             4, 2);       % input -> conv1\n\nfprintf('n=4 discriminator receptive field size: %d\\n', out);\n\n\n%% n=5 discriminator\n\n% fix the output size to 1 and derive the receptive field in the input\nout = ...\nf(f(f(f(f(f(f(1, 4, 1), ...   % conv6 -> conv7\n             4, 1), ...   % conv5 -> conv6\n             4, 2), ...   % conv4 -> conv5\n             4, 2), ...   % conv3 -> conv4\n             4, 2), ...   % conv2 -> conv3\n             4, 2), ...   % conv1 -> conv2\n             4, 2);       % input -> conv1\n\nfprintf('n=5 discriminator receptive field size: %d\\n', out);"
  },
  {
    "path": "test.lua",
    "content": "-- usage: DATA_ROOT=/path/to/data/ name=expt1 which_direction=BtoA th test.lua\n--\n-- code derived from https://github.com/soumith/dcgan.torch\n--\n\nrequire 'image'\nrequire 'nn'\nrequire 'nngraph'\nutil = paths.dofile('util/util.lua')\ntorch.setdefaulttensortype('torch.FloatTensor')\n\nopt = {\n    DATA_ROOT = '',           -- path to images (should have subfolders 'train', 'val', etc)\n    batchSize = 1,            -- # images in batch\n    loadSize = 256,           -- scale images to this size\n    fineSize = 256,           --  then crop to this size\n    flip=0,                   -- horizontal mirroring data augmentation\n    display = 1,              -- display samples while training. 0 = false\n    display_id = 200,         -- display window id.\n    gpu = 1,                  -- gpu = 0 is CPU mode. gpu=X is GPU mode on GPU X\n    how_many = 'all',         -- how many test images to run (set to all to run on every image found in the data/phase folder)\n    which_direction = 'AtoB', -- AtoB or BtoA\n    phase = 'val',            -- train, val, test ,etc\n    preprocess = 'regular',   -- for special purpose preprocessing, e.g., for colorization, change this (selects preprocessing functions in util.lua)\n    aspect_ratio = 1.0,       -- aspect ratio of result images\n    name = '',                -- name of experiment, selects which model to run, should generally should be passed on command line\n    input_nc = 3,             -- #  of input image channels\n    output_nc = 3,            -- #  of output image channels\n    serial_batches = 1,       -- if 1, takes images in order to make batches, otherwise takes them randomly\n    serial_batch_iter = 1,    -- iter into serial image list\n    cudnn = 1,                -- set to 0 to not use cudnn (untested)\n    checkpoints_dir = './checkpoints', -- loads models from here\n    results_dir='./results/',          -- saves results here\n    which_epoch = 'latest',            -- which epoch to test? set to 'latest' to use latest cached model\n}\n\n\n-- one-line argument parser. parses enviroment variables to override the defaults\nfor k,v in pairs(opt) do opt[k] = tonumber(os.getenv(k)) or os.getenv(k) or opt[k] end\nopt.nThreads = 1 -- test only works with 1 thread...\nprint(opt)\nif opt.display == 0 then opt.display = false end\n\nopt.manualSeed = torch.random(1, 10000) -- set seed\nprint(\"Random Seed: \" .. opt.manualSeed)\ntorch.manualSeed(opt.manualSeed)\ntorch.setdefaulttensortype('torch.FloatTensor')\n\nopt.netG_name = opt.name .. '/' .. opt.which_epoch .. '_net_G'\n\nlocal data_loader = paths.dofile('data/data.lua')\nprint('#threads...' .. opt.nThreads)\nlocal data = data_loader.new(opt.nThreads, opt)\nprint(\"Dataset Size: \", data:size())\n\n-- translation direction\nlocal idx_A = nil\nlocal idx_B = nil\nlocal input_nc = opt.input_nc\nlocal output_nc = opt.output_nc\nif opt.which_direction=='AtoB' then\n  idx_A = {1, input_nc}\n  idx_B = {input_nc+1, input_nc+output_nc}\nelseif opt.which_direction=='BtoA' then\n  idx_A = {input_nc+1, input_nc+output_nc}\n  idx_B = {1, input_nc}\nelse\n  error(string.format('bad direction %s',opt.which_direction))\nend\n----------------------------------------------------------------------------\n\nlocal input = torch.FloatTensor(opt.batchSize,3,opt.fineSize,opt.fineSize)\nlocal target = torch.FloatTensor(opt.batchSize,3,opt.fineSize,opt.fineSize)\n\nprint('checkpoints_dir', opt.checkpoints_dir)\nlocal netG = util.load(paths.concat(opt.checkpoints_dir, opt.netG_name .. '.t7'), opt)\n--netG:evaluate()\n\nprint(netG)\n\n\nfunction TableConcat(t1,t2)\n    for i=1,#t2 do\n        t1[#t1+1] = t2[i]\n    end\n    return t1\nend\n\nif opt.how_many=='all' then\n    opt.how_many=data:size()\nend\nopt.how_many=math.min(opt.how_many, data:size())\n\nlocal filepaths = {} -- paths to images tested on\nfor n=1,math.floor(opt.how_many/opt.batchSize) do\n    print('processing batch ' .. n)\n\n    local data_curr, filepaths_curr = data:getBatch()\n    filepaths_curr = util.basename_batch(filepaths_curr)\n    print('filepaths_curr: ', filepaths_curr)\n\n    input = data_curr[{ {}, idx_A, {}, {} }]\n    target = data_curr[{ {}, idx_B, {}, {} }]\n\n    if opt.gpu > 0 then\n        input = input:cuda()\n    end\n\n    if opt.preprocess == 'colorization' then\n       local output_AB = netG:forward(input):float()\n       local input_L = input:float()\n       output = util.deprocessLAB_batch(input_L, output_AB)\n       local target_AB = target:float()\n       target = util.deprocessLAB_batch(input_L, target_AB)\n       input = util.deprocessL_batch(input_L)\n    else\n        output = util.deprocess_batch(netG:forward(input))\n        input = util.deprocess_batch(input):float()\n        output = output:float()\n        target = util.deprocess_batch(target):float()\n    end\n    paths.mkdir(paths.concat(opt.results_dir, opt.netG_name .. '_' .. opt.phase))\n    local image_dir = paths.concat(opt.results_dir, opt.netG_name .. '_' .. opt.phase, 'images')\n    paths.mkdir(image_dir)\n    paths.mkdir(paths.concat(image_dir,'input'))\n    paths.mkdir(paths.concat(image_dir,'output'))\n    paths.mkdir(paths.concat(image_dir,'target'))\n    for i=1, opt.batchSize do\n        image.save(paths.concat(image_dir,'input',filepaths_curr[i]), image.scale(input[i],input[i]:size(2),input[i]:size(3)/opt.aspect_ratio))\n        image.save(paths.concat(image_dir,'output',filepaths_curr[i]), image.scale(output[i],output[i]:size(2),output[i]:size(3)/opt.aspect_ratio))\n        image.save(paths.concat(image_dir,'target',filepaths_curr[i]), image.scale(target[i],target[i]:size(2),target[i]:size(3)/opt.aspect_ratio))\n    end\n    print('Saved images to: ', image_dir)\n\n    if opt.display then\n      if opt.preprocess == 'regular' then\n        disp = require 'display'\n        disp.image(util.scaleBatch(input,100,100),{win=opt.display_id, title='input'})\n        disp.image(util.scaleBatch(output,100,100),{win=opt.display_id+1, title='output'})\n        disp.image(util.scaleBatch(target,100,100),{win=opt.display_id+2, title='target'})\n\n        print('Displayed images')\n      end\n    end\n\n    filepaths = TableConcat(filepaths, filepaths_curr)\nend\n\n-- make webpage\nio.output(paths.concat(opt.results_dir,opt.netG_name .. '_' .. opt.phase, 'index.html'))\n\nio.write('<table style=\"text-align:center;\">')\n\nio.write('<tr><td>Image #</td><td>Input</td><td>Output</td><td>Ground Truth</td></tr>')\nfor i=1, #filepaths do\n    io.write('<tr>')\n    io.write('<td>' .. filepaths[i] .. '</td>')\n    io.write('<td><img src=\"./images/input/' .. filepaths[i] .. '\"/></td>')\n    io.write('<td><img src=\"./images/output/' .. filepaths[i] .. '\"/></td>')\n    io.write('<td><img src=\"./images/target/' .. filepaths[i] .. '\"/></td>')\n    io.write('</tr>')\nend\n\nio.write('</table>')\n"
  },
  {
    "path": "train.lua",
    "content": "-- usage example: DATA_ROOT=/path/to/data/ which_direction=BtoA name=expt1 th train.lua \n--\n-- code derived from https://github.com/soumith/dcgan.torch\n--\n\nrequire 'torch'\nrequire 'nn'\nrequire 'optim'\nutil = paths.dofile('util/util.lua')\nrequire 'image'\nrequire 'models'\n\n\nopt = {\n   DATA_ROOT = '',         -- path to images (should have subfolders 'train', 'val', etc)\n   batchSize = 1,          -- # images in batch\n   loadSize = 286,         -- scale images to this size\n   fineSize = 256,         --  then crop to this size\n   ngf = 64,               -- #  of gen filters in first conv layer\n   ndf = 64,               -- #  of discrim filters in first conv layer\n   input_nc = 3,           -- #  of input image channels\n   output_nc = 3,          -- #  of output image channels\n   niter = 200,            -- #  of iter at starting learning rate\n   lr = 0.0002,            -- initial learning rate for adam\n   beta1 = 0.5,            -- momentum term of adam\n   ntrain = math.huge,     -- #  of examples per epoch. math.huge for full dataset\n   flip = 1,               -- if flip the images for data argumentation\n   display = 1,            -- display samples while training. 0 = false\n   display_id = 10,        -- display window id.\n   display_plot = 'errL1',    -- which loss values to plot over time. Accepted values include a comma seperated list of: errL1, errG, and errD\n   gpu = 1,                -- gpu = 0 is CPU mode. gpu=X is GPU mode on GPU X\n   name = '',              -- name of the experiment, should generally be passed on the command line\n   which_direction = 'AtoB',    -- AtoB or BtoA\n   phase = 'train',             -- train, val, test, etc\n   preprocess = 'regular',      -- for special purpose preprocessing, e.g., for colorization, change this (selects preprocessing functions in util.lua)\n   nThreads = 2,                -- # threads for loading data\n   save_epoch_freq = 50,        -- save a model every save_epoch_freq epochs (does not overwrite previously saved models)\n   save_latest_freq = 5000,     -- save the latest model every latest_freq sgd iterations (overwrites the previous latest model)\n   print_freq = 50,             -- print the debug information every print_freq iterations\n   display_freq = 100,          -- display the current results every display_freq iterations\n   save_display_freq = 5000,    -- save the current display of results every save_display_freq_iterations\n   continue_train=0,            -- if continue training, load the latest model: 1: true, 0: false\n   serial_batches = 0,          -- if 1, takes images in order to make batches, otherwise takes them randomly\n   serial_batch_iter = 1,       -- iter into serial image list\n   checkpoints_dir = './checkpoints', -- models are saved here\n   cudnn = 1,                         -- set to 0 to not use cudnn\n   condition_GAN = 1,                 -- set to 0 to use unconditional discriminator\n   use_GAN = 1,                       -- set to 0 to turn off GAN term\n   use_L1 = 1,                        -- set to 0 to turn off L1 term\n   which_model_netD = 'basic', -- selects model to use for netD\n   which_model_netG = 'unet',  -- selects model to use for netG\n   n_layers_D = 0,             -- only used if which_model_netD=='n_layers'\n   lambda = 100,               -- weight on L1 term in objective\n}\n\n-- one-line argument parser. parses enviroment variables to override the defaults\nfor k,v in pairs(opt) do opt[k] = tonumber(os.getenv(k)) or os.getenv(k) or opt[k] end\nprint(opt)\n\nlocal input_nc = opt.input_nc\nlocal output_nc = opt.output_nc\n-- translation direction\nlocal idx_A = nil\nlocal idx_B = nil\n\nif opt.which_direction=='AtoB' then\n    idx_A = {1, input_nc}\n    idx_B = {input_nc+1, input_nc+output_nc}\nelseif opt.which_direction=='BtoA' then\n    idx_A = {input_nc+1, input_nc+output_nc}\n    idx_B = {1, input_nc}\nelse\n    error(string.format('bad direction %s',opt.which_direction))\nend\n\nif opt.display == 0 then opt.display = false end\n\nopt.manualSeed = torch.random(1, 10000) -- fix seed\nprint(\"Random Seed: \" .. opt.manualSeed)\ntorch.manualSeed(opt.manualSeed)\ntorch.setdefaulttensortype('torch.FloatTensor')\n\n-- create data loader\nlocal data_loader = paths.dofile('data/data.lua')\nprint('#threads...' .. opt.nThreads)\nlocal data = data_loader.new(opt.nThreads, opt)\nprint(\"Dataset Size: \", data:size())\n\n----------------------------------------------------------------------------\nlocal function weights_init(m)\n   local name = torch.type(m)\n   if name:find('Convolution') then\n      m.weight:normal(0.0, 0.02)\n      m.bias:fill(0)\n   elseif name:find('BatchNormalization') then\n      if m.weight then m.weight:normal(1.0, 0.02) end\n      if m.bias then m.bias:fill(0) end\n   end\nend\n\n\nlocal ndf = opt.ndf\nlocal ngf = opt.ngf\nlocal real_label = 1\nlocal fake_label = 0\n\nfunction defineG(input_nc, output_nc, ngf)\n    local netG = nil\n    if     opt.which_model_netG == \"encoder_decoder\" then netG = defineG_encoder_decoder(input_nc, output_nc, ngf)\n    elseif opt.which_model_netG == \"unet\" then netG = defineG_unet(input_nc, output_nc, ngf)\n    elseif opt.which_model_netG == \"unet_128\" then netG = defineG_unet_128(input_nc, output_nc, ngf)\n    else error(\"unsupported netG model\")\n    end\n   \n    netG:apply(weights_init)\n  \n    return netG\nend\n\nfunction defineD(input_nc, output_nc, ndf)\n    local netD = nil\n    if opt.condition_GAN==1 then\n        input_nc_tmp = input_nc\n    else\n        input_nc_tmp = 0 -- only penalizes structure in output channels\n    end\n    \n    if     opt.which_model_netD == \"basic\" then netD = defineD_basic(input_nc_tmp, output_nc, ndf)\n    elseif opt.which_model_netD == \"n_layers\" then netD = defineD_n_layers(input_nc_tmp, output_nc, ndf, opt.n_layers_D)\n    else error(\"unsupported netD model\")\n    end\n    \n    netD:apply(weights_init)\n    \n    return netD\nend\n\n\n-- load saved models and finetune\nif opt.continue_train == 1 then\n   print('loading previously trained netG...')\n   netG = util.load(paths.concat(opt.checkpoints_dir, opt.name, 'latest_net_G.t7'), opt)\n   print('loading previously trained netD...')\n   netD = util.load(paths.concat(opt.checkpoints_dir, opt.name, 'latest_net_D.t7'), opt)\nelse\n  print('define model netG...')\n  netG = defineG(input_nc, output_nc, ngf)\n  print('define model netD...')\n  netD = defineD(input_nc, output_nc, ndf)\nend\n\nprint(netG)\nprint(netD)\n\n\nlocal criterion = nn.BCECriterion()\nlocal criterionAE = nn.AbsCriterion()\n---------------------------------------------------------------------------\noptimStateG = {\n   learningRate = opt.lr,\n   beta1 = opt.beta1,\n}\noptimStateD = {\n   learningRate = opt.lr,\n   beta1 = opt.beta1,\n}\n----------------------------------------------------------------------------\nlocal real_A = torch.Tensor(opt.batchSize, input_nc, opt.fineSize, opt.fineSize)\nlocal real_B = torch.Tensor(opt.batchSize, output_nc, opt.fineSize, opt.fineSize)\nlocal fake_B = torch.Tensor(opt.batchSize, output_nc, opt.fineSize, opt.fineSize)\nlocal real_AB = torch.Tensor(opt.batchSize, output_nc + input_nc*opt.condition_GAN, opt.fineSize, opt.fineSize)\nlocal fake_AB = torch.Tensor(opt.batchSize, output_nc + input_nc*opt.condition_GAN, opt.fineSize, opt.fineSize)\nlocal errD, errG, errL1 = 0, 0, 0\nlocal epoch_tm = torch.Timer()\nlocal tm = torch.Timer()\nlocal data_tm = torch.Timer()\n----------------------------------------------------------------------------\n\nif opt.gpu > 0 then\n   print('transferring to gpu...')\n   require 'cunn'\n   cutorch.setDevice(opt.gpu)\n   real_A = real_A:cuda();\n   real_B = real_B:cuda(); fake_B = fake_B:cuda();\n   real_AB = real_AB:cuda(); fake_AB = fake_AB:cuda();\n   if opt.cudnn==1 then\n      netG = util.cudnn(netG); netD = util.cudnn(netD);\n   end\n   netD:cuda(); netG:cuda(); criterion:cuda(); criterionAE:cuda();\n   print('done')\nelse\n\tprint('running model on CPU')\nend\n\n\nlocal parametersD, gradParametersD = netD:getParameters()\nlocal parametersG, gradParametersG = netG:getParameters()\n\n\n\nif opt.display then disp = require 'display' end\n\n\nfunction createRealFake()\n    -- load real\n    data_tm:reset(); data_tm:resume()\n    local real_data, data_path = data:getBatch()\n    data_tm:stop()\n    \n    real_A:copy(real_data[{ {}, idx_A, {}, {} }])\n    real_B:copy(real_data[{ {}, idx_B, {}, {} }])\n    \n    if opt.condition_GAN==1 then\n        real_AB = torch.cat(real_A,real_B,2)\n    else\n        real_AB = real_B -- unconditional GAN, only penalizes structure in B\n    end\n    \n    -- create fake\n    fake_B = netG:forward(real_A)\n    \n    if opt.condition_GAN==1 then\n        fake_AB = torch.cat(real_A,fake_B,2)\n    else\n        fake_AB = fake_B -- unconditional GAN, only penalizes structure in B\n    end\nend\n\n-- create closure to evaluate f(X) and df/dX of discriminator\nlocal fDx = function(x)\n    netD:apply(function(m) if torch.type(m):find('Convolution') then m.bias:zero() end end)\n    netG:apply(function(m) if torch.type(m):find('Convolution') then m.bias:zero() end end)\n    \n    gradParametersD:zero()\n    \n    -- Real\n    local output = netD:forward(real_AB)\n    local label = torch.FloatTensor(output:size()):fill(real_label)\n    if opt.gpu>0 then \n    \tlabel = label:cuda()\n    end\n    \n    local errD_real = criterion:forward(output, label)\n    local df_do = criterion:backward(output, label)\n    netD:backward(real_AB, df_do)\n    \n    -- Fake\n    local output = netD:forward(fake_AB)\n    label:fill(fake_label)\n    local errD_fake = criterion:forward(output, label)\n    local df_do = criterion:backward(output, label)\n    netD:backward(fake_AB, df_do)\n    \n    errD = (errD_real + errD_fake)/2\n    \n    return errD, gradParametersD\nend\n\n-- create closure to evaluate f(X) and df/dX of generator\nlocal fGx = function(x)\n    netD:apply(function(m) if torch.type(m):find('Convolution') then m.bias:zero() end end)\n    netG:apply(function(m) if torch.type(m):find('Convolution') then m.bias:zero() end end)\n    \n    gradParametersG:zero()\n    \n    -- GAN loss\n    local df_dg = torch.zeros(fake_B:size())\n    if opt.gpu>0 then \n    \tdf_dg = df_dg:cuda();\n    end\n    \n    if opt.use_GAN==1 then\n       local output = netD.output -- netD:forward{input_A,input_B} was already executed in fDx, so save computation\n       local label = torch.FloatTensor(output:size()):fill(real_label) -- fake labels are real for generator cost\n       if opt.gpu>0 then \n       \tlabel = label:cuda();\n       \tend\n       errG = criterion:forward(output, label)\n       local df_do = criterion:backward(output, label)\n       df_dg = netD:updateGradInput(fake_AB, df_do):narrow(2,fake_AB:size(2)-output_nc+1, output_nc)\n    else\n        errG = 0\n    end\n    \n    -- unary loss\n    local df_do_AE = torch.zeros(fake_B:size())\n    if opt.gpu>0 then \n    \tdf_do_AE = df_do_AE:cuda();\n    end\n    if opt.use_L1==1 then\n       errL1 = criterionAE:forward(fake_B, real_B)\n       df_do_AE = criterionAE:backward(fake_B, real_B)\n    else\n        errL1 = 0\n    end\n    \n    netG:backward(real_A, df_dg + df_do_AE:mul(opt.lambda))\n    \n    return errG, gradParametersG\nend\n\n\n\n\n-- train\nlocal best_err = nil\npaths.mkdir(opt.checkpoints_dir)\npaths.mkdir(opt.checkpoints_dir .. '/' .. opt.name)\n\n-- save opt\nfile = torch.DiskFile(paths.concat(opt.checkpoints_dir, opt.name, 'opt.txt'), 'w')\nfile:writeObject(opt)\nfile:close()\n\n-- parse diplay_plot string into table\nopt.display_plot = string.split(string.gsub(opt.display_plot, \"%s+\", \"\"), \",\")\nfor k, v in ipairs(opt.display_plot) do\n    if not util.containsValue({\"errG\", \"errD\", \"errL1\"}, v) then \n        error(string.format('bad display_plot value \"%s\"', v)) \n    end\nend\n\n-- display plot config\nlocal plot_config = {\n  title = \"Loss over time\",\n  labels = {\"epoch\", unpack(opt.display_plot)},\n  ylabel = \"loss\",\n}\n\n-- display plot vars\nlocal plot_data = {}\nlocal plot_win\n\nlocal counter = 0\nfor epoch = 1, opt.niter do\n    epoch_tm:reset()\n    for i = 1, math.min(data:size(), opt.ntrain), opt.batchSize do\n        tm:reset()\n        \n        -- load a batch and run G on that batch\n        createRealFake()\n        \n        -- (1) Update D network: maximize log(D(x,y)) + log(1 - D(x,G(x)))\n        if opt.use_GAN==1 then optim.adam(fDx, parametersD, optimStateD) end\n        \n        -- (2) Update G network: maximize log(D(x,G(x))) + L1(y,G(x))\n        optim.adam(fGx, parametersG, optimStateG)\n\n        -- display\n        counter = counter + 1\n        if counter % opt.display_freq == 0 and opt.display then\n            createRealFake()\n            if opt.preprocess == 'colorization' then \n                local real_A_s = util.scaleBatch(real_A:float(),100,100)\n                local fake_B_s = util.scaleBatch(fake_B:float(),100,100)\n                local real_B_s = util.scaleBatch(real_B:float(),100,100)\n                disp.image(util.deprocessL_batch(real_A_s), {win=opt.display_id, title=opt.name .. ' input'})\n                disp.image(util.deprocessLAB_batch(real_A_s, fake_B_s), {win=opt.display_id+1, title=opt.name .. ' output'})\n                disp.image(util.deprocessLAB_batch(real_A_s, real_B_s), {win=opt.display_id+2, title=opt.name .. ' target'})\n            else\n                disp.image(util.deprocess_batch(util.scaleBatch(real_A:float(),100,100)), {win=opt.display_id, title=opt.name .. ' input'})\n                disp.image(util.deprocess_batch(util.scaleBatch(fake_B:float(),100,100)), {win=opt.display_id+1, title=opt.name .. ' output'})\n                disp.image(util.deprocess_batch(util.scaleBatch(real_B:float(),100,100)), {win=opt.display_id+2, title=opt.name .. ' target'})\n            end\n        end\n      \n        -- write display visualization to disk\n        --  runs on the first batchSize images in the opt.phase set\n        if counter % opt.save_display_freq == 0 and opt.display then\n            local serial_batches=opt.serial_batches\n            opt.serial_batches=1\n            opt.serial_batch_iter=1\n            \n            local image_out = nil\n            local N_save_display = 10 \n            local N_save_iter = torch.max(torch.Tensor({1, torch.floor(N_save_display/opt.batchSize)}))\n            for i3=1, N_save_iter do\n            \n                createRealFake()\n                print('save to the disk')\n                if opt.preprocess == 'colorization' then \n                    for i2=1, fake_B:size(1) do\n                        if image_out==nil then image_out = torch.cat(util.deprocessL(real_A[i2]:float()),util.deprocessLAB(real_A[i2]:float(), fake_B[i2]:float()),3)/255.0\n                        else image_out = torch.cat(image_out, torch.cat(util.deprocessL(real_A[i2]:float()),util.deprocessLAB(real_A[i2]:float(), fake_B[i2]:float()),3)/255.0, 2) end\n                    end\n                else\n                    for i2=1, fake_B:size(1) do\n                        if image_out==nil then image_out = torch.cat(util.deprocess(real_A[i2]:float()),util.deprocess(fake_B[i2]:float()),3)\n                        else image_out = torch.cat(image_out, torch.cat(util.deprocess(real_A[i2]:float()),util.deprocess(fake_B[i2]:float()),3), 2) end\n                    end\n                end\n            end\n            image.save(paths.concat(opt.checkpoints_dir,  opt.name , counter .. '_train_res.png'), image_out)\n            \n            opt.serial_batches=serial_batches\n        end\n        \n        -- logging and display plot\n        if counter % opt.print_freq == 0 then\n            local loss = {errG=errG and errG or -1, errD=errD and errD or -1, errL1=errL1 and errL1 or -1}\n            local curItInBatch = ((i-1) / opt.batchSize)\n            local totalItInBatch = math.floor(math.min(data:size(), opt.ntrain) / opt.batchSize)\n            print(('Epoch: [%d][%8d / %8d]\\t Time: %.3f  DataTime: %.3f  '\n                    .. '  Err_G: %.4f  Err_D: %.4f  ErrL1: %.4f'):format(\n                     epoch, curItInBatch, totalItInBatch,\n                     tm:time().real / opt.batchSize, data_tm:time().real / opt.batchSize,\n                     errG, errD, errL1))\n           \n            local plot_vals = { epoch + curItInBatch / totalItInBatch }\n            for k, v in ipairs(opt.display_plot) do\n              if loss[v] ~= nil then\n               plot_vals[#plot_vals + 1] = loss[v] \n             end\n            end\n\n            -- update display plot\n            if opt.display then\n              table.insert(plot_data, plot_vals)\n              plot_config.win = plot_win\n              plot_win = disp.plot(plot_data, plot_config)\n            end\n        end\n        \n        -- save latest model\n        if counter % opt.save_latest_freq == 0 then\n            print(('saving the latest model (epoch %d, iters %d)'):format(epoch, counter))\n            torch.save(paths.concat(opt.checkpoints_dir, opt.name, 'latest_net_G.t7'), netG:clearState())\n            torch.save(paths.concat(opt.checkpoints_dir, opt.name, 'latest_net_D.t7'), netD:clearState())\n        end\n        \n    end\n    \n    \n    parametersD, gradParametersD = nil, nil -- nil them to avoid spiking memory\n    parametersG, gradParametersG = nil, nil\n    \n    if epoch % opt.save_epoch_freq == 0 then\n        torch.save(paths.concat(opt.checkpoints_dir, opt.name,  epoch .. '_net_G.t7'), netG:clearState())\n        torch.save(paths.concat(opt.checkpoints_dir, opt.name, epoch .. '_net_D.t7'), netD:clearState())\n    end\n    \n    print(('End of epoch %d / %d \\t Time Taken: %.3f'):format(\n            epoch, opt.niter, epoch_tm:time().real))\n    parametersD, gradParametersD = netD:getParameters() -- reflatten the params and get them\n    parametersG, gradParametersG = netG:getParameters()\nend\n"
  },
  {
    "path": "util/cudnn_convert_custom.lua",
    "content": "-- modified from https://github.com/NVIDIA/torch-cudnn/blob/master/convert.lua\n-- removed error on nngraph\n\n-- modules that can be converted to nn seamlessly\nlocal layer_list = {\n  'BatchNormalization',\n  'SpatialBatchNormalization',\n  'SpatialConvolution',\n  'SpatialCrossMapLRN',\n  'SpatialFullConvolution',\n  'SpatialMaxPooling',\n  'SpatialAveragePooling',\n  'ReLU',\n  'Tanh',\n  'Sigmoid',\n  'SoftMax',\n  'LogSoftMax',\n  'VolumetricBatchNormalization',\n  'VolumetricConvolution',\n  'VolumetricFullConvolution',\n  'VolumetricMaxPooling',\n  'VolumetricAveragePooling',\n}\n\n-- goes over a given net and converts all layers to dst backend\n-- for example: net = cudnn_convert_custom(net, cudnn)\n-- same as cudnn.convert with gModule check commented out\nfunction cudnn_convert_custom(net, dst, exclusion_fn)\n  return net:replace(function(x)\n    --if torch.type(x) == 'nn.gModule' then\n    --  io.stderr:write('Warning: cudnn.convert does not work with nngraph yet. Ignoring nn.gModule')\n    --  return x\n    --end\n    local y = 0\n    local src = dst == nn and cudnn or nn\n    local src_prefix = src == nn and 'nn.' or 'cudnn.'\n    local dst_prefix = dst == nn and 'nn.' or 'cudnn.'\n\n    local function convert(v)\n      local y = {}\n      torch.setmetatable(y, dst_prefix..v)\n      if v == 'ReLU' then y = dst.ReLU() end -- because parameters\n      for k,u in pairs(x) do y[k] = u end\n      if src == cudnn and x.clearDesc then x.clearDesc(y) end\n      if src == cudnn and v == 'SpatialAveragePooling' then\n        y.divide = true\n        y.count_include_pad = v.mode == 'CUDNN_POOLING_AVERAGE_COUNT_INCLUDE_PADDING'\n      end\n      if src == nn and string.find(v, 'Convolution') then\n         y.groups = 1\n      end\n      return y\n    end\n\n    if exclusion_fn and exclusion_fn(x) then\n      return x\n    end\n    local t = torch.typename(x)\n    if t == 'nn.SpatialConvolutionMM' then\n      y = convert('SpatialConvolution')\n    elseif t == 'inn.SpatialCrossResponseNormalization' then\n      y = convert('SpatialCrossMapLRN')\n    else\n      for i,v in ipairs(layer_list) do\n        if torch.typename(x) == src_prefix..v then\n          y = convert(v)\n        end\n      end\n    end\n    return y == 0 and x or y\n  end)\nend\n"
  },
  {
    "path": "util/util.lua",
    "content": "--\n-- code derived from https://github.com/soumith/dcgan.torch\n--\n\nlocal util = {}\n\nrequire 'torch'\n\nfunction util.normalize(img)\n  -- rescale image to 0 .. 1\n  local min = img:min()\n  local max = img:max()\n  \n  img = torch.FloatTensor(img:size()):copy(img)\n  img:add(-min):mul(1/(max-min))\n  return img\nend\n\nfunction util.normalizeBatch(batch)\n  for i = 1, batch:size(1) do\n    batch[i] = util.normalize(batch[i]:squeeze())\n  end\n  return batch\nend\n\nfunction util.basename_batch(batch)\n  for i = 1, #batch do\n    batch[i] = paths.basename(batch[i])\n  end\n  return batch\nend\n\n\n\n-- default preprocessing\n--\n-- Preprocesses an image before passing it to a net\n-- Converts from RGB to BGR and rescales from [0,1] to [-1,1]\nfunction util.preprocess(img)\n    -- RGB to BGR\n    local perm = torch.LongTensor{3, 2, 1}\n    img = img:index(1, perm)\n    \n    -- [0,1] to [-1,1]\n    img = img:mul(2):add(-1)\n    \n    -- check that input is in expected range\n    assert(img:max()<=1,\"badly scaled inputs\")\n    assert(img:min()>=-1,\"badly scaled inputs\")\n    \n    return img\nend\n\n-- Undo the above preprocessing.\nfunction util.deprocess(img)\n    -- BGR to RGB\n    local perm = torch.LongTensor{3, 2, 1}\n    img = img:index(1, perm)\n    \n    -- [-1,1] to [0,1]\n    \n    img = img:add(1):div(2)\n    \n    return img\nend\n\nfunction util.preprocess_batch(batch)\n  for i = 1, batch:size(1) do\n    batch[i] = util.preprocess(batch[i]:squeeze())\n  end\n  return batch\nend\n\nfunction util.deprocess_batch(batch)\n  for i = 1, batch:size(1) do\n   batch[i] = util.deprocess(batch[i]:squeeze())\n  end\nreturn batch\nend\n\n\n\n-- preprocessing specific to colorization\n\nfunction util.deprocessLAB(L, AB)\n    local L2 = torch.Tensor(L:size()):copy(L)\n    if L2:dim() == 3 then\n      L2 = L2[{1, {}, {} }]\n    end\n    local AB2 = torch.Tensor(AB:size()):copy(AB)\n    AB2 = torch.clamp(AB2, -1.0, 1.0)\n--    local AB2 = AB\n    L2 = L2:add(1):mul(50.0)\n    AB2 = AB2:mul(110.0)\n    \n    L2 = L2:reshape(1, L2:size(1), L2:size(2))\n    \n    im_lab = torch.cat(L2, AB2, 1)\n    im_rgb = torch.clamp(image.lab2rgb(im_lab):mul(255.0), 0.0, 255.0)/255.0\n    \n    return im_rgb\nend\n\nfunction util.deprocessL(L)\n    local L2 = torch.Tensor(L:size()):copy(L)\n    L2 = L2:add(1):mul(255.0/2.0)\n    \n    if L2:dim()==2 then\n      L2 = L2:reshape(1,L2:size(1),L2:size(2))\n    end\n    L2 = L2:repeatTensor(L2,3,1,1)/255.0\n    \n    return L2\nend\n\nfunction util.deprocessL_batch(batch)\n  local batch_new = {}\n  for i = 1, batch:size(1) do\n    batch_new[i] = util.deprocessL(batch[i]:squeeze())\n  end\n  return batch_new\nend\n\nfunction util.deprocessLAB_batch(batchL, batchAB)\n  local batch = {}\n  \n  for i = 1, batchL:size(1) do\n    batch[i] = util.deprocessLAB(batchL[i]:squeeze(), batchAB[i]:squeeze())\n  end\n  \n  return batch\nend\n\n\nfunction util.scaleBatch(batch,s1,s2)\n  local scaled_batch = torch.Tensor(batch:size(1),batch:size(2),s1,s2)\n  for i = 1, batch:size(1) do\n   scaled_batch[i] = image.scale(batch[i],s1,s2):squeeze()\n  end\n  return scaled_batch\nend\n\n\n\nfunction util.toTrivialBatch(input)\n    return input:reshape(1,input:size(1),input:size(2),input:size(3))\nend\nfunction util.fromTrivialBatch(input)\n    return input[1]\nend\n\n\n\nfunction util.scaleImage(input, loadSize)\n    -- replicate bw images to 3 channels\n    if input:size(1)==1 then\n      input = torch.repeatTensor(input,3,1,1)\n    end\n    \n    input = image.scale(input, loadSize, loadSize)\n    \n    return input\nend\n\nfunction util.getAspectRatio(path)\n  local input = image.load(path, 3, 'float')\n  local ar = input:size(3)/input:size(2)\n  return ar\nend\n\nfunction util.loadImage(path, loadSize, nc)\n    local input = image.load(path, 3, 'float')\n    input= util.preprocess(util.scaleImage(input, loadSize))\n    \n    if nc == 1 then\n        input = input[{{1}, {}, {}}]\n    end\n    \n    return input \nend\n\n\n\n-- TO DO: loading code is rather hacky; clean it up and make sure it works on all types of nets / cpu/gpu configurations\nfunction util.load(filename, opt)\n  if opt.cudnn>0 then\n    require 'cudnn'\n  end\n  \n  if opt.gpu > 0 then \n    require 'cunn'\n  end\n  \n  local net = torch.load(filename)\n\n  if opt.gpu > 0 then\n  \tnet:cuda()\n\n    -- calling cuda on cudnn saved nngraphs doesn't change all variables to cuda, so do it below\n    if net.forwardnodes then\n      for i=1,#net.forwardnodes do\n          if net.forwardnodes[i].data.module then\n            net.forwardnodes[i].data.module:cuda()\n          end\n      end\n    end\n  else\n    net:float()\n  end\n  net:apply(function(m) if m.weight then \n  m.gradWeight = m.weight:clone():zero(); \n  m.gradBias = m.bias:clone():zero(); end end)\n  return net\nend\n\nfunction util.cudnn(net)\n  require 'cudnn'\n  require 'util/cudnn_convert_custom'\n  return cudnn_convert_custom(net, cudnn)\nend\n\nfunction util.containsValue(table, value)\n  for k, v in pairs(table) do \n    if v == value then return true end\n  end\n  return false\nend\n\nreturn util\n"
  }
]