[
  {
    "path": ".gitignore",
    "content": "debug*\ncheckpoints/\nresults/\nbuild/\ndist/\ntorch.egg-info/\n*/**/__pycache__\ntorch/version.py\ntorch/csrc/generic/TensorMethods.cpp\ntorch/lib/*.so*\ntorch/lib/*.dylib*\ntorch/lib/*.h\ntorch/lib/build\ntorch/lib/tmp_install\ntorch/lib/include\ntorch/lib/torch_shm_manager\ntorch/csrc/cudnn/cuDNN.cpp\ntorch/csrc/nn/THNN.cwrap\ntorch/csrc/nn/THNN.cpp\ntorch/csrc/nn/THCUNN.cwrap\ntorch/csrc/nn/THCUNN.cpp\ntorch/csrc/nn/THNN_generic.cwrap\ntorch/csrc/nn/THNN_generic.cpp\ntorch/csrc/nn/THNN_generic.h\ndocs/src/**/*\ntest/data/legacy_modules.t7\ntest/data/gpu_tensors.pt\ntest/htmlcov\ntest/.coverage\n*/*.pyc\n*/**/*.pyc\n*/**/**/*.pyc\n*/**/**/**/*.pyc\n*/**/**/**/**/*.pyc\n*/*.so*\n*/**/*.so*\n*/**/*.dylib*\ntest/data/legacy_serialized.pt\n*.DS_Store\n*~\n"
  },
  {
    "path": "LICENSE.txt",
    "content": "Copyright (C) 2019 NVIDIA Corporation. Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu.\nBSD License. All rights reserved. \n\nRedistribution and use in source and binary forms, with or without\nmodification, are permitted provided that the following conditions are met:\n\n* Redistributions of source code must retain the above copyright notice, this\n  list of conditions and the following disclaimer.\n\n* Redistributions in binary form must reproduce the above copyright notice,\n  this list of conditions and the following disclaimer in the documentation\n  and/or other materials provided with the distribution.\n\nTHE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL \nIMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR ANY PARTICULAR PURPOSE. \nIN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL \nDAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, \nWHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING \nOUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.\n\n\n--------------------------- LICENSE FOR pytorch-CycleGAN-and-pix2pix ----------------\nCopyright (c) 2017, Jun-Yan Zhu and Taesung Park\nAll rights reserved.\n\nRedistribution and use in source and binary forms, with or without\nmodification, are permitted provided that the following conditions are met:\n\n* Redistributions of source code must retain the above copyright notice, this\n  list of conditions and the following disclaimer.\n\n* Redistributions in binary form must reproduce the above copyright notice,\n  this list of conditions and the following disclaimer in the documentation\n  and/or other materials provided with the distribution.\n\nTHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS \"AS IS\"\nAND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE\nIMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE\nDISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE\nFOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL\nDAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR\nSERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER\nCAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,\nOR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\nOF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n"
  },
  {
    "path": "README.md",
    "content": "<img src='imgs/teaser_720.gif' align=\"right\" width=360>\r\n\r\n<br><br><br><br>\r\n\r\n# pix2pixHD\r\n### [Project](https://tcwang0509.github.io/pix2pixHD/) | [Youtube](https://youtu.be/3AIpPlzM_qs) | [Paper](https://arxiv.org/pdf/1711.11585.pdf) <br>\r\nPytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic image-to-image translation. It can be used for turning semantic label maps into photo-realistic images or synthesizing portraits from face label maps. <br><br>\r\n[High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs](https://tcwang0509.github.io/pix2pixHD/)  \r\n [Ting-Chun Wang](https://tcwang0509.github.io/)<sup>1</sup>, [Ming-Yu Liu](http://mingyuliu.net/)<sup>1</sup>, [Jun-Yan Zhu](http://people.eecs.berkeley.edu/~junyanz/)<sup>2</sup>, Andrew Tao<sup>1</sup>, [Jan Kautz](http://jankautz.com/)<sup>1</sup>, [Bryan Catanzaro](http://catanzaro.name/)<sup>1</sup>  \r\n <sup>1</sup>NVIDIA Corporation, <sup>2</sup>UC Berkeley  \r\n In CVPR 2018.  \r\n\r\n## Image-to-image translation at 2k/1k resolution\r\n- Our label-to-streetview results\r\n<p align='center'>  \r\n  <img src='imgs/teaser_label.png' width='400'/>\r\n  <img src='imgs/teaser_ours.jpg' width='400'/>\r\n</p>\r\n- Interactive editing results\r\n<p align='center'>  \r\n  <img src='imgs/teaser_style.gif' width='400'/>\r\n  <img src='imgs/teaser_label.gif' width='400'/>\r\n</p>\r\n- Additional streetview results\r\n<p align='center'>\r\n  <img src='imgs/cityscapes_1.jpg' width='400'/>\r\n  <img src='imgs/cityscapes_2.jpg' width='400'/>\r\n</p>\r\n<p align='center'>\r\n  <img src='imgs/cityscapes_3.jpg' width='400'/>\r\n  <img src='imgs/cityscapes_4.jpg' width='400'/>\r\n</p>\r\n\r\n- Label-to-face and interactive editing results\r\n<p align='center'>\r\n  <img src='imgs/face1_1.jpg' width='250'/>\r\n  <img src='imgs/face1_2.jpg' width='250'/>\r\n  <img src='imgs/face1_3.jpg' width='250'/>\r\n</p>\r\n<p align='center'>\r\n  <img src='imgs/face2_1.jpg' width='250'/>\r\n  <img src='imgs/face2_2.jpg' width='250'/>\r\n  <img src='imgs/face2_3.jpg' width='250'/>\r\n</p>\r\n\r\n- Our editing interface\r\n<p align='center'>\r\n  <img src='imgs/city_short.gif' width='330'/>\r\n  <img src='imgs/face_short.gif' width='450'/>\r\n</p>\r\n\r\n## Prerequisites\r\n- Linux or macOS\r\n- Python 2 or 3\r\n- NVIDIA GPU (11G memory or larger) + CUDA cuDNN\r\n\r\n## Getting Started\r\n### Installation\r\n- Install PyTorch and dependencies from http://pytorch.org\r\n- Install python libraries [dominate](https://github.com/Knio/dominate).\r\n```bash\r\npip install dominate\r\n```\r\n- Clone this repo:\r\n```bash\r\ngit clone https://github.com/NVIDIA/pix2pixHD\r\ncd pix2pixHD\r\n```\r\n\r\n\r\n### Testing\r\n- A few example Cityscapes test images are included in the `datasets` folder.\r\n- Please download the pre-trained Cityscapes model from [here](https://drive.google.com/file/d/1OR-2aEPHOxZKuoOV34DvQxreqGCSLcW9/view?usp=drive_link) (google drive link), and put it under `./checkpoints/label2city_1024p/`\r\n- Test the model (`bash ./scripts/test_1024p.sh`):\r\n```bash\r\n#!./scripts/test_1024p.sh\r\npython test.py --name label2city_1024p --netG local --ngf 32 --resize_or_crop none\r\n```\r\nThe test results will be saved to a html file here: `./results/label2city_1024p/test_latest/index.html`.\r\n\r\nMore example scripts can be found in the `scripts` directory.\r\n\r\n\r\n### Dataset\r\n- We use the Cityscapes dataset. To train a model on the full dataset, please download it from the [official website](https://www.cityscapes-dataset.com/) (registration required).\r\nAfter downloading, please put it under the `datasets` folder in the same way the example images are provided.\r\n\r\n\r\n### Training\r\n- Train a model at 1024 x 512 resolution (`bash ./scripts/train_512p.sh`):\r\n```bash\r\n#!./scripts/train_512p.sh\r\npython train.py --name label2city_512p\r\n```\r\n- To view training results, please checkout intermediate results in `./checkpoints/label2city_512p/web/index.html`.\r\nIf you have tensorflow installed, you can see tensorboard logs in `./checkpoints/label2city_512p/logs` by adding `--tf_log` to the training scripts.\r\n\r\n### Multi-GPU training\r\n- Train a model using multiple GPUs (`bash ./scripts/train_512p_multigpu.sh`):\r\n```bash\r\n#!./scripts/train_512p_multigpu.sh\r\npython train.py --name label2city_512p --batchSize 8 --gpu_ids 0,1,2,3,4,5,6,7\r\n```\r\nNote: this is not tested and we trained our model using single GPU only. Please use at your own discretion.\r\n\r\n### Training with Automatic Mixed Precision (AMP) for faster speed\r\n- To train with mixed precision support, please first install apex from: https://github.com/NVIDIA/apex\r\n- You can then train the model by adding `--fp16`. For example,\r\n```bash\r\n#!./scripts/train_512p_fp16.sh\r\npython -m torch.distributed.launch train.py --name label2city_512p --fp16\r\n```\r\nIn our test case, it trains about 80% faster with AMP on a Volta machine.\r\n\r\n### Training at full resolution\r\n- To train the images at full resolution (2048 x 1024) requires a GPU with 24G memory (`bash ./scripts/train_1024p_24G.sh`), or 16G memory if using mixed precision (AMP).\r\n- If only GPUs with 12G memory are available, please use the 12G script (`bash ./scripts/train_1024p_12G.sh`), which will crop the images during training. Performance is not guaranteed using this script.\r\n\r\n### Training with your own dataset\r\n- If you want to train with your own dataset, please generate label maps which are one-channel whose pixel values correspond to the object labels (i.e. 0,1,...,N-1, where N is the number of labels). This is because we need to generate one-hot vectors from the label maps. Please also specity `--label_nc N` during both training and testing.\r\n- If your input is not a label map, please just specify `--label_nc 0` which will directly use the RGB colors as input. The folders should then be named `train_A`, `train_B` instead of `train_label`, `train_img`, where the goal is to translate images from A to B.\r\n- If you don't have instance maps or don't want to use them, please specify `--no_instance`.\r\n- The default setting for preprocessing is `scale_width`, which will scale the width of all training images to `opt.loadSize` (1024) while keeping the aspect ratio. If you want a different setting, please change it by using the `--resize_or_crop` option. For example, `scale_width_and_crop` first resizes the image to have width `opt.loadSize` and then does random cropping of size `(opt.fineSize, opt.fineSize)`. `crop` skips the resizing step and only performs random cropping. If you don't want any preprocessing, please specify `none`, which will do nothing other than making sure the image is divisible by 32.\r\n\r\n## More Training/Test Details\r\n- Flags: see `options/train_options.py` and `options/base_options.py` for all the training flags; see `options/test_options.py` and `options/base_options.py` for all the test flags.\r\n- Instance map: we take in both label maps and instance maps as input. If you don't want to use instance maps, please specify the flag `--no_instance`.\r\n\r\n\r\n## Citation\r\n\r\nIf you find this useful for your research, please use the following.\r\n\r\n```\r\n@inproceedings{wang2018pix2pixHD,\r\n  title={High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs},\r\n  author={Ting-Chun Wang and Ming-Yu Liu and Jun-Yan Zhu and Andrew Tao and Jan Kautz and Bryan Catanzaro},  \r\n  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},\r\n  year={2018}\r\n}\r\n```\r\n\r\n## Acknowledgments\r\nThis code borrows heavily from [pytorch-CycleGAN-and-pix2pix](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix).\r\n"
  },
  {
    "path": "_config.yml",
    "content": "theme: jekyll-theme-minimal"
  },
  {
    "path": "data/__init__.py",
    "content": ""
  },
  {
    "path": "data/aligned_dataset.py",
    "content": "import os.path\nfrom data.base_dataset import BaseDataset, get_params, get_transform, normalize\nfrom data.image_folder import make_dataset\nfrom PIL import Image\n\nclass AlignedDataset(BaseDataset):\n    def initialize(self, opt):\n        self.opt = opt\n        self.root = opt.dataroot    \n\n        ### input A (label maps)\n        dir_A = '_A' if self.opt.label_nc == 0 else '_label'\n        self.dir_A = os.path.join(opt.dataroot, opt.phase + dir_A)\n        self.A_paths = sorted(make_dataset(self.dir_A))\n\n        ### input B (real images)\n        if opt.isTrain or opt.use_encoded_image:\n            dir_B = '_B' if self.opt.label_nc == 0 else '_img'\n            self.dir_B = os.path.join(opt.dataroot, opt.phase + dir_B)  \n            self.B_paths = sorted(make_dataset(self.dir_B))\n\n        ### instance maps\n        if not opt.no_instance:\n            self.dir_inst = os.path.join(opt.dataroot, opt.phase + '_inst')\n            self.inst_paths = sorted(make_dataset(self.dir_inst))\n\n        ### load precomputed instance-wise encoded features\n        if opt.load_features:                              \n            self.dir_feat = os.path.join(opt.dataroot, opt.phase + '_feat')\n            print('----------- loading features from %s ----------' % self.dir_feat)\n            self.feat_paths = sorted(make_dataset(self.dir_feat))\n\n        self.dataset_size = len(self.A_paths) \n      \n    def __getitem__(self, index):        \n        ### input A (label maps)\n        A_path = self.A_paths[index]              \n        A = Image.open(A_path)        \n        params = get_params(self.opt, A.size)\n        if self.opt.label_nc == 0:\n            transform_A = get_transform(self.opt, params)\n            A_tensor = transform_A(A.convert('RGB'))\n        else:\n            transform_A = get_transform(self.opt, params, method=Image.NEAREST, normalize=False)\n            A_tensor = transform_A(A) * 255.0\n\n        B_tensor = inst_tensor = feat_tensor = 0\n        ### input B (real images)\n        if self.opt.isTrain or self.opt.use_encoded_image:\n            B_path = self.B_paths[index]   \n            B = Image.open(B_path).convert('RGB')\n            transform_B = get_transform(self.opt, params)      \n            B_tensor = transform_B(B)\n\n        ### if using instance maps        \n        if not self.opt.no_instance:\n            inst_path = self.inst_paths[index]\n            inst = Image.open(inst_path)\n            inst_tensor = transform_A(inst)\n\n            if self.opt.load_features:\n                feat_path = self.feat_paths[index]            \n                feat = Image.open(feat_path).convert('RGB')\n                norm = normalize()\n                feat_tensor = norm(transform_A(feat))                            \n\n        input_dict = {'label': A_tensor, 'inst': inst_tensor, 'image': B_tensor, \n                      'feat': feat_tensor, 'path': A_path}\n\n        return input_dict\n\n    def __len__(self):\n        return len(self.A_paths) // self.opt.batchSize * self.opt.batchSize\n\n    def name(self):\n        return 'AlignedDataset'"
  },
  {
    "path": "data/base_data_loader.py",
    "content": "\nclass BaseDataLoader():\n    def __init__(self):\n        pass\n    \n    def initialize(self, opt):\n        self.opt = opt\n        pass\n\n    def load_data():\n        return None\n\n        \n        \n"
  },
  {
    "path": "data/base_dataset.py",
    "content": "import torch.utils.data as data\nfrom PIL import Image\nimport torchvision.transforms as transforms\nimport numpy as np\nimport random\n\nclass BaseDataset(data.Dataset):\n    def __init__(self):\n        super(BaseDataset, self).__init__()\n\n    def name(self):\n        return 'BaseDataset'\n\n    def initialize(self, opt):\n        pass\n\ndef get_params(opt, size):\n    w, h = size\n    new_h = h\n    new_w = w\n    if opt.resize_or_crop == 'resize_and_crop':\n        new_h = new_w = opt.loadSize            \n    elif opt.resize_or_crop == 'scale_width_and_crop':\n        new_w = opt.loadSize\n        new_h = opt.loadSize * h // w\n\n    x = random.randint(0, np.maximum(0, new_w - opt.fineSize))\n    y = random.randint(0, np.maximum(0, new_h - opt.fineSize))\n    \n    flip = random.random() > 0.5\n    return {'crop_pos': (x, y), 'flip': flip}\n\ndef get_transform(opt, params, method=Image.BICUBIC, normalize=True):\n    transform_list = []\n    if 'resize' in opt.resize_or_crop:\n        osize = [opt.loadSize, opt.loadSize]\n        transform_list.append(transforms.Scale(osize, method))   \n    elif 'scale_width' in opt.resize_or_crop:\n        transform_list.append(transforms.Lambda(lambda img: __scale_width(img, opt.loadSize, method)))\n        \n    if 'crop' in opt.resize_or_crop:\n        transform_list.append(transforms.Lambda(lambda img: __crop(img, params['crop_pos'], opt.fineSize)))\n\n    if opt.resize_or_crop == 'none':\n        base = float(2 ** opt.n_downsample_global)\n        if opt.netG == 'local':\n            base *= (2 ** opt.n_local_enhancers)\n        transform_list.append(transforms.Lambda(lambda img: __make_power_2(img, base, method)))\n\n    if opt.isTrain and not opt.no_flip:\n        transform_list.append(transforms.Lambda(lambda img: __flip(img, params['flip'])))\n\n    transform_list += [transforms.ToTensor()]\n\n    if normalize:\n        transform_list += [transforms.Normalize((0.5, 0.5, 0.5),\n                                                (0.5, 0.5, 0.5))]\n    return transforms.Compose(transform_list)\n\ndef normalize():    \n    return transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))\n\ndef __make_power_2(img, base, method=Image.BICUBIC):\n    ow, oh = img.size        \n    h = int(round(oh / base) * base)\n    w = int(round(ow / base) * base)\n    if (h == oh) and (w == ow):\n        return img\n    return img.resize((w, h), method)\n\ndef __scale_width(img, target_width, method=Image.BICUBIC):\n    ow, oh = img.size\n    if (ow == target_width):\n        return img    \n    w = target_width\n    h = int(target_width * oh / ow)    \n    return img.resize((w, h), method)\n\ndef __crop(img, pos, size):\n    ow, oh = img.size\n    x1, y1 = pos\n    tw = th = size\n    if (ow > tw or oh > th):        \n        return img.crop((x1, y1, x1 + tw, y1 + th))\n    return img\n\ndef __flip(img, flip):\n    if flip:\n        return img.transpose(Image.FLIP_LEFT_RIGHT)\n    return img\n"
  },
  {
    "path": "data/custom_dataset_data_loader.py",
    "content": "import torch.utils.data\nfrom data.base_data_loader import BaseDataLoader\n\n\ndef CreateDataset(opt):\n    dataset = None\n    from data.aligned_dataset import AlignedDataset\n    dataset = AlignedDataset()\n\n    print(\"dataset [%s] was created\" % (dataset.name()))\n    dataset.initialize(opt)\n    return dataset\n\nclass CustomDatasetDataLoader(BaseDataLoader):\n    def name(self):\n        return 'CustomDatasetDataLoader'\n\n    def initialize(self, opt):\n        BaseDataLoader.initialize(self, opt)\n        self.dataset = CreateDataset(opt)\n        self.dataloader = torch.utils.data.DataLoader(\n            self.dataset,\n            batch_size=opt.batchSize,\n            shuffle=not opt.serial_batches,\n            num_workers=int(opt.nThreads))\n\n    def load_data(self):\n        return self.dataloader\n\n    def __len__(self):\n        return min(len(self.dataset), self.opt.max_dataset_size)\n"
  },
  {
    "path": "data/data_loader.py",
    "content": "\ndef CreateDataLoader(opt):\n    from data.custom_dataset_data_loader import CustomDatasetDataLoader\n    data_loader = CustomDatasetDataLoader()\n    print(data_loader.name())\n    data_loader.initialize(opt)\n    return data_loader\n"
  },
  {
    "path": "data/image_folder.py",
    "content": "###############################################################################\n# Code from\n# https://github.com/pytorch/vision/blob/master/torchvision/datasets/folder.py\n# Modified the original code so that it also loads images from the current\n# directory as well as the subdirectories\n###############################################################################\nimport torch.utils.data as data\nfrom PIL import Image\nimport os\n\nIMG_EXTENSIONS = [\n    '.jpg', '.JPG', '.jpeg', '.JPEG',\n    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP', '.tiff'\n]\n\n\ndef is_image_file(filename):\n    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)\n\n\ndef make_dataset(dir):\n    images = []\n    assert os.path.isdir(dir), '%s is not a valid directory' % dir\n\n    for root, _, fnames in sorted(os.walk(dir)):\n        for fname in fnames:\n            if is_image_file(fname):\n                path = os.path.join(root, fname)\n                images.append(path)\n\n    return images\n\n\ndef default_loader(path):\n    return Image.open(path).convert('RGB')\n\n\nclass ImageFolder(data.Dataset):\n\n    def __init__(self, root, transform=None, return_paths=False,\n                 loader=default_loader):\n        imgs = make_dataset(root)\n        if len(imgs) == 0:\n            raise(RuntimeError(\"Found 0 images in: \" + root + \"\\n\"\n                               \"Supported image extensions are: \" +\n                               \",\".join(IMG_EXTENSIONS)))\n\n        self.root = root\n        self.imgs = imgs\n        self.transform = transform\n        self.return_paths = return_paths\n        self.loader = loader\n\n    def __getitem__(self, index):\n        path = self.imgs[index]\n        img = self.loader(path)\n        if self.transform is not None:\n            img = self.transform(img)\n        if self.return_paths:\n            return img, path\n        else:\n            return img\n\n    def __len__(self):\n        return len(self.imgs)\n"
  },
  {
    "path": "encode_features.py",
    "content": "from options.train_options import TrainOptions\r\nfrom data.data_loader import CreateDataLoader\r\nfrom models.models import create_model\r\nimport numpy as np\r\nimport os\r\n\r\nopt = TrainOptions().parse()\r\nopt.nThreads = 1\r\nopt.batchSize = 1 \r\nopt.serial_batches = True \r\nopt.no_flip = True\r\nopt.instance_feat = True\r\nopt.continue_train = True\r\n\r\nname = 'features'\r\nsave_path = os.path.join(opt.checkpoints_dir, opt.name)\r\n\r\n############ Initialize #########\r\ndata_loader = CreateDataLoader(opt)\r\ndataset = data_loader.load_data()\r\ndataset_size = len(data_loader)\r\nmodel = create_model(opt)\r\n\r\n########### Encode features ###########\r\nreencode = True\r\nif reencode:\r\n\tfeatures = {}\r\n\tfor label in range(opt.label_nc):\r\n\t\tfeatures[label] = np.zeros((0, opt.feat_num+1))\r\n\tfor i, data in enumerate(dataset):    \r\n\t    feat = model.module.encode_features(data['image'], data['inst'])\r\n\t    for label in range(opt.label_nc):\r\n\t    \tfeatures[label] = np.append(features[label], feat[label], axis=0) \r\n\t        \r\n\t    print('%d / %d images' % (i+1, dataset_size))    \r\n\tsave_name = os.path.join(save_path, name + '.npy')\r\n\tnp.save(save_name, features)\r\n\r\n############## Clustering ###########\r\nn_clusters = opt.n_clusters\r\nload_name = os.path.join(save_path, name + '.npy')\r\nfeatures = np.load(load_name).item()\r\nfrom sklearn.cluster import KMeans\r\ncenters = {}\r\nfor label in range(opt.label_nc):\r\n\tfeat = features[label]\r\n\tfeat = feat[feat[:,-1] > 0.5, :-1]\t\t\r\n\tif feat.shape[0]:\r\n\t\tn_clusters = min(feat.shape[0], opt.n_clusters)\r\n\t\tkmeans = KMeans(n_clusters=n_clusters, random_state=0).fit(feat)\r\n\t\tcenters[label] = kmeans.cluster_centers_\r\nsave_name = os.path.join(save_path, name + '_clustered_%03d.npy' % opt.n_clusters)\r\nnp.save(save_name, centers)\r\nprint('saving to %s' % save_name)"
  },
  {
    "path": "models/__init__.py",
    "content": ""
  },
  {
    "path": "models/base_model.py",
    "content": "import os\nimport torch\nimport sys\n\nclass BaseModel(torch.nn.Module):\n    def name(self):\n        return 'BaseModel'\n\n    def initialize(self, opt):\n        self.opt = opt\n        self.gpu_ids = opt.gpu_ids\n        self.isTrain = opt.isTrain\n        self.Tensor = torch.cuda.FloatTensor if self.gpu_ids else torch.Tensor\n        self.save_dir = os.path.join(opt.checkpoints_dir, opt.name)\n\n    def set_input(self, input):\n        self.input = input\n\n    def forward(self):\n        pass\n\n    # used in test time, no backprop\n    def test(self):\n        pass\n\n    def get_image_paths(self):\n        pass\n\n    def optimize_parameters(self):\n        pass\n\n    def get_current_visuals(self):\n        return self.input\n\n    def get_current_errors(self):\n        return {}\n\n    def save(self, label):\n        pass\n\n    # helper saving function that can be used by subclasses\n    def save_network(self, network, network_label, epoch_label, gpu_ids):\n        save_filename = '%s_net_%s.pth' % (epoch_label, network_label)\n        save_path = os.path.join(self.save_dir, save_filename)\n        torch.save(network.cpu().state_dict(), save_path)\n        if len(gpu_ids) and torch.cuda.is_available():\n            network.cuda()\n\n    # helper loading function that can be used by subclasses\n    def load_network(self, network, network_label, epoch_label, save_dir=''):        \n        save_filename = '%s_net_%s.pth' % (epoch_label, network_label)\n        if not save_dir:\n            save_dir = self.save_dir\n        save_path = os.path.join(save_dir, save_filename)        \n        if not os.path.isfile(save_path):\n            print('%s not exists yet!' % save_path)\n            if network_label == 'G':\n                raise('Generator must exist!')\n        else:\n            #network.load_state_dict(torch.load(save_path))\n            try:\n                network.load_state_dict(torch.load(save_path))\n            except:   \n                pretrained_dict = torch.load(save_path)                \n                model_dict = network.state_dict()\n                try:\n                    pretrained_dict = {k: v for k, v in pretrained_dict.items() if k in model_dict}                    \n                    network.load_state_dict(pretrained_dict)\n                    if self.opt.verbose:\n                        print('Pretrained network %s has excessive layers; Only loading layers that are used' % network_label)\n                except:\n                    print('Pretrained network %s has fewer layers; The following are not initialized:' % network_label)\n                    for k, v in pretrained_dict.items():                      \n                        if v.size() == model_dict[k].size():\n                            model_dict[k] = v\n\n                    if sys.version_info >= (3,0):\n                        not_initialized = set()\n                    else:\n                        from sets import Set\n                        not_initialized = Set()                    \n\n                    for k, v in model_dict.items():\n                        if k not in pretrained_dict or v.size() != pretrained_dict[k].size():\n                            not_initialized.add(k.split('.')[0])\n                    \n                    print(sorted(not_initialized))\n                    network.load_state_dict(model_dict)                  \n\n    def update_learning_rate():\n        pass\n"
  },
  {
    "path": "models/models.py",
    "content": "import torch\n\ndef create_model(opt):\n    if opt.model == 'pix2pixHD':\n        from .pix2pixHD_model import Pix2PixHDModel, InferenceModel\n        if opt.isTrain:\n            model = Pix2PixHDModel()\n        else:\n            model = InferenceModel()\n    else:\n    \tfrom .ui_model import UIModel\n    \tmodel = UIModel()\n    model.initialize(opt)\n    if opt.verbose:\n        print(\"model [%s] was created\" % (model.name()))\n\n    if opt.isTrain and len(opt.gpu_ids) and not opt.fp16:\n        model = torch.nn.DataParallel(model, device_ids=opt.gpu_ids)\n\n    return model\n"
  },
  {
    "path": "models/networks.py",
    "content": "import torch\nimport torch.nn as nn\nimport functools\nfrom torch.autograd import Variable\nimport numpy as np\n\n###############################################################################\n# Functions\n###############################################################################\ndef weights_init(m):\n    classname = m.__class__.__name__\n    if classname.find('Conv') != -1:\n        m.weight.data.normal_(0.0, 0.02)\n    elif classname.find('BatchNorm2d') != -1:\n        m.weight.data.normal_(1.0, 0.02)\n        m.bias.data.fill_(0)\n\ndef get_norm_layer(norm_type='instance'):\n    if norm_type == 'batch':\n        norm_layer = functools.partial(nn.BatchNorm2d, affine=True)\n    elif norm_type == 'instance':\n        norm_layer = functools.partial(nn.InstanceNorm2d, affine=False)\n    else:\n        raise NotImplementedError('normalization layer [%s] is not found' % norm_type)\n    return norm_layer\n\ndef define_G(input_nc, output_nc, ngf, netG, n_downsample_global=3, n_blocks_global=9, n_local_enhancers=1, \n             n_blocks_local=3, norm='instance', gpu_ids=[]):    \n    norm_layer = get_norm_layer(norm_type=norm)     \n    if netG == 'global':    \n        netG = GlobalGenerator(input_nc, output_nc, ngf, n_downsample_global, n_blocks_global, norm_layer)       \n    elif netG == 'local':        \n        netG = LocalEnhancer(input_nc, output_nc, ngf, n_downsample_global, n_blocks_global, \n                                  n_local_enhancers, n_blocks_local, norm_layer)\n    elif netG == 'encoder':\n        netG = Encoder(input_nc, output_nc, ngf, n_downsample_global, norm_layer)\n    else:\n        raise('generator not implemented!')\n    print(netG)\n    if len(gpu_ids) > 0:\n        assert(torch.cuda.is_available())   \n        netG.cuda(gpu_ids[0])\n    netG.apply(weights_init)\n    return netG\n\ndef define_D(input_nc, ndf, n_layers_D, norm='instance', use_sigmoid=False, num_D=1, getIntermFeat=False, gpu_ids=[]):        \n    norm_layer = get_norm_layer(norm_type=norm)   \n    netD = MultiscaleDiscriminator(input_nc, ndf, n_layers_D, norm_layer, use_sigmoid, num_D, getIntermFeat)   \n    print(netD)\n    if len(gpu_ids) > 0:\n        assert(torch.cuda.is_available())\n        netD.cuda(gpu_ids[0])\n    netD.apply(weights_init)\n    return netD\n\ndef print_network(net):\n    if isinstance(net, list):\n        net = net[0]\n    num_params = 0\n    for param in net.parameters():\n        num_params += param.numel()\n    print(net)\n    print('Total number of parameters: %d' % num_params)\n\n##############################################################################\n# Losses\n##############################################################################\nclass GANLoss(nn.Module):\n    def __init__(self, use_lsgan=True, target_real_label=1.0, target_fake_label=0.0,\n                 tensor=torch.FloatTensor):\n        super(GANLoss, self).__init__()\n        self.real_label = target_real_label\n        self.fake_label = target_fake_label\n        self.real_label_var = None\n        self.fake_label_var = None\n        self.Tensor = tensor\n        if use_lsgan:\n            self.loss = nn.MSELoss()\n        else:\n            self.loss = nn.BCELoss()\n\n    def get_target_tensor(self, input, target_is_real):\n        target_tensor = None\n        if target_is_real:\n            create_label = ((self.real_label_var is None) or\n                            (self.real_label_var.numel() != input.numel()))\n            if create_label:\n                real_tensor = self.Tensor(input.size()).fill_(self.real_label)\n                self.real_label_var = Variable(real_tensor, requires_grad=False)\n            target_tensor = self.real_label_var\n        else:\n            create_label = ((self.fake_label_var is None) or\n                            (self.fake_label_var.numel() != input.numel()))\n            if create_label:\n                fake_tensor = self.Tensor(input.size()).fill_(self.fake_label)\n                self.fake_label_var = Variable(fake_tensor, requires_grad=False)\n            target_tensor = self.fake_label_var\n        return target_tensor\n\n    def __call__(self, input, target_is_real):\n        if isinstance(input[0], list):\n            loss = 0\n            for input_i in input:\n                pred = input_i[-1]\n                target_tensor = self.get_target_tensor(pred, target_is_real)\n                loss += self.loss(pred, target_tensor)\n            return loss\n        else:            \n            target_tensor = self.get_target_tensor(input[-1], target_is_real)\n            return self.loss(input[-1], target_tensor)\n\nclass VGGLoss(nn.Module):\n    def __init__(self, gpu_ids):\n        super(VGGLoss, self).__init__()        \n        self.vgg = Vgg19().cuda()\n        self.criterion = nn.L1Loss()\n        self.weights = [1.0/32, 1.0/16, 1.0/8, 1.0/4, 1.0]        \n\n    def forward(self, x, y):              \n        x_vgg, y_vgg = self.vgg(x), self.vgg(y)\n        loss = 0\n        for i in range(len(x_vgg)):\n            loss += self.weights[i] * self.criterion(x_vgg[i], y_vgg[i].detach())        \n        return loss\n\n##############################################################################\n# Generator\n##############################################################################\nclass LocalEnhancer(nn.Module):\n    def __init__(self, input_nc, output_nc, ngf=32, n_downsample_global=3, n_blocks_global=9, \n                 n_local_enhancers=1, n_blocks_local=3, norm_layer=nn.BatchNorm2d, padding_type='reflect'):        \n        super(LocalEnhancer, self).__init__()\n        self.n_local_enhancers = n_local_enhancers\n        \n        ###### global generator model #####           \n        ngf_global = ngf * (2**n_local_enhancers)\n        model_global = GlobalGenerator(input_nc, output_nc, ngf_global, n_downsample_global, n_blocks_global, norm_layer).model        \n        model_global = [model_global[i] for i in range(len(model_global)-3)] # get rid of final convolution layers        \n        self.model = nn.Sequential(*model_global)                \n\n        ###### local enhancer layers #####\n        for n in range(1, n_local_enhancers+1):\n            ### downsample            \n            ngf_global = ngf * (2**(n_local_enhancers-n))\n            model_downsample = [nn.ReflectionPad2d(3), nn.Conv2d(input_nc, ngf_global, kernel_size=7, padding=0), \n                                norm_layer(ngf_global), nn.ReLU(True),\n                                nn.Conv2d(ngf_global, ngf_global * 2, kernel_size=3, stride=2, padding=1), \n                                norm_layer(ngf_global * 2), nn.ReLU(True)]\n            ### residual blocks\n            model_upsample = []\n            for i in range(n_blocks_local):\n                model_upsample += [ResnetBlock(ngf_global * 2, padding_type=padding_type, norm_layer=norm_layer)]\n\n            ### upsample\n            model_upsample += [nn.ConvTranspose2d(ngf_global * 2, ngf_global, kernel_size=3, stride=2, padding=1, output_padding=1), \n                               norm_layer(ngf_global), nn.ReLU(True)]      \n\n            ### final convolution\n            if n == n_local_enhancers:                \n                model_upsample += [nn.ReflectionPad2d(3), nn.Conv2d(ngf, output_nc, kernel_size=7, padding=0), nn.Tanh()]                       \n            \n            setattr(self, 'model'+str(n)+'_1', nn.Sequential(*model_downsample))\n            setattr(self, 'model'+str(n)+'_2', nn.Sequential(*model_upsample))                  \n        \n        self.downsample = nn.AvgPool2d(3, stride=2, padding=[1, 1], count_include_pad=False)\n\n    def forward(self, input): \n        ### create input pyramid\n        input_downsampled = [input]\n        for i in range(self.n_local_enhancers):\n            input_downsampled.append(self.downsample(input_downsampled[-1]))\n\n        ### output at coarest level\n        output_prev = self.model(input_downsampled[-1])        \n        ### build up one layer at a time\n        for n_local_enhancers in range(1, self.n_local_enhancers+1):\n            model_downsample = getattr(self, 'model'+str(n_local_enhancers)+'_1')\n            model_upsample = getattr(self, 'model'+str(n_local_enhancers)+'_2')            \n            input_i = input_downsampled[self.n_local_enhancers-n_local_enhancers]            \n            output_prev = model_upsample(model_downsample(input_i) + output_prev)\n        return output_prev\n\nclass GlobalGenerator(nn.Module):\n    def __init__(self, input_nc, output_nc, ngf=64, n_downsampling=3, n_blocks=9, norm_layer=nn.BatchNorm2d, \n                 padding_type='reflect'):\n        assert(n_blocks >= 0)\n        super(GlobalGenerator, self).__init__()        \n        activation = nn.ReLU(True)        \n\n        model = [nn.ReflectionPad2d(3), nn.Conv2d(input_nc, ngf, kernel_size=7, padding=0), norm_layer(ngf), activation]\n        ### downsample\n        for i in range(n_downsampling):\n            mult = 2**i\n            model += [nn.Conv2d(ngf * mult, ngf * mult * 2, kernel_size=3, stride=2, padding=1),\n                      norm_layer(ngf * mult * 2), activation]\n\n        ### resnet blocks\n        mult = 2**n_downsampling\n        for i in range(n_blocks):\n            model += [ResnetBlock(ngf * mult, padding_type=padding_type, activation=activation, norm_layer=norm_layer)]\n        \n        ### upsample         \n        for i in range(n_downsampling):\n            mult = 2**(n_downsampling - i)\n            model += [nn.ConvTranspose2d(ngf * mult, int(ngf * mult / 2), kernel_size=3, stride=2, padding=1, output_padding=1),\n                       norm_layer(int(ngf * mult / 2)), activation]\n        model += [nn.ReflectionPad2d(3), nn.Conv2d(ngf, output_nc, kernel_size=7, padding=0), nn.Tanh()]        \n        self.model = nn.Sequential(*model)\n            \n    def forward(self, input):\n        return self.model(input)             \n        \n# Define a resnet block\nclass ResnetBlock(nn.Module):\n    def __init__(self, dim, padding_type, norm_layer, activation=nn.ReLU(True), use_dropout=False):\n        super(ResnetBlock, self).__init__()\n        self.conv_block = self.build_conv_block(dim, padding_type, norm_layer, activation, use_dropout)\n\n    def build_conv_block(self, dim, padding_type, norm_layer, activation, use_dropout):\n        conv_block = []\n        p = 0\n        if padding_type == 'reflect':\n            conv_block += [nn.ReflectionPad2d(1)]\n        elif padding_type == 'replicate':\n            conv_block += [nn.ReplicationPad2d(1)]\n        elif padding_type == 'zero':\n            p = 1\n        else:\n            raise NotImplementedError('padding [%s] is not implemented' % padding_type)\n\n        conv_block += [nn.Conv2d(dim, dim, kernel_size=3, padding=p),\n                       norm_layer(dim),\n                       activation]\n        if use_dropout:\n            conv_block += [nn.Dropout(0.5)]\n\n        p = 0\n        if padding_type == 'reflect':\n            conv_block += [nn.ReflectionPad2d(1)]\n        elif padding_type == 'replicate':\n            conv_block += [nn.ReplicationPad2d(1)]\n        elif padding_type == 'zero':\n            p = 1\n        else:\n            raise NotImplementedError('padding [%s] is not implemented' % padding_type)\n        conv_block += [nn.Conv2d(dim, dim, kernel_size=3, padding=p),\n                       norm_layer(dim)]\n\n        return nn.Sequential(*conv_block)\n\n    def forward(self, x):\n        out = x + self.conv_block(x)\n        return out\n\nclass Encoder(nn.Module):\n    def __init__(self, input_nc, output_nc, ngf=32, n_downsampling=4, norm_layer=nn.BatchNorm2d):\n        super(Encoder, self).__init__()        \n        self.output_nc = output_nc        \n\n        model = [nn.ReflectionPad2d(3), nn.Conv2d(input_nc, ngf, kernel_size=7, padding=0), \n                 norm_layer(ngf), nn.ReLU(True)]             \n        ### downsample\n        for i in range(n_downsampling):\n            mult = 2**i\n            model += [nn.Conv2d(ngf * mult, ngf * mult * 2, kernel_size=3, stride=2, padding=1),\n                      norm_layer(ngf * mult * 2), nn.ReLU(True)]\n\n        ### upsample         \n        for i in range(n_downsampling):\n            mult = 2**(n_downsampling - i)\n            model += [nn.ConvTranspose2d(ngf * mult, int(ngf * mult / 2), kernel_size=3, stride=2, padding=1, output_padding=1),\n                       norm_layer(int(ngf * mult / 2)), nn.ReLU(True)]        \n\n        model += [nn.ReflectionPad2d(3), nn.Conv2d(ngf, output_nc, kernel_size=7, padding=0), nn.Tanh()]\n        self.model = nn.Sequential(*model) \n\n    def forward(self, input, inst):\n        outputs = self.model(input)\n\n        # instance-wise average pooling\n        outputs_mean = outputs.clone()\n        inst_list = np.unique(inst.cpu().numpy().astype(int))        \n        for i in inst_list:\n            for b in range(input.size()[0]):\n                indices = (inst[b:b+1] == int(i)).nonzero() # n x 4            \n                for j in range(self.output_nc):\n                    output_ins = outputs[indices[:,0] + b, indices[:,1] + j, indices[:,2], indices[:,3]]                    \n                    mean_feat = torch.mean(output_ins).expand_as(output_ins)                                        \n                    outputs_mean[indices[:,0] + b, indices[:,1] + j, indices[:,2], indices[:,3]] = mean_feat                       \n        return outputs_mean\n\nclass MultiscaleDiscriminator(nn.Module):\n    def __init__(self, input_nc, ndf=64, n_layers=3, norm_layer=nn.BatchNorm2d, \n                 use_sigmoid=False, num_D=3, getIntermFeat=False):\n        super(MultiscaleDiscriminator, self).__init__()\n        self.num_D = num_D\n        self.n_layers = n_layers\n        self.getIntermFeat = getIntermFeat\n     \n        for i in range(num_D):\n            netD = NLayerDiscriminator(input_nc, ndf, n_layers, norm_layer, use_sigmoid, getIntermFeat)\n            if getIntermFeat:                                \n                for j in range(n_layers+2):\n                    setattr(self, 'scale'+str(i)+'_layer'+str(j), getattr(netD, 'model'+str(j)))                                   \n            else:\n                setattr(self, 'layer'+str(i), netD.model)\n\n        self.downsample = nn.AvgPool2d(3, stride=2, padding=[1, 1], count_include_pad=False)\n\n    def singleD_forward(self, model, input):\n        if self.getIntermFeat:\n            result = [input]\n            for i in range(len(model)):\n                result.append(model[i](result[-1]))\n            return result[1:]\n        else:\n            return [model(input)]\n\n    def forward(self, input):        \n        num_D = self.num_D\n        result = []\n        input_downsampled = input\n        for i in range(num_D):\n            if self.getIntermFeat:\n                model = [getattr(self, 'scale'+str(num_D-1-i)+'_layer'+str(j)) for j in range(self.n_layers+2)]\n            else:\n                model = getattr(self, 'layer'+str(num_D-1-i))\n            result.append(self.singleD_forward(model, input_downsampled))\n            if i != (num_D-1):\n                input_downsampled = self.downsample(input_downsampled)\n        return result\n        \n# Defines the PatchGAN discriminator with the specified arguments.\nclass NLayerDiscriminator(nn.Module):\n    def __init__(self, input_nc, ndf=64, n_layers=3, norm_layer=nn.BatchNorm2d, use_sigmoid=False, getIntermFeat=False):\n        super(NLayerDiscriminator, self).__init__()\n        self.getIntermFeat = getIntermFeat\n        self.n_layers = n_layers\n\n        kw = 4\n        padw = int(np.ceil((kw-1.0)/2))\n        sequence = [[nn.Conv2d(input_nc, ndf, kernel_size=kw, stride=2, padding=padw), nn.LeakyReLU(0.2, True)]]\n\n        nf = ndf\n        for n in range(1, n_layers):\n            nf_prev = nf\n            nf = min(nf * 2, 512)\n            sequence += [[\n                nn.Conv2d(nf_prev, nf, kernel_size=kw, stride=2, padding=padw),\n                norm_layer(nf), nn.LeakyReLU(0.2, True)\n            ]]\n\n        nf_prev = nf\n        nf = min(nf * 2, 512)\n        sequence += [[\n            nn.Conv2d(nf_prev, nf, kernel_size=kw, stride=1, padding=padw),\n            norm_layer(nf),\n            nn.LeakyReLU(0.2, True)\n        ]]\n\n        sequence += [[nn.Conv2d(nf, 1, kernel_size=kw, stride=1, padding=padw)]]\n\n        if use_sigmoid:\n            sequence += [[nn.Sigmoid()]]\n\n        if getIntermFeat:\n            for n in range(len(sequence)):\n                setattr(self, 'model'+str(n), nn.Sequential(*sequence[n]))\n        else:\n            sequence_stream = []\n            for n in range(len(sequence)):\n                sequence_stream += sequence[n]\n            self.model = nn.Sequential(*sequence_stream)\n\n    def forward(self, input):\n        if self.getIntermFeat:\n            res = [input]\n            for n in range(self.n_layers+2):\n                model = getattr(self, 'model'+str(n))\n                res.append(model(res[-1]))\n            return res[1:]\n        else:\n            return self.model(input)        \n\nfrom torchvision import models\nclass Vgg19(torch.nn.Module):\n    def __init__(self, requires_grad=False):\n        super(Vgg19, self).__init__()\n        vgg_pretrained_features = models.vgg19(pretrained=True).features\n        self.slice1 = torch.nn.Sequential()\n        self.slice2 = torch.nn.Sequential()\n        self.slice3 = torch.nn.Sequential()\n        self.slice4 = torch.nn.Sequential()\n        self.slice5 = torch.nn.Sequential()\n        for x in range(2):\n            self.slice1.add_module(str(x), vgg_pretrained_features[x])\n        for x in range(2, 7):\n            self.slice2.add_module(str(x), vgg_pretrained_features[x])\n        for x in range(7, 12):\n            self.slice3.add_module(str(x), vgg_pretrained_features[x])\n        for x in range(12, 21):\n            self.slice4.add_module(str(x), vgg_pretrained_features[x])\n        for x in range(21, 30):\n            self.slice5.add_module(str(x), vgg_pretrained_features[x])\n        if not requires_grad:\n            for param in self.parameters():\n                param.requires_grad = False\n\n    def forward(self, X):\n        h_relu1 = self.slice1(X)\n        h_relu2 = self.slice2(h_relu1)        \n        h_relu3 = self.slice3(h_relu2)        \n        h_relu4 = self.slice4(h_relu3)        \n        h_relu5 = self.slice5(h_relu4)                \n        out = [h_relu1, h_relu2, h_relu3, h_relu4, h_relu5]\n        return out\n"
  },
  {
    "path": "models/pix2pixHD_model.py",
    "content": "import numpy as np\nimport torch\nimport os\nfrom torch.autograd import Variable\nfrom util.image_pool import ImagePool\nfrom .base_model import BaseModel\nfrom . import networks\n\nclass Pix2PixHDModel(BaseModel):\n    def name(self):\n        return 'Pix2PixHDModel'\n    \n    def init_loss_filter(self, use_gan_feat_loss, use_vgg_loss):\n        flags = (True, use_gan_feat_loss, use_vgg_loss, True, True)\n        def loss_filter(g_gan, g_gan_feat, g_vgg, d_real, d_fake):\n            return [l for (l,f) in zip((g_gan,g_gan_feat,g_vgg,d_real,d_fake),flags) if f]\n        return loss_filter\n    \n    def initialize(self, opt):\n        BaseModel.initialize(self, opt)\n        if opt.resize_or_crop != 'none' or not opt.isTrain: # when training at full res this causes OOM\n            torch.backends.cudnn.benchmark = True\n        self.isTrain = opt.isTrain\n        self.use_features = opt.instance_feat or opt.label_feat\n        self.gen_features = self.use_features and not self.opt.load_features\n        input_nc = opt.label_nc if opt.label_nc != 0 else opt.input_nc\n\n        ##### define networks        \n        # Generator network\n        netG_input_nc = input_nc        \n        if not opt.no_instance:\n            netG_input_nc += 1\n        if self.use_features:\n            netG_input_nc += opt.feat_num                  \n        self.netG = networks.define_G(netG_input_nc, opt.output_nc, opt.ngf, opt.netG, \n                                      opt.n_downsample_global, opt.n_blocks_global, opt.n_local_enhancers, \n                                      opt.n_blocks_local, opt.norm, gpu_ids=self.gpu_ids)        \n\n        # Discriminator network\n        if self.isTrain:\n            use_sigmoid = opt.no_lsgan\n            netD_input_nc = input_nc + opt.output_nc\n            if not opt.no_instance:\n                netD_input_nc += 1\n            self.netD = networks.define_D(netD_input_nc, opt.ndf, opt.n_layers_D, opt.norm, use_sigmoid, \n                                          opt.num_D, not opt.no_ganFeat_loss, gpu_ids=self.gpu_ids)\n\n        ### Encoder network\n        if self.gen_features:          \n            self.netE = networks.define_G(opt.output_nc, opt.feat_num, opt.nef, 'encoder', \n                                          opt.n_downsample_E, norm=opt.norm, gpu_ids=self.gpu_ids)  \n        if self.opt.verbose:\n                print('---------- Networks initialized -------------')\n\n        # load networks\n        if not self.isTrain or opt.continue_train or opt.load_pretrain:\n            pretrained_path = '' if not self.isTrain else opt.load_pretrain\n            self.load_network(self.netG, 'G', opt.which_epoch, pretrained_path)            \n            if self.isTrain:\n                self.load_network(self.netD, 'D', opt.which_epoch, pretrained_path)  \n            if self.gen_features:\n                self.load_network(self.netE, 'E', opt.which_epoch, pretrained_path)              \n\n        # set loss functions and optimizers\n        if self.isTrain:\n            if opt.pool_size > 0 and (len(self.gpu_ids)) > 1:\n                raise NotImplementedError(\"Fake Pool Not Implemented for MultiGPU\")\n            self.fake_pool = ImagePool(opt.pool_size)\n            self.old_lr = opt.lr\n\n            # define loss functions\n            self.loss_filter = self.init_loss_filter(not opt.no_ganFeat_loss, not opt.no_vgg_loss)\n            \n            self.criterionGAN = networks.GANLoss(use_lsgan=not opt.no_lsgan, tensor=self.Tensor)   \n            self.criterionFeat = torch.nn.L1Loss()\n            if not opt.no_vgg_loss:             \n                self.criterionVGG = networks.VGGLoss(self.gpu_ids)\n                \n        \n            # Names so we can breakout loss\n            self.loss_names = self.loss_filter('G_GAN','G_GAN_Feat','G_VGG','D_real', 'D_fake')\n\n            # initialize optimizers\n            # optimizer G\n            if opt.niter_fix_global > 0:                \n                import sys\n                if sys.version_info >= (3,0):\n                    finetune_list = set()\n                else:\n                    from sets import Set\n                    finetune_list = Set()\n\n                params_dict = dict(self.netG.named_parameters())\n                params = []\n                for key, value in params_dict.items():       \n                    if key.startswith('model' + str(opt.n_local_enhancers)):                    \n                        params += [value]\n                        finetune_list.add(key.split('.')[0])  \n                print('------------- Only training the local enhancer network (for %d epochs) ------------' % opt.niter_fix_global)\n                print('The layers that are finetuned are ', sorted(finetune_list))                         \n            else:\n                params = list(self.netG.parameters())\n            if self.gen_features:              \n                params += list(self.netE.parameters())         \n            self.optimizer_G = torch.optim.Adam(params, lr=opt.lr, betas=(opt.beta1, 0.999))                            \n\n            # optimizer D                        \n            params = list(self.netD.parameters())    \n            self.optimizer_D = torch.optim.Adam(params, lr=opt.lr, betas=(opt.beta1, 0.999))\n\n    def encode_input(self, label_map, inst_map=None, real_image=None, feat_map=None, infer=False):             \n        if self.opt.label_nc == 0:\n            input_label = label_map.data.cuda()\n        else:\n            # create one-hot vector for label map \n            size = label_map.size()\n            oneHot_size = (size[0], self.opt.label_nc, size[2], size[3])\n            input_label = torch.cuda.FloatTensor(torch.Size(oneHot_size)).zero_()\n            input_label = input_label.scatter_(1, label_map.data.long().cuda(), 1.0)\n            if self.opt.data_type == 16:\n                input_label = input_label.half()\n\n        # get edges from instance map\n        if not self.opt.no_instance:\n            inst_map = inst_map.data.cuda()\n            edge_map = self.get_edges(inst_map)\n            input_label = torch.cat((input_label, edge_map), dim=1)         \n        input_label = Variable(input_label, volatile=infer)\n\n        # real images for training\n        if real_image is not None:\n            real_image = Variable(real_image.data.cuda())\n\n        # instance map for feature encoding\n        if self.use_features:\n            # get precomputed feature maps\n            if self.opt.load_features:\n                feat_map = Variable(feat_map.data.cuda())\n            if self.opt.label_feat:\n                inst_map = label_map.cuda()\n\n        return input_label, inst_map, real_image, feat_map\n\n    def discriminate(self, input_label, test_image, use_pool=False):\n        input_concat = torch.cat((input_label, test_image.detach()), dim=1)\n        if use_pool:            \n            fake_query = self.fake_pool.query(input_concat)\n            return self.netD.forward(fake_query)\n        else:\n            return self.netD.forward(input_concat)\n\n    def forward(self, label, inst, image, feat, infer=False):\n        # Encode Inputs\n        input_label, inst_map, real_image, feat_map = self.encode_input(label, inst, image, feat)  \n\n        # Fake Generation\n        if self.use_features:\n            if not self.opt.load_features:\n                feat_map = self.netE.forward(real_image, inst_map)                     \n            input_concat = torch.cat((input_label, feat_map), dim=1)                        \n        else:\n            input_concat = input_label\n        fake_image = self.netG.forward(input_concat)\n\n        # Fake Detection and Loss\n        pred_fake_pool = self.discriminate(input_label, fake_image, use_pool=True)\n        loss_D_fake = self.criterionGAN(pred_fake_pool, False)        \n\n        # Real Detection and Loss        \n        pred_real = self.discriminate(input_label, real_image)\n        loss_D_real = self.criterionGAN(pred_real, True)\n\n        # GAN loss (Fake Passability Loss)        \n        pred_fake = self.netD.forward(torch.cat((input_label, fake_image), dim=1))        \n        loss_G_GAN = self.criterionGAN(pred_fake, True)               \n        \n        # GAN feature matching loss\n        loss_G_GAN_Feat = 0\n        if not self.opt.no_ganFeat_loss:\n            feat_weights = 4.0 / (self.opt.n_layers_D + 1)\n            D_weights = 1.0 / self.opt.num_D\n            for i in range(self.opt.num_D):\n                for j in range(len(pred_fake[i])-1):\n                    loss_G_GAN_Feat += D_weights * feat_weights * \\\n                        self.criterionFeat(pred_fake[i][j], pred_real[i][j].detach()) * self.opt.lambda_feat\n                   \n        # VGG feature matching loss\n        loss_G_VGG = 0\n        if not self.opt.no_vgg_loss:\n            loss_G_VGG = self.criterionVGG(fake_image, real_image) * self.opt.lambda_feat\n        \n        # Only return the fake_B image if necessary to save BW\n        return [ self.loss_filter( loss_G_GAN, loss_G_GAN_Feat, loss_G_VGG, loss_D_real, loss_D_fake ), None if not infer else fake_image ]\n\n    def inference(self, label, inst, image=None):\n        # Encode Inputs        \n        image = Variable(image) if image is not None else None\n        input_label, inst_map, real_image, _ = self.encode_input(Variable(label), Variable(inst), image, infer=True)\n\n        # Fake Generation\n        if self.use_features:\n            if self.opt.use_encoded_image:\n                # encode the real image to get feature map\n                feat_map = self.netE.forward(real_image, inst_map)\n            else:\n                # sample clusters from precomputed features             \n                feat_map = self.sample_features(inst_map)\n            input_concat = torch.cat((input_label, feat_map), dim=1)                        \n        else:\n            input_concat = input_label        \n           \n        if torch.__version__.startswith('0.4'):\n            with torch.no_grad():\n                fake_image = self.netG.forward(input_concat)\n        else:\n            fake_image = self.netG.forward(input_concat)\n        return fake_image\n\n    def sample_features(self, inst): \n        # read precomputed feature clusters \n        cluster_path = os.path.join(self.opt.checkpoints_dir, self.opt.name, self.opt.cluster_path)        \n        features_clustered = np.load(cluster_path, encoding='latin1').item()\n\n        # randomly sample from the feature clusters\n        inst_np = inst.cpu().numpy().astype(int)                                      \n        feat_map = self.Tensor(inst.size()[0], self.opt.feat_num, inst.size()[2], inst.size()[3])\n        for i in np.unique(inst_np):    \n            label = i if i < 1000 else i//1000\n            if label in features_clustered:\n                feat = features_clustered[label]\n                cluster_idx = np.random.randint(0, feat.shape[0]) \n                                            \n                idx = (inst == int(i)).nonzero()\n                for k in range(self.opt.feat_num):                                    \n                    feat_map[idx[:,0], idx[:,1] + k, idx[:,2], idx[:,3]] = feat[cluster_idx, k]\n        if self.opt.data_type==16:\n            feat_map = feat_map.half()\n        return feat_map\n\n    def encode_features(self, image, inst):\n        image = Variable(image.cuda(), volatile=True)\n        feat_num = self.opt.feat_num\n        h, w = inst.size()[2], inst.size()[3]\n        block_num = 32\n        feat_map = self.netE.forward(image, inst.cuda())\n        inst_np = inst.cpu().numpy().astype(int)\n        feature = {}\n        for i in range(self.opt.label_nc):\n            feature[i] = np.zeros((0, feat_num+1))\n        for i in np.unique(inst_np):\n            label = i if i < 1000 else i//1000\n            idx = (inst == int(i)).nonzero()\n            num = idx.size()[0]\n            idx = idx[num//2,:]\n            val = np.zeros((1, feat_num+1))                        \n            for k in range(feat_num):\n                val[0, k] = feat_map[idx[0], idx[1] + k, idx[2], idx[3]].data[0]            \n            val[0, feat_num] = float(num) / (h * w // block_num)\n            feature[label] = np.append(feature[label], val, axis=0)\n        return feature\n\n    def get_edges(self, t):\n        edge = torch.cuda.ByteTensor(t.size()).zero_()\n        edge[:,:,:,1:] = edge[:,:,:,1:] | (t[:,:,:,1:] != t[:,:,:,:-1])\n        edge[:,:,:,:-1] = edge[:,:,:,:-1] | (t[:,:,:,1:] != t[:,:,:,:-1])\n        edge[:,:,1:,:] = edge[:,:,1:,:] | (t[:,:,1:,:] != t[:,:,:-1,:])\n        edge[:,:,:-1,:] = edge[:,:,:-1,:] | (t[:,:,1:,:] != t[:,:,:-1,:])\n        if self.opt.data_type==16:\n            return edge.half()\n        else:\n            return edge.float()\n\n    def save(self, which_epoch):\n        self.save_network(self.netG, 'G', which_epoch, self.gpu_ids)\n        self.save_network(self.netD, 'D', which_epoch, self.gpu_ids)\n        if self.gen_features:\n            self.save_network(self.netE, 'E', which_epoch, self.gpu_ids)\n\n    def update_fixed_params(self):\n        # after fixing the global generator for a number of iterations, also start finetuning it\n        params = list(self.netG.parameters())\n        if self.gen_features:\n            params += list(self.netE.parameters())           \n        self.optimizer_G = torch.optim.Adam(params, lr=self.opt.lr, betas=(self.opt.beta1, 0.999))\n        if self.opt.verbose:\n            print('------------ Now also finetuning global generator -----------')\n\n    def update_learning_rate(self):\n        lrd = self.opt.lr / self.opt.niter_decay\n        lr = self.old_lr - lrd        \n        for param_group in self.optimizer_D.param_groups:\n            param_group['lr'] = lr\n        for param_group in self.optimizer_G.param_groups:\n            param_group['lr'] = lr\n        if self.opt.verbose:\n            print('update learning rate: %f -> %f' % (self.old_lr, lr))\n        self.old_lr = lr\n\nclass InferenceModel(Pix2PixHDModel):\n    def forward(self, inp):\n        label, inst = inp\n        return self.inference(label, inst)\n\n        \n"
  },
  {
    "path": "models/ui_model.py",
    "content": "import torch\nfrom torch.autograd import Variable\nfrom collections import OrderedDict\nimport numpy as np\nimport os\nfrom PIL import Image\nimport util.util as util\nfrom .base_model import BaseModel\nfrom . import networks\n\nclass UIModel(BaseModel):\n    def name(self):\n        return 'UIModel'\n\n    def initialize(self, opt):\n        assert(not opt.isTrain)\n        BaseModel.initialize(self, opt)\n        self.use_features = opt.instance_feat or opt.label_feat\n\n        netG_input_nc = opt.label_nc\n        if not opt.no_instance:\n            netG_input_nc += 1            \n        if self.use_features:   \n            netG_input_nc += opt.feat_num           \n\n        self.netG = networks.define_G(netG_input_nc, opt.output_nc, opt.ngf, opt.netG, \n                                      opt.n_downsample_global, opt.n_blocks_global, opt.n_local_enhancers, \n                                      opt.n_blocks_local, opt.norm, gpu_ids=self.gpu_ids)            \n        self.load_network(self.netG, 'G', opt.which_epoch)\n\n        print('---------- Networks initialized -------------')\n\n    def toTensor(self, img, normalize=False):\n        tensor = torch.from_numpy(np.array(img, np.int32, copy=False))\n        tensor = tensor.view(1, img.size[1], img.size[0], len(img.mode))    \n        tensor = tensor.transpose(1, 2).transpose(1, 3).contiguous()\n        if normalize:\n            return (tensor.float()/255.0 - 0.5) / 0.5        \n        return tensor.float()\n\n    def load_image(self, label_path, inst_path, feat_path):\n        opt = self.opt\n        # read label map\n        label_img = Image.open(label_path)    \n        if label_path.find('face') != -1:\n            label_img = label_img.convert('L')\n        ow, oh = label_img.size    \n        w = opt.loadSize\n        h = int(w * oh / ow)    \n        label_img = label_img.resize((w, h), Image.NEAREST)\n        label_map = self.toTensor(label_img)           \n        \n        # onehot vector input for label map\n        self.label_map = label_map.cuda()\n        oneHot_size = (1, opt.label_nc, h, w)\n        input_label = self.Tensor(torch.Size(oneHot_size)).zero_()\n        self.input_label = input_label.scatter_(1, label_map.long().cuda(), 1.0)\n\n        # read instance map\n        if not opt.no_instance:\n            inst_img = Image.open(inst_path)        \n            inst_img = inst_img.resize((w, h), Image.NEAREST)            \n            self.inst_map = self.toTensor(inst_img).cuda()\n            self.edge_map = self.get_edges(self.inst_map)          \n            self.net_input = Variable(torch.cat((self.input_label, self.edge_map), dim=1), volatile=True)\n        else:\n            self.net_input = Variable(self.input_label, volatile=True)  \n        \n        self.features_clustered = np.load(feat_path).item()\n        self.object_map = self.inst_map if opt.instance_feat else self.label_map \n                       \n        object_np = self.object_map.cpu().numpy().astype(int) \n        self.feat_map = self.Tensor(1, opt.feat_num, h, w).zero_()                 \n        self.cluster_indices = np.zeros(self.opt.label_nc, np.uint8)\n        for i in np.unique(object_np):    \n            label = i if i < 1000 else i//1000\n            if label in self.features_clustered:\n                feat = self.features_clustered[label]\n                np.random.seed(i+1)\n                cluster_idx = np.random.randint(0, feat.shape[0])\n                self.cluster_indices[label] = cluster_idx\n                idx = (self.object_map == i).nonzero()                    \n                self.set_features(idx, feat, cluster_idx)\n\n        self.net_input_original = self.net_input.clone()        \n        self.label_map_original = self.label_map.clone()\n        self.feat_map_original = self.feat_map.clone()\n        if not opt.no_instance:\n            self.inst_map_original = self.inst_map.clone()        \n\n    def reset(self):\n        self.net_input = self.net_input_prev = self.net_input_original.clone()        \n        self.label_map = self.label_map_prev = self.label_map_original.clone()\n        self.feat_map = self.feat_map_prev = self.feat_map_original.clone()\n        if not self.opt.no_instance:\n            self.inst_map = self.inst_map_prev = self.inst_map_original.clone()\n        self.object_map = self.inst_map if self.opt.instance_feat else self.label_map \n\n    def undo(self):        \n        self.net_input = self.net_input_prev\n        self.label_map = self.label_map_prev\n        self.feat_map = self.feat_map_prev\n        if not self.opt.no_instance:\n            self.inst_map = self.inst_map_prev\n        self.object_map = self.inst_map if self.opt.instance_feat else self.label_map \n            \n    # get boundary map from instance map\n    def get_edges(self, t):\n        edge = torch.cuda.ByteTensor(t.size()).zero_()\n        edge[:,:,:,1:] = edge[:,:,:,1:] | (t[:,:,:,1:] != t[:,:,:,:-1])\n        edge[:,:,:,:-1] = edge[:,:,:,:-1] | (t[:,:,:,1:] != t[:,:,:,:-1])\n        edge[:,:,1:,:] = edge[:,:,1:,:] | (t[:,:,1:,:] != t[:,:,:-1,:])\n        edge[:,:,:-1,:] = edge[:,:,:-1,:] | (t[:,:,1:,:] != t[:,:,:-1,:])\n        return edge.float()\n\n    # change the label at the source position to the label at the target position\n    def change_labels(self, click_src, click_tgt): \n        y_src, x_src = click_src[0], click_src[1]\n        y_tgt, x_tgt = click_tgt[0], click_tgt[1]\n        label_src = int(self.label_map[0, 0, y_src, x_src])\n        inst_src = self.inst_map[0, 0, y_src, x_src]\n        label_tgt = int(self.label_map[0, 0, y_tgt, x_tgt])\n        inst_tgt = self.inst_map[0, 0, y_tgt, x_tgt]\n\n        idx_src = (self.inst_map == inst_src).nonzero()         \n        # need to change 3 things: label map, instance map, and feature map\n        if idx_src.shape:\n            # backup current maps\n            self.backup_current_state() \n\n            # change both the label map and the network input\n            self.label_map[idx_src[:,0], idx_src[:,1], idx_src[:,2], idx_src[:,3]] = label_tgt\n            self.net_input[idx_src[:,0], idx_src[:,1] + label_src, idx_src[:,2], idx_src[:,3]] = 0\n            self.net_input[idx_src[:,0], idx_src[:,1] + label_tgt, idx_src[:,2], idx_src[:,3]] = 1                                    \n            \n            # update the instance map (and the network input)\n            if inst_tgt > 1000:\n                # if different instances have different ids, give the new object a new id\n                tgt_indices = (self.inst_map > label_tgt * 1000) & (self.inst_map < (label_tgt+1) * 1000)\n                inst_tgt = self.inst_map[tgt_indices].max() + 1\n            self.inst_map[idx_src[:,0], idx_src[:,1], idx_src[:,2], idx_src[:,3]] = inst_tgt\n            self.net_input[:,-1,:,:] = self.get_edges(self.inst_map)\n\n            # also copy the source features to the target position      \n            idx_tgt = (self.inst_map == inst_tgt).nonzero()    \n            if idx_tgt.shape:\n                self.copy_features(idx_src, idx_tgt[0,:])\n\n        self.fake_image = util.tensor2im(self.single_forward(self.net_input, self.feat_map))\n\n    # add strokes of target label in the image\n    def add_strokes(self, click_src, label_tgt, bw, save):\n        # get the region of the new strokes (bw is the brush width)        \n        size = self.net_input.size()\n        h, w = size[2], size[3]\n        idx_src = torch.LongTensor(bw**2, 4).fill_(0)\n        for i in range(bw):\n            idx_src[i*bw:(i+1)*bw, 2] = min(h-1, max(0, click_src[0]-bw//2 + i))\n            for j in range(bw):\n                idx_src[i*bw+j, 3] = min(w-1, max(0, click_src[1]-bw//2 + j))\n        idx_src = idx_src.cuda()\n        \n        # again, need to update 3 things\n        if idx_src.shape:\n            # backup current maps\n            if save:\n                self.backup_current_state()\n\n            # update the label map (and the network input) in the stroke region            \n            self.label_map[idx_src[:,0], idx_src[:,1], idx_src[:,2], idx_src[:,3]] = label_tgt\n            for k in range(self.opt.label_nc):\n                self.net_input[idx_src[:,0], idx_src[:,1] + k, idx_src[:,2], idx_src[:,3]] = 0\n            self.net_input[idx_src[:,0], idx_src[:,1] + label_tgt, idx_src[:,2], idx_src[:,3]] = 1                 \n\n            # update the instance map (and the network input)\n            self.inst_map[idx_src[:,0], idx_src[:,1], idx_src[:,2], idx_src[:,3]] = label_tgt\n            self.net_input[:,-1,:,:] = self.get_edges(self.inst_map)\n            \n            # also update the features if available\n            if self.opt.instance_feat:                                            \n                feat = self.features_clustered[label_tgt]\n                #np.random.seed(label_tgt+1)   \n                #cluster_idx = np.random.randint(0, feat.shape[0])\n                cluster_idx = self.cluster_indices[label_tgt]\n                self.set_features(idx_src, feat, cluster_idx)                                                  \n        \n        self.fake_image = util.tensor2im(self.single_forward(self.net_input, self.feat_map))\n\n    # add an object to the clicked position with selected style\n    def add_objects(self, click_src, label_tgt, mask, style_id=0):\n        y, x = click_src[0], click_src[1]\n        mask = np.transpose(mask, (2, 0, 1))[np.newaxis,...]        \n        idx_src = torch.from_numpy(mask).cuda().nonzero()        \n        idx_src[:,2] += y\n        idx_src[:,3] += x\n\n        # backup current maps\n        self.backup_current_state()\n\n        # update label map\n        self.label_map[idx_src[:,0], idx_src[:,1], idx_src[:,2], idx_src[:,3]] = label_tgt        \n        for k in range(self.opt.label_nc):\n            self.net_input[idx_src[:,0], idx_src[:,1] + k, idx_src[:,2], idx_src[:,3]] = 0\n        self.net_input[idx_src[:,0], idx_src[:,1] + label_tgt, idx_src[:,2], idx_src[:,3]] = 1            \n\n        # update instance map\n        self.inst_map[idx_src[:,0], idx_src[:,1], idx_src[:,2], idx_src[:,3]] = label_tgt\n        self.net_input[:,-1,:,:] = self.get_edges(self.inst_map)\n                \n        # update feature map\n        self.set_features(idx_src, self.feat, style_id)                \n        \n        self.fake_image = util.tensor2im(self.single_forward(self.net_input, self.feat_map))\n\n    def single_forward(self, net_input, feat_map):\n        net_input = torch.cat((net_input, feat_map), dim=1)\n        fake_image = self.netG.forward(net_input)\n\n        if fake_image.size()[0] == 1:\n            return fake_image.data[0]        \n        return fake_image.data\n\n\n    # generate all outputs for different styles\n    def style_forward(self, click_pt, style_id=-1):           \n        if click_pt is None:            \n            self.fake_image = util.tensor2im(self.single_forward(self.net_input, self.feat_map))\n            self.crop = None\n            self.mask = None        \n        else:                       \n            instToChange = int(self.object_map[0, 0, click_pt[0], click_pt[1]])\n            self.instToChange = instToChange\n            label = instToChange if instToChange < 1000 else instToChange//1000        \n            self.feat = self.features_clustered[label]\n            self.fake_image = []\n            self.mask = self.object_map == instToChange\n            idx = self.mask.nonzero()\n            self.get_crop_region(idx)            \n            if idx.size():                \n                if style_id == -1:\n                    (min_y, min_x, max_y, max_x) = self.crop\n                    ### original\n                    for cluster_idx in range(self.opt.multiple_output):\n                        self.set_features(idx, self.feat, cluster_idx)\n                        fake_image = self.single_forward(self.net_input, self.feat_map)\n                        fake_image = util.tensor2im(fake_image[:,min_y:max_y,min_x:max_x])\n                        self.fake_image.append(fake_image)    \n                    \"\"\"### To speed up previewing different style results, either crop or downsample the label maps\n                    if instToChange > 1000:\n                        (min_y, min_x, max_y, max_x) = self.crop                                                \n                        ### crop                                                \n                        _, _, h, w = self.net_input.size()\n                        offset = 512\n                        y_start, x_start = max(0, min_y-offset), max(0, min_x-offset)\n                        y_end, x_end = min(h, (max_y + offset)), min(w, (max_x + offset))\n                        y_region = slice(y_start, y_start+(y_end-y_start)//16*16)\n                        x_region = slice(x_start, x_start+(x_end-x_start)//16*16)\n                        net_input = self.net_input[:,:,y_region,x_region]                    \n                        for cluster_idx in range(self.opt.multiple_output):  \n                            self.set_features(idx, self.feat, cluster_idx)\n                            fake_image = self.single_forward(net_input, self.feat_map[:,:,y_region,x_region])                            \n                            fake_image = util.tensor2im(fake_image[:,min_y-y_start:max_y-y_start,min_x-x_start:max_x-x_start])\n                            self.fake_image.append(fake_image)\n                    else:\n                        ### downsample\n                        (min_y, min_x, max_y, max_x) = [crop//2 for crop in self.crop]                    \n                        net_input = self.net_input[:,:,::2,::2]                    \n                        size = net_input.size()\n                        net_input_batch = net_input.expand(self.opt.multiple_output, size[1], size[2], size[3])             \n                        for cluster_idx in range(self.opt.multiple_output):  \n                            self.set_features(idx, self.feat, cluster_idx)\n                            feat_map = self.feat_map[:,:,::2,::2]\n                            if cluster_idx == 0:\n                                feat_map_batch = feat_map\n                            else:\n                                feat_map_batch = torch.cat((feat_map_batch, feat_map), dim=0)\n                        fake_image_batch = self.single_forward(net_input_batch, feat_map_batch)\n                        for i in range(self.opt.multiple_output):\n                            self.fake_image.append(util.tensor2im(fake_image_batch[i,:,min_y:max_y,min_x:max_x]))\"\"\"\n                                        \n                else:\n                    self.set_features(idx, self.feat, style_id)\n                    self.cluster_indices[label] = style_id\n                    self.fake_image = util.tensor2im(self.single_forward(self.net_input, self.feat_map))        \n\n    def backup_current_state(self):\n        self.net_input_prev = self.net_input.clone()\n        self.label_map_prev = self.label_map.clone() \n        self.inst_map_prev = self.inst_map.clone() \n        self.feat_map_prev = self.feat_map.clone() \n\n    # crop the ROI and get the mask of the object\n    def get_crop_region(self, idx):\n        size = self.net_input.size()\n        h, w = size[2], size[3]\n        min_y, min_x = idx[:,2].min(), idx[:,3].min()\n        max_y, max_x = idx[:,2].max(), idx[:,3].max()             \n        crop_min = 128\n        if max_y - min_y < crop_min:\n            min_y = max(0, (max_y + min_y) // 2 - crop_min // 2)\n            max_y = min(h-1, min_y + crop_min)\n        if max_x - min_x < crop_min:\n            min_x = max(0, (max_x + min_x) // 2 - crop_min // 2)\n            max_x = min(w-1, min_x + crop_min)\n        self.crop = (min_y, min_x, max_y, max_x)           \n        self.mask = self.mask[:,:, min_y:max_y, min_x:max_x]\n\n    # update the feature map once a new object is added or the label is changed\n    def update_features(self, cluster_idx, mask=None, click_pt=None):        \n        self.feat_map_prev = self.feat_map.clone()\n        # adding a new object\n        if mask is not None:\n            y, x = click_pt[0], click_pt[1]\n            mask = np.transpose(mask, (2,0,1))[np.newaxis,...]        \n            idx = torch.from_numpy(mask).cuda().nonzero()        \n            idx[:,2] += y\n            idx[:,3] += x    \n        # changing the label of an existing object \n        else:            \n            idx = (self.object_map == self.instToChange).nonzero()              \n\n        # update feature map\n        self.set_features(idx, self.feat, cluster_idx)        \n\n    # set the class features to the target feature\n    def set_features(self, idx, feat, cluster_idx):        \n        for k in range(self.opt.feat_num):\n            self.feat_map[idx[:,0], idx[:,1] + k, idx[:,2], idx[:,3]] = feat[cluster_idx, k] \n\n    # copy the features at the target position to the source position\n    def copy_features(self, idx_src, idx_tgt):        \n        for k in range(self.opt.feat_num):\n            val = self.feat_map[idx_tgt[0], idx_tgt[1] + k, idx_tgt[2], idx_tgt[3]]\n            self.feat_map[idx_src[:,0], idx_src[:,1] + k, idx_src[:,2], idx_src[:,3]] = val \n\n    def get_current_visuals(self, getLabel=False):                              \n        mask = self.mask     \n        if self.mask is not None:\n            mask = np.transpose(self.mask[0].cpu().float().numpy(), (1,2,0)).astype(np.uint8)        \n\n        dict_list = [('fake_image', self.fake_image), ('mask', mask)]\n\n        if getLabel: # only output label map if needed to save bandwidth\n            label = util.tensor2label(self.net_input.data[0], self.opt.label_nc)                    \n            dict_list += [('label', label)]\n\n        return OrderedDict(dict_list)"
  },
  {
    "path": "options/__init__.py",
    "content": ""
  },
  {
    "path": "options/base_options.py",
    "content": "import argparse\nimport os\nfrom util import util\nimport torch\n\nclass BaseOptions():\n    def __init__(self):\n        self.parser = argparse.ArgumentParser()\n        self.initialized = False\n\n    def initialize(self):    \n        # experiment specifics\n        self.parser.add_argument('--name', type=str, default='label2city', help='name of the experiment. It decides where to store samples and models')        \n        self.parser.add_argument('--gpu_ids', type=str, default='0', help='gpu ids: e.g. 0  0,1,2, 0,2. use -1 for CPU')\n        self.parser.add_argument('--checkpoints_dir', type=str, default='./checkpoints', help='models are saved here')\n        self.parser.add_argument('--model', type=str, default='pix2pixHD', help='which model to use')\n        self.parser.add_argument('--norm', type=str, default='instance', help='instance normalization or batch normalization')        \n        self.parser.add_argument('--use_dropout', action='store_true', help='use dropout for the generator')\n        self.parser.add_argument('--data_type', default=32, type=int, choices=[8, 16, 32], help=\"Supported data type i.e. 8, 16, 32 bit\")\n        self.parser.add_argument('--verbose', action='store_true', default=False, help='toggles verbose')\n        self.parser.add_argument('--fp16', action='store_true', default=False, help='train with AMP')\n        self.parser.add_argument('--local_rank', type=int, default=0, help='local rank for distributed training')\n\n        # input/output sizes       \n        self.parser.add_argument('--batchSize', type=int, default=1, help='input batch size')\n        self.parser.add_argument('--loadSize', type=int, default=1024, help='scale images to this size')\n        self.parser.add_argument('--fineSize', type=int, default=512, help='then crop to this size')\n        self.parser.add_argument('--label_nc', type=int, default=35, help='# of input label channels')\n        self.parser.add_argument('--input_nc', type=int, default=3, help='# of input image channels')\n        self.parser.add_argument('--output_nc', type=int, default=3, help='# of output image channels')\n\n        # for setting inputs\n        self.parser.add_argument('--dataroot', type=str, default='./datasets/cityscapes/') \n        self.parser.add_argument('--resize_or_crop', type=str, default='scale_width', help='scaling and cropping of images at load time [resize_and_crop|crop|scale_width|scale_width_and_crop]')\n        self.parser.add_argument('--serial_batches', action='store_true', help='if true, takes images in order to make batches, otherwise takes them randomly')        \n        self.parser.add_argument('--no_flip', action='store_true', help='if specified, do not flip the images for data argumentation') \n        self.parser.add_argument('--nThreads', default=2, type=int, help='# threads for loading data')                \n        self.parser.add_argument('--max_dataset_size', type=int, default=float(\"inf\"), help='Maximum number of samples allowed per dataset. If the dataset directory contains more than max_dataset_size, only a subset is loaded.')\n\n        # for displays\n        self.parser.add_argument('--display_winsize', type=int, default=512,  help='display window size')\n        self.parser.add_argument('--tf_log', action='store_true', help='if specified, use tensorboard logging. Requires tensorflow installed')\n\n        # for generator\n        self.parser.add_argument('--netG', type=str, default='global', help='selects model to use for netG')\n        self.parser.add_argument('--ngf', type=int, default=64, help='# of gen filters in first conv layer')\n        self.parser.add_argument('--n_downsample_global', type=int, default=4, help='number of downsampling layers in netG') \n        self.parser.add_argument('--n_blocks_global', type=int, default=9, help='number of residual blocks in the global generator network')\n        self.parser.add_argument('--n_blocks_local', type=int, default=3, help='number of residual blocks in the local enhancer network')\n        self.parser.add_argument('--n_local_enhancers', type=int, default=1, help='number of local enhancers to use')        \n        self.parser.add_argument('--niter_fix_global', type=int, default=0, help='number of epochs that we only train the outmost local enhancer')        \n\n        # for instance-wise features\n        self.parser.add_argument('--no_instance', action='store_true', help='if specified, do *not* add instance map as input')        \n        self.parser.add_argument('--instance_feat', action='store_true', help='if specified, add encoded instance features as input')\n        self.parser.add_argument('--label_feat', action='store_true', help='if specified, add encoded label features as input')        \n        self.parser.add_argument('--feat_num', type=int, default=3, help='vector length for encoded features')        \n        self.parser.add_argument('--load_features', action='store_true', help='if specified, load precomputed feature maps')\n        self.parser.add_argument('--n_downsample_E', type=int, default=4, help='# of downsampling layers in encoder') \n        self.parser.add_argument('--nef', type=int, default=16, help='# of encoder filters in the first conv layer')        \n        self.parser.add_argument('--n_clusters', type=int, default=10, help='number of clusters for features')        \n\n        self.initialized = True\n\n    def parse(self, save=True):\n        if not self.initialized:\n            self.initialize()\n        self.opt = self.parser.parse_args()\n        self.opt.isTrain = self.isTrain   # train or test\n\n        str_ids = self.opt.gpu_ids.split(',')\n        self.opt.gpu_ids = []\n        for str_id in str_ids:\n            id = int(str_id)\n            if id >= 0:\n                self.opt.gpu_ids.append(id)\n        \n        # set gpu ids\n        if len(self.opt.gpu_ids) > 0:\n            torch.cuda.set_device(self.opt.gpu_ids[0])\n\n        args = vars(self.opt)\n\n        print('------------ Options -------------')\n        for k, v in sorted(args.items()):\n            print('%s: %s' % (str(k), str(v)))\n        print('-------------- End ----------------')\n\n        # save to the disk        \n        expr_dir = os.path.join(self.opt.checkpoints_dir, self.opt.name)\n        util.mkdirs(expr_dir)\n        if save and not self.opt.continue_train:\n            file_name = os.path.join(expr_dir, 'opt.txt')\n            with open(file_name, 'wt') as opt_file:\n                opt_file.write('------------ Options -------------\\n')\n                for k, v in sorted(args.items()):\n                    opt_file.write('%s: %s\\n' % (str(k), str(v)))\n                opt_file.write('-------------- End ----------------\\n')\n        return self.opt\n"
  },
  {
    "path": "options/test_options.py",
    "content": "from .base_options import BaseOptions\n\nclass TestOptions(BaseOptions):\n    def initialize(self):\n        BaseOptions.initialize(self)\n        self.parser.add_argument('--ntest', type=int, default=float(\"inf\"), help='# of test examples.')\n        self.parser.add_argument('--results_dir', type=str, default='./results/', help='saves results here.')\n        self.parser.add_argument('--aspect_ratio', type=float, default=1.0, help='aspect ratio of result images')\n        self.parser.add_argument('--phase', type=str, default='test', help='train, val, test, etc')\n        self.parser.add_argument('--which_epoch', type=str, default='latest', help='which epoch to load? set to latest to use latest cached model')\n        self.parser.add_argument('--how_many', type=int, default=50, help='how many test images to run')       \n        self.parser.add_argument('--cluster_path', type=str, default='features_clustered_010.npy', help='the path for clustered results of encoded features')\n        self.parser.add_argument('--use_encoded_image', action='store_true', help='if specified, encode the real image to get the feature map')\n        self.parser.add_argument(\"--export_onnx\", type=str, help=\"export ONNX model to a given file\")\n        self.parser.add_argument(\"--engine\", type=str, help=\"run serialized TRT engine\")\n        self.parser.add_argument(\"--onnx\", type=str, help=\"run ONNX model via TRT\")        \n        self.isTrain = False\n"
  },
  {
    "path": "options/train_options.py",
    "content": "from .base_options import BaseOptions\n\nclass TrainOptions(BaseOptions):\n    def initialize(self):\n        BaseOptions.initialize(self)\n        # for displays\n        self.parser.add_argument('--display_freq', type=int, default=100, help='frequency of showing training results on screen')\n        self.parser.add_argument('--print_freq', type=int, default=100, help='frequency of showing training results on console')\n        self.parser.add_argument('--save_latest_freq', type=int, default=1000, help='frequency of saving the latest results')\n        self.parser.add_argument('--save_epoch_freq', type=int, default=10, help='frequency of saving checkpoints at the end of epochs')        \n        self.parser.add_argument('--no_html', action='store_true', help='do not save intermediate training results to [opt.checkpoints_dir]/[opt.name]/web/')\n        self.parser.add_argument('--debug', action='store_true', help='only do one epoch and displays at each iteration')\n\n        # for training\n        self.parser.add_argument('--continue_train', action='store_true', help='continue training: load the latest model')\n        self.parser.add_argument('--load_pretrain', type=str, default='', help='load the pretrained model from the specified location')\n        self.parser.add_argument('--which_epoch', type=str, default='latest', help='which epoch to load? set to latest to use latest cached model')\n        self.parser.add_argument('--phase', type=str, default='train', help='train, val, test, etc')\n        self.parser.add_argument('--niter', type=int, default=100, help='# of iter at starting learning rate')\n        self.parser.add_argument('--niter_decay', type=int, default=100, help='# of iter to linearly decay learning rate to zero')\n        self.parser.add_argument('--beta1', type=float, default=0.5, help='momentum term of adam')\n        self.parser.add_argument('--lr', type=float, default=0.0002, help='initial learning rate for adam')\n\n        # for discriminators        \n        self.parser.add_argument('--num_D', type=int, default=2, help='number of discriminators to use')\n        self.parser.add_argument('--n_layers_D', type=int, default=3, help='only used if which_model_netD==n_layers')\n        self.parser.add_argument('--ndf', type=int, default=64, help='# of discrim filters in first conv layer')    \n        self.parser.add_argument('--lambda_feat', type=float, default=10.0, help='weight for feature matching loss')                \n        self.parser.add_argument('--no_ganFeat_loss', action='store_true', help='if specified, do *not* use discriminator feature matching loss')\n        self.parser.add_argument('--no_vgg_loss', action='store_true', help='if specified, do *not* use VGG feature matching loss')        \n        self.parser.add_argument('--no_lsgan', action='store_true', help='do *not* use least square GAN, if false, use vanilla GAN')\n        self.parser.add_argument('--pool_size', type=int, default=0, help='the size of image buffer that stores previously generated images')\n\n        self.isTrain = True\n"
  },
  {
    "path": "precompute_feature_maps.py",
    "content": "from options.train_options import TrainOptions\r\nfrom data.data_loader import CreateDataLoader\r\nfrom models.models import create_model\r\nimport os\r\nimport util.util as util\r\nfrom torch.autograd import Variable\r\nimport torch.nn as nn\r\n\r\nopt = TrainOptions().parse()\r\nopt.nThreads = 1\r\nopt.batchSize = 1 \r\nopt.serial_batches = True \r\nopt.no_flip = True\r\nopt.instance_feat = True\r\n\r\nname = 'features'\r\nsave_path = os.path.join(opt.checkpoints_dir, opt.name)\r\n\r\n############ Initialize #########\r\ndata_loader = CreateDataLoader(opt)\r\ndataset = data_loader.load_data()\r\ndataset_size = len(data_loader)\r\nmodel = create_model(opt)\r\nutil.mkdirs(os.path.join(opt.dataroot, opt.phase + '_feat'))\r\n\r\n######## Save precomputed feature maps for 1024p training #######\r\nfor i, data in enumerate(dataset):\r\n\tprint('%d / %d images' % (i+1, dataset_size)) \r\n\tfeat_map = model.module.netE.forward(Variable(data['image'].cuda(), volatile=True), data['inst'].cuda())\r\n\tfeat_map = nn.Upsample(scale_factor=2, mode='nearest')(feat_map)\r\n\timage_numpy = util.tensor2im(feat_map.data[0])\r\n\tsave_path = data['path'][0].replace('/train_label/', '/train_feat/')\r\n\tutil.save_image(image_numpy, save_path)"
  },
  {
    "path": "run_engine.py",
    "content": "import os\nimport sys\nfrom random import randint\nimport numpy as np\nimport tensorrt\n\ntry:\n    from PIL import Image\n    import pycuda.driver as cuda\n    import pycuda.gpuarray as gpuarray\n    import pycuda.autoinit\n    import argparse\nexcept ImportError as err:\n    sys.stderr.write(\"\"\"ERROR: failed to import module ({})\nPlease make sure you have pycuda and the example dependencies installed.\nhttps://wiki.tiker.net/PyCuda/Installation/Linux\npip(3) install tensorrt[examples]\n\"\"\".format(err))\n    exit(1)\n\ntry:\n    import tensorrt as trt\n    from tensorrt.parsers import caffeparser\n    from tensorrt.parsers import onnxparser    \nexcept ImportError as err:\n    sys.stderr.write(\"\"\"ERROR: failed to import module ({})\nPlease make sure you have the TensorRT Library installed\nand accessible in your LD_LIBRARY_PATH\n\"\"\".format(err))\n    exit(1)\n\n\nG_LOGGER = trt.infer.ConsoleLogger(trt.infer.LogSeverity.INFO)\n\nclass Profiler(trt.infer.Profiler):\n    \"\"\"\n    Example Implimentation of a Profiler\n    Is identical to the Profiler class in trt.infer so it is possible\n    to just use that instead of implementing this if further\n    functionality is not needed\n    \"\"\"\n    def __init__(self, timing_iter):\n        trt.infer.Profiler.__init__(self)\n        self.timing_iterations = timing_iter\n        self.profile = []\n\n    def report_layer_time(self, layerName, ms):\n        record = next((r for r in self.profile if r[0] == layerName), (None, None))\n        if record == (None, None):\n            self.profile.append((layerName, ms))\n        else:\n            self.profile[self.profile.index(record)] = (record[0], record[1] + ms)\n\n    def print_layer_times(self):\n        totalTime = 0\n        for i in range(len(self.profile)):\n            print(\"{:40.40} {:4.3f}ms\".format(self.profile[i][0], self.profile[i][1] / self.timing_iterations))\n            totalTime += self.profile[i][1]\n        print(\"Time over all layers: {:4.2f} ms per iteration\".format(totalTime / self.timing_iterations))\n\n\ndef get_input_output_names(trt_engine):\n    nbindings = trt_engine.get_nb_bindings();\n    maps = []\n\n    for b in range(0, nbindings):\n        dims = trt_engine.get_binding_dimensions(b).to_DimsCHW()\n        name = trt_engine.get_binding_name(b)\n        type = trt_engine.get_binding_data_type(b)\n        \n        if (trt_engine.binding_is_input(b)):\n            maps.append(name)\n            print(\"Found input: \", name)\n        else:\n            maps.append(name)\n            print(\"Found output: \", name)\n\n        print(\"shape=\" + str(dims.C()) + \" , \" + str(dims.H()) + \" , \" + str(dims.W()))\n        print(\"dtype=\" + str(type))\n    return maps\n\ndef create_memory(engine, name,  buf, mem, batchsize, inp, inp_idx):\n    binding_idx = engine.get_binding_index(name)\n    if binding_idx == -1:\n        raise AttributeError(\"Not a valid binding\")\n    print(\"Binding: name={}, bindingIndex={}\".format(name, str(binding_idx)))\n    dims = engine.get_binding_dimensions(binding_idx).to_DimsCHW()\n    eltCount = dims.C() * dims.H() * dims.W() * batchsize\n\n    if engine.binding_is_input(binding_idx):\n        h_mem = inp[inp_idx]\n        inp_idx = inp_idx + 1\n    else:\n        h_mem = np.random.uniform(0.0, 255.0, eltCount).astype(np.dtype('f4'))\n\n    d_mem = cuda.mem_alloc(eltCount * 4)\n    cuda.memcpy_htod(d_mem, h_mem)\n    buf.insert(binding_idx, int(d_mem))\n    mem.append(d_mem)\n    return inp_idx\n\n\n#Run inference on device\ndef time_inference(engine, batch_size, inp):\n    bindings = []\n    mem = []\n    inp_idx = 0\n    for io in get_input_output_names(engine):\n        inp_idx = create_memory(engine, io,  bindings, mem,\n                                batch_size, inp, inp_idx)\n\n    context = engine.create_execution_context()\n    g_prof = Profiler(500)\n    context.set_profiler(g_prof)\n    for i in range(iter):\n        context.execute(batch_size, bindings)\n    g_prof.print_layer_times()\n    \n    context.destroy() \n    return\n\n\ndef convert_to_datatype(v):\n    if v==8:\n        return trt.infer.DataType.INT8\n    elif v==16:\n        return trt.infer.DataType.HALF\n    elif v==32:\n        return trt.infer.DataType.FLOAT\n    else:\n        print(\"ERROR: Invalid model data type bit depth: \" + str(v))\n        return trt.infer.DataType.INT8\n\ndef run_trt_engine(engine_file, bs, it):\n    engine = trt.utils.load_engine(G_LOGGER, engine_file)\n    time_inference(engine, bs, it)\n\ndef run_onnx(onnx_file, data_type, bs, inp):\n    # Create onnx_config\n    apex = onnxparser.create_onnxconfig()\n    apex.set_model_file_name(onnx_file)\n    apex.set_model_dtype(convert_to_datatype(data_type))\n\n     # create parser\n    trt_parser = onnxparser.create_onnxparser(apex)\n    assert(trt_parser)\n    data_type = apex.get_model_dtype()\n    onnx_filename = apex.get_model_file_name()\n    trt_parser.parse(onnx_filename, data_type)\n    trt_parser.report_parsing_info()\n    trt_parser.convert_to_trtnetwork()\n    trt_network = trt_parser.get_trtnetwork()\n    assert(trt_network)\n\n    # create infer builder\n    trt_builder = trt.infer.create_infer_builder(G_LOGGER)\n    trt_builder.set_max_batch_size(max_batch_size)\n    trt_builder.set_max_workspace_size(max_workspace_size)\n    \n    if (apex.get_model_dtype() == trt.infer.DataType_kHALF):\n        print(\"-------------------  Running FP16 -----------------------------\")\n        trt_builder.set_half2_mode(True)\n    elif (apex.get_model_dtype() == trt.infer.DataType_kINT8): \n        print(\"-------------------  Running INT8 -----------------------------\")\n        trt_builder.set_int8_mode(True)\n    else:\n        print(\"-------------------  Running FP32 -----------------------------\")\n        \n    print(\"----- Builder is Done -----\")\n    print(\"----- Creating Engine -----\")\n    trt_engine = trt_builder.build_cuda_engine(trt_network)\n    print(\"----- Engine is built -----\")\n    time_inference(engine, bs, inp)\n"
  },
  {
    "path": "scripts/test_1024p.sh",
    "content": "#!/bin/bash\n################################ Testing ################################\n# labels only\npython test.py --name label2city_1024p --netG local --ngf 32 --resize_or_crop none $@\n"
  },
  {
    "path": "scripts/test_1024p_feat.sh",
    "content": "################################ Testing ################################\r\n# first precompute and cluster all features\r\npython encode_features.py --name label2city_1024p_feat --netG local --ngf 32 --resize_or_crop none;\r\n# use instance-wise features\r\npython test.py --name label2city_1024p_feat ---netG local --ngf 32 --resize_or_crop none --instance_feat"
  },
  {
    "path": "scripts/test_512p.sh",
    "content": "################################ Testing ################################\r\n# labels only\r\npython test.py --name label2city_512p"
  },
  {
    "path": "scripts/test_512p_feat.sh",
    "content": "################################ Testing ################################\r\n# first precompute and cluster all features\r\npython encode_features.py --name label2city_512p_feat;\r\n# use instance-wise features\r\npython test.py --name label2city_512p_feat --instance_feat"
  },
  {
    "path": "scripts/train_1024p_12G.sh",
    "content": "############## To train images at 2048 x 1024 resolution after training 1024 x 512 resolution models #############\r\n##### Using GPUs with 12G memory (not tested)\r\n# Using labels only\r\npython train.py --name label2city_1024p --netG local --ngf 32 --num_D 3 --load_pretrain checkpoints/label2city_512p/ --niter_fix_global 20 --resize_or_crop crop --fineSize 1024"
  },
  {
    "path": "scripts/train_1024p_24G.sh",
    "content": "############## To train images at 2048 x 1024 resolution after training 1024 x 512 resolution models #############\r\n######## Using GPUs with 24G memory\r\n# Using labels only\r\npython train.py --name label2city_1024p --netG local --ngf 32 --num_D 3 --load_pretrain checkpoints/label2city_512p/ --niter 50 --niter_decay 50 --niter_fix_global 10 --resize_or_crop none"
  },
  {
    "path": "scripts/train_1024p_feat_12G.sh",
    "content": "############## To train images at 2048 x 1024 resolution after training 1024 x 512 resolution models #############\r\n##### Using GPUs with 12G memory (not tested)\r\n# First precompute feature maps and save them\r\npython precompute_feature_maps.py --name label2city_512p_feat;\r\n# Adding instances and encoded features\r\npython train.py --name label2city_1024p_feat --netG local --ngf 32 --num_D 3 --load_pretrain checkpoints/label2city_512p_feat/ --niter_fix_global 20 --resize_or_crop crop --fineSize 896 --instance_feat --load_features"
  },
  {
    "path": "scripts/train_1024p_feat_24G.sh",
    "content": "############## To train images at 2048 x 1024 resolution after training 1024 x 512 resolution models #############\r\n######## Using GPUs with 24G memory\r\n# First precompute feature maps and save them\r\npython precompute_feature_maps.py --name label2city_512p_feat;\r\n# Adding instances and encoded features\r\npython train.py --name label2city_1024p_feat --netG local --ngf 32 --num_D 3 --load_pretrain checkpoints/label2city_512p_feat/ --niter 50 --niter_decay 50 --niter_fix_global 10 --resize_or_crop none --instance_feat --load_features"
  },
  {
    "path": "scripts/train_512p.sh",
    "content": "### Using labels only\r\npython train.py --name label2city_512p"
  },
  {
    "path": "scripts/train_512p_feat.sh",
    "content": "### Adding instances and encoded features\r\npython train.py --name label2city_512p_feat --instance_feat"
  },
  {
    "path": "scripts/train_512p_fp16.sh",
    "content": "### Using labels only\r\n python -m torch.distributed.launch train.py --name label2city_512p --fp16"
  },
  {
    "path": "scripts/train_512p_fp16_multigpu.sh",
    "content": "######## Multi-GPU training example #######\r\npython -m torch.distributed.launch train.py --name label2city_512p --batchSize 8 --gpu_ids 0,1,2,3,4,5,6,7 --fp16"
  },
  {
    "path": "scripts/train_512p_multigpu.sh",
    "content": "######## Multi-GPU training example #######\r\npython train.py --name label2city_512p --batchSize 8 --gpu_ids 0,1,2,3,4,5,6,7"
  },
  {
    "path": "test.py",
    "content": "import os\nfrom collections import OrderedDict\nfrom torch.autograd import Variable\nfrom options.test_options import TestOptions\nfrom data.data_loader import CreateDataLoader\nfrom models.models import create_model\nimport util.util as util\nfrom util.visualizer import Visualizer\nfrom util import html\nimport torch\n\nopt = TestOptions().parse(save=False)\nopt.nThreads = 1   # test code only supports nThreads = 1\nopt.batchSize = 1  # test code only supports batchSize = 1\nopt.serial_batches = True  # no shuffle\nopt.no_flip = True  # no flip\n\ndata_loader = CreateDataLoader(opt)\ndataset = data_loader.load_data()\nvisualizer = Visualizer(opt)\n# create website\nweb_dir = os.path.join(opt.results_dir, opt.name, '%s_%s' % (opt.phase, opt.which_epoch))\nwebpage = html.HTML(web_dir, 'Experiment = %s, Phase = %s, Epoch = %s' % (opt.name, opt.phase, opt.which_epoch))\n\n# test\nif not opt.engine and not opt.onnx:\n    model = create_model(opt)\n    if opt.data_type == 16:\n        model.half()\n    elif opt.data_type == 8:\n        model.type(torch.uint8)\n            \n    if opt.verbose:\n        print(model)\nelse:\n    from run_engine import run_trt_engine, run_onnx\n    \nfor i, data in enumerate(dataset):\n    if i >= opt.how_many:\n        break\n    if opt.data_type == 16:\n        data['label'] = data['label'].half()\n        data['inst']  = data['inst'].half()\n    elif opt.data_type == 8:\n        data['label'] = data['label'].uint8()\n        data['inst']  = data['inst'].uint8()\n    if opt.export_onnx:\n        print (\"Exporting to ONNX: \", opt.export_onnx)\n        assert opt.export_onnx.endswith(\"onnx\"), \"Export model file should end with .onnx\"\n        torch.onnx.export(model, [data['label'], data['inst']],\n                          opt.export_onnx, verbose=True)\n        exit(0)\n    minibatch = 1 \n    if opt.engine:\n        generated = run_trt_engine(opt.engine, minibatch, [data['label'], data['inst']])\n    elif opt.onnx:\n        generated = run_onnx(opt.onnx, opt.data_type, minibatch, [data['label'], data['inst']])\n    else:        \n        generated = model.inference(data['label'], data['inst'], data['image'])\n        \n    visuals = OrderedDict([('input_label', util.tensor2label(data['label'][0], opt.label_nc)),\n                           ('synthesized_image', util.tensor2im(generated.data[0]))])\n    img_path = data['path']\n    print('process image... %s' % img_path)\n    visualizer.save_images(webpage, visuals, img_path)\n\nwebpage.save()\n"
  },
  {
    "path": "train.py",
    "content": "import time\nimport os\nimport numpy as np\nimport torch\nfrom torch.autograd import Variable\nfrom collections import OrderedDict\nfrom subprocess import call\nimport fractions\ndef lcm(a,b): return abs(a * b)/fractions.gcd(a,b) if a and b else 0\n\nfrom options.train_options import TrainOptions\nfrom data.data_loader import CreateDataLoader\nfrom models.models import create_model\nimport util.util as util\nfrom util.visualizer import Visualizer\n\nopt = TrainOptions().parse()\niter_path = os.path.join(opt.checkpoints_dir, opt.name, 'iter.txt')\nif opt.continue_train:\n    try:\n        start_epoch, epoch_iter = np.loadtxt(iter_path , delimiter=',', dtype=int)\n    except:\n        start_epoch, epoch_iter = 1, 0\n    print('Resuming from epoch %d at iteration %d' % (start_epoch, epoch_iter))        \nelse:    \n    start_epoch, epoch_iter = 1, 0\n\nopt.print_freq = lcm(opt.print_freq, opt.batchSize)    \nif opt.debug:\n    opt.display_freq = 1\n    opt.print_freq = 1\n    opt.niter = 1\n    opt.niter_decay = 0\n    opt.max_dataset_size = 10\n\ndata_loader = CreateDataLoader(opt)\ndataset = data_loader.load_data()\ndataset_size = len(data_loader)\nprint('#training images = %d' % dataset_size)\n\nmodel = create_model(opt)\nvisualizer = Visualizer(opt)\nif opt.fp16:    \n    from apex import amp\n    model, [optimizer_G, optimizer_D] = amp.initialize(model, [model.optimizer_G, model.optimizer_D], opt_level='O1')             \n    model = torch.nn.DataParallel(model, device_ids=opt.gpu_ids)\nelse:\n    optimizer_G, optimizer_D = model.module.optimizer_G, model.module.optimizer_D\n\ntotal_steps = (start_epoch-1) * dataset_size + epoch_iter\n\ndisplay_delta = total_steps % opt.display_freq\nprint_delta = total_steps % opt.print_freq\nsave_delta = total_steps % opt.save_latest_freq\n\nfor epoch in range(start_epoch, opt.niter + opt.niter_decay + 1):\n    epoch_start_time = time.time()\n    if epoch != start_epoch:\n        epoch_iter = epoch_iter % dataset_size\n    for i, data in enumerate(dataset, start=epoch_iter):\n        if total_steps % opt.print_freq == print_delta:\n            iter_start_time = time.time()\n        total_steps += opt.batchSize\n        epoch_iter += opt.batchSize\n\n        # whether to collect output images\n        save_fake = total_steps % opt.display_freq == display_delta\n\n        ############## Forward Pass ######################\n        losses, generated = model(Variable(data['label']), Variable(data['inst']), \n            Variable(data['image']), Variable(data['feat']), infer=save_fake)\n\n        # sum per device losses\n        losses = [ torch.mean(x) if not isinstance(x, int) else x for x in losses ]\n        loss_dict = dict(zip(model.module.loss_names, losses))\n\n        # calculate final loss scalar\n        loss_D = (loss_dict['D_fake'] + loss_dict['D_real']) * 0.5\n        loss_G = loss_dict['G_GAN'] + loss_dict.get('G_GAN_Feat',0) + loss_dict.get('G_VGG',0)\n\n        ############### Backward Pass ####################\n        # update generator weights\n        optimizer_G.zero_grad()\n        if opt.fp16:                                \n            with amp.scale_loss(loss_G, optimizer_G) as scaled_loss: scaled_loss.backward()                \n        else:\n            loss_G.backward()          \n        optimizer_G.step()\n\n        # update discriminator weights\n        optimizer_D.zero_grad()\n        if opt.fp16:                                \n            with amp.scale_loss(loss_D, optimizer_D) as scaled_loss: scaled_loss.backward()                \n        else:\n            loss_D.backward()        \n        optimizer_D.step()        \n\n        ############## Display results and errors ##########\n        ### print out errors\n        if total_steps % opt.print_freq == print_delta:\n            errors = {k: v.data.item() if not isinstance(v, int) else v for k, v in loss_dict.items()}            \n            t = (time.time() - iter_start_time) / opt.print_freq\n            visualizer.print_current_errors(epoch, epoch_iter, errors, t)\n            visualizer.plot_current_errors(errors, total_steps)\n            #call([\"nvidia-smi\", \"--format=csv\", \"--query-gpu=memory.used,memory.free\"]) \n\n        ### display output images\n        if save_fake:\n            visuals = OrderedDict([('input_label', util.tensor2label(data['label'][0], opt.label_nc)),\n                                   ('synthesized_image', util.tensor2im(generated.data[0])),\n                                   ('real_image', util.tensor2im(data['image'][0]))])\n            visualizer.display_current_results(visuals, epoch, total_steps)\n\n        ### save latest model\n        if total_steps % opt.save_latest_freq == save_delta:\n            print('saving the latest model (epoch %d, total_steps %d)' % (epoch, total_steps))\n            model.module.save('latest')            \n            np.savetxt(iter_path, (epoch, epoch_iter), delimiter=',', fmt='%d')\n\n        if epoch_iter >= dataset_size:\n            break\n       \n    # end of epoch \n    iter_end_time = time.time()\n    print('End of epoch %d / %d \\t Time Taken: %d sec' %\n          (epoch, opt.niter + opt.niter_decay, time.time() - epoch_start_time))\n\n    ### save model for this epoch\n    if epoch % opt.save_epoch_freq == 0:\n        print('saving the model at the end of epoch %d, iters %d' % (epoch, total_steps))        \n        model.module.save('latest')\n        model.module.save(epoch)\n        np.savetxt(iter_path, (epoch+1, 0), delimiter=',', fmt='%d')\n\n    ### instead of only training the local enhancer, train the entire network after certain iterations\n    if (opt.niter_fix_global != 0) and (epoch == opt.niter_fix_global):\n        model.module.update_fixed_params()\n\n    ### linearly decay learning rate after certain iterations\n    if epoch > opt.niter:\n        model.module.update_learning_rate()\n"
  },
  {
    "path": "util/__init__.py",
    "content": ""
  },
  {
    "path": "util/html.py",
    "content": "import dominate\nfrom dominate.tags import *\nimport os\n\n\nclass HTML:\n    def __init__(self, web_dir, title, refresh=0):\n        self.title = title\n        self.web_dir = web_dir\n        self.img_dir = os.path.join(self.web_dir, 'images')\n        if not os.path.exists(self.web_dir):\n            os.makedirs(self.web_dir)\n        if not os.path.exists(self.img_dir):\n            os.makedirs(self.img_dir)\n\n        self.doc = dominate.document(title=title)\n        if refresh > 0:\n            with self.doc.head:\n                meta(http_equiv=\"refresh\", content=str(refresh))\n\n    def get_image_dir(self):\n        return self.img_dir\n\n    def add_header(self, str):\n        with self.doc:\n            h3(str)\n\n    def add_table(self, border=1):\n        self.t = table(border=border, style=\"table-layout: fixed;\")\n        self.doc.add(self.t)\n\n    def add_images(self, ims, txts, links, width=512):\n        self.add_table()\n        with self.t:\n            with tr():\n                for im, txt, link in zip(ims, txts, links):\n                    with td(style=\"word-wrap: break-word;\", halign=\"center\", valign=\"top\"):\n                        with p():\n                            with a(href=os.path.join('images', link)):\n                                img(style=\"width:%dpx\" % (width), src=os.path.join('images', im))\n                            br()\n                            p(txt)\n\n    def save(self):\n        html_file = '%s/index.html' % self.web_dir\n        f = open(html_file, 'wt')\n        f.write(self.doc.render())\n        f.close()\n\n\nif __name__ == '__main__':\n    html = HTML('web/', 'test_html')\n    html.add_header('hello world')\n\n    ims = []\n    txts = []\n    links = []\n    for n in range(4):\n        ims.append('image_%d.jpg' % n)\n        txts.append('text_%d' % n)\n        links.append('image_%d.jpg' % n)\n    html.add_images(ims, txts, links)\n    html.save()\n"
  },
  {
    "path": "util/image_pool.py",
    "content": "import random\nimport torch\nfrom torch.autograd import Variable\nclass ImagePool():\n    def __init__(self, pool_size):\n        self.pool_size = pool_size\n        if self.pool_size > 0:\n            self.num_imgs = 0\n            self.images = []\n\n    def query(self, images):\n        if self.pool_size == 0:\n            return images\n        return_images = []\n        for image in images.data:\n            image = torch.unsqueeze(image, 0)\n            if self.num_imgs < self.pool_size:\n                self.num_imgs = self.num_imgs + 1\n                self.images.append(image)\n                return_images.append(image)\n            else:\n                p = random.uniform(0, 1)\n                if p > 0.5:\n                    random_id = random.randint(0, self.pool_size-1)\n                    tmp = self.images[random_id].clone()\n                    self.images[random_id] = image\n                    return_images.append(tmp)\n                else:\n                    return_images.append(image)\n        return_images = Variable(torch.cat(return_images, 0))\n        return return_images\n"
  },
  {
    "path": "util/util.py",
    "content": "from __future__ import print_function\nimport torch\nimport numpy as np\nfrom PIL import Image\nimport numpy as np\nimport os\n\n# Converts a Tensor into a Numpy array\n# |imtype|: the desired type of the converted numpy array\ndef tensor2im(image_tensor, imtype=np.uint8, normalize=True):\n    if isinstance(image_tensor, list):\n        image_numpy = []\n        for i in range(len(image_tensor)):\n            image_numpy.append(tensor2im(image_tensor[i], imtype, normalize))\n        return image_numpy\n    image_numpy = image_tensor.cpu().float().numpy()\n    if normalize:\n        image_numpy = (np.transpose(image_numpy, (1, 2, 0)) + 1) / 2.0 * 255.0\n    else:\n        image_numpy = np.transpose(image_numpy, (1, 2, 0)) * 255.0      \n    image_numpy = np.clip(image_numpy, 0, 255)\n    if image_numpy.shape[2] == 1 or image_numpy.shape[2] > 3:        \n        image_numpy = image_numpy[:,:,0]\n    return image_numpy.astype(imtype)\n\n# Converts a one-hot tensor into a colorful label map\ndef tensor2label(label_tensor, n_label, imtype=np.uint8):\n    if n_label == 0:\n        return tensor2im(label_tensor, imtype)\n    label_tensor = label_tensor.cpu().float()    \n    if label_tensor.size()[0] > 1:\n        label_tensor = label_tensor.max(0, keepdim=True)[1]\n    label_tensor = Colorize(n_label)(label_tensor)\n    label_numpy = np.transpose(label_tensor.numpy(), (1, 2, 0))\n    return label_numpy.astype(imtype)\n\ndef save_image(image_numpy, image_path):\n    image_pil = Image.fromarray(image_numpy)\n    image_pil.save(image_path)\n\ndef mkdirs(paths):\n    if isinstance(paths, list) and not isinstance(paths, str):\n        for path in paths:\n            mkdir(path)\n    else:\n        mkdir(paths)\n\ndef mkdir(path):\n    if not os.path.exists(path):\n        os.makedirs(path)\n\n###############################################################################\n# Code from\n# https://github.com/ycszen/pytorch-seg/blob/master/transform.py\n# Modified so it complies with the Citscape label map colors\n###############################################################################\ndef uint82bin(n, count=8):\n    \"\"\"returns the binary of integer n, count refers to amount of bits\"\"\"\n    return ''.join([str((n >> y) & 1) for y in range(count-1, -1, -1)])\n\ndef labelcolormap(N):\n    if N == 35: # cityscape\n        cmap = np.array([(  0,  0,  0), (  0,  0,  0), (  0,  0,  0), (  0,  0,  0), (  0,  0,  0), (111, 74,  0), ( 81,  0, 81),\n                     (128, 64,128), (244, 35,232), (250,170,160), (230,150,140), ( 70, 70, 70), (102,102,156), (190,153,153),\n                     (180,165,180), (150,100,100), (150,120, 90), (153,153,153), (153,153,153), (250,170, 30), (220,220,  0),\n                     (107,142, 35), (152,251,152), ( 70,130,180), (220, 20, 60), (255,  0,  0), (  0,  0,142), (  0,  0, 70),\n                     (  0, 60,100), (  0,  0, 90), (  0,  0,110), (  0, 80,100), (  0,  0,230), (119, 11, 32), (  0,  0,142)], \n                     dtype=np.uint8)\n    else:\n        cmap = np.zeros((N, 3), dtype=np.uint8)\n        for i in range(N):\n            r, g, b = 0, 0, 0\n            id = i\n            for j in range(7):\n                str_id = uint82bin(id)\n                r = r ^ (np.uint8(str_id[-1]) << (7-j))\n                g = g ^ (np.uint8(str_id[-2]) << (7-j))\n                b = b ^ (np.uint8(str_id[-3]) << (7-j))\n                id = id >> 3\n            cmap[i, 0] = r\n            cmap[i, 1] = g\n            cmap[i, 2] = b\n    return cmap\n\nclass Colorize(object):\n    def __init__(self, n=35):\n        self.cmap = labelcolormap(n)\n        self.cmap = torch.from_numpy(self.cmap[:n])\n\n    def __call__(self, gray_image):\n        size = gray_image.size()\n        color_image = torch.ByteTensor(3, size[1], size[2]).fill_(0)\n\n        for label in range(0, len(self.cmap)):\n            mask = (label == gray_image[0]).cpu()\n            color_image[0][mask] = self.cmap[label][0]\n            color_image[1][mask] = self.cmap[label][1]\n            color_image[2][mask] = self.cmap[label][2]\n\n        return color_image\n"
  },
  {
    "path": "util/visualizer.py",
    "content": "import numpy as np\nimport os\nimport ntpath\nimport time\nfrom . import util\nfrom . import html\nimport scipy.misc\ntry:\n    from StringIO import StringIO  # Python 2.7\nexcept ImportError:\n    from io import BytesIO         # Python 3.x\n\nclass Visualizer():\n    def __init__(self, opt):\n        # self.opt = opt\n        self.tf_log = opt.tf_log\n        self.use_html = opt.isTrain and not opt.no_html\n        self.win_size = opt.display_winsize\n        self.name = opt.name\n        if self.tf_log:\n            import tensorflow as tf\n            self.tf = tf\n            self.log_dir = os.path.join(opt.checkpoints_dir, opt.name, 'logs')\n            self.writer = tf.summary.FileWriter(self.log_dir)\n\n        if self.use_html:\n            self.web_dir = os.path.join(opt.checkpoints_dir, opt.name, 'web')\n            self.img_dir = os.path.join(self.web_dir, 'images')\n            print('create web directory %s...' % self.web_dir)\n            util.mkdirs([self.web_dir, self.img_dir])\n        self.log_name = os.path.join(opt.checkpoints_dir, opt.name, 'loss_log.txt')\n        with open(self.log_name, \"a\") as log_file:\n            now = time.strftime(\"%c\")\n            log_file.write('================ Training Loss (%s) ================\\n' % now)\n\n    # |visuals|: dictionary of images to display or save\n    def display_current_results(self, visuals, epoch, step):\n        if self.tf_log: # show images in tensorboard output\n            img_summaries = []\n            for label, image_numpy in visuals.items():\n                # Write the image to a string\n                try:\n                    s = StringIO()\n                except:\n                    s = BytesIO()\n                scipy.misc.toimage(image_numpy).save(s, format=\"jpeg\")\n                # Create an Image object\n                img_sum = self.tf.Summary.Image(encoded_image_string=s.getvalue(), height=image_numpy.shape[0], width=image_numpy.shape[1])\n                # Create a Summary value\n                img_summaries.append(self.tf.Summary.Value(tag=label, image=img_sum))\n\n            # Create and write Summary\n            summary = self.tf.Summary(value=img_summaries)\n            self.writer.add_summary(summary, step)\n\n        if self.use_html: # save images to a html file\n            for label, image_numpy in visuals.items():\n                if isinstance(image_numpy, list):\n                    for i in range(len(image_numpy)):\n                        img_path = os.path.join(self.img_dir, 'epoch%.3d_%s_%d.jpg' % (epoch, label, i))\n                        util.save_image(image_numpy[i], img_path)\n                else:\n                    img_path = os.path.join(self.img_dir, 'epoch%.3d_%s.jpg' % (epoch, label))\n                    util.save_image(image_numpy, img_path)\n\n            # update website\n            webpage = html.HTML(self.web_dir, 'Experiment name = %s' % self.name, refresh=30)\n            for n in range(epoch, 0, -1):\n                webpage.add_header('epoch [%d]' % n)\n                ims = []\n                txts = []\n                links = []\n\n                for label, image_numpy in visuals.items():\n                    if isinstance(image_numpy, list):\n                        for i in range(len(image_numpy)):\n                            img_path = 'epoch%.3d_%s_%d.jpg' % (n, label, i)\n                            ims.append(img_path)\n                            txts.append(label+str(i))\n                            links.append(img_path)\n                    else:\n                        img_path = 'epoch%.3d_%s.jpg' % (n, label)\n                        ims.append(img_path)\n                        txts.append(label)\n                        links.append(img_path)\n                if len(ims) < 10:\n                    webpage.add_images(ims, txts, links, width=self.win_size)\n                else:\n                    num = int(round(len(ims)/2.0))\n                    webpage.add_images(ims[:num], txts[:num], links[:num], width=self.win_size)\n                    webpage.add_images(ims[num:], txts[num:], links[num:], width=self.win_size)\n            webpage.save()\n\n    # errors: dictionary of error labels and values\n    def plot_current_errors(self, errors, step):\n        if self.tf_log:\n            for tag, value in errors.items():\n                summary = self.tf.Summary(value=[self.tf.Summary.Value(tag=tag, simple_value=value)])\n                self.writer.add_summary(summary, step)\n\n    # errors: same format as |errors| of plotCurrentErrors\n    def print_current_errors(self, epoch, i, errors, t):\n        message = '(epoch: %d, iters: %d, time: %.3f) ' % (epoch, i, t)\n        for k, v in errors.items():\n            if v != 0:\n                message += '%s: %.3f ' % (k, v)\n\n        print(message)\n        with open(self.log_name, \"a\") as log_file:\n            log_file.write('%s\\n' % message)\n\n    # save image to the disk\n    def save_images(self, webpage, visuals, image_path):\n        image_dir = webpage.get_image_dir()\n        short_path = ntpath.basename(image_path[0])\n        name = os.path.splitext(short_path)[0]\n\n        webpage.add_header(name)\n        ims = []\n        txts = []\n        links = []\n\n        for label, image_numpy in visuals.items():\n            image_name = '%s_%s.jpg' % (name, label)\n            save_path = os.path.join(image_dir, image_name)\n            util.save_image(image_numpy, save_path)\n\n            ims.append(image_name)\n            txts.append(label)\n            links.append(image_name)\n        webpage.add_images(ims, txts, links, width=self.win_size)\n"
  }
]