[
  {
    "path": ".gitignore",
    "content": "# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\n\n# C extensions\n*.so\n\n# Distribution / packaging\n.Python\nenv/\nbuild/\ndevelop-eggs/\ndist/\ndownloads/\neggs/\n.eggs/\nlib/\nlib64/\nparts/\nsdist/\nvar/\n*.egg-info/\n.installed.cfg\n*.egg\n\n# PyInstaller\n#  Usually these files are written by a python script from a template\n#  before PyInstaller builds the exe, so as to inject date/other infos into it.\n*.manifest\n*.spec\n\n# Installer logs\npip-log.txt\npip-delete-this-directory.txt\n\n# Unit test / coverage reports\nhtmlcov/\n.tox/\n.coverage\n.coverage.*\n.cache\nnosetests.xml\ncoverage.xml\n*,cover\n.hypothesis/\n\n# Translations\n*.mo\n*.pot\n\n# Django stuff:\n*.log\nlocal_settings.py\n\n# Flask stuff:\ninstance/\n.webassets-cache\n\n# Scrapy stuff:\n.scrapy\n\n# Sphinx documentation\ndocs/_build/\n\n# PyBuilder\ntarget/\n\n# IPython Notebook\n.ipynb_checkpoints\n\n# pyenv\n.python-version\n\n# celery beat schedule file\ncelerybeat-schedule\n\n# dotenv\n.env\n\n# virtualenv\nvenv/\nENV/\n\n# Spyder project settings\n.spyderproject\n\n# Rope project settings\n.ropeproject\n\n# Mac stuff\n.DS_Store\n\n# Visual Studio Code stuff\n.vscode/"
  },
  {
    "path": "LICENSE",
    "content": "MIT License\n\nCopyright (c) 2017 Visipedia\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "README.md",
    "content": "# TensorFlow Classification\nThis repo contains training, testing and classifcation code for image classification using [TensorFlow](https://www.tensorflow.org/). Whole image classification as well as multi instance bounding box classification is supported. \n\nCheckout the [Wiki](https://github.com/visipedia/tf_classification/wiki) for more detailed tutorials. \n\n---\n\n## Requirements\nTensorFlow 1.0+ is required. The code is tested with TensorFlow 1.3 and Python 2.7 on Ubuntu 16.04 and Mac OSX 10.11. Check out the [requirements.txt](requirements.txt) file for a list of python dependencies. \n\n---\n\n## Prepare the Data\nThe models require the image data to be in a specific format. You can use the Visipedia [tfrecords repo](https://github.com/visipedia/tfrecords) to produce the files. \n\nFor the commands below, I'll assume that you have created a `DATASET_DIR` environment variable that points to the directory that contains your tfrecords:\n```\n$ export DATASET_DIR=/home/ubuntu/tf_datasets/cub\n```\n\n---\n\n## Directory Structure\nI have found that its useful to have the following directory and file setup:\n* experiment/\n  * logdir/\n    * train_summaries/\n    * val_summaries/\n    * test_summaries/\n    * results/\n    * finetune/\n      * train_summaries/\n      * val_summaries/\n  * cmds.txt\n  * config_train.yaml\n  * config_test.yaml\n  * config_export.yaml\n\nThe purpose of each directory and file will be explained below. \n\nThe `cmds.txt` is useful to save the different training and testing commands. There are quite a few command-line arguments to some of the scripts, so its convienent to compose the commands in an editor. \n\nFor the commands below, I'll assume that you have created a `EXPERIMENT_DIR` environment variable that points to your experiment directory:\n```\n$ export EXPERIMENT_DIR=/home/ubuntu/tf_experiments/cub\n```\n\n---\n\n## Configuration\nThere are example configuration files in the [config directory](config/). At the very least you'll need a `config_train.yaml` file, and you'll probably want a `config_test.yaml` file. It is convienent to copy the example configuration files into your `experiment` directory. See the configuration [README](config/README.md) for more details.\n\n### Choose a Network Architecture\nThis repo currently supports the Google Inception, ResNet and MobileNet flavor of networks. See the nets [README](nets/README.md) for more information on the different Inception versions. At the moment, `inception_v3` probably offers the best tradeoff in terms of size and performance, although its always worth experimenting with a few different architectures. The [README](nets/README.md) also contains links where you can download checkpoint files for the models. In most cases you should start your training from these checkpoint files rather than training from scratch. \n\nYou can specify the name of the choosen network in the configuration yaml file. Alternatively you can pass it in as a command-line argument to most of the scripts. \n\nFor the commands below, I'll assume that you have created an environment variable that points to the pretrained checkpoint file that you downloaded:\n```\n$ export PRETRAINED_MODEL=/home/ubuntu/tf_models/inception_v3.ckpt\n```\n\n---\n\n## Data Visualization\nNow that you have a configuration script for training, it is a good idea to visualize the inputs to the network and ensure that they look good. This allows you to debug any problems with your tfrecords and lets you play with different augmentation techniques. Visualize your data by doing:\n```\n$ CUDA_VISIBLE_DEVICES=1 python visualize_train_inputs.py \\\n--tfrecords $DATASET_DIR/train* \\\n--config $EXPERIMENT_DIR/config_train.yaml\n```\n\nIf you are in a virtualenv and Matplotlib is complaining, then you may need to modify your environment. See this [FAQ](http://matplotlib.org/faq/virtualenv_faq.html) and [this document](http://matplotlib.org/faq/osx_framework.html#osxframework-faq) for fixing this issue. I use a virtualenv on my Mac OSX 10.11 machine and I needed to do the `PYTHONHOME` [work around](http://matplotlib.org/faq/osx_framework.html#pythonhome-function) for Matplotlib to work properly. In this case the command looks like:\n```\n$ CUDA_VISIBLE_DEVICES=1 frameworkpython visualize_train_inputs.py \\\n--tfrecords $DATASET_DIR/train* \\\n--config $EXPERIMENT_DIR/config_train.yaml\n```\n\n---\n\n## Training and Validating\nIt's recommended to start from a pretrained network when training a network on your own data. However, this isn't necessary and you can train from scratch if you have enough data. The following warmup section assumes you are starting from a pretrained network. See the nets [README](nets/README.md) to find links to pretrained checkpoint files.\n\n### Finetune A Pretrained Network\nFinetuning a pretrained network essentially uses the pretrained network as a generic feature extractor and learns a new final layer that will output predictions for your target classes (rather than the original classes that the pretrained network was trained on). To do this, we will specify the pretrained model as the starting point, and only allow the logits layers to be modified. We can put the trained models in the `experiment/logdir/finetune` directory. \n\n```\n$ CUDA_VISIBLE_DEVICES=0 python train.py \\\n--tfrecords $DATASET_DIR/train* \\\n--logdir $EXPERIMENT_DIR/logdir/finetune \\\n--config $EXPERIMENT_DIR/config_train.yaml \\\n--pretrained_model $PRETRAINED_MODEL \\\n--trainable_scopes InceptionV3/Logits InceptionV3/AuxLogits \\\n--checkpoint_exclude_scopes InceptionV3/Logits InceptionV3/AuxLogits \\\n--learning_rate_decay_type fixed \\\n--lr 0.01 \n```\n\n#### Monitoring Progress\nWe'll want to monitor performance of the model on a validation set. Once the model performance starts to plateau we can assume that the final layer is warmed up and we can switch to full training. We can monitor the validation performance by running:\n```\n$ CUDA_VISIBLE_DEVICES=1 python test.py \\\n--tfrecords $DATASET_DIR/val* \\\n--save_dir $EXPERIMENT_DIR/logdir/finetune/val_summaries \\\n--checkpoint_path $EXPERIMENT_DIR/logdir/finetune \\\n--config $EXPERIMENT_DIR/config_test.yaml \\\n--batches 100 \\\n--eval_interval_secs 300\n```\n\nYou may want to also monitor the accuracy on the train set. Simply pass in the train tfrecords to the `test.py` script and change the output directory:\n```\n$ CUDA_VISIBLE_DEVICES=1 python test.py \\\n--tfrecords $DATASET_DIR/train* \\\n--save_dir $EXPERIMENT_DIR/logdir/finetune/train_summaries \\\n--checkpoint_path $EXPERIMENT_DIR/logdir/finetune \\\n--config $EXPERIMENT_DIR/config_test.yaml \\\n--batches 100 \\\n--eval_interval_secs 300\n```\n\nKeeping the train summaries and val summaries in separate directories will keep the tensorboard ui clean. To monitor the training process you can fireup tensorboard:\n```\n$ tensorboard --logdir=$EXPERIMENT_DIR/logdir --port=6006\n```\n\n### Training the Entire Network\nThe benefit of finetuning a network is that the training is very fast, as only the last layer is modified. However, to get the best performance you'll typically want to modify more (or all) of the layers of the network. Starting from a pretrained network (which can happen to be a finetuned network), this full training step essentially adapts the network to operating on the domain of your specific dataset.  We'll store the generated files in the `experiment/logdir` directory. You can do the finetuning process as a warmup and then start the full train:\n```\n$ CUDA_VISIBLE_DEVICES=0 python train.py \\\n--tfrecords $DATASET_DIR/train* \\\n--logdir $EXPERIMENT_DIR/logdir \\\n--config $EXPERIMENT_DIR/config_train.yaml \\\n--pretrained_model $EXPERIMENT_DIR/logdir/finetune\n```\n\nOr you can just start the full train from a pretrained model:\n```\n$ CUDA_VISIBLE_DEVICES=0 python train.py \\\n--tfrecords $DATASET_DIR/train* \\\n--logdir $EXPERIMENT_DIR/logdir \\\n--config $EXPERIMENT_DIR/config_train.yaml \\\n--pretrained_model $PRETRAINED_MODEL \\\n--checkpoint_exclude_scopes InceptionV3/Logits InceptionV3/AuxLogits\n```\n\nOr if you have enough data, you may not want to even use the pretrained model. Rather you can train from scratch:\n```\n$ CUDA_VISIBLE_DEVICES=0 python train.py \\\n--tfrecords $DATASET_DIR/train* \\\n--logdir $EXPERIMENT_DIR/logdir/ \\\n--config $EXPERIMENT_DIR/config_train.yaml\n``` \n\n#### Monitoring Progress\n\nFor watching the validation performance we can do:\n```\n$ CUDA_VISIBLE_DEVICES=1 python test.py \\\n--tfrecords $DATASET_DIR/val* \\\n--save_dir $EXPERIMENT_DIR/logdir/val_summaries \\\n--checkpoint_path $EXPERIMENT_DIR/logdir \\\n--config $EXPERIMENT_DIR/config_test.yaml \\\n--batches 100 \\\n--eval_interval_secs 300\n```\n\nSimilar for the train data: \n```\n$ CUDA_VISIBLE_DEVICES=1 python test.py \\\n--tfrecords $DATASET_DIR/train* \\\n--save_dir $EXPERIMENT_DIR/train_summaries \\\n--checkpoint_path $EXPERIMENT_DIR/logdir \\\n--config $EXPERIMENT_DIR/config_test.yaml \\\n--batches 100 \\\n--eval_interval_secs 300\n```\n\nThe command for tensorboard doesn't need to change:\n```\n$ tensorboard --logdir=$EXPERIMENT_DIR/logdir --port=6006\n```\nYou will be able to see the fine-tune and the full train data plotted on the same plots. \n\n---\n\n## Test\nOnce performance on the validation data has plateaued (or some other criterion has been met), you can test the model on a held out set of images to see how well it generalizes to new data:\n```\n$ CUDA_VISIBLE_DEVICES=1 python test.py \\\n--tfrecords $DATASET_DIR/test* \\\n--save_dir $EXPERIMENT_DIR/logdir/test_summaries \\\n--checkpoint_path $EXPERIMENT_DIR/logdir \\\n--config $EXPERIMENT_DIR/config_test.yaml \\\n--batch_size 32 \\\n--batches 100\n```\n\nIf you are happy with the performance of the model, then you are ready to classify new images and export the model for production use. Otherwise its back to the drawing board to figure out how to increase performance. \n\n---\n\n## Classifying \nIf you want to classify data offline using the trained model then you can do:\n```\nCUDA_VISIBLE_DEVICES=1 python classify.py \\\n--tfrecords $DATASET_DIR/new/* \\\n--checkpoint_path $EXPERIMENT_DIR/logdir \\\n--save_path $EXPERIMENT_DIR/logdir/results/classification_results.npz \\\n--config $EXPERIMENT_DIR/config_test.yaml \\\n--batch_size 32 \\\n--batches 1000 \\\n--save_logits\n```\n\nThe output of the script is a numpy uncompressed .npz file saved at `--save_path`. The file will contain at least 2 arrays: one that contains ids and one that contains the predicted class label. If `--save_logits` is specified, then the raw logits (before going through the softmax) will also be saved. \n\n---\n\n## Export & Compress\nTo export a model for easy use on a mobile device you can use:\n```\npython export.py \\\n--checkpoint_path model.ckpt-399739 \\\n--export_dir ./export \\\n--export_version 1 \\\n--config config_export.yaml \\\n--class_names class-codes.txt\n```\nThe input node is called `images` and the output node is called `Predictions`. Checkout [this](https://github.com/visipedia/tf_classification/wiki/Exporting-an-Optimized-Model) wiki article for more tips. \n\nIf you are going to use the model with [TensorFlow Serving](https://www.tensorflow.org/deploy/tfserve) then you can use the following:\n```\npython export.py \\\n--checkpoint_path model.ckpt-399739 \\\n--export_dir ./export \\\n--export_version 1 \\\n--config config_export.yaml \\\n--serving \\\n--add_preprocess \\\n--class_names class-codes.txt\n```\nCheck out the resources in the [tfserving](tfserving/) directory for more help with deploying on TensorFlow Serving.\n"
  },
  {
    "path": "classify.py",
    "content": "from __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport argparse\nimport os\nimport time\n\nimport numpy as np\nimport tensorflow as tf\nimport tensorflow.contrib.slim as slim\n\nfrom config.parse_config import parse_config_file\nfrom nets import nets_factory\nfrom preprocessing import inputs\n\ndef classify(tfrecords, checkpoint_path, save_path, max_iterations, save_logits, cfg, read_images=False):\n    \"\"\"\n    Args:\n        tfrecords (list)\n        checkpoint_path (str)\n        save_dir (str)\n        max_iterations (int)\n        save_logits (bool)\n        cfg (EasyDict)\n    \"\"\"\n    tf.logging.set_verbosity(tf.logging.DEBUG)\n\n    graph = tf.Graph()\n\n    with graph.as_default():\n\n        global_step = slim.get_or_create_global_step()\n\n        with tf.device('/cpu:0'):\n            batch_dict = inputs.input_nodes(\n                tfrecords=tfrecords,\n                cfg=cfg.IMAGE_PROCESSING,\n                num_epochs=1,\n                batch_size=cfg.BATCH_SIZE,\n                num_threads=cfg.NUM_INPUT_THREADS,\n                shuffle_batch =cfg.SHUFFLE_QUEUE,\n                random_seed=cfg.RANDOM_SEED,\n                capacity=cfg.QUEUE_CAPACITY,\n                min_after_dequeue=cfg.QUEUE_MIN,\n                add_summaries=False,\n                input_type='classification',\n                read_filenames=read_images\n            )\n\n        arg_scope = nets_factory.arg_scopes_map[cfg.MODEL_NAME]()\n\n        with slim.arg_scope(arg_scope):\n            logits, end_points = nets_factory.networks_map[cfg.MODEL_NAME](\n                inputs=batch_dict['inputs'],\n                num_classes=cfg.NUM_CLASSES,\n                is_training=False\n            )\n\n            predicted_labels = tf.argmax(end_points['Predictions'], 1)\n\n        if 'MOVING_AVERAGE_DECAY' in cfg and cfg.MOVING_AVERAGE_DECAY > 0:\n            variable_averages = tf.train.ExponentialMovingAverage(\n                cfg.MOVING_AVERAGE_DECAY, global_step)\n            variables_to_restore = variable_averages.variables_to_restore(\n                slim.get_model_variables())\n            variables_to_restore[global_step.op.name] = global_step\n        else:\n            variables_to_restore = slim.get_variables_to_restore()\n            variables_to_restore.append(global_step)\n\n        saver = tf.train.Saver(variables_to_restore, reshape=True)\n\n        num_batches = max_iterations\n        num_images = num_batches * cfg.BATCH_SIZE\n        label_array = np.empty(num_images, dtype=np.int32)\n        id_array = np.empty(num_images, dtype=np.object)\n        fetches = [predicted_labels, batch_dict['ids']]\n        if save_logits:\n            fetches.append(logits)\n            logits_array = np.empty((num_images, cfg.NUM_CLASSES), dtype=np.float32)\n\n        if os.path.isdir(checkpoint_path):\n            checkpoint_dir = checkpoint_path\n            checkpoint_path = tf.train.latest_checkpoint(checkpoint_dir)\n\n            if checkpoint_path is None:\n                raise ValueError(\"Unable to find a model checkpoint in the \" \\\n                                 \"directory %s\" % (checkpoint_dir,))\n\n        tf.logging.info('Classifying records using %s' % checkpoint_path)\n\n        coord = tf.train.Coordinator()\n\n        sess_config = tf.ConfigProto(\n                log_device_placement=cfg.SESSION_CONFIG.LOG_DEVICE_PLACEMENT,\n                allow_soft_placement = True,\n                gpu_options = tf.GPUOptions(\n                    per_process_gpu_memory_fraction=cfg.SESSION_CONFIG.PER_PROCESS_GPU_MEMORY_FRACTION\n                ),\n                intra_op_parallelism_threads=cfg.SESSION_CONFIG.INTRA_OP_PARALLELISM_THREADS if 'INTRA_OP_PARALLELISM_THREADS' in cfg.SESSION_CONFIG else None,\n                inter_op_parallelism_threads=cfg.SESSION_CONFIG.INTER_OP_PARALLELISM_THREADS if 'INTER_OP_PARALLELISM_THREADS' in cfg.SESSION_CONFIG else None\n            )\n        sess = tf.Session(graph=graph, config=sess_config)\n\n        with sess.as_default():\n\n            tf.global_variables_initializer().run()\n            tf.local_variables_initializer().run()\n            threads = tf.train.start_queue_runners(sess=sess, coord=coord)\n\n            try:\n\n                # Restore from checkpoint\n                saver.restore(sess, checkpoint_path)\n\n                print_str = ', '.join([\n                  'Step: %d',\n                  'Time/image (ms): %.1f'\n                ])\n\n                step = 0\n                while not coord.should_stop():\n\n                    t = time.time()\n                    outputs = sess.run(fetches)\n                    dt = time.time()-t\n\n                    idx1 = cfg.BATCH_SIZE * step\n                    idx2 = idx1 + cfg.BATCH_SIZE\n                    label_array[idx1:idx2] = outputs[0]\n                    id_array[idx1:idx2] = outputs[1]\n                    if save_logits:\n                        logits_array[idx1:idx2] = outputs[2]\n\n                    step += 1\n                    print(print_str % (step, (dt / cfg.BATCH_SIZE) * 1000))\n\n                    if max_iterations > 0 and step == max_iterations:\n                        break\n\n            except tf.errors.OutOfRangeError as e:\n                pass\n\n        coord.request_stop()\n        coord.join(threads)\n\n        # save the results\n        if save_logits:\n            np.savez(save_path, labels=label_array, ids=id_array, logits=logits_array)\n        else:\n            np.savez(save_path, labels=label_array, ids=id_array)\n\n\ndef parse_args():\n\n    parser = argparse.ArgumentParser(description='Classify images, optionally saving the logits.')\n\n    parser.add_argument('--tfrecords', dest='tfrecords',\n                        help='Paths to tfrecords.', type=str,\n                        nargs='+', required=True)\n\n    parser.add_argument('--checkpoint_path', dest='checkpoint_path',\n                          help='Path to a specific model to test against. If a directory, then the newest checkpoint file will be used.', type=str,\n                          required=True, default=None)\n\n    parser.add_argument('--save_path', dest='save_path',\n                          help='File name path to a save the classification results.', type=str,\n                          required=True, default=None)\n\n    parser.add_argument('--config', dest='config_file',\n                        help='Path to the configuration file',\n                        required=True, type=str)\n\n    parser.add_argument('--batch_size', dest='batch_size',\n                        help='The number of images in a batch.',\n                        required=True, type=int, default=None)\n\n    parser.add_argument('--batches', dest='batches',\n                        help='Maximum number of iterations to run. Default is all records (modulo the batch size).',\n                        required=True, type=int, default=0)\n\n    parser.add_argument('--save_logits', dest='save_logits',\n                        help='Should the logits be saved?',\n                        action='store_true', default=False)\n\n    parser.add_argument('--model_name', dest='model_name',\n                        help='The name of the architecture to use.',\n                        required=False, type=str, default=None)\n\n    parser.add_argument('--read_images', dest='read_images',\n                        help='Read the images from the file system using the `filename` field rather than using the `encoded` field of the tfrecord.',\n                        action='store_true', default=False)\n\n\n    args = parser.parse_args()\n    return args\n\ndef main():\n    args = parse_args()\n\n    cfg = parse_config_file(args.config_file)\n\n    if args.batch_size != None:\n        cfg.BATCH_SIZE = args.batch_size\n\n    if args.model_name != None:\n        cfg.MODEL_NAME = args.model_name\n\n    classify(\n        tfrecords=args.tfrecords,\n        checkpoint_path=args.checkpoint_path,\n        save_path = args.save_path,\n        max_iterations=args.batches,\n        save_logits=args.save_logits,\n        cfg=cfg,\n        read_images=args.read_images\n    )\n\nif __name__ == '__main__':\n    main()\n"
  },
  {
    "path": "config/README.md",
    "content": "This directory contains example configuration scripts for training, testing, classifying and exporting models. I find it easy to copy these configuration files to my experiment directory and make the necessary changes. \n\n## Training Configuration\nSee the [example training config file](config_train.yaml). \n\nThe training configuration script contains the most configurations. The other scripts mainly contain subsets of the training configuration. The `Learning Rate Parameters`, `Regularization`, and `Optimization` configurations provided experimenters fine-grained control over the learning process. Non-researchers will probably find most of the default settings adequate. I will not go into detail for these configuration parameters, but there are comments for these parameters in the [example training config file](config_train.yaml).\n\nThe configuration sections that you will want to pay attention to are the `Dataset Info` section and the `Image Processing and Augmentation` section. You'll most likely be modifying these for each experiment. Once you determine good settings for the `Queues` and `Saving Models and Summaries` you'll probably reuse these values across experiments.\n\n### Dataset Info\n| Config Name | Type | Description |\n:----:|:----:|------------|\nNUM_CLASSES | int | This is how you specify how many classes are in your dataset. |\nNUM_TRAIN_EXAMPLES | int | This is the number of images (or bounding boxes) in your training tfrecords. This value, along with the `BATCH_SIZE` is used to compute the number of iterations in an epoch (i.e. the number of batches it takes to go through the whole training set) |\nNUM_TRAIN_ITERATIONS | int | The maximum number of iterations to execute before stopping. If you are manually monitoring the training, then you can set this to a large number (e.g. 1000000) |\nBATCH_SIZE | int | The number of images to process in one iteration. This number is constrained by the amount of GPU memory you have. The larger the batch size, the more GPU memory you need. You typically want the largest batch size that will fit on your GPU. |\nMODEL_NAME | str | The architecture to use. Its important to keep this configuration parameter constant in all of your configuration files. |\n\n### Image Processing and Augmentation\nDeep neural networks are notoriously data hungry. One technique for increasing the amount of data that you can pass through the network is to augment your training data. Augmentations can be as simple as randomly flipping the images horizontally, or as complex as extracting crops and perturbing the pixel values. You will typically only want to augment data for the training phase. \n\n`IMAGE_PROCESSING` contains the parameters for controlling how to extract data from the images:\n\n| Config Name | Type | Description |\n:----:|:----:|------------|\nINPUT_SIZE | int | All images will be resized to [`INPUT_SIZE`, `INPUT_SIZE`, 3] prior to passing through the network. You'll want to set this to the same value that the pretrained model used. See the nets [README](../nets/README.md) for the input size of each model architecture. |\nREGION_TYPE | str | Which region should be used when creating an example? Possible values are `image` and `bbox`. |\nMAINTAIN_ASPECT_RATIO | bool | When we resize an extracted region, should we maintain the aspect ratio? Or just squish it? \nRESIZE_FAST | bool | If true, then slower resize operations will be avoided and only [bilinear resizing](https://en.wikipedia.org/wiki/Bilinear_interpolation) will be used. Otherwise, a random choice between [bilinear](), [nearest neighbor](https://en.wikipedia.org/wiki/Nearest-neighbor_interpolation), [bicubic](https://en.wikipedia.org/wiki/Bicubic_interpolation) and area interpolation will be used. |\nDO_RANDOM_FLIP_LEFT_RIGHT | bool | If true, then each region has a 50% chance of being flipped. | \nDO_COLOR_DISTORTION | float | Value between 0 and 1. 0 means never distort the color, and 1 means always distort the color. |\nCOLOR_DISTORT_FAST | bool | Its possible to distort the brightness, saturation, hue and contrast of an image. If true, then slower modifications (hue and contrast) are avoided. |\n\n#### Region Extraction\n\nCurrently there are two different region extraction protocols: \n* `image`: The entire image is extracted and passed to the next phase of augmentation \n* `bbox`: Each bounding box in the tfrecord is used to crop out an image region. These regions are passed on to the next phase of augmentation. If there are `n` bounding boxes in a tfrecord, then `n` regions will be extracted from the image. \n\nFor bounding boxes, we can specify wether we want to enlarge the box. This can be used as another form of augmentation (loose bounding boxes vs tight bounding boxes).\n\n| Config Name | Type | Description |\n:----:|:----:|------------|\nDO_EXPANSION | float | Value between 0 and 1. 0 means never expand the box. 1 means always expand the box. |\nEXPANSION_CFG | | Contains the parameters controlling the expansion of the bounding box. | \nEXPANSION_CFG.<br />WIDTH_EXPANSION_FACTOR | float | Scaling factor for the width of the box. | \nEXPANSION_CFG.<br />HEIGHT_EXPANSION_FACTOR | float | Scaling factor for the height of the box. | \n\n\n#### Random Cropping\n\nEach region that is extracted from an image can then be randomly cropped. Again, this is a form of data augmentation. We are trying to make the network robust to changes in the data that do not effect the class label. \n\n`RANDOM_CROP_CFG` contains parameters for cropping out a rectangular patch from each region. \n\n| Config Name | Type | Description |\n:----:|:----:|------------|\nDO_RANDOM_CROP | float | Value between 0 and 1. 0 means never crop a region. 1 means always take a crop. |\nRANDOM_CROP_CFG | | This contains parameters that controls the types of crops that are possible. |\nRANDOM_CROP_CFG.<br />MIN_AREA | float | Value between 0 and 1. This controls how much of the region is required to be in the crop, essentially controlling how small a crop can be. |\nRANDOM_CROP_CFG.<br />MAX_AREA | float | Value between 0 and 1. This controls the maximum size of the crop. |\nRANDOM_CROP_CFG.<br />MIN_ASPECT_RATIO | float | The minimum [aspect ratio](https://en.wikipedia.org/wiki/Aspect_ratio_(image)) of the crop. Don't forget that this crop will be resized to [`INPUT_SIZE`, `INPUT_SIZE`, 3] prior to passing through the network. |\nRANDOM_CROP_CFG.<br />MAX_ASPECT_RATIO | float | The maximum [aspect ratio](https://en.wikipedia.org/wiki/Aspect_ratio_(image)) of the crop. Don't forget that this crop will be resized to [`INPUT_SIZE`, `INPUT_SIZE`, 3] prior to passing through the network. |\nRANDOM_CROP_CFG.<br />MAX_ATTEMPTS | int | The number of crop attempts to try before returning the whole region. |\n\n### Queues\nThis section of the config file contains parameters for controlling the queueing of data to feed the network. These setting depend on the number of cores in your machine and the amount of memory available. Please see the comments in the example config file for more information. \n\n### Saving Models and Summaries \nThis section of the config file contains parameters for controlling how often a model checkpoint should be created and how often tensorboard summary files should be generated. Please see the comments in the example config file for more information. \n\n## Testing Configuration\nSee the [example testing config file](config_test.yaml). \n\nThe `Learning Rate Parameters`, `Optimization`, and `Saving Models and Summaries` parameters are not necessary for testing. The remaining parameters from the training config carry over to testing. In addition there are a few new configurations:\n\n| Config Name | Type | Description |\n:----:|:----:|------------|\nPRECISION_AT_K_METRIC | array of ints | You can track top-k metrics using this array. Top-1 (i.e. accuracy) will always be plotted |\nNUM_TEST_EXAMPLES | int | The number of images (or bounding boxes) in the tfrecords. This can be ignored if you use the `--batches` command line flag. | \n\nTypically in a testing situation you'll want to turn off the augmentations to the extracted image regions. This way you are passing \"real\" data to the network. See the `Image Processing and Augmentation` section of the [example testing config file](config_test.yaml) to see how to extract regions without augmentations.\n\n## Classification Configuration\nSee the [example classification config file](config_classify.yaml).\n\nThe classification configuration contains even fewer necessary fields than the testing configuration. The `Metrics` section is removed and you'll need to pass batch size and total batch information through command-line arguments. \n\n## Export Configuration\nSee the [example export config file](config_export.yaml).\n\nThe export configuration is the smallest configuration file. See the [example](config_export.yaml) for which fields are required. \n"
  },
  {
    "path": "config/__init__.py",
    "content": ""
  },
  {
    "path": "config/config_classify.yaml",
    "content": "# Classification specific configuration\n\nRANDOM_SEED : 1.0\n\nSESSION_CONFIG : {\n  # If true, then the device location of each variable will be printed\n  LOG_DEVICE_PLACEMENT : false,\n\n  # How much GPU memory we are allowed to pre-allocate\n  PER_PROCESS_GPU_MEMORY_FRACTION : 0.9,\n\n  # Set the number of accessible cpu threads. Leave as null to use everything.\n  # Set to 1 to help with debugging (makes the print statements legible)\n  INTRA_OP_PARALLELISM_THREADS : null,\n  INTER_OP_PARALLELISM_THREADS : null\n}\n\n#################################################\n# Dataset Info\n# The number of classes we are classifying\nNUM_CLASSES : 200\n\n# The model architecture to use.\nMODEL_NAME : 'inception_v3'\n\n# END: Dataset Info\n#################################################\n# Image Processing and Augmentation\n# There are 5 steps to image processing:\n# 1) Extract regions from the image\n# 2) Extract a crops from each region\n# 3) Resize the crops for the network architecture\n# 4) Flip the crops\n# 5) Modify the colors of the crops\nIMAGE_PROCESSING : {\n    # All images will be resized to the [INPUT_SIZE, INPUT_SIZE, 3]\n    INPUT_SIZE : 299,\n\n    # 1) First we extract regions from the image\n    # What type of region should be extracted, either 'image' or 'bbox'\n    REGION_TYPE : 'image',\n\n    # Specific whole image region extraction configuration\n    WHOLE_IMAGE_CFG : {},\n\n    # Specific bounding box region extraction configuration\n    BBOX_CFG : {\n        # We can centrally expand a bbox (i.e. turn a tight crop into a loose crop)\n        # The fraction of time to expand the bounding box, 0 is never, 1 is always\n        DO_EXPANSION : 1,\n        EXPANSION_CFG : {\n            WIDTH_EXPANSION_FACTOR : 2.0, # Expand the width by a factor of 2 (centrally)\n            HEIGHT_EXPANSION_FACTOR : 2.0, # Expand the height by a factor of 2 (centrally)\n        }\n    },\n\n    # 2) Then we take a random crop from the region\n    # The fraction of time to take a random crop, 0 is never, 1 is always\n    DO_RANDOM_CROP : 0,\n    RANDOM_CROP_CFG : {\n        MIN_AREA : 0.5, # between 0 and 1, how much of the region must be included\n        MAX_AREA : 1.0, # between 0 and 1, how much of the region can be included\n        MIN_ASPECT_RATIO : 0.7, # minimum aspect ratio of the crop\n        MAX_ASPECT_RATIO : 1.33, # maximum aspect ratio of the crop\n        MAX_ATTEMPTS : 100, # maximum number of attempts before returning the whole region\n    },\n\n    # Alternatively we can take a central crop from the image\n    DO_CENTRAL_CROP : 0, # Fraction of the time to take a central crop, 0 is never, 1 is always\n    CENTRAL_CROP_FRACTION : 0.875, # Between 0 and 1, fraction of size to crop\n\n    # 3) We need to resize the extracted regions to feed into the network.\n    MAINTAIN_ASPECT_RATIO : false,\n    # Avoid slower resize operations (bi-cubic, etc.)\n    RESIZE_FAST : true,\n\n    # 4) We can flip the regions\n    # Randomly flip the image left right, 50% chance of flipping\n    DO_RANDOM_FLIP_LEFT_RIGHT : false,\n\n    # 5) We can distort the colors of the regions\n    # The fraction of time to distort the color, 0 is never, 1 is always\n    DO_COLOR_DISTORTION : 0,\n    # Avoids slower ops (random_hue and random_contrast)\n    COLOR_DISTORT_FAST : false\n}\n\n# END: Image Processing and Augmentation\n#################################################\n# Queues\n#\n# Number of threads to populate the batch queue\nNUM_INPUT_THREADS : 2\n# Should the data be shuffled?\nSHUFFLE_QUEUE : false\n# Capacity of the queue producing batched examples\nQUEUE_CAPACITY : 1000\n# Minimum size of the queue to ensure good shuffling\nQUEUE_MIN :  200\n\n# END: Queues\n#################################################\n# Regularization\n#\n# The decay to use for the moving average. If 0, then moving average is not computed\n# When restoring models, this value is needed to determine whether to restore moving\n# average variables or not.\nMOVING_AVERAGE_DECAY : 0.9999\n\n# End: Regularization\n#################################################"
  },
  {
    "path": "config/config_export.yaml",
    "content": "# Export specific configuration\n\nRANDOM_SEED : 1.0\n\nSESSION_CONFIG : {\n  # If true, then the device location of each variable will be printed\n  LOG_DEVICE_PLACEMENT : false,\n\n  # How much GPU memory we are allowed to pre-allocate\n  PER_PROCESS_GPU_MEMORY_FRACTION : 0.9,\n\n  # Set the number of accessible cpu threads. Leave as null to use everything.\n  # Set to 1 to help with debugging (makes the print statements legible)\n  INTRA_OP_PARALLELISM_THREADS : null,\n  INTER_OP_PARALLELISM_THREADS : null\n}\n\n#################################################\n# Dataset Info\n# The number of classes we are classifying\nNUM_CLASSES : 200\n\n# The model architecture to use.\nMODEL_NAME : 'inception_v3'\n\n# END: Dataset Info\n#################################################\n# Image Processing and Augmentation\n\nIMAGE_PROCESSING : {\n    # Images are assumed to be raveled, and have length  INPUT_SIZE * INPUT_SIZE * 3\n    INPUT_SIZE : 299\n}\n\n# END: Image Processing and Augmentation\n#################################################\n# Regularization\n#\n# The decay to use for the moving average. If 0, then moving average is not computed\n# When restoring models, this value is needed to determine whether to restore moving\n# average variables or not.\nMOVING_AVERAGE_DECAY : 0.9999\n\n# End: Regularization\n#################################################"
  },
  {
    "path": "config/config_test.yaml",
    "content": "# Testing specific configuration\n\nRANDOM_SEED : 1.0\n\nSESSION_CONFIG : {\n  # If true, then the device location of each variable will be printed\n  LOG_DEVICE_PLACEMENT : false,\n\n  # How much GPU memory we are allowed to pre-allocate\n  PER_PROCESS_GPU_MEMORY_FRACTION : 0.9,\n\n  # Set the number of accessible cpu threads. Leave as null to use everything.\n  # Set to 1 to help with debugging (makes the print statements legible)\n  INTRA_OP_PARALLELISM_THREADS : null,\n  INTER_OP_PARALLELISM_THREADS : null\n}\n\n#################################################\n# Metrics\n#\n# Top-k precision information. Each entry is a different k value.\nACCURACY_AT_K_METRIC : [3, 5]\n\n# END: Metrics\n#################################################\n# Dataset Info\n# The number of classes we are classifying\nNUM_CLASSES : 200\n\n# Number of test examples in the tfrecords. This is needed to compute the total number of\n# batches to pass through the network.\nNUM_TEST_EXAMPLES : 5794\n\n# The number of images to pass through the network on each iteration\nBATCH_SIZE : 32\n\n# The model architecture to use.\nMODEL_NAME : 'inception_v3'\n\n# END: Dataset Info\n#################################################\n# Image Processing and Augmentation\n# There are 5 steps to image processing:\n# 1) Extract regions from the image\n# 2) Extract a crops from each region\n# 3) Resize the crops for the network architecture\n# 4) Flip the crops\n# 5) Modify the colors of the crops\nIMAGE_PROCESSING : {\n    # All images will be resized to the [INPUT_SIZE, INPUT_SIZE, 3]\n    INPUT_SIZE : 299,\n\n    # 1) First we extract regions from the image\n    # What type of region should be extracted, either 'image' or 'bbox'\n    REGION_TYPE : 'image',\n\n    # Specific whole image region extraction configuration\n    WHOLE_IMAGE_CFG : {},\n\n    # Specific bounding box region extraction configuration\n    BBOX_CFG : {\n        # We can centrally expand a bbox (i.e. turn a tight crop into a loose crop)\n        # The fraction of time to expand the bounding box, 0 is never, 1 is always\n        DO_EXPANSION : 1,\n        EXPANSION_CFG : {\n            WIDTH_EXPANSION_FACTOR : 2.0, # Expand the width by a factor of 2 (centrally)\n            HEIGHT_EXPANSION_FACTOR : 2.0, # Expand the height by a factor of 2 (centrally)\n        }\n    },\n\n    # 2) Then we take a random crop from the region\n    # The fraction of time to take a random crop, 0 is never, 1 is always\n    DO_RANDOM_CROP : 0,\n    RANDOM_CROP_CFG : {\n        MIN_AREA : 0.5, # between 0 and 1, how much of the region must be included\n        MAX_AREA : 1.0, # between 0 and 1, how much of the region can be included\n        MIN_ASPECT_RATIO : 0.7, # minimum aspect ratio of the crop\n        MAX_ASPECT_RATIO : 1.33, # maximum aspect ratio of the crop\n        MAX_ATTEMPTS : 100, # maximum number of attempts before returning the whole region\n    },\n\n    # Alternatively we can take a central crop from the image\n    DO_CENTRAL_CROP : 0, # Fraction of the time to take a central crop, 0 is never, 1 is always\n    CENTRAL_CROP_FRACTION : 0.875, # Between 0 and 1, fraction of size to crop\n\n    # 3) We need to resize the extracted regions to feed into the network.\n    MAINTAIN_ASPECT_RATIO : false,\n    # Avoid slower resize operations (bi-cubic, etc.)\n    RESIZE_FAST : true,\n\n    # 4) We can flip the regions\n    # Randomly flip the image left right, 50% chance of flipping\n    DO_RANDOM_FLIP_LEFT_RIGHT : false,\n\n    # 5) We can distort the colors of the regions\n    # The fraction of time to distort the color, 0 is never, 1 is always\n    DO_COLOR_DISTORTION : 0,\n    # Avoids slower ops (random_hue and random_contrast)\n    COLOR_DISTORT_FAST : false\n}\n\n# END: Image Processing and Augmentation\n#################################################\n# Queues\n#\n# Number of threads to populate the batch queue\nNUM_INPUT_THREADS : 2\n# Should the data be shuffled?\nSHUFFLE_QUEUE : false\n# Capacity of the queue producing batched examples\nQUEUE_CAPACITY : 1000\n# Minimum size of the queue to ensure good shuffling\nQUEUE_MIN :  200\n\n# END: Queues\n#################################################\n# Regularization\n#\n# The decay to use for the moving average. If 0, then moving average is not computed\n# When restoring models, this value is needed to determine whether to restore moving\n# average variables or not.\nMOVING_AVERAGE_DECAY : 0.9999\n\n# End: Regularization\n#################################################"
  },
  {
    "path": "config/config_train.yaml",
    "content": "# Training specific configuration\n\nRANDOM_SEED : 1.0\n\nSESSION_CONFIG : {\n  # If true, then the device location of each variable will be printed\n  LOG_DEVICE_PLACEMENT : false,\n\n  # How much GPU memory we are allowed to pre-allocate\n  PER_PROCESS_GPU_MEMORY_FRACTION : 0.9,\n\n  # Set the number of accessible cpu threads. Leave as null to use everything.\n  # Set to 1 to help with debugging (makes the print statements legible)\n  INTRA_OP_PARALLELISM_THREADS : null,\n  INTER_OP_PARALLELISM_THREADS : null\n}\n\n#################################################\n# Dataset Info\n#\n# The number of classes we are classifying\nNUM_CLASSES : 200\n\n# Number of training examples in the tfrecords. This is needed to compute the number of\n# batches in an epoch\nNUM_TRAIN_EXAMPLES : 5994\n\n# Maximum number of iterations to run before stopping\nNUM_TRAIN_ITERATIONS : 20000\n\n# The number of images to pass through the network in a single iteration\nBATCH_SIZE : 32\n\n# Which model architecture to use.\nMODEL_NAME : 'inception_v3'\n\n# END: Dataset Info\n#################################################\n# Image Processing and Augmentation\n# There are 5 steps to image processing:\n# 1) Extract regions from the image\n# 2) Extract a crops from each region\n# 3) Resize the crops for the network architecture\n# 4) Flip the crops\n# 5) Modify the colors of the crops\nIMAGE_PROCESSING : {\n    # All images will be resized to the [INPUT_SIZE, INPUT_SIZE, 3]\n    INPUT_SIZE : 299,\n\n    # 1) First we extract regions from the image\n    # What type of region should be extracted, either 'image' or 'bbox'\n    REGION_TYPE : 'image',\n\n    # Specific whole image region extraction configuration\n    WHOLE_IMAGE_CFG : {},\n\n    # Specific bounding box region extraction configuration\n    BBOX_CFG : {\n        # We can centrally expand a bbox (i.e. turn a tight crop into a loose crop)\n        # The fraction of time to expand the bounding box, 0 is never, 1 is always\n        DO_EXPANSION : 1,\n        EXPANSION_CFG : {\n            WIDTH_EXPANSION_FACTOR : 2.0, # Expand the width by a factor of 2 (centrally)\n            HEIGHT_EXPANSION_FACTOR : 2.0, # Expand the height by a factor of 2 (centrally)\n        }\n    },\n\n    # 2) Then we take a random crop from the region\n    # The fraction of time to take a random crop, 0 is never, 1 is always\n    DO_RANDOM_CROP : 1,\n    RANDOM_CROP_CFG : {\n        MIN_AREA : 0.5, # between 0 and 1, how much of the region must be included\n        MAX_AREA : 1.0, # between 0 and 1, how much of the region can be included\n        MIN_ASPECT_RATIO : 0.7, # minimum aspect ratio of the crop\n        MAX_ASPECT_RATIO : 1.33, # maximum aspect ratio of the crop\n        MAX_ATTEMPTS : 100, # maximum number of attempts before returning the whole region\n    },\n\n    # Alternatively we can take a central crop from the image\n    DO_CENTRAL_CROP : 0, # Fraction of the time to take a central crop, 0 is never, 1 is always\n    CENTRAL_CROP_FRACTION : 0.875, # Between 0 and 1, fraction of size to crop\n\n    # 3) We need to resize the extracted regions to feed into the network.\n    MAINTAIN_ASPECT_RATIO : false,\n    # Avoid slower resize operations (bi-cubic, etc.)\n    RESIZE_FAST : false,\n\n    # 4) We can flip the regions\n    # Randomly flip the image left right, 50% chance of flipping\n    DO_RANDOM_FLIP_LEFT_RIGHT : true,\n\n    # 5) We can distort the colors of the regions\n    # The fraction of time to distort the color, 0 is never, 1 is always\n    DO_COLOR_DISTORTION : 0.3,\n    # Avoids slower ops (random_hue and random_contrast)\n    COLOR_DISTORT_FAST : false\n}\n\n# END: Image Processing and Augmentation\n#################################################\n# Queues\n#\n# Number of threads to populate the batch queue\nNUM_INPUT_THREADS : 4\n# Should the data be shuffled?\nSHUFFLE_QUEUE : true\n# Capacity of the queue producing batched examples\nQUEUE_CAPACITY : 1000\n# Minimum size of the queue to ensure good shuffling\nQUEUE_MIN :  200\n\n# END: Queues\n#################################################\n# Saving Models and Summaries\n#\n# How often, in seconds, to save summaries.\nSAVE_SUMMARY_SECS : 30\n\n# How often, in seconds, to save the model\nSAVE_INTERVAL_SECS : 1800\n\n# The maximum number of recent checkpoint files to keep.\nMAX_TO_KEEP : 3\n\n# In addition to keeping the most recent `max_to_keep` checkpoint files,\n# you might want to keep one checkpoint file for every N hours of training\n# The default value of 10,000 hours effectively disables the feature.\nKEEP_CHECKPOINT_EVERY_N_HOURS : 10000\n\n# The frequency, in terms of global steps, that the loss and global step and logged.\nLOG_EVERY_N_STEPS : 10\n\n# END: Saving Models and Summaries\n#################################################\n# Learning Rate Parameters\nLEARNING_RATE_DECAY_TYPE : 'exponential' # One of \"fixed\", \"exponential\", or \"polynomial\"\n\nINITIAL_LEARNING_RATE : 0.01\n\n# The minimal end learning rate used by a polynomial decay learning rate.\nEND_LEARNING_RATE : 0.0001\n\n# The amount of label smoothing.\nLABEL_SMOOTHING : 0.1\n\n# How much to decay the learning rate\nLEARNING_RATE_DECAY_FACTOR : 0.94\n# Number of epochs between decaying the learning rate\nNUM_EPOCHS_PER_DELAY : 4\n\nLEARNING_RATE_STAIRCASE : true\n\n# END: Learning Rate Parameters\n#################################################\n# Regularization\n#\n# The decay to use for the moving average. If 0, then moving average is not computed\nMOVING_AVERAGE_DECAY : 0.9999\n\n# The weight decay on the model weights\nWEIGHT_DECAY : 0.00004\n\nBATCHNORM_MOVING_AVERAGE_DECAY : 0.9997\nBATCHNORM_EPSILON : 0.001\n\nDROPOUT_KEEP_PROB : 0.5\n\nCLIP_GRADIENT_NORM : 0 # If 0, no clipping is performed. Otherwise acts as a threshold to clip the gradients.\n\n# End: Regularization\n#################################################\n# Optimization\n#\n# The name of the optimizer, one of \"adadelta\", \"adagrad\", \"adam\", \"ftrl\", \"momentum\", \"sgd\" or \"rmsprop\"\nOPTIMIZER : 'rmsprop'\nOPTIMIZER_EPSILON : 1.0\n\n# The decay rate for adadelta.\nADADELTA_RHO: 0.95\n\n# Starting value for the AdaGrad accumulators.\nADAGRAD_INITIAL_ACCUMULATOR_VALUE: 0.1\n\n# The exponential decay rate for the 1st moment estimates.\nADAM_BETA1 : 0.9\n# The exponential decay rate for the 2nd moment estimates.\nADAM_BETA2 : 0.99\n\n# The learning rate power.\nFTRL_LEARNING_RATE_POWER : -0.5\n# Starting value for the FTRL accumulators.\nFTRL_INITIAL_ACCUMULATOR_VALUE : 0.1\n# The FTRL l1 regularization strength.\nFTRL_L1 : 0.0\n# The FTRL l2 regularization strength.\nFTRL_L2 : 0.0\n\n# The momentum for the MomentumOptimizer and RMSPropOptimizer\nMOMENTUM : 0.9\n\n# Decay term for RMSProp.\nRMSPROP_DECAY : 0.9\n\n# END: Optimization\n#################################################"
  },
  {
    "path": "config/parse_config.py",
    "content": "from __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport yaml\nfrom easydict import EasyDict as easydict\n\ndef parse_config_file(path_to_config):\n\n    with open(path_to_config) as f:\n        cfg = yaml.load(f)\n\n    return easydict(cfg)"
  },
  {
    "path": "export.py",
    "content": "\"\"\"\nExport a trained model for application use.\n\nExample for use with TensorFlow Serving:\npython export.py \\\n--checkpoint_path model.ckpt-399739 \\\n--export_dir export \\\n--export_version 1 \\\n--config config_export.yaml \\\n--serving \\\n--add_preprocess \\\n--class_names class-codes.txt\n\nExample for use with TensorFlow Mobile:\npython export.py \\\n--checkpoint_path model.ckpt-399739 \\\n--export_dir export \\\n--export_version 1 \\\n--config config_export.yaml \\\n--class_names class-codes.txt\n\nAuthor: Grant Van Horn\n\"\"\"\n\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport argparse\nimport os\n\nimport tensorflow as tf\nfrom tensorflow.python.framework import dtypes\nfrom tensorflow.python.framework import graph_util\nfrom tensorflow.python.saved_model import builder as saved_model_builder\nfrom tensorflow.python.saved_model import signature_constants\nfrom tensorflow.python.saved_model import signature_def_utils\nfrom tensorflow.python.saved_model import tag_constants\nfrom tensorflow.python.saved_model import utils\nfrom tensorflow.python.tools import optimize_for_inference_lib\nslim = tf.contrib.slim\n\nfrom config.parse_config import parse_config_file\nfrom nets import nets_factory\n\n\ndef export(checkpoint_path,\n           export_dir, export_version, export_for_serving, export_tflite, export_coreml,\n           add_preprocess_step,\n           output_classes, class_names,\n           batch_size, raveled_input,\n           cfg):\n    \"\"\"Export a model for use with TensorFlow Serving or for more conveinent use on mobile devices, etc.\n    Arguments:\n      checkpoint_path (str): Path to the specific model checkpoint file to export.\n      export_dir (str): Path to a directory to store the export files.\n      export_version (int): The version number of this export. If `export_for_serving` is True, then this version\n        number must not exist in the `export_dir`.\n      export_for_serving (bool): Export a model for use with TensorFlow Serving.\n      export_tflite (bool): Export a model for tensorflow lite.\n      export_coreml (bool): Export a model for coreml.\n      add_preprocess_step (bool): If True, then an input path for handling image byte strings will be added to the graph.\n      output_classes (bool): If True, then the class indices (or `class_names` if provided) will be output along with the scores.\n      class_names (list): A list of semantic class identifiers to embed within the model that correspond to the prediction\n        indices. Set to None to not embed.\n      batch_size (int or None): Specify a fixed batch size, or use None to keep it flexible. For tflite export you'll need a fixed batch size.\n      raveled_input (bool): If True, then the input is considered to be a raveled vector that will be reshaped to a fixed height and width. Otherwise it will be treated as the proper shape.\n      cfg (dict): Configuration dictionary.\n    \"\"\"\n\n    if not os.path.exists(export_dir):\n        print(\"Making export directory: %s\" % (export_dir,))\n        os.makedirs(export_dir)\n\n    graph = tf.Graph()\n\n    array_input_node_name = \"images\"\n    bytes_input_node_name = \"image_bytes\"\n\n    output_node_name = \"Predictions\"\n    class_names_node_name = \"names\"\n\n    input_height = cfg.IMAGE_PROCESSING.INPUT_SIZE\n    input_width = cfg.IMAGE_PROCESSING.INPUT_SIZE\n    input_depth = 3\n\n    with graph.as_default():\n\n        global_step = slim.get_or_create_global_step()\n\n        # We want to store the preprocessing operation in the graph\n        if add_preprocess_step:\n\n            # The TensorFlow map_fn() function passes one argument only,\n            # so I have put this method here to take advantage of scope\n            # (to access input_height, etc.)\n            def preprocess_image(image_buffer):\n                \"\"\"Preprocess image bytes to 3D float Tensor.\"\"\"\n\n                # Decode image bytes\n                image = tf.image.decode_image(image_buffer)\n                image = tf.image.convert_image_dtype(image, dtype=tf.float32)\n\n                # make sure the image is of rank 3\n                image = tf.cond(\n                    tf.equal(tf.rank(image), 2),\n                    lambda: tf.expand_dims(image, 2),\n                    lambda: image\n                )\n\n                num_channels = tf.shape(image)[2]\n\n                # if we decoded 1 channel (grayscale), then convert to a RGB image\n                image = tf.cond(\n                    tf.equal(num_channels, 1),\n                    lambda: tf.image.grayscale_to_rgb(image),\n                    lambda: image\n                )\n\n                # if we decoded 2 channels (grayscale + alpha), then strip off the last dim and convert to rgb\n                image = tf.cond(\n                    tf.equal(num_channels, 2),\n                    lambda: tf.image.grayscale_to_rgb(\n                        tf.expand_dims(image[:, :, 0], 2)),\n                    lambda: image\n                )\n\n                # if we decoded 4 or more channels (rgb + alpha), then take the first three channels\n                image = tf.cond(\n                    tf.greater(num_channels, 3),\n                    lambda: image[:, :, :3],\n                    lambda: image\n                )\n\n                # Resize the image to the input height and width for the network.\n                image = tf.expand_dims(image, 0)\n                image = tf.image.resize_bilinear(image,\n                                                 [input_height, input_width],\n                                                 align_corners=False)\n                image = tf.squeeze(image, [0])\n                # Finally, rescale to [-1,1] instead of [0, 1)\n                image = tf.subtract(image, 0.5)\n                image = tf.multiply(image, 2.0)\n                return image\n\n            image_bytes_placeholder = tf.placeholder(\n                tf.string, name=bytes_input_node_name)\n            preped_images = tf.map_fn(\n                preprocess_image, image_bytes_placeholder, dtype=tf.float32)\n            # Explicit name (we can't name the map_fn)\n            input_placeholder = tf.identity(\n                preped_images, name=array_input_node_name)\n\n        # We assume the client has preprocessed the data for us\n        else:\n            # Is the input coming in as a raveled vector? Or is it a tensor?\n            if raveled_input:\n                input_placeholder = tf.placeholder(tf.float32, shape=[batch_size, input_height * input_width * input_depth], name=array_input_node_name)\n            else:\n                input_placeholder = tf.placeholder(tf.float32, shape=[batch_size, input_height, input_width, input_depth], name=array_input_node_name)\n\n        # Reshape the images to proper tensors if they are coming in as vectors.\n        if raveled_input:\n            images = tf.reshape(input_placeholder,\n                                [-1, input_height, input_width, input_depth])\n        else:\n            images = input_placeholder\n\n        arg_scope = nets_factory.arg_scopes_map[cfg.MODEL_NAME]()\n\n        with slim.arg_scope(arg_scope):\n            logits, end_points = nets_factory.networks_map[cfg.MODEL_NAME](\n                inputs=images,\n                num_classes=cfg.NUM_CLASSES,\n                is_training=False\n            )\n\n        class_scores = end_points['Predictions']\n        if output_classes:\n            if class_names == None:\n                class_names = tf.range(class_scores.get_shape().as_list()[1])\n            predicted_classes = tf.tile(tf.expand_dims(class_names, 0), [\n                                        tf.shape(class_scores)[0], 1], name=class_names_node_name)\n\n        # GVH: I would like to use tf.identity here, but the function tensorflow.python.framework.graph_util.remove_training_nodes\n        # called in (optimize_for_inference_lib.optimize_for_inference) removes the identity function.\n        # Sticking with an add 0 operation for now.\n        # We are doing this so that we can rename the output to `output_node_name` (i.e. something consistent)\n        output_node = tf.add(\n            end_points['Predictions'], 0., name=output_node_name)\n        output_node_name = output_node.op.name\n\n        if 'MOVING_AVERAGE_DECAY' in cfg and cfg.MOVING_AVERAGE_DECAY > 0:\n            variable_averages = tf.train.ExponentialMovingAverage(\n                cfg.MOVING_AVERAGE_DECAY, global_step)\n            variables_to_restore = variable_averages.variables_to_restore(\n                slim.get_model_variables())\n        else:\n            variables_to_restore = slim.get_variables_to_restore()\n\n        saver = tf.train.Saver(variables_to_restore, reshape=True)\n\n        if os.path.isdir(checkpoint_path):\n            checkpoint_dir = checkpoint_path\n            checkpoint_path = tf.train.latest_checkpoint(checkpoint_dir)\n\n            if checkpoint_path is None:\n                raise ValueError(\"Unable to find a model checkpoint in the \"\n                                 \"directory %s\" % (checkpoint_dir,))\n\n        tf.logging.info('Exporting model: %s' % checkpoint_path)\n\n        sess_config = tf.ConfigProto(\n            log_device_placement=cfg.SESSION_CONFIG.LOG_DEVICE_PLACEMENT,\n            allow_soft_placement=True,\n            gpu_options=tf.GPUOptions(\n                per_process_gpu_memory_fraction=cfg.SESSION_CONFIG.PER_PROCESS_GPU_MEMORY_FRACTION\n            )\n        )\n        sess = tf.Session(graph=graph, config=sess_config)\n\n        if export_for_serving:\n\n            with tf.Session(graph=graph) as sess:\n\n                tf.global_variables_initializer().run()\n\n                saver.restore(sess, checkpoint_path)\n\n                save_path = os.path.join(export_dir, \"%d\" % (export_version,))\n\n                builder = saved_model_builder.SavedModelBuilder(save_path)\n\n                # Build the signature_def_map.\n                signature_def_map = {}\n                signature_def_outputs = {\n                    'scores': utils.build_tensor_info(class_scores)}\n                if output_classes:\n                    signature_def_outputs['classes'] = utils.build_tensor_info(\n                        predicted_classes)\n\n                # image bytes input\n                if add_preprocess_step:\n                    image_bytes_tensor_info = utils.build_tensor_info(\n                        image_bytes_placeholder)\n                    image_bytes_prediction_signature = signature_def_utils.build_signature_def(\n                        inputs={'images': image_bytes_tensor_info},\n                        outputs=signature_def_outputs,\n                        method_name=signature_constants.PREDICT_METHOD_NAME\n                    )\n                    signature_def_map['predict_image_bytes'] = image_bytes_prediction_signature\n\n                # image array input\n                image_array_tensor_info = utils.build_tensor_info(\n                    input_placeholder)\n                image_array_prediction_signature = signature_def_utils.build_signature_def(\n                    inputs={'images': image_array_tensor_info},\n                    outputs=signature_def_outputs,\n                    method_name=signature_constants.PREDICT_METHOD_NAME\n                )\n                signature_def_map['predict_image_array'] = image_array_prediction_signature\n                signature_def_map[signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY] = image_array_prediction_signature\n\n                legacy_init_op = tf.group(\n                    tf.tables_initializer(), name='legacy_init_op')\n\n                builder.add_meta_graph_and_variables(\n                    sess, [tag_constants.SERVING],\n                    signature_def_map=signature_def_map,\n                    legacy_init_op=legacy_init_op\n                )\n\n                builder.save()\n\n                print(\"Saved optimized model for TensorFlow Serving.\")\n\n        else:\n            with sess.as_default():\n\n                tf.global_variables_initializer().run()\n\n                saver.restore(sess, checkpoint_path)\n\n                input_graph_def = graph.as_graph_def()\n                input_node_names = [array_input_node_name]\n                if add_preprocess_step:\n                    input_node_names.append(bytes_input_node_name)\n                output_node_names = [output_node_name]\n                if output_classes:\n                    output_node_names.append(class_names_node_name)\n\n                constant_graph_def = graph_util.convert_variables_to_constants(\n                    sess=sess,\n                    input_graph_def=input_graph_def,\n                    output_node_names=output_node_names,\n                    variable_names_whitelist=None,\n                    variable_names_blacklist=None\n                )\n\n                if add_preprocess_step:\n                    optimized_graph_def = constant_graph_def\n                else:\n                    optimized_graph_def = optimize_for_inference_lib.optimize_for_inference(\n                        input_graph_def=constant_graph_def,\n                        input_node_names=input_node_names,\n                        output_node_names=output_node_names,\n                        placeholder_type_enum=dtypes.float32.as_datatype_enum\n                    )\n\n                save_dir = os.path.join(export_dir, str(export_version))\n                if not os.path.exists(save_dir):\n                    print(\"Making version directory in export directory: %s\" %\n                          (save_dir,))\n                    os.makedirs(save_dir)\n                save_path = os.path.join(save_dir, 'optimized_model.pb')\n                with open(save_path, 'w') as f:\n                    f.write(optimized_graph_def.SerializeToString())\n\n                print(\"Saved optimized model for mobile devices at: %s.\" %\n                      (save_path,))\n                print(\"Input node names: %s\" % (input_node_names,))\n                print(\"Output node name: %s\" % (output_node_names,))\n\n                if export_tflite:\n\n                    # Patch the tensorflow lite conversion module\n                    # See here: https://github.com/tensorflow/tensorflow/issues/15410\n                    import tempfile\n                    import subprocess\n                    tf.contrib.lite.tempfile = tempfile\n                    tf.contrib.lite.subprocess = subprocess\n\n                    assert batch_size != None, \"We need a fixed batch size for the tensorflow lite export. (e.g. set --batch_size=1)\"\n\n                    tflite_model = tf.contrib.lite.toco_convert(\n                        optimized_graph_def, [input_placeholder], [output_node])\n                    tflite_save_path = os.path.join(\n                        save_dir, 'optimized_model.tflite')\n                    with open(tflite_save_path, 'wb') as f:\n                        f.write(tflite_model)\n\n                    print()\n                    print(\"Saved optimized model for tensorflow lite: %s.\" %\n                          (tflite_save_path,))\n                    print(\"Input node names: %s\" % (input_node_names,))\n                    print(\"Output node name: %s\" % (output_node_name,))\n\n    # We have to get out of the graph scope.\n    if export_coreml:\n        try:\n            import tfcoreml as tf_converter\n        except:\n            raise ValueError(\"Can't import tfcoreml, so we can't create a coreml model.\")\n\n        assert batch_size != None, \"We need a fixed batch size for the coreml export. (e.g. set --batch_size=1)\"\n        assert raveled_input == False, \"The input cannot be raveled. CoreML does not support `reshape()`.\"\n\n        coreml_save_path = os.path.join(save_dir, 'optimized_model.mlmodel')\n        tf_converter.convert(tf_model_path=save_path,\n                             mlmodel_path=coreml_save_path,\n                             output_feature_names=[output_node_name + \":0\"],\n                             input_name_shape_dict={'images:0': [\n                                 batch_size, input_height, input_width, input_depth]}\n                             )\n\n        print()\n        print(\"Saved optimized model for coreml: %s.\" % (coreml_save_path,))\n        print(\"Input node names: %s\" % (input_node_names,))\n        print(\"Output node name: %s\" % (output_node_name,))\n\n\ndef parse_args():\n\n    parser = argparse.ArgumentParser(\n        description='Test an Inception V3 network')\n\n    parser.add_argument('--checkpoint_path', dest='checkpoint_path',\n                        help='Path to the specific model you want to export.',\n                        required=True, type=str)\n\n    parser.add_argument('--export_dir', dest='export_dir',\n                        help='Path to a directory where the exported model will be saved.',\n                        required=True, type=str)\n\n    parser.add_argument('--export_version', dest='export_version',\n                        help='Version number of the model.',\n                        required=True, type=int)\n\n    parser.add_argument('--config', dest='config_file',\n                        help='Path to the configuration file',\n                        required=True, type=str)\n\n    parser.add_argument('--serving', dest='serving',\n                        help='Export for TensorFlow Serving usage. Otherwise, a constant graph will be generated.',\n                        action='store_true', default=False)\n\n    parser.add_argument('--export_tflite', dest='export_tflite',\n                        help='If True, then a tensorflow lite file will be produced along with the normal tensorflow model export (This is ignored if --serving is present).',\n                        action='store_true', default=False)\n\n    parser.add_argument('--export_coreml', dest='export_coreml',\n                        help='If True, then a coreml file will be produced along with the normal tensorflow model export (This is ignored if --serving is present).',\n                        action='store_true', default=False)\n\n    parser.add_argument('--add_preprocess', dest='add_preprocess',\n                        help='Add the image decoding and preprocessing nodes to the graph so that image bytes can be passed in.',\n                        action='store_true', default=False)\n\n    parser.add_argument('--output_classes', dest='output_classes',\n                        help='If True, then class indices (or names if `class_names` is provided) are output along with the scores.',\n                        action='store_true', default=False)\n\n    parser.add_argument('--class_names', dest='class_names_path',\n                        help='Path to the class names corresponding to each entry in the predictions output. This file should have one line for each index.',\n                        required=False, type=str, default=None)\n\n    parser.add_argument('--batch_size', dest='batch_size',\n                        help='Use this to specify a fixed batch size. Leave as None to have a flexible batch size. This must be specified to create tflite and coreml exports.',\n                        required=False, type=int, default=None)\n\n    parser.add_argument('--raveled_input', dest='raveled_input',\n                        help='If True, then the input is considered to be a vector that will be reshaped to the proper tensor form. This cannot be used with coreml',\n                        action='store_true', default=False)\n\n    args = parser.parse_args()\n\n    return args\n\n\nif __name__ == '__main__':\n\n    args = parse_args()\n    cfg = parse_config_file(args.config_file)\n\n    if args.class_names_path != None:\n        class_names = []\n        with open(args.class_names_path) as f:\n            for line in f:\n                class_names.append(line.strip())\n    else:\n        class_names = None\n\n    export(checkpoint_path=args.checkpoint_path,\n           export_dir=args.export_dir,\n           export_version=args.export_version,\n           export_for_serving=args.serving,\n           export_tflite=args.export_tflite,\n           export_coreml=args.export_coreml,\n           add_preprocess_step=args.add_preprocess,\n           output_classes=args.output_classes,\n           class_names=class_names,\n           batch_size=args.batch_size,\n           raveled_input=args.raveled_input,\n           cfg=cfg\n    )\n"
  },
  {
    "path": "extract.py",
    "content": "\"\"\"\nExtract features.\n\"\"\"\n\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport argparse\nimport os\nimport time\n\nimport numpy as np\nimport tensorflow as tf\nimport tensorflow.contrib.slim as slim\n\nfrom config.parse_config import parse_config_file\nfrom nets import nets_factory\nfrom preprocessing import inputs\n\ndef extract_features(tfrecords, checkpoint_path, num_iterations, feature_keys, cfg, read_images=False):\n    \"\"\"\n    Extract and return the features\n    \"\"\"\n\n    tf.logging.set_verbosity(tf.logging.INFO)\n\n    graph = tf.Graph()\n\n    with graph.as_default():\n\n        global_step = slim.get_or_create_global_step()\n\n        with tf.device('/cpu:0'):\n            batch_dict = inputs.input_nodes(\n                tfrecords=tfrecords,\n                cfg=cfg.IMAGE_PROCESSING,\n                num_epochs=1,\n                batch_size=cfg.BATCH_SIZE,\n                num_threads=cfg.NUM_INPUT_THREADS,\n                shuffle_batch =cfg.SHUFFLE_QUEUE,\n                random_seed=cfg.RANDOM_SEED,\n                capacity=cfg.QUEUE_CAPACITY,\n                min_after_dequeue=cfg.QUEUE_MIN,\n                add_summaries=False,\n                input_type='classification',\n                read_filenames=read_images\n            )\n\n        arg_scope = nets_factory.arg_scopes_map[cfg.MODEL_NAME]()\n\n        with slim.arg_scope(arg_scope):\n            logits, end_points = nets_factory.networks_map[cfg.MODEL_NAME](\n                inputs=batch_dict['inputs'],\n                num_classes=cfg.NUM_CLASSES,\n                is_training=False\n            )\n\n            predicted_labels = tf.argmax(end_points['Predictions'], 1)\n\n        if 'MOVING_AVERAGE_DECAY' in cfg and cfg.MOVING_AVERAGE_DECAY > 0:\n            variable_averages = tf.train.ExponentialMovingAverage(\n                cfg.MOVING_AVERAGE_DECAY, global_step)\n            variables_to_restore = variable_averages.variables_to_restore(\n                slim.get_model_variables())\n            variables_to_restore[global_step.op.name] = global_step\n        else:\n            variables_to_restore = slim.get_variables_to_restore()\n            variables_to_restore.append(global_step)\n\n\n        saver = tf.train.Saver(variables_to_restore, reshape=True)\n\n        num_batches = num_iterations\n        num_items = num_batches * cfg.BATCH_SIZE\n\n        fetches = []\n        feature_stores = []\n        for feature_key in feature_keys:\n            feature = tf.reshape(end_points[feature_key], [cfg.BATCH_SIZE, -1])\n            num_elements = feature.get_shape().as_list()[1]\n            feature_stores.append(np.empty([num_items, num_elements], dtype=np.float32))\n            fetches.append(feature)\n\n        fetches.append(batch_dict['ids'])\n        feature_stores.append(np.empty(num_items, dtype=np.object))\n\n        if os.path.isdir(checkpoint_path):\n            checkpoint_dir = checkpoint_path\n            checkpoint_path = tf.train.latest_checkpoint(checkpoint_dir)\n\n            if checkpoint_path is None:\n                raise ValueError(\"Unable to find a model checkpoint in the \" \\\n                                 \"directory %s\" % (checkpoint_dir,))\n\n        tf.logging.info('Classifying records using %s' % checkpoint_path)\n\n        coord = tf.train.Coordinator()\n\n        sess_config = tf.ConfigProto(\n                log_device_placement=cfg.SESSION_CONFIG.LOG_DEVICE_PLACEMENT,\n                allow_soft_placement = True,\n                gpu_options = tf.GPUOptions(\n                    per_process_gpu_memory_fraction=cfg.SESSION_CONFIG.PER_PROCESS_GPU_MEMORY_FRACTION\n                )\n            )\n        sess = tf.Session(graph=graph, config=sess_config)\n\n        with sess.as_default():\n\n            tf.global_variables_initializer().run()\n            tf.local_variables_initializer().run()\n            threads = tf.train.start_queue_runners(sess=sess, coord=coord)\n\n            try:\n\n                # Restore from checkpoint\n                saver.restore(sess, checkpoint_path)\n\n                print_str = ', '.join([\n                  'Step: %d',\n                  'Time/image (ms): %.1f'\n                ])\n\n                step = 0\n                while not coord.should_stop():\n\n                    t = time.time()\n                    outputs = sess.run(fetches)\n                    dt = time.time()-t\n\n                    idx1 = cfg.BATCH_SIZE * step\n                    idx2 = idx1 + cfg.BATCH_SIZE\n\n                    for i in range(len(outputs)):\n                        feature_stores[i][idx1:idx2] = outputs[i]\n\n                    step += 1\n                    print(print_str % (step, (dt / cfg.BATCH_SIZE) * 1000))\n\n                    if num_iterations > 0 and step == num_iterations:\n                        break\n\n            except tf.errors.OutOfRangeError as e:\n                pass\n\n        coord.request_stop()\n        coord.join(threads)\n\n        feature_dict = {feature_key : feature for feature_key, feature in zip(feature_keys, feature_stores[:-1])}\n        feature_dict['ids'] = feature_stores[-1]\n\n        return feature_dict\n\ndef extract_and_save(tfrecords, checkpoint_path, save_path, num_iterations, feature_keys, cfg, read_images=False):\n    \"\"\"Extract and save the features\n    Args:\n        tfrecords (list)\n        checkpoint_path (str)\n        save_dir (str)\n        max_iterations (int)\n        save_logits (bool)\n        cfg (EasyDict)\n    \"\"\"\n\n    feature_dict = extract_features(tfrecords, checkpoint_path, num_iterations, feature_keys, cfg, read_images=read_images)\n\n    # save the results\n    np.savez(save_path, **feature_dict)\n\n\ndef parse_args():\n\n    parser = argparse.ArgumentParser(description='Classify images, optionally saving the logits.')\n\n    parser.add_argument('--tfrecords', dest='tfrecords',\n                        help='Paths to tfrecords.', type=str,\n                        nargs='+', required=True)\n\n    parser.add_argument('--checkpoint_path', dest='checkpoint_path',\n                          help='Path to a specific model to test against. If a directory, then the newest checkpoint file will be used.', type=str,\n                          required=True)\n\n    parser.add_argument('--save_path', dest='save_path',\n                          help='File name path to a save the classification results.', type=str,\n                          required=True)\n\n    parser.add_argument('--config', dest='config_file',\n                        help='Path to the configuration file',\n                        required=True, type=str)\n\n    parser.add_argument('--batch_size', dest='batch_size',\n                        help='The number of images in a batch.',\n                        required=True, type=int)\n\n    parser.add_argument('--batches', dest='batches',\n                        help='Maximum number of iterations to run. Default is all records (modulo the batch size).',\n                        required=True, type=int)\n\n    parser.add_argument('--features', dest='features',\n                        help='The features to extract. These are keys into the end_points dictionary returned by the model architecture.',\n                        type=str, nargs='+', required=True)\n\n    parser.add_argument('--model_name', dest='model_name',\n                        help='The name of the architecture to use.',\n                        required=False, type=str, default=None)\n\n    parser.add_argument('--read_images', dest='read_images',\n                        help='Read the images from the file system using the `filename` field rather than using the `encoded` field of the tfrecord.',\n                        action='store_true', default=False)\n\n\n\n    args = parser.parse_args()\n    return args\n\ndef main():\n    args = parse_args()\n\n    cfg = parse_config_file(args.config_file)\n\n    if args.batch_size != None:\n        cfg.BATCH_SIZE = args.batch_size\n\n    if args.model_name != None:\n        cfg.MODEL_NAME = args.model_name\n\n    extract_and_save(\n        tfrecords=args.tfrecords,\n        checkpoint_path=args.checkpoint_path,\n        save_path = args.save_path,\n        num_iterations=args.batches,\n        feature_keys=args.features,\n        cfg=cfg,\n        read_images=args.read_images\n    )\n\nif __name__ == '__main__':\n    main()\n"
  },
  {
    "path": "nets/README.md",
    "content": "# Models\n\nThis directory contains the available classification models. All of these models were copied from the [TensorFlow Models repo](https://github.com/tensorflow/models/tree/master/slim/nets) and updated to TensorFlow r1.0.\n\nThe table below lists relevant information for each model. To use one of these models (e.g. when using the training scripts), simply set the `--model_name` flag to the appropriate name. The number of parameters and the number of flops were computed using the `profile` function in [net_profile.py](net_profile.py). I assumed a batch size of 1, and 1000 classes for all models. All available checkpoint files are from models trained on the [ILSVRC-2012-CLS](http://www.image-net.org/challenges/LSVRC/2012/) dataset. Top-1 and Top-5 numbers correspond to performance on that datasets. When fine-tuning from one of these checkpoints, it is recommended to use the same image size as the default image size for that model.\n\n| Model | Name | TF-Slim File | Checkpoint | Top-1 Accuracy | Top-5 Accuracy | Default Image Size | Num Params | Num Flops |\n:----:|:----:|:------------:|:----------:|:-------:|:--------:|:--------:|:--------:|:--------:|\n[Inception V1](http://arxiv.org/abs/1409.4842v1) | inception_v1 | [Code](inception_v1.py) | [Checkpoint](http://download.tensorflow.org/models/inception_v1_2016_08_28.tar.gz) | 69.8 | 89.6 | 224px | 6,617,624 | 3.00b |\n[Inception V2](http://arxiv.org/abs/1502.03167) | inception_v2 | [Code](inception_v2.py) | [Checkpoint](http://download.tensorflow.org/models/inception_v2_2016_08_28.tar.gz) | 73.9 | 91.8 | 224px | 11,178,336 | 3.87b |\n[Inception V3](http://arxiv.org/abs/1512.00567) | inception_v3 | [Code](inception_v3.py) | [Checkpoint](http://download.tensorflow.org/models/inception_v3_2016_08_28.tar.gz) | 78.0 | 93.9 | 299px | 27,143,152 | 11.44b |\n[Inception V4](http://arxiv.org/abs/1602.07261) | inception_v4 | [Code](inception_v4.py) | [Checkpoint](http://download.tensorflow.org/models/inception_v4_2016_09_09.tar.gz) | 80.2 | 95.2 | 299px | 46,006,800 | 24.52b |\n[Inception-ResNet-v2](http://arxiv.org/abs/1602.07261) | inception_resnet_v2 | [Code](inception_resnet_v2.py) | [Checkpoint](http://download.tensorflow.org/models/inception_resnet_v2_2016_08_30.tar.gz) | 80.4 | 95.3 | 299px | 59,179,952 | 26.34b |\n[ResNet V2 50](https://arxiv.org/abs/1603.05027) | resnet_v2_50 | [Code](resnet_v2.py) | [Checkpoint](http://download.tensorflow.org/models/resnet_v2_50_2017_04_14.tar.gz) | 75.6 | 92.8 | 299px | 25,568,360 | 13.08b |\n[ResNet V2 101](https://arxiv.org/abs/1603.05027) | resnet_v2_101 | [Code](resnet_v2.py) | [Checkpoint](http://download.tensorflow.org/models/resnet_v2_101_2017_04_14.tar.gz) | 77.0 | 93.7 | 299px | 44,577,896 | 26.77b |\n[ResNet V2 152](https://arxiv.org/abs/1603.05027) | resnet_v2_152 | [Code](resnet_v2.py) | [Checkpoint](http://download.tensorflow.org/models/resnet_v2_152_2017_04_14.tar.gz) | 77.8 | 94.1 | 299px | 60,236,904 | 40.45b |\n[MobileNet-v1](https://arxiv.org/abs/1704.04861) | mobilenet_v1 | [Code](mobilenet_v1.py) | [Checkpoint](http://download.tensorflow.org/models/mobilenet_v1_1.0_224_2017_06_14.tar.gz) | 70.7 | 89.5 | 224px | 4,231,976 | 1.14b |\n\n# Finetuning\n\nWhen you finetune one of the above models, you'll start the training procedure using something like:\n```\npython train.py \\\n--tfrecords $DATASET_DIR/train* \\\n--logdir $EXPERIMENT_DIR/logdir \\\n--config $EXPERIMENT_DIR/config_train.yaml \\\n--pretrained_model $PRETRAINED_MODEL \\\n--checkpoint_exclude_scopes <model specific scopes>\n```\n\nThe `--checkpoint_exclude_scopes` argument allows you to prevent restoring variables that have different sizes, which are typically your logit variables (which have a different size due to the number of classes in your application being different than the number of classes in ImageNet). The below table provides the proper value for `--checkpoint_exclude_scopes` for each model.\n\n| Model | Name | TF-Slim File | Default Image Size | Exclude Scopes |\n:----:|:----:|:------------:|:----------:|:-------:|\n[Inception V1](http://arxiv.org/abs/1409.4842v1) | inception_v1 | [Code](inception_v1.py) | 224px | InceptionV1/Logits |\n[Inception V2](http://arxiv.org/abs/1502.03167) | inception_v2 | [Code](inception_v2.py) | 224px | InceptionV2/Logits |\n[Inception V3](http://arxiv.org/abs/1512.00567) | inception_v3 | [Code](inception_v3.py) | 299px | InceptionV3/Logits InceptionV3/AuxLogits |\n[Inception V4](http://arxiv.org/abs/1602.07261) | inception_v4 | [Code](inception_v4.py) | 299px | InceptionV4/Logits InceptionV4/AuxLogits |\n[Inception-ResNet-v2](http://arxiv.org/abs/1602.07261) | inception_resnet_v2 | [Code](inception_resnet_v2.py) | 299px | InceptionResnetV2/Logits InceptionResnetV2/AuxLogits |\n[ResNet V2 50](https://arxiv.org/abs/1603.05027) | resnet_v2_50 | [Code](resnet_v2.py) | 224px | resnet_v2_50/logits |\n[ResNet V2 101](https://arxiv.org/abs/1603.05027) | resnet_v2_101 | [Code](resnet_v2.py) | 224px | resnet_v2_101/logits |\n[ResNet V2 152](https://arxiv.org/abs/1603.05027) | resnet_v2_152 | [Code](resnet_v2.py) | 224px | resnet_v2_152/logits |\n[MobileNet-v1](https://arxiv.org/abs/1704.04861) | mobilenet_v1 | [Code](mobilenet_v1.py) | 224px | MobilenetV1/Logits |\n"
  },
  {
    "path": "nets/__init__.py",
    "content": "\n"
  },
  {
    "path": "nets/inception.py",
    "content": "# Copyright 2016 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\"\"\"Brings all inception models under one namespace.\"\"\"\n\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\n# pylint: disable=unused-import\nfrom nets.inception_resnet_v2 import inception_resnet_v2\nfrom nets.inception_resnet_v2 import inception_resnet_v2_arg_scope\nfrom nets.inception_v1 import inception_v1\nfrom nets.inception_v1 import inception_v1_arg_scope\nfrom nets.inception_v1 import inception_v1_base\nfrom nets.inception_v2 import inception_v2\nfrom nets.inception_v2 import inception_v2_arg_scope\nfrom nets.inception_v2 import inception_v2_base\nfrom nets.inception_v3 import inception_v3\nfrom nets.inception_v3 import inception_v3_arg_scope\nfrom nets.inception_v3 import inception_v3_base\nfrom nets.inception_v4 import inception_v4\nfrom nets.inception_v4 import inception_v4_arg_scope\nfrom nets.inception_v4 import inception_v4_base\n# pylint: enable=unused-import\n"
  },
  {
    "path": "nets/inception_resnet_v2.py",
    "content": "# Copyright 2016 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\"\"\"Contains the definition of the Inception Resnet V2 architecture.\n\nAs described in http://arxiv.org/abs/1602.07261.\n\n  Inception-v4, Inception-ResNet and the Impact of Residual Connections\n    on Learning\n  Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, Alex Alemi\n\"\"\"\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\n\nimport tensorflow as tf\n\nslim = tf.contrib.slim\n\n\ndef block35(net, scale=1.0, activation_fn=tf.nn.relu, scope=None, reuse=None):\n  \"\"\"Builds the 35x35 resnet block.\"\"\"\n  with tf.variable_scope(scope, 'Block35', [net], reuse=reuse):\n    with tf.variable_scope('Branch_0'):\n      tower_conv = slim.conv2d(net, 32, 1, scope='Conv2d_1x1')\n    with tf.variable_scope('Branch_1'):\n      tower_conv1_0 = slim.conv2d(net, 32, 1, scope='Conv2d_0a_1x1')\n      tower_conv1_1 = slim.conv2d(tower_conv1_0, 32, 3, scope='Conv2d_0b_3x3')\n    with tf.variable_scope('Branch_2'):\n      tower_conv2_0 = slim.conv2d(net, 32, 1, scope='Conv2d_0a_1x1')\n      tower_conv2_1 = slim.conv2d(tower_conv2_0, 48, 3, scope='Conv2d_0b_3x3')\n      tower_conv2_2 = slim.conv2d(tower_conv2_1, 64, 3, scope='Conv2d_0c_3x3')\n    mixed = tf.concat(axis=3, values=[tower_conv, tower_conv1_1, tower_conv2_2])\n    up = slim.conv2d(mixed, net.get_shape()[3], 1, normalizer_fn=None,\n                     activation_fn=None, scope='Conv2d_1x1')\n    net += scale * up\n    if activation_fn:\n      net = activation_fn(net)\n  return net\n\n\ndef block17(net, scale=1.0, activation_fn=tf.nn.relu, scope=None, reuse=None):\n  \"\"\"Builds the 17x17 resnet block.\"\"\"\n  with tf.variable_scope(scope, 'Block17', [net], reuse=reuse):\n    with tf.variable_scope('Branch_0'):\n      tower_conv = slim.conv2d(net, 192, 1, scope='Conv2d_1x1')\n    with tf.variable_scope('Branch_1'):\n      tower_conv1_0 = slim.conv2d(net, 128, 1, scope='Conv2d_0a_1x1')\n      tower_conv1_1 = slim.conv2d(tower_conv1_0, 160, [1, 7],\n                                  scope='Conv2d_0b_1x7')\n      tower_conv1_2 = slim.conv2d(tower_conv1_1, 192, [7, 1],\n                                  scope='Conv2d_0c_7x1')\n    mixed = tf.concat(axis=3, values=[tower_conv, tower_conv1_2])\n    up = slim.conv2d(mixed, net.get_shape()[3], 1, normalizer_fn=None,\n                     activation_fn=None, scope='Conv2d_1x1')\n    net += scale * up\n    if activation_fn:\n      net = activation_fn(net)\n  return net\n\n\ndef block8(net, scale=1.0, activation_fn=tf.nn.relu, scope=None, reuse=None):\n  \"\"\"Builds the 8x8 resnet block.\"\"\"\n  with tf.variable_scope(scope, 'Block8', [net], reuse=reuse):\n    with tf.variable_scope('Branch_0'):\n      tower_conv = slim.conv2d(net, 192, 1, scope='Conv2d_1x1')\n    with tf.variable_scope('Branch_1'):\n      tower_conv1_0 = slim.conv2d(net, 192, 1, scope='Conv2d_0a_1x1')\n      tower_conv1_1 = slim.conv2d(tower_conv1_0, 224, [1, 3],\n                                  scope='Conv2d_0b_1x3')\n      tower_conv1_2 = slim.conv2d(tower_conv1_1, 256, [3, 1],\n                                  scope='Conv2d_0c_3x1')\n    mixed = tf.concat(axis=3, values=[tower_conv, tower_conv1_2])\n    up = slim.conv2d(mixed, net.get_shape()[3], 1, normalizer_fn=None,\n                     activation_fn=None, scope='Conv2d_1x1')\n    net += scale * up\n    if activation_fn:\n      net = activation_fn(net)\n  return net\n\n\ndef inception_resnet_v2(inputs, num_classes=1001, is_training=True,\n                        dropout_keep_prob=0.8,\n                        reuse=None,\n                        scope='InceptionResnetV2'):\n  \"\"\"Creates the Inception Resnet V2 model.\n\n  Args:\n    inputs: a 4-D tensor of size [batch_size, height, width, 3].\n    num_classes: number of predicted classes.\n    is_training: whether is training or not.\n    dropout_keep_prob: float, the fraction to keep before final layer.\n    reuse: whether or not the network and its variables should be reused. To be\n      able to reuse 'scope' must be given.\n    scope: Optional variable_scope.\n\n  Returns:\n    logits: the logits outputs of the model.\n    end_points: the set of end_points from the inception model.\n  \"\"\"\n  end_points = {}\n\n  with tf.variable_scope(scope, 'InceptionResnetV2', [inputs], reuse=reuse):\n    with slim.arg_scope([slim.batch_norm, slim.dropout],\n                        is_training=is_training):\n      with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d],\n                          stride=1, padding='SAME'):\n\n        # 149 x 149 x 32\n        net = slim.conv2d(inputs, 32, 3, stride=2, padding='VALID',\n                          scope='Conv2d_1a_3x3')\n        end_points['Conv2d_1a_3x3'] = net\n        # 147 x 147 x 32\n        net = slim.conv2d(net, 32, 3, padding='VALID',\n                          scope='Conv2d_2a_3x3')\n        end_points['Conv2d_2a_3x3'] = net\n        # 147 x 147 x 64\n        net = slim.conv2d(net, 64, 3, scope='Conv2d_2b_3x3')\n        end_points['Conv2d_2b_3x3'] = net\n        # 73 x 73 x 64\n        net = slim.max_pool2d(net, 3, stride=2, padding='VALID',\n                              scope='MaxPool_3a_3x3')\n        end_points['MaxPool_3a_3x3'] = net\n        # 73 x 73 x 80\n        net = slim.conv2d(net, 80, 1, padding='VALID',\n                          scope='Conv2d_3b_1x1')\n        end_points['Conv2d_3b_1x1'] = net\n        # 71 x 71 x 192\n        net = slim.conv2d(net, 192, 3, padding='VALID',\n                          scope='Conv2d_4a_3x3')\n        end_points['Conv2d_4a_3x3'] = net\n        # 35 x 35 x 192\n        net = slim.max_pool2d(net, 3, stride=2, padding='VALID',\n                              scope='MaxPool_5a_3x3')\n        end_points['MaxPool_5a_3x3'] = net\n\n        # 35 x 35 x 320\n        with tf.variable_scope('Mixed_5b'):\n          with tf.variable_scope('Branch_0'):\n            tower_conv = slim.conv2d(net, 96, 1, scope='Conv2d_1x1')\n          with tf.variable_scope('Branch_1'):\n            tower_conv1_0 = slim.conv2d(net, 48, 1, scope='Conv2d_0a_1x1')\n            tower_conv1_1 = slim.conv2d(tower_conv1_0, 64, 5,\n                                        scope='Conv2d_0b_5x5')\n          with tf.variable_scope('Branch_2'):\n            tower_conv2_0 = slim.conv2d(net, 64, 1, scope='Conv2d_0a_1x1')\n            tower_conv2_1 = slim.conv2d(tower_conv2_0, 96, 3,\n                                        scope='Conv2d_0b_3x3')\n            tower_conv2_2 = slim.conv2d(tower_conv2_1, 96, 3,\n                                        scope='Conv2d_0c_3x3')\n          with tf.variable_scope('Branch_3'):\n            tower_pool = slim.avg_pool2d(net, 3, stride=1, padding='SAME',\n                                         scope='AvgPool_0a_3x3')\n            tower_pool_1 = slim.conv2d(tower_pool, 64, 1,\n                                       scope='Conv2d_0b_1x1')\n          net = tf.concat(axis=3, values=[tower_conv, tower_conv1_1,\n                              tower_conv2_2, tower_pool_1])\n\n        end_points['Mixed_5b'] = net\n        net = slim.repeat(net, 10, block35, scale=0.17)\n\n        # 17 x 17 x 1024\n        with tf.variable_scope('Mixed_6a'):\n          with tf.variable_scope('Branch_0'):\n            tower_conv = slim.conv2d(net, 384, 3, stride=2, padding='VALID',\n                                     scope='Conv2d_1a_3x3')\n          with tf.variable_scope('Branch_1'):\n            tower_conv1_0 = slim.conv2d(net, 256, 1, scope='Conv2d_0a_1x1')\n            tower_conv1_1 = slim.conv2d(tower_conv1_0, 256, 3,\n                                        scope='Conv2d_0b_3x3')\n            tower_conv1_2 = slim.conv2d(tower_conv1_1, 384, 3,\n                                        stride=2, padding='VALID',\n                                        scope='Conv2d_1a_3x3')\n          with tf.variable_scope('Branch_2'):\n            tower_pool = slim.max_pool2d(net, 3, stride=2, padding='VALID',\n                                         scope='MaxPool_1a_3x3')\n          net = tf.concat(axis=3, values=[tower_conv, tower_conv1_2, tower_pool])\n\n        end_points['Mixed_6a'] = net\n        net = slim.repeat(net, 20, block17, scale=0.10)\n\n        # Auxillary tower\n        with tf.variable_scope('AuxLogits'):\n          # Originally, kernel_size = 5\n          # However, if we change the input size then we need to change the kernel size\n          # We want to pool the feature map to be 5x5xC\n          # With padding = 0, and stride 3, this means our kernel is H - 12\n          kernel_size = [net.get_shape().as_list()[1] - 12] * 2\n          aux = slim.avg_pool2d(net, kernel_size, stride=3, padding='VALID',\n                                scope='Conv2d_1a_3x3')\n          aux = slim.conv2d(aux, 128, 1, scope='Conv2d_1b_1x1')\n          aux = slim.conv2d(aux, 768, aux.get_shape()[1:3],\n                            padding='VALID', scope='Conv2d_2a_5x5')\n          aux = slim.flatten(aux)\n          aux = slim.fully_connected(aux, num_classes, activation_fn=None,\n                                     scope='Logits')\n          end_points['AuxLogits'] = aux\n\n        with tf.variable_scope('Mixed_7a'):\n          with tf.variable_scope('Branch_0'):\n            tower_conv = slim.conv2d(net, 256, 1, scope='Conv2d_0a_1x1')\n            tower_conv_1 = slim.conv2d(tower_conv, 384, 3, stride=2,\n                                       padding='VALID', scope='Conv2d_1a_3x3')\n          with tf.variable_scope('Branch_1'):\n            tower_conv1 = slim.conv2d(net, 256, 1, scope='Conv2d_0a_1x1')\n            tower_conv1_1 = slim.conv2d(tower_conv1, 288, 3, stride=2,\n                                        padding='VALID', scope='Conv2d_1a_3x3')\n          with tf.variable_scope('Branch_2'):\n            tower_conv2 = slim.conv2d(net, 256, 1, scope='Conv2d_0a_1x1')\n            tower_conv2_1 = slim.conv2d(tower_conv2, 288, 3,\n                                        scope='Conv2d_0b_3x3')\n            tower_conv2_2 = slim.conv2d(tower_conv2_1, 320, 3, stride=2,\n                                        padding='VALID', scope='Conv2d_1a_3x3')\n          with tf.variable_scope('Branch_3'):\n            tower_pool = slim.max_pool2d(net, 3, stride=2, padding='VALID',\n                                         scope='MaxPool_1a_3x3')\n          net = tf.concat(axis=3, values=[tower_conv_1, tower_conv1_1,\n                              tower_conv2_2, tower_pool])\n\n        end_points['Mixed_7a'] = net\n\n        net = slim.repeat(net, 9, block8, scale=0.20)\n        net = block8(net, activation_fn=None)\n\n        net = slim.conv2d(net, 1536, 1, scope='Conv2d_7b_1x1')\n        end_points['Conv2d_7b_1x1'] = net\n\n        with tf.variable_scope('Logits'):\n          end_points['PrePool'] = net\n          net = slim.avg_pool2d(net, net.get_shape()[1:3], padding='VALID',\n                                scope='AvgPool_1a_8x8')\n          net = slim.flatten(net)\n\n          net = slim.dropout(net, dropout_keep_prob, is_training=is_training,\n                             scope='Dropout')\n\n          end_points['PreLogitsFlatten'] = net\n          logits = slim.fully_connected(net, num_classes, activation_fn=None,\n                                        scope='Logits')\n          end_points['Logits'] = logits\n          end_points['Predictions'] = tf.nn.softmax(logits, name='Predictions')\n\n    return logits, end_points\ninception_resnet_v2.default_image_size = 299\n\n\ndef inception_resnet_v2_arg_scope(weight_decay=0.00004,\n                                  batch_norm_decay=0.9997,\n                                  batch_norm_epsilon=0.001):\n  \"\"\"Yields the scope with the default parameters for inception_resnet_v2.\n\n  Args:\n    weight_decay: the weight decay for weights variables.\n    batch_norm_decay: decay for the moving average of batch_norm momentums.\n    batch_norm_epsilon: small float added to variance to avoid dividing by zero.\n\n  Returns:\n    a arg_scope with the parameters needed for inception_resnet_v2.\n  \"\"\"\n  # Set weight_decay for weights in conv2d and fully_connected layers.\n  with slim.arg_scope([slim.conv2d, slim.fully_connected],\n                      weights_regularizer=slim.l2_regularizer(weight_decay),\n                      biases_regularizer=slim.l2_regularizer(weight_decay)):\n\n    batch_norm_params = {\n        'decay': batch_norm_decay,\n        'epsilon': batch_norm_epsilon,\n    }\n    # Set activation_fn and parameters for batch_norm.\n    with slim.arg_scope([slim.conv2d], activation_fn=tf.nn.relu,\n                        normalizer_fn=slim.batch_norm,\n                        normalizer_params=batch_norm_params) as scope:\n      return scope\n"
  },
  {
    "path": "nets/inception_resnet_v2_test.py",
    "content": "# Copyright 2016 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\"\"\"Tests for slim.inception_resnet_v2.\"\"\"\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport tensorflow as tf\n\nfrom nets import inception\n\n\nclass InceptionTest(tf.test.TestCase):\n\n  def testBuildLogits(self):\n    batch_size = 5\n    height, width = 299, 299\n    num_classes = 1000\n    with self.test_session():\n      inputs = tf.random_uniform((batch_size, height, width, 3))\n      logits, _ = inception.inception_resnet_v2(inputs, num_classes)\n      self.assertTrue(logits.op.name.startswith('InceptionResnetV2/Logits'))\n      self.assertListEqual(logits.get_shape().as_list(),\n                           [batch_size, num_classes])\n\n  def testBuildEndPoints(self):\n    batch_size = 5\n    height, width = 299, 299\n    num_classes = 1000\n    with self.test_session():\n      inputs = tf.random_uniform((batch_size, height, width, 3))\n      _, end_points = inception.inception_resnet_v2(inputs, num_classes)\n      self.assertTrue('Logits' in end_points)\n      logits = end_points['Logits']\n      self.assertListEqual(logits.get_shape().as_list(),\n                           [batch_size, num_classes])\n      self.assertTrue('AuxLogits' in end_points)\n      aux_logits = end_points['AuxLogits']\n      self.assertListEqual(aux_logits.get_shape().as_list(),\n                           [batch_size, num_classes])\n      pre_pool = end_points['PrePool']\n      self.assertListEqual(pre_pool.get_shape().as_list(),\n                           [batch_size, 8, 8, 1536])\n\n  def testVariablesSetDevice(self):\n    batch_size = 5\n    height, width = 299, 299\n    num_classes = 1000\n    with self.test_session():\n      inputs = tf.random_uniform((batch_size, height, width, 3))\n      # Force all Variables to reside on the device.\n      with tf.variable_scope('on_cpu'), tf.device('/cpu:0'):\n        inception.inception_resnet_v2(inputs, num_classes)\n      with tf.variable_scope('on_gpu'), tf.device('/gpu:0'):\n        inception.inception_resnet_v2(inputs, num_classes)\n      for v in tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, scope='on_cpu'):\n        self.assertDeviceEqual(v.device, '/cpu:0')\n      for v in tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, scope='on_gpu'):\n        self.assertDeviceEqual(v.device, '/gpu:0')\n\n  def testHalfSizeImages(self):\n    batch_size = 5\n    height, width = 150, 150\n    num_classes = 1000\n    with self.test_session():\n      inputs = tf.random_uniform((batch_size, height, width, 3))\n      logits, end_points = inception.inception_resnet_v2(inputs, num_classes)\n      self.assertTrue(logits.op.name.startswith('InceptionResnetV2/Logits'))\n      self.assertListEqual(logits.get_shape().as_list(),\n                           [batch_size, num_classes])\n      pre_pool = end_points['PrePool']\n      self.assertListEqual(pre_pool.get_shape().as_list(),\n                           [batch_size, 3, 3, 1536])\n\n  def testUnknownBatchSize(self):\n    batch_size = 1\n    height, width = 299, 299\n    num_classes = 1000\n    with self.test_session() as sess:\n      inputs = tf.placeholder(tf.float32, (None, height, width, 3))\n      logits, _ = inception.inception_resnet_v2(inputs, num_classes)\n      self.assertTrue(logits.op.name.startswith('InceptionResnetV2/Logits'))\n      self.assertListEqual(logits.get_shape().as_list(),\n                           [None, num_classes])\n      images = tf.random_uniform((batch_size, height, width, 3))\n      sess.run(tf.global_variables_initializer())\n      output = sess.run(logits, {inputs: images.eval()})\n      self.assertEquals(output.shape, (batch_size, num_classes))\n\n  def testEvaluation(self):\n    batch_size = 2\n    height, width = 299, 299\n    num_classes = 1000\n    with self.test_session() as sess:\n      eval_inputs = tf.random_uniform((batch_size, height, width, 3))\n      logits, _ = inception.inception_resnet_v2(eval_inputs,\n                                                num_classes,\n                                                is_training=False)\n      predictions = tf.argmax(logits, 1)\n      sess.run(tf.global_variables_initializer())\n      output = sess.run(predictions)\n      self.assertEquals(output.shape, (batch_size,))\n\n  def testTrainEvalWithReuse(self):\n    train_batch_size = 5\n    eval_batch_size = 2\n    height, width = 150, 150\n    num_classes = 1000\n    with self.test_session() as sess:\n      train_inputs = tf.random_uniform((train_batch_size, height, width, 3))\n      inception.inception_resnet_v2(train_inputs, num_classes)\n      eval_inputs = tf.random_uniform((eval_batch_size, height, width, 3))\n      logits, _ = inception.inception_resnet_v2(eval_inputs,\n                                                num_classes,\n                                                is_training=False,\n                                                reuse=True)\n      predictions = tf.argmax(logits, 1)\n      sess.run(tf.global_variables_initializer())\n      output = sess.run(predictions)\n      self.assertEquals(output.shape, (eval_batch_size,))\n\n\nif __name__ == '__main__':\n  tf.test.main()\n"
  },
  {
    "path": "nets/inception_utils.py",
    "content": "# Copyright 2016 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\"\"\"Contains common code shared by all inception models.\n\nUsage of arg scope:\n  with slim.arg_scope(inception_arg_scope()):\n    logits, end_points = inception.inception_v3(images, num_classes,\n                                                is_training=is_training)\n\n\"\"\"\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport tensorflow as tf\n\nslim = tf.contrib.slim\n\n\ndef inception_arg_scope(weight_decay=0.00004,\n                        use_batch_norm=True,\n                        batch_norm_decay=0.9997,\n                        batch_norm_epsilon=0.001):\n  \"\"\"Defines the default arg scope for inception models.\n\n  Args:\n    weight_decay: The weight decay to use for regularizing the model.\n    use_batch_norm: \"If `True`, batch_norm is applied after each convolution.\n    batch_norm_decay: Decay for batch norm moving average.\n    batch_norm_epsilon: Small float added to variance to avoid dividing by zero\n      in batch norm.\n\n  Returns:\n    An `arg_scope` to use for the inception models.\n  \"\"\"\n  batch_norm_params = {\n      # Decay for the moving averages.\n      'decay': batch_norm_decay,\n      # epsilon to prevent 0s in variance.\n      'epsilon': batch_norm_epsilon,\n      # collection containing update_ops.\n      'updates_collections': tf.GraphKeys.UPDATE_OPS,\n  }\n  if use_batch_norm:\n    normalizer_fn = slim.batch_norm\n    normalizer_params = batch_norm_params\n  else:\n    normalizer_fn = None\n    normalizer_params = {}\n  # Set weight_decay for weights in Conv and FC layers.\n  with slim.arg_scope([slim.conv2d, slim.fully_connected],\n                      weights_regularizer=slim.l2_regularizer(weight_decay)):\n    with slim.arg_scope(\n        [slim.conv2d],\n        weights_initializer=slim.variance_scaling_initializer(),\n        activation_fn=tf.nn.relu,\n        normalizer_fn=normalizer_fn,\n        normalizer_params=normalizer_params) as sc:\n      return sc\n"
  },
  {
    "path": "nets/inception_v1.py",
    "content": "# Copyright 2016 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\"\"\"Contains the definition for inception v1 classification network.\"\"\"\n\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport tensorflow as tf\n\nfrom nets import inception_utils\n\nslim = tf.contrib.slim\ntrunc_normal = lambda stddev: tf.truncated_normal_initializer(0.0, stddev)\n\n\ndef inception_v1_base(inputs,\n                      final_endpoint='Mixed_5c',\n                      scope='InceptionV1'):\n  \"\"\"Defines the Inception V1 base architecture.\n\n  This architecture is defined in:\n    Going deeper with convolutions\n    Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed,\n    Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich.\n    http://arxiv.org/pdf/1409.4842v1.pdf.\n\n  Args:\n    inputs: a tensor of size [batch_size, height, width, channels].\n    final_endpoint: specifies the endpoint to construct the network up to. It\n      can be one of ['Conv2d_1a_7x7', 'MaxPool_2a_3x3', 'Conv2d_2b_1x1',\n      'Conv2d_2c_3x3', 'MaxPool_3a_3x3', 'Mixed_3b', 'Mixed_3c',\n      'MaxPool_4a_3x3', 'Mixed_4b', 'Mixed_4c', 'Mixed_4d', 'Mixed_4e',\n      'Mixed_4f', 'MaxPool_5a_2x2', 'Mixed_5b', 'Mixed_5c']\n    scope: Optional variable_scope.\n\n  Returns:\n    A dictionary from components of the network to the corresponding activation.\n\n  Raises:\n    ValueError: if final_endpoint is not set to one of the predefined values.\n  \"\"\"\n  end_points = {}\n  with tf.variable_scope(scope, 'InceptionV1', [inputs]):\n    with slim.arg_scope(\n        [slim.conv2d, slim.fully_connected],\n        weights_initializer=trunc_normal(0.01)):\n      with slim.arg_scope([slim.conv2d, slim.max_pool2d],\n                          stride=1, padding='SAME'):\n        end_point = 'Conv2d_1a_7x7'\n        net = slim.conv2d(inputs, 64, [7, 7], stride=2, scope=end_point)\n        end_points[end_point] = net\n        if final_endpoint == end_point: return net, end_points\n        end_point = 'MaxPool_2a_3x3'\n        net = slim.max_pool2d(net, [3, 3], stride=2, scope=end_point)\n        end_points[end_point] = net\n        if final_endpoint == end_point: return net, end_points\n        end_point = 'Conv2d_2b_1x1'\n        net = slim.conv2d(net, 64, [1, 1], scope=end_point)\n        end_points[end_point] = net\n        if final_endpoint == end_point: return net, end_points\n        end_point = 'Conv2d_2c_3x3'\n        net = slim.conv2d(net, 192, [3, 3], scope=end_point)\n        end_points[end_point] = net\n        if final_endpoint == end_point: return net, end_points\n        end_point = 'MaxPool_3a_3x3'\n        net = slim.max_pool2d(net, [3, 3], stride=2, scope=end_point)\n        end_points[end_point] = net\n        if final_endpoint == end_point: return net, end_points\n\n        end_point = 'Mixed_3b'\n        with tf.variable_scope(end_point):\n          with tf.variable_scope('Branch_0'):\n            branch_0 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')\n          with tf.variable_scope('Branch_1'):\n            branch_1 = slim.conv2d(net, 96, [1, 1], scope='Conv2d_0a_1x1')\n            branch_1 = slim.conv2d(branch_1, 128, [3, 3], scope='Conv2d_0b_3x3')\n          with tf.variable_scope('Branch_2'):\n            branch_2 = slim.conv2d(net, 16, [1, 1], scope='Conv2d_0a_1x1')\n            branch_2 = slim.conv2d(branch_2, 32, [3, 3], scope='Conv2d_0b_3x3')\n          with tf.variable_scope('Branch_3'):\n            branch_3 = slim.max_pool2d(net, [3, 3], scope='MaxPool_0a_3x3')\n            branch_3 = slim.conv2d(branch_3, 32, [1, 1], scope='Conv2d_0b_1x1')\n          net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])\n        end_points[end_point] = net\n        if final_endpoint == end_point: return net, end_points\n\n        end_point = 'Mixed_3c'\n        with tf.variable_scope(end_point):\n          with tf.variable_scope('Branch_0'):\n            branch_0 = slim.conv2d(net, 128, [1, 1], scope='Conv2d_0a_1x1')\n          with tf.variable_scope('Branch_1'):\n            branch_1 = slim.conv2d(net, 128, [1, 1], scope='Conv2d_0a_1x1')\n            branch_1 = slim.conv2d(branch_1, 192, [3, 3], scope='Conv2d_0b_3x3')\n          with tf.variable_scope('Branch_2'):\n            branch_2 = slim.conv2d(net, 32, [1, 1], scope='Conv2d_0a_1x1')\n            branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0b_3x3')\n          with tf.variable_scope('Branch_3'):\n            branch_3 = slim.max_pool2d(net, [3, 3], scope='MaxPool_0a_3x3')\n            branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope='Conv2d_0b_1x1')\n          net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])\n        end_points[end_point] = net\n        if final_endpoint == end_point: return net, end_points\n\n        end_point = 'MaxPool_4a_3x3'\n        net = slim.max_pool2d(net, [3, 3], stride=2, scope=end_point)\n        end_points[end_point] = net\n        if final_endpoint == end_point: return net, end_points\n\n        end_point = 'Mixed_4b'\n        with tf.variable_scope(end_point):\n          with tf.variable_scope('Branch_0'):\n            branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')\n          with tf.variable_scope('Branch_1'):\n            branch_1 = slim.conv2d(net, 96, [1, 1], scope='Conv2d_0a_1x1')\n            branch_1 = slim.conv2d(branch_1, 208, [3, 3], scope='Conv2d_0b_3x3')\n          with tf.variable_scope('Branch_2'):\n            branch_2 = slim.conv2d(net, 16, [1, 1], scope='Conv2d_0a_1x1')\n            branch_2 = slim.conv2d(branch_2, 48, [3, 3], scope='Conv2d_0b_3x3')\n          with tf.variable_scope('Branch_3'):\n            branch_3 = slim.max_pool2d(net, [3, 3], scope='MaxPool_0a_3x3')\n            branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope='Conv2d_0b_1x1')\n          net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])\n        end_points[end_point] = net\n        if final_endpoint == end_point: return net, end_points\n\n        end_point = 'Mixed_4c'\n        with tf.variable_scope(end_point):\n          with tf.variable_scope('Branch_0'):\n            branch_0 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')\n          with tf.variable_scope('Branch_1'):\n            branch_1 = slim.conv2d(net, 112, [1, 1], scope='Conv2d_0a_1x1')\n            branch_1 = slim.conv2d(branch_1, 224, [3, 3], scope='Conv2d_0b_3x3')\n          with tf.variable_scope('Branch_2'):\n            branch_2 = slim.conv2d(net, 24, [1, 1], scope='Conv2d_0a_1x1')\n            branch_2 = slim.conv2d(branch_2, 64, [3, 3], scope='Conv2d_0b_3x3')\n          with tf.variable_scope('Branch_3'):\n            branch_3 = slim.max_pool2d(net, [3, 3], scope='MaxPool_0a_3x3')\n            branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope='Conv2d_0b_1x1')\n          net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])\n        end_points[end_point] = net\n        if final_endpoint == end_point: return net, end_points\n\n        end_point = 'Mixed_4d'\n        with tf.variable_scope(end_point):\n          with tf.variable_scope('Branch_0'):\n            branch_0 = slim.conv2d(net, 128, [1, 1], scope='Conv2d_0a_1x1')\n          with tf.variable_scope('Branch_1'):\n            branch_1 = slim.conv2d(net, 128, [1, 1], scope='Conv2d_0a_1x1')\n            branch_1 = slim.conv2d(branch_1, 256, [3, 3], scope='Conv2d_0b_3x3')\n          with tf.variable_scope('Branch_2'):\n            branch_2 = slim.conv2d(net, 24, [1, 1], scope='Conv2d_0a_1x1')\n            branch_2 = slim.conv2d(branch_2, 64, [3, 3], scope='Conv2d_0b_3x3')\n          with tf.variable_scope('Branch_3'):\n            branch_3 = slim.max_pool2d(net, [3, 3], scope='MaxPool_0a_3x3')\n            branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope='Conv2d_0b_1x1')\n          net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])\n        end_points[end_point] = net\n        if final_endpoint == end_point: return net, end_points\n\n        end_point = 'Mixed_4e'\n        with tf.variable_scope(end_point):\n          with tf.variable_scope('Branch_0'):\n            branch_0 = slim.conv2d(net, 112, [1, 1], scope='Conv2d_0a_1x1')\n          with tf.variable_scope('Branch_1'):\n            branch_1 = slim.conv2d(net, 144, [1, 1], scope='Conv2d_0a_1x1')\n            branch_1 = slim.conv2d(branch_1, 288, [3, 3], scope='Conv2d_0b_3x3')\n          with tf.variable_scope('Branch_2'):\n            branch_2 = slim.conv2d(net, 32, [1, 1], scope='Conv2d_0a_1x1')\n            branch_2 = slim.conv2d(branch_2, 64, [3, 3], scope='Conv2d_0b_3x3')\n          with tf.variable_scope('Branch_3'):\n            branch_3 = slim.max_pool2d(net, [3, 3], scope='MaxPool_0a_3x3')\n            branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope='Conv2d_0b_1x1')\n          net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])\n        end_points[end_point] = net\n        if final_endpoint == end_point: return net, end_points\n\n        end_point = 'Mixed_4f'\n        with tf.variable_scope(end_point):\n          with tf.variable_scope('Branch_0'):\n            branch_0 = slim.conv2d(net, 256, [1, 1], scope='Conv2d_0a_1x1')\n          with tf.variable_scope('Branch_1'):\n            branch_1 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')\n            branch_1 = slim.conv2d(branch_1, 320, [3, 3], scope='Conv2d_0b_3x3')\n          with tf.variable_scope('Branch_2'):\n            branch_2 = slim.conv2d(net, 32, [1, 1], scope='Conv2d_0a_1x1')\n            branch_2 = slim.conv2d(branch_2, 128, [3, 3], scope='Conv2d_0b_3x3')\n          with tf.variable_scope('Branch_3'):\n            branch_3 = slim.max_pool2d(net, [3, 3], scope='MaxPool_0a_3x3')\n            branch_3 = slim.conv2d(branch_3, 128, [1, 1], scope='Conv2d_0b_1x1')\n          net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])\n        end_points[end_point] = net\n        if final_endpoint == end_point: return net, end_points\n\n        end_point = 'MaxPool_5a_2x2'\n        net = slim.max_pool2d(net, [2, 2], stride=2, scope=end_point)\n        end_points[end_point] = net\n        if final_endpoint == end_point: return net, end_points\n\n        end_point = 'Mixed_5b'\n        with tf.variable_scope(end_point):\n          with tf.variable_scope('Branch_0'):\n            branch_0 = slim.conv2d(net, 256, [1, 1], scope='Conv2d_0a_1x1')\n          with tf.variable_scope('Branch_1'):\n            branch_1 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')\n            branch_1 = slim.conv2d(branch_1, 320, [3, 3], scope='Conv2d_0b_3x3')\n          with tf.variable_scope('Branch_2'):\n            branch_2 = slim.conv2d(net, 32, [1, 1], scope='Conv2d_0a_1x1')\n            branch_2 = slim.conv2d(branch_2, 128, [3, 3], scope='Conv2d_0a_3x3')\n          with tf.variable_scope('Branch_3'):\n            branch_3 = slim.max_pool2d(net, [3, 3], scope='MaxPool_0a_3x3')\n            branch_3 = slim.conv2d(branch_3, 128, [1, 1], scope='Conv2d_0b_1x1')\n          net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])\n        end_points[end_point] = net\n        if final_endpoint == end_point: return net, end_points\n\n        end_point = 'Mixed_5c'\n        with tf.variable_scope(end_point):\n          with tf.variable_scope('Branch_0'):\n            branch_0 = slim.conv2d(net, 384, [1, 1], scope='Conv2d_0a_1x1')\n          with tf.variable_scope('Branch_1'):\n            branch_1 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')\n            branch_1 = slim.conv2d(branch_1, 384, [3, 3], scope='Conv2d_0b_3x3')\n          with tf.variable_scope('Branch_2'):\n            branch_2 = slim.conv2d(net, 48, [1, 1], scope='Conv2d_0a_1x1')\n            branch_2 = slim.conv2d(branch_2, 128, [3, 3], scope='Conv2d_0b_3x3')\n          with tf.variable_scope('Branch_3'):\n            branch_3 = slim.max_pool2d(net, [3, 3], scope='MaxPool_0a_3x3')\n            branch_3 = slim.conv2d(branch_3, 128, [1, 1], scope='Conv2d_0b_1x1')\n          net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])\n        end_points[end_point] = net\n        if final_endpoint == end_point: return net, end_points\n    raise ValueError('Unknown final endpoint %s' % final_endpoint)\n\n\ndef inception_v1(inputs,\n                 num_classes=1000,\n                 is_training=True,\n                 dropout_keep_prob=0.8,\n                 prediction_fn=slim.softmax,\n                 spatial_squeeze=True,\n                 reuse=None,\n                 scope='InceptionV1'):\n  \"\"\"Defines the Inception V1 architecture.\n\n  This architecture is defined in:\n\n    Going deeper with convolutions\n    Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed,\n    Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich.\n    http://arxiv.org/pdf/1409.4842v1.pdf.\n\n  The default image size used to train this network is 224x224.\n\n  Args:\n    inputs: a tensor of size [batch_size, height, width, channels].\n    num_classes: number of predicted classes.\n    is_training: whether is training or not.\n    dropout_keep_prob: the percentage of activation values that are retained.\n    prediction_fn: a function to get predictions out of logits.\n    spatial_squeeze: if True, logits is of shape is [B, C], if false logits is\n        of shape [B, 1, 1, C], where B is batch_size and C is number of classes.\n    reuse: whether or not the network and its variables should be reused. To be\n      able to reuse 'scope' must be given.\n    scope: Optional variable_scope.\n\n  Returns:\n    logits: the pre-softmax activations, a tensor of size\n      [batch_size, num_classes]\n    end_points: a dictionary from components of the network to the corresponding\n      activation.\n  \"\"\"\n  # Final pooling and prediction\n  with tf.variable_scope(scope, 'InceptionV1', [inputs, num_classes],\n                         reuse=reuse) as scope:\n    with slim.arg_scope([slim.batch_norm, slim.dropout],\n                        is_training=is_training):\n      net, end_points = inception_v1_base(inputs, scope=scope)\n      with tf.variable_scope('Logits'):\n        net = slim.avg_pool2d(net, [7, 7], stride=1, scope='MaxPool_0a_7x7')\n        net = slim.dropout(net,\n                           dropout_keep_prob, scope='Dropout_0b')\n        logits = slim.conv2d(net, num_classes, [1, 1], activation_fn=None,\n                             normalizer_fn=None, scope='Conv2d_0c_1x1')\n        if spatial_squeeze:\n          logits = tf.squeeze(logits, [1, 2], name='SpatialSqueeze')\n\n        end_points['Logits'] = logits\n        end_points['Predictions'] = prediction_fn(logits, scope='Predictions')\n  return logits, end_points\ninception_v1.default_image_size = 224\n\ninception_v1_arg_scope = inception_utils.inception_arg_scope\n"
  },
  {
    "path": "nets/inception_v1_test.py",
    "content": "# Copyright 2016 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\"\"\"Tests for nets.inception_v1.\"\"\"\n\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport numpy as np\nimport tensorflow as tf\n\nfrom nets import inception\n\nslim = tf.contrib.slim\n\n\nclass InceptionV1Test(tf.test.TestCase):\n\n  def testBuildClassificationNetwork(self):\n    batch_size = 5\n    height, width = 224, 224\n    num_classes = 1000\n\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    logits, end_points = inception.inception_v1(inputs, num_classes)\n    self.assertTrue(logits.op.name.startswith('InceptionV1/Logits'))\n    self.assertListEqual(logits.get_shape().as_list(),\n                         [batch_size, num_classes])\n    self.assertTrue('Predictions' in end_points)\n    self.assertListEqual(end_points['Predictions'].get_shape().as_list(),\n                         [batch_size, num_classes])\n\n  def testBuildBaseNetwork(self):\n    batch_size = 5\n    height, width = 224, 224\n\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    mixed_6c, end_points = inception.inception_v1_base(inputs)\n    self.assertTrue(mixed_6c.op.name.startswith('InceptionV1/Mixed_5c'))\n    self.assertListEqual(mixed_6c.get_shape().as_list(),\n                         [batch_size, 7, 7, 1024])\n    expected_endpoints = ['Conv2d_1a_7x7', 'MaxPool_2a_3x3', 'Conv2d_2b_1x1',\n                          'Conv2d_2c_3x3', 'MaxPool_3a_3x3', 'Mixed_3b',\n                          'Mixed_3c', 'MaxPool_4a_3x3', 'Mixed_4b', 'Mixed_4c',\n                          'Mixed_4d', 'Mixed_4e', 'Mixed_4f', 'MaxPool_5a_2x2',\n                          'Mixed_5b', 'Mixed_5c']\n    self.assertItemsEqual(end_points.keys(), expected_endpoints)\n\n  def testBuildOnlyUptoFinalEndpoint(self):\n    batch_size = 5\n    height, width = 224, 224\n    endpoints = ['Conv2d_1a_7x7', 'MaxPool_2a_3x3', 'Conv2d_2b_1x1',\n                 'Conv2d_2c_3x3', 'MaxPool_3a_3x3', 'Mixed_3b', 'Mixed_3c',\n                 'MaxPool_4a_3x3', 'Mixed_4b', 'Mixed_4c', 'Mixed_4d',\n                 'Mixed_4e', 'Mixed_4f', 'MaxPool_5a_2x2', 'Mixed_5b',\n                 'Mixed_5c']\n    for index, endpoint in enumerate(endpoints):\n      with tf.Graph().as_default():\n        inputs = tf.random_uniform((batch_size, height, width, 3))\n        out_tensor, end_points = inception.inception_v1_base(\n            inputs, final_endpoint=endpoint)\n        self.assertTrue(out_tensor.op.name.startswith(\n            'InceptionV1/' + endpoint))\n        self.assertItemsEqual(endpoints[:index+1], end_points)\n\n  def testBuildAndCheckAllEndPointsUptoMixed5c(self):\n    batch_size = 5\n    height, width = 224, 224\n\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    _, end_points = inception.inception_v1_base(inputs,\n                                                final_endpoint='Mixed_5c')\n    endpoints_shapes = {'Conv2d_1a_7x7': [5, 112, 112, 64],\n                        'MaxPool_2a_3x3': [5, 56, 56, 64],\n                        'Conv2d_2b_1x1': [5, 56, 56, 64],\n                        'Conv2d_2c_3x3': [5, 56, 56, 192],\n                        'MaxPool_3a_3x3': [5, 28, 28, 192],\n                        'Mixed_3b': [5, 28, 28, 256],\n                        'Mixed_3c': [5, 28, 28, 480],\n                        'MaxPool_4a_3x3': [5, 14, 14, 480],\n                        'Mixed_4b': [5, 14, 14, 512],\n                        'Mixed_4c': [5, 14, 14, 512],\n                        'Mixed_4d': [5, 14, 14, 512],\n                        'Mixed_4e': [5, 14, 14, 528],\n                        'Mixed_4f': [5, 14, 14, 832],\n                        'MaxPool_5a_2x2': [5, 7, 7, 832],\n                        'Mixed_5b': [5, 7, 7, 832],\n                        'Mixed_5c': [5, 7, 7, 1024]}\n\n    self.assertItemsEqual(endpoints_shapes.keys(), end_points.keys())\n    for endpoint_name in endpoints_shapes:\n      expected_shape = endpoints_shapes[endpoint_name]\n      self.assertTrue(endpoint_name in end_points)\n      self.assertListEqual(end_points[endpoint_name].get_shape().as_list(),\n                           expected_shape)\n\n  def testModelHasExpectedNumberOfParameters(self):\n    batch_size = 5\n    height, width = 224, 224\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    with slim.arg_scope(inception.inception_v1_arg_scope()):\n      inception.inception_v1_base(inputs)\n    total_params, _ = slim.model_analyzer.analyze_vars(\n        slim.get_model_variables())\n    self.assertAlmostEqual(5607184, total_params)\n\n  def testHalfSizeImages(self):\n    batch_size = 5\n    height, width = 112, 112\n\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    mixed_5c, _ = inception.inception_v1_base(inputs)\n    self.assertTrue(mixed_5c.op.name.startswith('InceptionV1/Mixed_5c'))\n    self.assertListEqual(mixed_5c.get_shape().as_list(),\n                         [batch_size, 4, 4, 1024])\n\n  def testUnknownImageShape(self):\n    tf.reset_default_graph()\n    batch_size = 2\n    height, width = 224, 224\n    num_classes = 1000\n    input_np = np.random.uniform(0, 1, (batch_size, height, width, 3))\n    with self.test_session() as sess:\n      inputs = tf.placeholder(tf.float32, shape=(batch_size, None, None, 3))\n      logits, end_points = inception.inception_v1(inputs, num_classes)\n      self.assertTrue(logits.op.name.startswith('InceptionV1/Logits'))\n      self.assertListEqual(logits.get_shape().as_list(),\n                           [batch_size, num_classes])\n      pre_pool = end_points['Mixed_5c']\n      feed_dict = {inputs: input_np}\n      tf.global_variables_initializer().run()\n      pre_pool_out = sess.run(pre_pool, feed_dict=feed_dict)\n      self.assertListEqual(list(pre_pool_out.shape), [batch_size, 7, 7, 1024])\n\n  def testUnknowBatchSize(self):\n    batch_size = 1\n    height, width = 224, 224\n    num_classes = 1000\n\n    inputs = tf.placeholder(tf.float32, (None, height, width, 3))\n    logits, _ = inception.inception_v1(inputs, num_classes)\n    self.assertTrue(logits.op.name.startswith('InceptionV1/Logits'))\n    self.assertListEqual(logits.get_shape().as_list(),\n                         [None, num_classes])\n    images = tf.random_uniform((batch_size, height, width, 3))\n\n    with self.test_session() as sess:\n      sess.run(tf.global_variables_initializer())\n      output = sess.run(logits, {inputs: images.eval()})\n      self.assertEquals(output.shape, (batch_size, num_classes))\n\n  def testEvaluation(self):\n    batch_size = 2\n    height, width = 224, 224\n    num_classes = 1000\n\n    eval_inputs = tf.random_uniform((batch_size, height, width, 3))\n    logits, _ = inception.inception_v1(eval_inputs, num_classes,\n                                       is_training=False)\n    predictions = tf.argmax(logits, 1)\n\n    with self.test_session() as sess:\n      sess.run(tf.global_variables_initializer())\n      output = sess.run(predictions)\n      self.assertEquals(output.shape, (batch_size,))\n\n  def testTrainEvalWithReuse(self):\n    train_batch_size = 5\n    eval_batch_size = 2\n    height, width = 224, 224\n    num_classes = 1000\n\n    train_inputs = tf.random_uniform((train_batch_size, height, width, 3))\n    inception.inception_v1(train_inputs, num_classes)\n    eval_inputs = tf.random_uniform((eval_batch_size, height, width, 3))\n    logits, _ = inception.inception_v1(eval_inputs, num_classes, reuse=True)\n    predictions = tf.argmax(logits, 1)\n\n    with self.test_session() as sess:\n      sess.run(tf.global_variables_initializer())\n      output = sess.run(predictions)\n      self.assertEquals(output.shape, (eval_batch_size,))\n\n  def testLogitsNotSqueezed(self):\n    num_classes = 25\n    images = tf.random_uniform([1, 224, 224, 3])\n    logits, _ = inception.inception_v1(images,\n                                       num_classes=num_classes,\n                                       spatial_squeeze=False)\n\n    with self.test_session() as sess:\n      tf.global_variables_initializer().run()\n      logits_out = sess.run(logits)\n      self.assertListEqual(list(logits_out.shape), [1, 1, 1, num_classes])\n\n\nif __name__ == '__main__':\n  tf.test.main()\n"
  },
  {
    "path": "nets/inception_v2.py",
    "content": "# Copyright 2016 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\"\"\"Contains the definition for inception v2 classification network.\"\"\"\n\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport tensorflow as tf\n\nfrom nets import inception_utils\n\nslim = tf.contrib.slim\ntrunc_normal = lambda stddev: tf.truncated_normal_initializer(0.0, stddev)\n\n\ndef inception_v2_base(inputs,\n                      final_endpoint='Mixed_5c',\n                      min_depth=16,\n                      depth_multiplier=1.0,\n                      scope=None):\n  \"\"\"Inception v2 (6a2).\n\n  Constructs an Inception v2 network from inputs to the given final endpoint.\n  This method can construct the network up to the layer inception(5b) as\n  described in http://arxiv.org/abs/1502.03167.\n\n  Args:\n    inputs: a tensor of shape [batch_size, height, width, channels].\n    final_endpoint: specifies the endpoint to construct the network up to. It\n      can be one of ['Conv2d_1a_7x7', 'MaxPool_2a_3x3', 'Conv2d_2b_1x1',\n      'Conv2d_2c_3x3', 'MaxPool_3a_3x3', 'Mixed_3b', 'Mixed_3c', 'Mixed_4a',\n      'Mixed_4b', 'Mixed_4c', 'Mixed_4d', 'Mixed_4e', 'Mixed_5a', 'Mixed_5b',\n      'Mixed_5c'].\n    min_depth: Minimum depth value (number of channels) for all convolution ops.\n      Enforced when depth_multiplier < 1, and not an active constraint when\n      depth_multiplier >= 1.\n    depth_multiplier: Float multiplier for the depth (number of channels)\n      for all convolution ops. The value must be greater than zero. Typical\n      usage will be to set this value in (0, 1) to reduce the number of\n      parameters or computation cost of the model.\n    scope: Optional variable_scope.\n\n  Returns:\n    tensor_out: output tensor corresponding to the final_endpoint.\n    end_points: a set of activations for external use, for example summaries or\n                losses.\n\n  Raises:\n    ValueError: if final_endpoint is not set to one of the predefined values,\n                or depth_multiplier <= 0\n  \"\"\"\n\n  # end_points will collect relevant activations for external use, for example\n  # summaries or losses.\n  end_points = {}\n\n  # Used to find thinned depths for each layer.\n  if depth_multiplier <= 0:\n    raise ValueError('depth_multiplier is not greater than zero.')\n  depth = lambda d: max(int(d * depth_multiplier), min_depth)\n\n  with tf.variable_scope(scope, 'InceptionV2', [inputs]):\n    with slim.arg_scope(\n        [slim.conv2d, slim.max_pool2d, slim.avg_pool2d, slim.separable_conv2d],\n        stride=1, padding='SAME'):\n\n      # Note that sizes in the comments below assume an input spatial size of\n      # 224x224, however, the inputs can be of any size greater 32x32.\n\n      # 224 x 224 x 3\n      end_point = 'Conv2d_1a_7x7'\n      # depthwise_multiplier here is different from depth_multiplier.\n      # depthwise_multiplier determines the output channels of the initial\n      # depthwise conv (see docs for tf.nn.separable_conv2d), while\n      # depth_multiplier controls the # channels of the subsequent 1x1\n      # convolution. Must have\n      #   in_channels * depthwise_multipler <= out_channels\n      # so that the separable convolution is not overparameterized.\n      depthwise_multiplier = min(int(depth(64) / 3), 8)\n      net = slim.separable_conv2d(\n          inputs, depth(64), [7, 7], depth_multiplier=depthwise_multiplier,\n          stride=2, weights_initializer=trunc_normal(1.0),\n          scope=end_point)\n      end_points[end_point] = net\n      if end_point == final_endpoint: return net, end_points\n      # 112 x 112 x 64\n      end_point = 'MaxPool_2a_3x3'\n      net = slim.max_pool2d(net, [3, 3], scope=end_point, stride=2)\n      end_points[end_point] = net\n      if end_point == final_endpoint: return net, end_points\n      # 56 x 56 x 64\n      end_point = 'Conv2d_2b_1x1'\n      net = slim.conv2d(net, depth(64), [1, 1], scope=end_point,\n                        weights_initializer=trunc_normal(0.1))\n      end_points[end_point] = net\n      if end_point == final_endpoint: return net, end_points\n      # 56 x 56 x 64\n      end_point = 'Conv2d_2c_3x3'\n      net = slim.conv2d(net, depth(192), [3, 3], scope=end_point)\n      end_points[end_point] = net\n      if end_point == final_endpoint: return net, end_points\n      # 56 x 56 x 192\n      end_point = 'MaxPool_3a_3x3'\n      net = slim.max_pool2d(net, [3, 3], scope=end_point, stride=2)\n      end_points[end_point] = net\n      if end_point == final_endpoint: return net, end_points\n      # 28 x 28 x 192\n      # Inception module.\n      end_point = 'Mixed_3b'\n      with tf.variable_scope(end_point):\n        with tf.variable_scope('Branch_0'):\n          branch_0 = slim.conv2d(net, depth(64), [1, 1], scope='Conv2d_0a_1x1')\n        with tf.variable_scope('Branch_1'):\n          branch_1 = slim.conv2d(\n              net, depth(64), [1, 1],\n              weights_initializer=trunc_normal(0.09),\n              scope='Conv2d_0a_1x1')\n          branch_1 = slim.conv2d(branch_1, depth(64), [3, 3],\n                                 scope='Conv2d_0b_3x3')\n        with tf.variable_scope('Branch_2'):\n          branch_2 = slim.conv2d(\n              net, depth(64), [1, 1],\n              weights_initializer=trunc_normal(0.09),\n              scope='Conv2d_0a_1x1')\n          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3],\n                                 scope='Conv2d_0b_3x3')\n          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3],\n                                 scope='Conv2d_0c_3x3')\n        with tf.variable_scope('Branch_3'):\n          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')\n          branch_3 = slim.conv2d(\n              branch_3, depth(32), [1, 1],\n              weights_initializer=trunc_normal(0.1),\n              scope='Conv2d_0b_1x1')\n        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])\n        end_points[end_point] = net\n        if end_point == final_endpoint: return net, end_points\n      # 28 x 28 x 256\n      end_point = 'Mixed_3c'\n      with tf.variable_scope(end_point):\n        with tf.variable_scope('Branch_0'):\n          branch_0 = slim.conv2d(net, depth(64), [1, 1], scope='Conv2d_0a_1x1')\n        with tf.variable_scope('Branch_1'):\n          branch_1 = slim.conv2d(\n              net, depth(64), [1, 1],\n              weights_initializer=trunc_normal(0.09),\n              scope='Conv2d_0a_1x1')\n          branch_1 = slim.conv2d(branch_1, depth(96), [3, 3],\n                                 scope='Conv2d_0b_3x3')\n        with tf.variable_scope('Branch_2'):\n          branch_2 = slim.conv2d(\n              net, depth(64), [1, 1],\n              weights_initializer=trunc_normal(0.09),\n              scope='Conv2d_0a_1x1')\n          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3],\n                                 scope='Conv2d_0b_3x3')\n          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3],\n                                 scope='Conv2d_0c_3x3')\n        with tf.variable_scope('Branch_3'):\n          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')\n          branch_3 = slim.conv2d(\n              branch_3, depth(64), [1, 1],\n              weights_initializer=trunc_normal(0.1),\n              scope='Conv2d_0b_1x1')\n        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])\n        end_points[end_point] = net\n        if end_point == final_endpoint: return net, end_points\n      # 28 x 28 x 320\n      end_point = 'Mixed_4a'\n      with tf.variable_scope(end_point):\n        with tf.variable_scope('Branch_0'):\n          branch_0 = slim.conv2d(\n              net, depth(128), [1, 1],\n              weights_initializer=trunc_normal(0.09),\n              scope='Conv2d_0a_1x1')\n          branch_0 = slim.conv2d(branch_0, depth(160), [3, 3], stride=2,\n                                 scope='Conv2d_1a_3x3')\n        with tf.variable_scope('Branch_1'):\n          branch_1 = slim.conv2d(\n              net, depth(64), [1, 1],\n              weights_initializer=trunc_normal(0.09),\n              scope='Conv2d_0a_1x1')\n          branch_1 = slim.conv2d(\n              branch_1, depth(96), [3, 3], scope='Conv2d_0b_3x3')\n          branch_1 = slim.conv2d(\n              branch_1, depth(96), [3, 3], stride=2, scope='Conv2d_1a_3x3')\n        with tf.variable_scope('Branch_2'):\n          branch_2 = slim.max_pool2d(\n              net, [3, 3], stride=2, scope='MaxPool_1a_3x3')\n        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2])\n        end_points[end_point] = net\n        if end_point == final_endpoint: return net, end_points\n      # 14 x 14 x 576\n      end_point = 'Mixed_4b'\n      with tf.variable_scope(end_point):\n        with tf.variable_scope('Branch_0'):\n          branch_0 = slim.conv2d(net, depth(224), [1, 1], scope='Conv2d_0a_1x1')\n        with tf.variable_scope('Branch_1'):\n          branch_1 = slim.conv2d(\n              net, depth(64), [1, 1],\n              weights_initializer=trunc_normal(0.09),\n              scope='Conv2d_0a_1x1')\n          branch_1 = slim.conv2d(\n              branch_1, depth(96), [3, 3], scope='Conv2d_0b_3x3')\n        with tf.variable_scope('Branch_2'):\n          branch_2 = slim.conv2d(\n              net, depth(96), [1, 1],\n              weights_initializer=trunc_normal(0.09),\n              scope='Conv2d_0a_1x1')\n          branch_2 = slim.conv2d(branch_2, depth(128), [3, 3],\n                                 scope='Conv2d_0b_3x3')\n          branch_2 = slim.conv2d(branch_2, depth(128), [3, 3],\n                                 scope='Conv2d_0c_3x3')\n        with tf.variable_scope('Branch_3'):\n          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')\n          branch_3 = slim.conv2d(\n              branch_3, depth(128), [1, 1],\n              weights_initializer=trunc_normal(0.1),\n              scope='Conv2d_0b_1x1')\n        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])\n        end_points[end_point] = net\n        if end_point == final_endpoint: return net, end_points\n      # 14 x 14 x 576\n      end_point = 'Mixed_4c'\n      with tf.variable_scope(end_point):\n        with tf.variable_scope('Branch_0'):\n          branch_0 = slim.conv2d(net, depth(192), [1, 1], scope='Conv2d_0a_1x1')\n        with tf.variable_scope('Branch_1'):\n          branch_1 = slim.conv2d(\n              net, depth(96), [1, 1],\n              weights_initializer=trunc_normal(0.09),\n              scope='Conv2d_0a_1x1')\n          branch_1 = slim.conv2d(branch_1, depth(128), [3, 3],\n                                 scope='Conv2d_0b_3x3')\n        with tf.variable_scope('Branch_2'):\n          branch_2 = slim.conv2d(\n              net, depth(96), [1, 1],\n              weights_initializer=trunc_normal(0.09),\n              scope='Conv2d_0a_1x1')\n          branch_2 = slim.conv2d(branch_2, depth(128), [3, 3],\n                                 scope='Conv2d_0b_3x3')\n          branch_2 = slim.conv2d(branch_2, depth(128), [3, 3],\n                                 scope='Conv2d_0c_3x3')\n        with tf.variable_scope('Branch_3'):\n          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')\n          branch_3 = slim.conv2d(\n              branch_3, depth(128), [1, 1],\n              weights_initializer=trunc_normal(0.1),\n              scope='Conv2d_0b_1x1')\n        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])\n        end_points[end_point] = net\n        if end_point == final_endpoint: return net, end_points\n      # 14 x 14 x 576\n      end_point = 'Mixed_4d'\n      with tf.variable_scope(end_point):\n        with tf.variable_scope('Branch_0'):\n          branch_0 = slim.conv2d(net, depth(160), [1, 1], scope='Conv2d_0a_1x1')\n        with tf.variable_scope('Branch_1'):\n          branch_1 = slim.conv2d(\n              net, depth(128), [1, 1],\n              weights_initializer=trunc_normal(0.09),\n              scope='Conv2d_0a_1x1')\n          branch_1 = slim.conv2d(branch_1, depth(160), [3, 3],\n                                 scope='Conv2d_0b_3x3')\n        with tf.variable_scope('Branch_2'):\n          branch_2 = slim.conv2d(\n              net, depth(128), [1, 1],\n              weights_initializer=trunc_normal(0.09),\n              scope='Conv2d_0a_1x1')\n          branch_2 = slim.conv2d(branch_2, depth(160), [3, 3],\n                                 scope='Conv2d_0b_3x3')\n          branch_2 = slim.conv2d(branch_2, depth(160), [3, 3],\n                                 scope='Conv2d_0c_3x3')\n        with tf.variable_scope('Branch_3'):\n          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')\n          branch_3 = slim.conv2d(\n              branch_3, depth(96), [1, 1],\n              weights_initializer=trunc_normal(0.1),\n              scope='Conv2d_0b_1x1')\n        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])\n        end_points[end_point] = net\n        if end_point == final_endpoint: return net, end_points\n\n      # 14 x 14 x 576\n      end_point = 'Mixed_4e'\n      with tf.variable_scope(end_point):\n        with tf.variable_scope('Branch_0'):\n          branch_0 = slim.conv2d(net, depth(96), [1, 1], scope='Conv2d_0a_1x1')\n        with tf.variable_scope('Branch_1'):\n          branch_1 = slim.conv2d(\n              net, depth(128), [1, 1],\n              weights_initializer=trunc_normal(0.09),\n              scope='Conv2d_0a_1x1')\n          branch_1 = slim.conv2d(branch_1, depth(192), [3, 3],\n                                 scope='Conv2d_0b_3x3')\n        with tf.variable_scope('Branch_2'):\n          branch_2 = slim.conv2d(\n              net, depth(160), [1, 1],\n              weights_initializer=trunc_normal(0.09),\n              scope='Conv2d_0a_1x1')\n          branch_2 = slim.conv2d(branch_2, depth(192), [3, 3],\n                                 scope='Conv2d_0b_3x3')\n          branch_2 = slim.conv2d(branch_2, depth(192), [3, 3],\n                                 scope='Conv2d_0c_3x3')\n        with tf.variable_scope('Branch_3'):\n          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')\n          branch_3 = slim.conv2d(\n              branch_3, depth(96), [1, 1],\n              weights_initializer=trunc_normal(0.1),\n              scope='Conv2d_0b_1x1')\n        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])\n        end_points[end_point] = net\n        if end_point == final_endpoint: return net, end_points\n      # 14 x 14 x 576\n      end_point = 'Mixed_5a'\n      with tf.variable_scope(end_point):\n        with tf.variable_scope('Branch_0'):\n          branch_0 = slim.conv2d(\n              net, depth(128), [1, 1],\n              weights_initializer=trunc_normal(0.09),\n              scope='Conv2d_0a_1x1')\n          branch_0 = slim.conv2d(branch_0, depth(192), [3, 3], stride=2,\n                                 scope='Conv2d_1a_3x3')\n        with tf.variable_scope('Branch_1'):\n          branch_1 = slim.conv2d(\n              net, depth(192), [1, 1],\n              weights_initializer=trunc_normal(0.09),\n              scope='Conv2d_0a_1x1')\n          branch_1 = slim.conv2d(branch_1, depth(256), [3, 3],\n                                 scope='Conv2d_0b_3x3')\n          branch_1 = slim.conv2d(branch_1, depth(256), [3, 3], stride=2,\n                                 scope='Conv2d_1a_3x3')\n        with tf.variable_scope('Branch_2'):\n          branch_2 = slim.max_pool2d(net, [3, 3], stride=2,\n                                     scope='MaxPool_1a_3x3')\n        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2])\n        end_points[end_point] = net\n        if end_point == final_endpoint: return net, end_points\n      # 7 x 7 x 1024\n      end_point = 'Mixed_5b'\n      with tf.variable_scope(end_point):\n        with tf.variable_scope('Branch_0'):\n          branch_0 = slim.conv2d(net, depth(352), [1, 1], scope='Conv2d_0a_1x1')\n        with tf.variable_scope('Branch_1'):\n          branch_1 = slim.conv2d(\n              net, depth(192), [1, 1],\n              weights_initializer=trunc_normal(0.09),\n              scope='Conv2d_0a_1x1')\n          branch_1 = slim.conv2d(branch_1, depth(320), [3, 3],\n                                 scope='Conv2d_0b_3x3')\n        with tf.variable_scope('Branch_2'):\n          branch_2 = slim.conv2d(\n              net, depth(160), [1, 1],\n              weights_initializer=trunc_normal(0.09),\n              scope='Conv2d_0a_1x1')\n          branch_2 = slim.conv2d(branch_2, depth(224), [3, 3],\n                                 scope='Conv2d_0b_3x3')\n          branch_2 = slim.conv2d(branch_2, depth(224), [3, 3],\n                                 scope='Conv2d_0c_3x3')\n        with tf.variable_scope('Branch_3'):\n          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')\n          branch_3 = slim.conv2d(\n              branch_3, depth(128), [1, 1],\n              weights_initializer=trunc_normal(0.1),\n              scope='Conv2d_0b_1x1')\n        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])\n        end_points[end_point] = net\n        if end_point == final_endpoint: return net, end_points\n\n      # 7 x 7 x 1024\n      end_point = 'Mixed_5c'\n      with tf.variable_scope(end_point):\n        with tf.variable_scope('Branch_0'):\n          branch_0 = slim.conv2d(net, depth(352), [1, 1], scope='Conv2d_0a_1x1')\n        with tf.variable_scope('Branch_1'):\n          branch_1 = slim.conv2d(\n              net, depth(192), [1, 1],\n              weights_initializer=trunc_normal(0.09),\n              scope='Conv2d_0a_1x1')\n          branch_1 = slim.conv2d(branch_1, depth(320), [3, 3],\n                                 scope='Conv2d_0b_3x3')\n        with tf.variable_scope('Branch_2'):\n          branch_2 = slim.conv2d(\n              net, depth(192), [1, 1],\n              weights_initializer=trunc_normal(0.09),\n              scope='Conv2d_0a_1x1')\n          branch_2 = slim.conv2d(branch_2, depth(224), [3, 3],\n                                 scope='Conv2d_0b_3x3')\n          branch_2 = slim.conv2d(branch_2, depth(224), [3, 3],\n                                 scope='Conv2d_0c_3x3')\n        with tf.variable_scope('Branch_3'):\n          branch_3 = slim.max_pool2d(net, [3, 3], scope='MaxPool_0a_3x3')\n          branch_3 = slim.conv2d(\n              branch_3, depth(128), [1, 1],\n              weights_initializer=trunc_normal(0.1),\n              scope='Conv2d_0b_1x1')\n        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])\n        end_points[end_point] = net\n        if end_point == final_endpoint: return net, end_points\n    raise ValueError('Unknown final endpoint %s' % final_endpoint)\n\n\ndef inception_v2(inputs,\n                 num_classes=1000,\n                 is_training=True,\n                 dropout_keep_prob=0.8,\n                 min_depth=16,\n                 depth_multiplier=1.0,\n                 prediction_fn=slim.softmax,\n                 spatial_squeeze=True,\n                 reuse=None,\n                 scope='InceptionV2'):\n  \"\"\"Inception v2 model for classification.\n\n  Constructs an Inception v2 network for classification as described in\n  http://arxiv.org/abs/1502.03167.\n\n  The default image size used to train this network is 224x224.\n\n  Args:\n    inputs: a tensor of shape [batch_size, height, width, channels].\n    num_classes: number of predicted classes.\n    is_training: whether is training or not.\n    dropout_keep_prob: the percentage of activation values that are retained.\n    min_depth: Minimum depth value (number of channels) for all convolution ops.\n      Enforced when depth_multiplier < 1, and not an active constraint when\n      depth_multiplier >= 1.\n    depth_multiplier: Float multiplier for the depth (number of channels)\n      for all convolution ops. The value must be greater than zero. Typical\n      usage will be to set this value in (0, 1) to reduce the number of\n      parameters or computation cost of the model.\n    prediction_fn: a function to get predictions out of logits.\n    spatial_squeeze: if True, logits is of shape is [B, C], if false logits is\n        of shape [B, 1, 1, C], where B is batch_size and C is number of classes.\n    reuse: whether or not the network and its variables should be reused. To be\n      able to reuse 'scope' must be given.\n    scope: Optional variable_scope.\n\n  Returns:\n    logits: the pre-softmax activations, a tensor of size\n      [batch_size, num_classes]\n    end_points: a dictionary from components of the network to the corresponding\n      activation.\n\n  Raises:\n    ValueError: if final_endpoint is not set to one of the predefined values,\n                or depth_multiplier <= 0\n  \"\"\"\n  if depth_multiplier <= 0:\n    raise ValueError('depth_multiplier is not greater than zero.')\n\n  # Final pooling and prediction\n  with tf.variable_scope(scope, 'InceptionV2', [inputs, num_classes],\n                         reuse=reuse) as scope:\n    with slim.arg_scope([slim.batch_norm, slim.dropout],\n                        is_training=is_training):\n      net, end_points = inception_v2_base(\n          inputs, scope=scope, min_depth=min_depth,\n          depth_multiplier=depth_multiplier)\n      with tf.variable_scope('Logits'):\n        kernel_size = _reduced_kernel_size_for_small_input(net, [7, 7])\n        net = slim.avg_pool2d(net, kernel_size, padding='VALID',\n                              scope='AvgPool_1a_{}x{}'.format(*kernel_size))\n        # 1 x 1 x 1024\n        net = slim.dropout(net, keep_prob=dropout_keep_prob, scope='Dropout_1b')\n        logits = slim.conv2d(net, num_classes, [1, 1], activation_fn=None,\n                             normalizer_fn=None, scope='Conv2d_1c_1x1')\n        if spatial_squeeze:\n          logits = tf.squeeze(logits, [1, 2], name='SpatialSqueeze')\n      end_points['Logits'] = logits\n      end_points['Predictions'] = prediction_fn(logits, scope='Predictions')\n  return logits, end_points\ninception_v2.default_image_size = 224\n\n\ndef _reduced_kernel_size_for_small_input(input_tensor, kernel_size):\n  \"\"\"Define kernel size which is automatically reduced for small input.\n\n  If the shape of the input images is unknown at graph construction time this\n  function assumes that the input images are is large enough.\n\n  Args:\n    input_tensor: input tensor of size [batch_size, height, width, channels].\n    kernel_size: desired kernel size of length 2: [kernel_height, kernel_width]\n\n  Returns:\n    a tensor with the kernel size.\n\n  TODO(jrru): Make this function work with unknown shapes. Theoretically, this\n  can be done with the code below. Problems are two-fold: (1) If the shape was\n  known, it will be lost. (2) inception.slim.ops._two_element_tuple cannot\n  handle tensors that define the kernel size.\n      shape = tf.shape(input_tensor)\n      return = tf.pack([tf.minimum(shape[1], kernel_size[0]),\n                        tf.minimum(shape[2], kernel_size[1])])\n\n  \"\"\"\n  shape = input_tensor.get_shape().as_list()\n  if shape[1] is None or shape[2] is None:\n    kernel_size_out = kernel_size\n  else:\n    kernel_size_out = [min(shape[1], kernel_size[0]),\n                       min(shape[2], kernel_size[1])]\n  return kernel_size_out\n\n\ninception_v2_arg_scope = inception_utils.inception_arg_scope\n"
  },
  {
    "path": "nets/inception_v2_test.py",
    "content": "# Copyright 2016 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\"\"\"Tests for nets.inception_v2.\"\"\"\n\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport numpy as np\nimport tensorflow as tf\n\nfrom nets import inception\n\nslim = tf.contrib.slim\n\n\nclass InceptionV2Test(tf.test.TestCase):\n\n  def testBuildClassificationNetwork(self):\n    batch_size = 5\n    height, width = 224, 224\n    num_classes = 1000\n\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    logits, end_points = inception.inception_v2(inputs, num_classes)\n    self.assertTrue(logits.op.name.startswith('InceptionV2/Logits'))\n    self.assertListEqual(logits.get_shape().as_list(),\n                         [batch_size, num_classes])\n    self.assertTrue('Predictions' in end_points)\n    self.assertListEqual(end_points['Predictions'].get_shape().as_list(),\n                         [batch_size, num_classes])\n\n  def testBuildBaseNetwork(self):\n    batch_size = 5\n    height, width = 224, 224\n\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    mixed_5c, end_points = inception.inception_v2_base(inputs)\n    self.assertTrue(mixed_5c.op.name.startswith('InceptionV2/Mixed_5c'))\n    self.assertListEqual(mixed_5c.get_shape().as_list(),\n                         [batch_size, 7, 7, 1024])\n    expected_endpoints = ['Mixed_3b', 'Mixed_3c', 'Mixed_4a', 'Mixed_4b',\n                          'Mixed_4c', 'Mixed_4d', 'Mixed_4e', 'Mixed_5a',\n                          'Mixed_5b', 'Mixed_5c', 'Conv2d_1a_7x7',\n                          'MaxPool_2a_3x3', 'Conv2d_2b_1x1', 'Conv2d_2c_3x3',\n                          'MaxPool_3a_3x3']\n    self.assertItemsEqual(end_points.keys(), expected_endpoints)\n\n  def testBuildOnlyUptoFinalEndpoint(self):\n    batch_size = 5\n    height, width = 224, 224\n    endpoints = ['Conv2d_1a_7x7', 'MaxPool_2a_3x3', 'Conv2d_2b_1x1',\n                 'Conv2d_2c_3x3', 'MaxPool_3a_3x3', 'Mixed_3b', 'Mixed_3c',\n                 'Mixed_4a', 'Mixed_4b', 'Mixed_4c', 'Mixed_4d', 'Mixed_4e',\n                 'Mixed_5a', 'Mixed_5b', 'Mixed_5c']\n    for index, endpoint in enumerate(endpoints):\n      with tf.Graph().as_default():\n        inputs = tf.random_uniform((batch_size, height, width, 3))\n        out_tensor, end_points = inception.inception_v2_base(\n            inputs, final_endpoint=endpoint)\n        self.assertTrue(out_tensor.op.name.startswith(\n            'InceptionV2/' + endpoint))\n        self.assertItemsEqual(endpoints[:index+1], end_points)\n\n  def testBuildAndCheckAllEndPointsUptoMixed5c(self):\n    batch_size = 5\n    height, width = 224, 224\n\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    _, end_points = inception.inception_v2_base(inputs,\n                                                final_endpoint='Mixed_5c')\n    endpoints_shapes = {'Mixed_3b': [batch_size, 28, 28, 256],\n                        'Mixed_3c': [batch_size, 28, 28, 320],\n                        'Mixed_4a': [batch_size, 14, 14, 576],\n                        'Mixed_4b': [batch_size, 14, 14, 576],\n                        'Mixed_4c': [batch_size, 14, 14, 576],\n                        'Mixed_4d': [batch_size, 14, 14, 576],\n                        'Mixed_4e': [batch_size, 14, 14, 576],\n                        'Mixed_5a': [batch_size, 7, 7, 1024],\n                        'Mixed_5b': [batch_size, 7, 7, 1024],\n                        'Mixed_5c': [batch_size, 7, 7, 1024],\n                        'Conv2d_1a_7x7': [batch_size, 112, 112, 64],\n                        'MaxPool_2a_3x3': [batch_size, 56, 56, 64],\n                        'Conv2d_2b_1x1': [batch_size, 56, 56, 64],\n                        'Conv2d_2c_3x3': [batch_size, 56, 56, 192],\n                        'MaxPool_3a_3x3': [batch_size, 28, 28, 192]}\n    self.assertItemsEqual(endpoints_shapes.keys(), end_points.keys())\n    for endpoint_name in endpoints_shapes:\n      expected_shape = endpoints_shapes[endpoint_name]\n      self.assertTrue(endpoint_name in end_points)\n      self.assertListEqual(end_points[endpoint_name].get_shape().as_list(),\n                           expected_shape)\n\n  def testModelHasExpectedNumberOfParameters(self):\n    batch_size = 5\n    height, width = 224, 224\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    with slim.arg_scope(inception.inception_v2_arg_scope()):\n      inception.inception_v2_base(inputs)\n    total_params, _ = slim.model_analyzer.analyze_vars(\n        slim.get_model_variables())\n    self.assertAlmostEqual(10173112, total_params)\n\n  def testBuildEndPointsWithDepthMultiplierLessThanOne(self):\n    batch_size = 5\n    height, width = 224, 224\n    num_classes = 1000\n\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    _, end_points = inception.inception_v2(inputs, num_classes)\n\n    endpoint_keys = [key for key in end_points.keys()\n                     if key.startswith('Mixed') or key.startswith('Conv')]\n\n    _, end_points_with_multiplier = inception.inception_v2(\n        inputs, num_classes, scope='depth_multiplied_net',\n        depth_multiplier=0.5)\n\n    for key in endpoint_keys:\n      original_depth = end_points[key].get_shape().as_list()[3]\n      new_depth = end_points_with_multiplier[key].get_shape().as_list()[3]\n      self.assertEqual(0.5 * original_depth, new_depth)\n\n  def testBuildEndPointsWithDepthMultiplierGreaterThanOne(self):\n    batch_size = 5\n    height, width = 224, 224\n    num_classes = 1000\n\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    _, end_points = inception.inception_v2(inputs, num_classes)\n\n    endpoint_keys = [key for key in end_points.keys()\n                     if key.startswith('Mixed') or key.startswith('Conv')]\n\n    _, end_points_with_multiplier = inception.inception_v2(\n        inputs, num_classes, scope='depth_multiplied_net',\n        depth_multiplier=2.0)\n\n    for key in endpoint_keys:\n      original_depth = end_points[key].get_shape().as_list()[3]\n      new_depth = end_points_with_multiplier[key].get_shape().as_list()[3]\n      self.assertEqual(2.0 * original_depth, new_depth)\n\n  def testRaiseValueErrorWithInvalidDepthMultiplier(self):\n    batch_size = 5\n    height, width = 224, 224\n    num_classes = 1000\n\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    with self.assertRaises(ValueError):\n      _ = inception.inception_v2(inputs, num_classes, depth_multiplier=-0.1)\n    with self.assertRaises(ValueError):\n      _ = inception.inception_v2(inputs, num_classes, depth_multiplier=0.0)\n\n  def testHalfSizeImages(self):\n    batch_size = 5\n    height, width = 112, 112\n    num_classes = 1000\n\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    logits, end_points = inception.inception_v2(inputs, num_classes)\n    self.assertTrue(logits.op.name.startswith('InceptionV2/Logits'))\n    self.assertListEqual(logits.get_shape().as_list(),\n                         [batch_size, num_classes])\n    pre_pool = end_points['Mixed_5c']\n    self.assertListEqual(pre_pool.get_shape().as_list(),\n                         [batch_size, 4, 4, 1024])\n\n  def testUnknownImageShape(self):\n    tf.reset_default_graph()\n    batch_size = 2\n    height, width = 224, 224\n    num_classes = 1000\n    input_np = np.random.uniform(0, 1, (batch_size, height, width, 3))\n    with self.test_session() as sess:\n      inputs = tf.placeholder(tf.float32, shape=(batch_size, None, None, 3))\n      logits, end_points = inception.inception_v2(inputs, num_classes)\n      self.assertTrue(logits.op.name.startswith('InceptionV2/Logits'))\n      self.assertListEqual(logits.get_shape().as_list(),\n                           [batch_size, num_classes])\n      pre_pool = end_points['Mixed_5c']\n      feed_dict = {inputs: input_np}\n      tf.global_variables_initializer().run()\n      pre_pool_out = sess.run(pre_pool, feed_dict=feed_dict)\n      self.assertListEqual(list(pre_pool_out.shape), [batch_size, 7, 7, 1024])\n\n  def testUnknowBatchSize(self):\n    batch_size = 1\n    height, width = 224, 224\n    num_classes = 1000\n\n    inputs = tf.placeholder(tf.float32, (None, height, width, 3))\n    logits, _ = inception.inception_v2(inputs, num_classes)\n    self.assertTrue(logits.op.name.startswith('InceptionV2/Logits'))\n    self.assertListEqual(logits.get_shape().as_list(),\n                         [None, num_classes])\n    images = tf.random_uniform((batch_size, height, width, 3))\n\n    with self.test_session() as sess:\n      sess.run(tf.global_variables_initializer())\n      output = sess.run(logits, {inputs: images.eval()})\n      self.assertEquals(output.shape, (batch_size, num_classes))\n\n  def testEvaluation(self):\n    batch_size = 2\n    height, width = 224, 224\n    num_classes = 1000\n\n    eval_inputs = tf.random_uniform((batch_size, height, width, 3))\n    logits, _ = inception.inception_v2(eval_inputs, num_classes,\n                                       is_training=False)\n    predictions = tf.argmax(logits, 1)\n\n    with self.test_session() as sess:\n      sess.run(tf.global_variables_initializer())\n      output = sess.run(predictions)\n      self.assertEquals(output.shape, (batch_size,))\n\n  def testTrainEvalWithReuse(self):\n    train_batch_size = 5\n    eval_batch_size = 2\n    height, width = 150, 150\n    num_classes = 1000\n\n    train_inputs = tf.random_uniform((train_batch_size, height, width, 3))\n    inception.inception_v2(train_inputs, num_classes)\n    eval_inputs = tf.random_uniform((eval_batch_size, height, width, 3))\n    logits, _ = inception.inception_v2(eval_inputs, num_classes, reuse=True)\n    predictions = tf.argmax(logits, 1)\n\n    with self.test_session() as sess:\n      sess.run(tf.global_variables_initializer())\n      output = sess.run(predictions)\n      self.assertEquals(output.shape, (eval_batch_size,))\n\n  def testLogitsNotSqueezed(self):\n    num_classes = 25\n    images = tf.random_uniform([1, 224, 224, 3])\n    logits, _ = inception.inception_v2(images,\n                                       num_classes=num_classes,\n                                       spatial_squeeze=False)\n\n    with self.test_session() as sess:\n      tf.global_variables_initializer().run()\n      logits_out = sess.run(logits)\n      self.assertListEqual(list(logits_out.shape), [1, 1, 1, num_classes])\n\n\nif __name__ == '__main__':\n  tf.test.main()\n"
  },
  {
    "path": "nets/inception_v3.py",
    "content": "# Copyright 2016 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\"\"\"Contains the definition for inception v3 classification network.\"\"\"\n\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport tensorflow as tf\n\nfrom nets import inception_utils\n\nslim = tf.contrib.slim\ntrunc_normal = lambda stddev: tf.truncated_normal_initializer(0.0, stddev)\n\n\ndef inception_v3_base(inputs,\n                      final_endpoint='Mixed_7c',\n                      min_depth=16,\n                      depth_multiplier=1.0,\n                      scope=None):\n  \"\"\"Inception model from http://arxiv.org/abs/1512.00567.\n\n  Constructs an Inception v3 network from inputs to the given final endpoint.\n  This method can construct the network up to the final inception block\n  Mixed_7c.\n\n  Note that the names of the layers in the paper do not correspond to the names\n  of the endpoints registered by this function although they build the same\n  network.\n\n  Here is a mapping from the old_names to the new names:\n  Old name          | New name\n  =======================================\n  conv0             | Conv2d_1a_3x3\n  conv1             | Conv2d_2a_3x3\n  conv2             | Conv2d_2b_3x3\n  pool1             | MaxPool_3a_3x3\n  conv3             | Conv2d_3b_1x1\n  conv4             | Conv2d_4a_3x3\n  pool2             | MaxPool_5a_3x3\n  mixed_35x35x256a  | Mixed_5b\n  mixed_35x35x288a  | Mixed_5c\n  mixed_35x35x288b  | Mixed_5d\n  mixed_17x17x768a  | Mixed_6a\n  mixed_17x17x768b  | Mixed_6b\n  mixed_17x17x768c  | Mixed_6c\n  mixed_17x17x768d  | Mixed_6d\n  mixed_17x17x768e  | Mixed_6e\n  mixed_8x8x1280a   | Mixed_7a\n  mixed_8x8x2048a   | Mixed_7b\n  mixed_8x8x2048b   | Mixed_7c\n\n  Args:\n    inputs: a tensor of size [batch_size, height, width, channels].\n    final_endpoint: specifies the endpoint to construct the network up to. It\n      can be one of ['Conv2d_1a_3x3', 'Conv2d_2a_3x3', 'Conv2d_2b_3x3',\n      'MaxPool_3a_3x3', 'Conv2d_3b_1x1', 'Conv2d_4a_3x3', 'MaxPool_5a_3x3',\n      'Mixed_5b', 'Mixed_5c', 'Mixed_5d', 'Mixed_6a', 'Mixed_6b', 'Mixed_6c',\n      'Mixed_6d', 'Mixed_6e', 'Mixed_7a', 'Mixed_7b', 'Mixed_7c'].\n    min_depth: Minimum depth value (number of channels) for all convolution ops.\n      Enforced when depth_multiplier < 1, and not an active constraint when\n      depth_multiplier >= 1.\n    depth_multiplier: Float multiplier for the depth (number of channels)\n      for all convolution ops. The value must be greater than zero. Typical\n      usage will be to set this value in (0, 1) to reduce the number of\n      parameters or computation cost of the model.\n    scope: Optional variable_scope.\n\n  Returns:\n    tensor_out: output tensor corresponding to the final_endpoint.\n    end_points: a set of activations for external use, for example summaries or\n                losses.\n\n  Raises:\n    ValueError: if final_endpoint is not set to one of the predefined values,\n                or depth_multiplier <= 0\n  \"\"\"\n  # end_points will collect relevant activations for external use, for example\n  # summaries or losses.\n  end_points = {}\n\n  if depth_multiplier <= 0:\n    raise ValueError('depth_multiplier is not greater than zero.')\n  depth = lambda d: max(int(d * depth_multiplier), min_depth)\n\n  with tf.variable_scope(scope, 'InceptionV3', [inputs]):\n    with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d],\n                        stride=1, padding='VALID'):\n      # 299 x 299 x 3\n      end_point = 'Conv2d_1a_3x3'\n      net = slim.conv2d(inputs, depth(32), [3, 3], stride=2, scope=end_point)\n      end_points[end_point] = net\n      if end_point == final_endpoint: return net, end_points\n      # 149 x 149 x 32\n      end_point = 'Conv2d_2a_3x3'\n      net = slim.conv2d(net, depth(32), [3, 3], scope=end_point)\n      end_points[end_point] = net\n      if end_point == final_endpoint: return net, end_points\n      # 147 x 147 x 32\n      end_point = 'Conv2d_2b_3x3'\n      net = slim.conv2d(net, depth(64), [3, 3], padding='SAME', scope=end_point)\n      end_points[end_point] = net\n      if end_point == final_endpoint: return net, end_points\n      # 147 x 147 x 64\n      end_point = 'MaxPool_3a_3x3'\n      net = slim.max_pool2d(net, [3, 3], stride=2, scope=end_point)\n      end_points[end_point] = net\n      if end_point == final_endpoint: return net, end_points\n      # 73 x 73 x 64\n      end_point = 'Conv2d_3b_1x1'\n      net = slim.conv2d(net, depth(80), [1, 1], scope=end_point)\n      end_points[end_point] = net\n      if end_point == final_endpoint: return net, end_points\n      # 73 x 73 x 80.\n      end_point = 'Conv2d_4a_3x3'\n      net = slim.conv2d(net, depth(192), [3, 3], scope=end_point)\n      end_points[end_point] = net\n      if end_point == final_endpoint: return net, end_points\n      # 71 x 71 x 192.\n      end_point = 'MaxPool_5a_3x3'\n      net = slim.max_pool2d(net, [3, 3], stride=2, scope=end_point)\n      end_points[end_point] = net\n      if end_point == final_endpoint: return net, end_points\n      # 35 x 35 x 192.\n\n    # Inception blocks\n    with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d],\n                        stride=1, padding='SAME'):\n      # mixed: 35 x 35 x 256.\n      end_point = 'Mixed_5b'\n      with tf.variable_scope(end_point):\n        with tf.variable_scope('Branch_0'):\n          branch_0 = slim.conv2d(net, depth(64), [1, 1], scope='Conv2d_0a_1x1')\n        with tf.variable_scope('Branch_1'):\n          branch_1 = slim.conv2d(net, depth(48), [1, 1], scope='Conv2d_0a_1x1')\n          branch_1 = slim.conv2d(branch_1, depth(64), [5, 5],\n                                 scope='Conv2d_0b_5x5')\n        with tf.variable_scope('Branch_2'):\n          branch_2 = slim.conv2d(net, depth(64), [1, 1], scope='Conv2d_0a_1x1')\n          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3],\n                                 scope='Conv2d_0b_3x3')\n          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3],\n                                 scope='Conv2d_0c_3x3')\n        with tf.variable_scope('Branch_3'):\n          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')\n          branch_3 = slim.conv2d(branch_3, depth(32), [1, 1],\n                                 scope='Conv2d_0b_1x1')\n        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])\n      end_points[end_point] = net\n      if end_point == final_endpoint: return net, end_points\n\n      # mixed_1: 35 x 35 x 288.\n      end_point = 'Mixed_5c'\n      with tf.variable_scope(end_point):\n        with tf.variable_scope('Branch_0'):\n          branch_0 = slim.conv2d(net, depth(64), [1, 1], scope='Conv2d_0a_1x1')\n        with tf.variable_scope('Branch_1'):\n          branch_1 = slim.conv2d(net, depth(48), [1, 1], scope='Conv2d_0b_1x1')\n          branch_1 = slim.conv2d(branch_1, depth(64), [5, 5],\n                                 scope='Conv_1_0c_5x5')\n        with tf.variable_scope('Branch_2'):\n          branch_2 = slim.conv2d(net, depth(64), [1, 1],\n                                 scope='Conv2d_0a_1x1')\n          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3],\n                                 scope='Conv2d_0b_3x3')\n          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3],\n                                 scope='Conv2d_0c_3x3')\n        with tf.variable_scope('Branch_3'):\n          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')\n          branch_3 = slim.conv2d(branch_3, depth(64), [1, 1],\n                                 scope='Conv2d_0b_1x1')\n        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])\n      end_points[end_point] = net\n      if end_point == final_endpoint: return net, end_points\n\n      # mixed_2: 35 x 35 x 288.\n      end_point = 'Mixed_5d'\n      with tf.variable_scope(end_point):\n        with tf.variable_scope('Branch_0'):\n          branch_0 = slim.conv2d(net, depth(64), [1, 1], scope='Conv2d_0a_1x1')\n        with tf.variable_scope('Branch_1'):\n          branch_1 = slim.conv2d(net, depth(48), [1, 1], scope='Conv2d_0a_1x1')\n          branch_1 = slim.conv2d(branch_1, depth(64), [5, 5],\n                                 scope='Conv2d_0b_5x5')\n        with tf.variable_scope('Branch_2'):\n          branch_2 = slim.conv2d(net, depth(64), [1, 1], scope='Conv2d_0a_1x1')\n          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3],\n                                 scope='Conv2d_0b_3x3')\n          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3],\n                                 scope='Conv2d_0c_3x3')\n        with tf.variable_scope('Branch_3'):\n          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')\n          branch_3 = slim.conv2d(branch_3, depth(64), [1, 1],\n                                 scope='Conv2d_0b_1x1')\n        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])\n      end_points[end_point] = net\n      if end_point == final_endpoint: return net, end_points\n\n      # mixed_3: 17 x 17 x 768.\n      end_point = 'Mixed_6a'\n      with tf.variable_scope(end_point):\n        with tf.variable_scope('Branch_0'):\n          branch_0 = slim.conv2d(net, depth(384), [3, 3], stride=2,\n                                 padding='VALID', scope='Conv2d_1a_1x1')\n        with tf.variable_scope('Branch_1'):\n          branch_1 = slim.conv2d(net, depth(64), [1, 1], scope='Conv2d_0a_1x1')\n          branch_1 = slim.conv2d(branch_1, depth(96), [3, 3],\n                                 scope='Conv2d_0b_3x3')\n          branch_1 = slim.conv2d(branch_1, depth(96), [3, 3], stride=2,\n                                 padding='VALID', scope='Conv2d_1a_1x1')\n        with tf.variable_scope('Branch_2'):\n          branch_2 = slim.max_pool2d(net, [3, 3], stride=2, padding='VALID',\n                                     scope='MaxPool_1a_3x3')\n        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2])\n      end_points[end_point] = net\n      if end_point == final_endpoint: return net, end_points\n\n      # mixed4: 17 x 17 x 768.\n      end_point = 'Mixed_6b'\n      with tf.variable_scope(end_point):\n        with tf.variable_scope('Branch_0'):\n          branch_0 = slim.conv2d(net, depth(192), [1, 1], scope='Conv2d_0a_1x1')\n        with tf.variable_scope('Branch_1'):\n          branch_1 = slim.conv2d(net, depth(128), [1, 1], scope='Conv2d_0a_1x1')\n          branch_1 = slim.conv2d(branch_1, depth(128), [1, 7],\n                                 scope='Conv2d_0b_1x7')\n          branch_1 = slim.conv2d(branch_1, depth(192), [7, 1],\n                                 scope='Conv2d_0c_7x1')\n        with tf.variable_scope('Branch_2'):\n          branch_2 = slim.conv2d(net, depth(128), [1, 1], scope='Conv2d_0a_1x1')\n          branch_2 = slim.conv2d(branch_2, depth(128), [7, 1],\n                                 scope='Conv2d_0b_7x1')\n          branch_2 = slim.conv2d(branch_2, depth(128), [1, 7],\n                                 scope='Conv2d_0c_1x7')\n          branch_2 = slim.conv2d(branch_2, depth(128), [7, 1],\n                                 scope='Conv2d_0d_7x1')\n          branch_2 = slim.conv2d(branch_2, depth(192), [1, 7],\n                                 scope='Conv2d_0e_1x7')\n        with tf.variable_scope('Branch_3'):\n          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')\n          branch_3 = slim.conv2d(branch_3, depth(192), [1, 1],\n                                 scope='Conv2d_0b_1x1')\n        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])\n      end_points[end_point] = net\n      if end_point == final_endpoint: return net, end_points\n\n      # mixed_5: 17 x 17 x 768.\n      end_point = 'Mixed_6c'\n      with tf.variable_scope(end_point):\n        with tf.variable_scope('Branch_0'):\n          branch_0 = slim.conv2d(net, depth(192), [1, 1], scope='Conv2d_0a_1x1')\n        with tf.variable_scope('Branch_1'):\n          branch_1 = slim.conv2d(net, depth(160), [1, 1], scope='Conv2d_0a_1x1')\n          branch_1 = slim.conv2d(branch_1, depth(160), [1, 7],\n                                 scope='Conv2d_0b_1x7')\n          branch_1 = slim.conv2d(branch_1, depth(192), [7, 1],\n                                 scope='Conv2d_0c_7x1')\n        with tf.variable_scope('Branch_2'):\n          branch_2 = slim.conv2d(net, depth(160), [1, 1], scope='Conv2d_0a_1x1')\n          branch_2 = slim.conv2d(branch_2, depth(160), [7, 1],\n                                 scope='Conv2d_0b_7x1')\n          branch_2 = slim.conv2d(branch_2, depth(160), [1, 7],\n                                 scope='Conv2d_0c_1x7')\n          branch_2 = slim.conv2d(branch_2, depth(160), [7, 1],\n                                 scope='Conv2d_0d_7x1')\n          branch_2 = slim.conv2d(branch_2, depth(192), [1, 7],\n                                 scope='Conv2d_0e_1x7')\n        with tf.variable_scope('Branch_3'):\n          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')\n          branch_3 = slim.conv2d(branch_3, depth(192), [1, 1],\n                                 scope='Conv2d_0b_1x1')\n        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])\n      end_points[end_point] = net\n      if end_point == final_endpoint: return net, end_points\n      # mixed_6: 17 x 17 x 768.\n      end_point = 'Mixed_6d'\n      with tf.variable_scope(end_point):\n        with tf.variable_scope('Branch_0'):\n          branch_0 = slim.conv2d(net, depth(192), [1, 1], scope='Conv2d_0a_1x1')\n        with tf.variable_scope('Branch_1'):\n          branch_1 = slim.conv2d(net, depth(160), [1, 1], scope='Conv2d_0a_1x1')\n          branch_1 = slim.conv2d(branch_1, depth(160), [1, 7],\n                                 scope='Conv2d_0b_1x7')\n          branch_1 = slim.conv2d(branch_1, depth(192), [7, 1],\n                                 scope='Conv2d_0c_7x1')\n        with tf.variable_scope('Branch_2'):\n          branch_2 = slim.conv2d(net, depth(160), [1, 1], scope='Conv2d_0a_1x1')\n          branch_2 = slim.conv2d(branch_2, depth(160), [7, 1],\n                                 scope='Conv2d_0b_7x1')\n          branch_2 = slim.conv2d(branch_2, depth(160), [1, 7],\n                                 scope='Conv2d_0c_1x7')\n          branch_2 = slim.conv2d(branch_2, depth(160), [7, 1],\n                                 scope='Conv2d_0d_7x1')\n          branch_2 = slim.conv2d(branch_2, depth(192), [1, 7],\n                                 scope='Conv2d_0e_1x7')\n        with tf.variable_scope('Branch_3'):\n          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')\n          branch_3 = slim.conv2d(branch_3, depth(192), [1, 1],\n                                 scope='Conv2d_0b_1x1')\n        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])\n      end_points[end_point] = net\n      if end_point == final_endpoint: return net, end_points\n\n      # mixed_7: 17 x 17 x 768.\n      end_point = 'Mixed_6e'\n      with tf.variable_scope(end_point):\n        with tf.variable_scope('Branch_0'):\n          branch_0 = slim.conv2d(net, depth(192), [1, 1], scope='Conv2d_0a_1x1')\n        with tf.variable_scope('Branch_1'):\n          branch_1 = slim.conv2d(net, depth(192), [1, 1], scope='Conv2d_0a_1x1')\n          branch_1 = slim.conv2d(branch_1, depth(192), [1, 7],\n                                 scope='Conv2d_0b_1x7')\n          branch_1 = slim.conv2d(branch_1, depth(192), [7, 1],\n                                 scope='Conv2d_0c_7x1')\n        with tf.variable_scope('Branch_2'):\n          branch_2 = slim.conv2d(net, depth(192), [1, 1], scope='Conv2d_0a_1x1')\n          branch_2 = slim.conv2d(branch_2, depth(192), [7, 1],\n                                 scope='Conv2d_0b_7x1')\n          branch_2 = slim.conv2d(branch_2, depth(192), [1, 7],\n                                 scope='Conv2d_0c_1x7')\n          branch_2 = slim.conv2d(branch_2, depth(192), [7, 1],\n                                 scope='Conv2d_0d_7x1')\n          branch_2 = slim.conv2d(branch_2, depth(192), [1, 7],\n                                 scope='Conv2d_0e_1x7')\n        with tf.variable_scope('Branch_3'):\n          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')\n          branch_3 = slim.conv2d(branch_3, depth(192), [1, 1],\n                                 scope='Conv2d_0b_1x1')\n        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])\n      end_points[end_point] = net\n      if end_point == final_endpoint: return net, end_points\n\n      # mixed_8: 8 x 8 x 1280.\n      end_point = 'Mixed_7a'\n      with tf.variable_scope(end_point):\n        with tf.variable_scope('Branch_0'):\n          branch_0 = slim.conv2d(net, depth(192), [1, 1], scope='Conv2d_0a_1x1')\n          branch_0 = slim.conv2d(branch_0, depth(320), [3, 3], stride=2,\n                                 padding='VALID', scope='Conv2d_1a_3x3')\n        with tf.variable_scope('Branch_1'):\n          branch_1 = slim.conv2d(net, depth(192), [1, 1], scope='Conv2d_0a_1x1')\n          branch_1 = slim.conv2d(branch_1, depth(192), [1, 7],\n                                 scope='Conv2d_0b_1x7')\n          branch_1 = slim.conv2d(branch_1, depth(192), [7, 1],\n                                 scope='Conv2d_0c_7x1')\n          branch_1 = slim.conv2d(branch_1, depth(192), [3, 3], stride=2,\n                                 padding='VALID', scope='Conv2d_1a_3x3')\n        with tf.variable_scope('Branch_2'):\n          branch_2 = slim.max_pool2d(net, [3, 3], stride=2, padding='VALID',\n                                     scope='MaxPool_1a_3x3')\n        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2])\n      end_points[end_point] = net\n      if end_point == final_endpoint: return net, end_points\n      # mixed_9: 8 x 8 x 2048.\n      end_point = 'Mixed_7b'\n      with tf.variable_scope(end_point):\n        with tf.variable_scope('Branch_0'):\n          branch_0 = slim.conv2d(net, depth(320), [1, 1], scope='Conv2d_0a_1x1')\n        with tf.variable_scope('Branch_1'):\n          branch_1 = slim.conv2d(net, depth(384), [1, 1], scope='Conv2d_0a_1x1')\n          branch_1 = tf.concat(axis=3, values=[\n              slim.conv2d(branch_1, depth(384), [1, 3], scope='Conv2d_0b_1x3'),\n              slim.conv2d(branch_1, depth(384), [3, 1], scope='Conv2d_0b_3x1')])\n        with tf.variable_scope('Branch_2'):\n          branch_2 = slim.conv2d(net, depth(448), [1, 1], scope='Conv2d_0a_1x1')\n          branch_2 = slim.conv2d(\n              branch_2, depth(384), [3, 3], scope='Conv2d_0b_3x3')\n          branch_2 = tf.concat(axis=3, values=[\n              slim.conv2d(branch_2, depth(384), [1, 3], scope='Conv2d_0c_1x3'),\n              slim.conv2d(branch_2, depth(384), [3, 1], scope='Conv2d_0d_3x1')])\n        with tf.variable_scope('Branch_3'):\n          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')\n          branch_3 = slim.conv2d(\n              branch_3, depth(192), [1, 1], scope='Conv2d_0b_1x1')\n        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])\n      end_points[end_point] = net\n      if end_point == final_endpoint: return net, end_points\n\n      # mixed_10: 8 x 8 x 2048.\n      end_point = 'Mixed_7c'\n      with tf.variable_scope(end_point):\n        with tf.variable_scope('Branch_0'):\n          branch_0 = slim.conv2d(net, depth(320), [1, 1], scope='Conv2d_0a_1x1')\n        with tf.variable_scope('Branch_1'):\n          branch_1 = slim.conv2d(net, depth(384), [1, 1], scope='Conv2d_0a_1x1')\n          branch_1 = tf.concat(axis=3, values=[\n              slim.conv2d(branch_1, depth(384), [1, 3], scope='Conv2d_0b_1x3'),\n              slim.conv2d(branch_1, depth(384), [3, 1], scope='Conv2d_0c_3x1')])\n        with tf.variable_scope('Branch_2'):\n          branch_2 = slim.conv2d(net, depth(448), [1, 1], scope='Conv2d_0a_1x1')\n          branch_2 = slim.conv2d(\n              branch_2, depth(384), [3, 3], scope='Conv2d_0b_3x3')\n          branch_2 = tf.concat(axis=3, values=[\n              slim.conv2d(branch_2, depth(384), [1, 3], scope='Conv2d_0c_1x3'),\n              slim.conv2d(branch_2, depth(384), [3, 1], scope='Conv2d_0d_3x1')])\n        with tf.variable_scope('Branch_3'):\n          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')\n          branch_3 = slim.conv2d(\n              branch_3, depth(192), [1, 1], scope='Conv2d_0b_1x1')\n        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])\n      end_points[end_point] = net\n      if end_point == final_endpoint: return net, end_points\n    raise ValueError('Unknown final endpoint %s' % final_endpoint)\n\n\ndef inception_v3(inputs,\n                 num_classes=1000,\n                 is_training=True,\n                 dropout_keep_prob=0.8,\n                 min_depth=16,\n                 depth_multiplier=1.0,\n                 prediction_fn=slim.softmax,\n                 spatial_squeeze=True,\n                 reuse=None,\n                 scope='InceptionV3'):\n  \"\"\"Inception model from http://arxiv.org/abs/1512.00567.\n\n  \"Rethinking the Inception Architecture for Computer Vision\"\n\n  Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens,\n  Zbigniew Wojna.\n\n  With the default arguments this method constructs the exact model defined in\n  the paper. However, one can experiment with variations of the inception_v3\n  network by changing arguments dropout_keep_prob, min_depth and\n  depth_multiplier.\n\n  The default image size used to train this network is 299x299.\n\n  Args:\n    inputs: a tensor of size [batch_size, height, width, channels].\n    num_classes: number of predicted classes.\n    is_training: whether is training or not.\n    dropout_keep_prob: the percentage of activation values that are retained.\n    min_depth: Minimum depth value (number of channels) for all convolution ops.\n      Enforced when depth_multiplier < 1, and not an active constraint when\n      depth_multiplier >= 1.\n    depth_multiplier: Float multiplier for the depth (number of channels)\n      for all convolution ops. The value must be greater than zero. Typical\n      usage will be to set this value in (0, 1) to reduce the number of\n      parameters or computation cost of the model.\n    prediction_fn: a function to get predictions out of logits.\n    spatial_squeeze: if True, logits is of shape is [B, C], if false logits is\n        of shape [B, 1, 1, C], where B is batch_size and C is number of classes.\n    reuse: whether or not the network and its variables should be reused. To be\n      able to reuse 'scope' must be given.\n    scope: Optional variable_scope.\n\n  Returns:\n    logits: the pre-softmax activations, a tensor of size\n      [batch_size, num_classes]\n    end_points: a dictionary from components of the network to the corresponding\n      activation.\n\n  Raises:\n    ValueError: if 'depth_multiplier' is less than or equal to zero.\n  \"\"\"\n  if depth_multiplier <= 0:\n    raise ValueError('depth_multiplier is not greater than zero.')\n  depth = lambda d: max(int(d * depth_multiplier), min_depth)\n\n  with tf.variable_scope(scope, 'InceptionV3', [inputs, num_classes],\n                         reuse=reuse) as scope:\n    with slim.arg_scope([slim.batch_norm, slim.dropout],\n                        is_training=is_training):\n      net, end_points = inception_v3_base(\n          inputs, scope=scope, min_depth=min_depth,\n          depth_multiplier=depth_multiplier)\n\n      # Auxiliary Head logits\n      with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d],\n                          stride=1, padding='SAME'):\n        aux_logits = end_points['Mixed_6e']\n        with tf.variable_scope('AuxLogits'):\n          # We want to pool the feature map to be 5x5xC\n          # With padding = 0, and stride 3, this means our kernel is H - 12\n          kernel_size = [aux_logits.get_shape().as_list()[1] - 12] * 2\n          aux_logits = slim.avg_pool2d(\n              aux_logits, kernel_size, stride=3, padding='VALID',\n              scope='AvgPool_1a_5x5')\n          aux_logits = slim.conv2d(aux_logits, depth(128), [1, 1],\n                                   scope='Conv2d_1b_1x1')\n\n          # Shape of feature map before the final layer.\n          kernel_size = _reduced_kernel_size_for_small_input(aux_logits, [5, 5])\n          aux_logits = slim.conv2d(\n              aux_logits, depth(768), kernel_size,\n              weights_initializer=trunc_normal(0.01),\n              padding='VALID', scope='Conv2d_2a_{}x{}'.format(*kernel_size))\n          aux_logits = slim.conv2d(\n              aux_logits, num_classes, [1, 1], activation_fn=None,\n              normalizer_fn=None, weights_initializer=trunc_normal(0.001),\n              scope='Conv2d_2b_1x1')\n          if spatial_squeeze:\n            aux_logits = tf.squeeze(aux_logits, [1, 2], name='SpatialSqueeze')\n          end_points['AuxLogits'] = aux_logits\n\n      # Final pooling and prediction\n      with tf.variable_scope('Logits'):\n        #kernel_size = _reduced_kernel_size_for_small_input(net, [8, 8])\n        kernel_size = _kernel_to_1x1_for_specific_input(net)\n        net = slim.avg_pool2d(net, kernel_size, padding='VALID',\n                              scope='AvgPool_1a_{}x{}'.format(*kernel_size))\n        # 1 x 1 x 2048\n        net = slim.dropout(net, keep_prob=dropout_keep_prob, scope='Dropout_1b')\n        end_points['PreLogits'] = net\n        # 2048\n        logits = slim.conv2d(net, num_classes, [1, 1], activation_fn=None,\n                             normalizer_fn=None, scope='Conv2d_1c_1x1')\n        if spatial_squeeze:\n          logits = tf.squeeze(logits, [1, 2], name='SpatialSqueeze')\n        # 1000\n      end_points['Logits'] = logits\n      end_points['Predictions'] = prediction_fn(logits, scope='Predictions')\n  return logits, end_points\ninception_v3.default_image_size = 299\n\n\ndef _reduced_kernel_size_for_small_input(input_tensor, kernel_size):\n  \"\"\"Define kernel size which is automatically reduced for small input.\n\n  If the shape of the input images is unknown at graph construction time this\n  function assumes that the input images are is large enough.\n\n  Args:\n    input_tensor: input tensor of size [batch_size, height, width, channels].\n    kernel_size: desired kernel size of length 2: [kernel_height, kernel_width]\n\n  Returns:\n    a tensor with the kernel size.\n\n  TODO(jrru): Make this function work with unknown shapes. Theoretically, this\n  can be done with the code below. Problems are two-fold: (1) If the shape was\n  known, it will be lost. (2) inception.slim.ops._two_element_tuple cannot\n  handle tensors that define the kernel size.\n      shape = tf.shape(input_tensor)\n      return = tf.pack([tf.minimum(shape[1], kernel_size[0]),\n                        tf.minimum(shape[2], kernel_size[1])])\n\n  \"\"\"\n  shape = input_tensor.get_shape().as_list()\n  if shape[1] is None or shape[2] is None:\n    kernel_size_out = kernel_size\n  else:\n    kernel_size_out = [min(shape[1], kernel_size[0]),\n                       min(shape[2], kernel_size[1])]\n  return kernel_size_out\n\ndef _kernel_to_1x1_for_specific_input(input_tensor):\n  \"\"\"Return a kernel that will transform the input_tensor into a vector.\n\n  We want any input tensor of shape [B, H, W, C] to be transormed into [B, 1, 1, C].\n  We assume a known input shape.\n  \"\"\"\n  shape = input_tensor.get_shape().as_list()\n  return [shape[1], shape[2]]\n\n\ninception_v3_arg_scope = inception_utils.inception_arg_scope\n"
  },
  {
    "path": "nets/inception_v3_test.py",
    "content": "# Copyright 2016 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\"\"\"Tests for nets.inception_v1.\"\"\"\n\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport numpy as np\nimport tensorflow as tf\n\nfrom nets import inception\n\nslim = tf.contrib.slim\n\n\nclass InceptionV3Test(tf.test.TestCase):\n\n  def testBuildClassificationNetwork(self):\n    batch_size = 5\n    height, width = 299, 299\n    num_classes = 1000\n\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    logits, end_points = inception.inception_v3(inputs, num_classes)\n    self.assertTrue(logits.op.name.startswith('InceptionV3/Logits'))\n    self.assertListEqual(logits.get_shape().as_list(),\n                         [batch_size, num_classes])\n    self.assertTrue('Predictions' in end_points)\n    self.assertListEqual(end_points['Predictions'].get_shape().as_list(),\n                         [batch_size, num_classes])\n\n  def testBuildBaseNetwork(self):\n    batch_size = 5\n    height, width = 299, 299\n\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    final_endpoint, end_points = inception.inception_v3_base(inputs)\n    self.assertTrue(final_endpoint.op.name.startswith(\n        'InceptionV3/Mixed_7c'))\n    self.assertListEqual(final_endpoint.get_shape().as_list(),\n                         [batch_size, 8, 8, 2048])\n    expected_endpoints = ['Conv2d_1a_3x3', 'Conv2d_2a_3x3', 'Conv2d_2b_3x3',\n                          'MaxPool_3a_3x3', 'Conv2d_3b_1x1', 'Conv2d_4a_3x3',\n                          'MaxPool_5a_3x3', 'Mixed_5b', 'Mixed_5c', 'Mixed_5d',\n                          'Mixed_6a', 'Mixed_6b', 'Mixed_6c', 'Mixed_6d',\n                          'Mixed_6e', 'Mixed_7a', 'Mixed_7b', 'Mixed_7c']\n    self.assertItemsEqual(end_points.keys(), expected_endpoints)\n\n  def testBuildOnlyUptoFinalEndpoint(self):\n    batch_size = 5\n    height, width = 299, 299\n    endpoints = ['Conv2d_1a_3x3', 'Conv2d_2a_3x3', 'Conv2d_2b_3x3',\n                 'MaxPool_3a_3x3', 'Conv2d_3b_1x1', 'Conv2d_4a_3x3',\n                 'MaxPool_5a_3x3', 'Mixed_5b', 'Mixed_5c', 'Mixed_5d',\n                 'Mixed_6a', 'Mixed_6b', 'Mixed_6c', 'Mixed_6d',\n                 'Mixed_6e', 'Mixed_7a', 'Mixed_7b', 'Mixed_7c']\n\n    for index, endpoint in enumerate(endpoints):\n      with tf.Graph().as_default():\n        inputs = tf.random_uniform((batch_size, height, width, 3))\n        out_tensor, end_points = inception.inception_v3_base(\n            inputs, final_endpoint=endpoint)\n        self.assertTrue(out_tensor.op.name.startswith(\n            'InceptionV3/' + endpoint))\n        self.assertItemsEqual(endpoints[:index+1], end_points)\n\n  def testBuildAndCheckAllEndPointsUptoMixed7c(self):\n    batch_size = 5\n    height, width = 299, 299\n\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    _, end_points = inception.inception_v3_base(\n        inputs, final_endpoint='Mixed_7c')\n    endpoints_shapes = {'Conv2d_1a_3x3': [batch_size, 149, 149, 32],\n                        'Conv2d_2a_3x3': [batch_size, 147, 147, 32],\n                        'Conv2d_2b_3x3': [batch_size, 147, 147, 64],\n                        'MaxPool_3a_3x3': [batch_size, 73, 73, 64],\n                        'Conv2d_3b_1x1': [batch_size, 73, 73, 80],\n                        'Conv2d_4a_3x3': [batch_size, 71, 71, 192],\n                        'MaxPool_5a_3x3': [batch_size, 35, 35, 192],\n                        'Mixed_5b': [batch_size, 35, 35, 256],\n                        'Mixed_5c': [batch_size, 35, 35, 288],\n                        'Mixed_5d': [batch_size, 35, 35, 288],\n                        'Mixed_6a': [batch_size, 17, 17, 768],\n                        'Mixed_6b': [batch_size, 17, 17, 768],\n                        'Mixed_6c': [batch_size, 17, 17, 768],\n                        'Mixed_6d': [batch_size, 17, 17, 768],\n                        'Mixed_6e': [batch_size, 17, 17, 768],\n                        'Mixed_7a': [batch_size, 8, 8, 1280],\n                        'Mixed_7b': [batch_size, 8, 8, 2048],\n                        'Mixed_7c': [batch_size, 8, 8, 2048]}\n    self.assertItemsEqual(endpoints_shapes.keys(), end_points.keys())\n    for endpoint_name in endpoints_shapes:\n      expected_shape = endpoints_shapes[endpoint_name]\n      self.assertTrue(endpoint_name in end_points)\n      self.assertListEqual(end_points[endpoint_name].get_shape().as_list(),\n                           expected_shape)\n\n  def testModelHasExpectedNumberOfParameters(self):\n    batch_size = 5\n    height, width = 299, 299\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    with slim.arg_scope(inception.inception_v3_arg_scope()):\n      inception.inception_v3_base(inputs)\n    total_params, _ = slim.model_analyzer.analyze_vars(\n        slim.get_model_variables())\n    self.assertAlmostEqual(21802784, total_params)\n\n  def testBuildEndPoints(self):\n    batch_size = 5\n    height, width = 299, 299\n    num_classes = 1000\n\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    _, end_points = inception.inception_v3(inputs, num_classes)\n    self.assertTrue('Logits' in end_points)\n    logits = end_points['Logits']\n    self.assertListEqual(logits.get_shape().as_list(),\n                         [batch_size, num_classes])\n    self.assertTrue('AuxLogits' in end_points)\n    aux_logits = end_points['AuxLogits']\n    self.assertListEqual(aux_logits.get_shape().as_list(),\n                         [batch_size, num_classes])\n    self.assertTrue('Mixed_7c' in end_points)\n    pre_pool = end_points['Mixed_7c']\n    self.assertListEqual(pre_pool.get_shape().as_list(),\n                         [batch_size, 8, 8, 2048])\n    self.assertTrue('PreLogits' in end_points)\n    pre_logits = end_points['PreLogits']\n    self.assertListEqual(pre_logits.get_shape().as_list(),\n                         [batch_size, 1, 1, 2048])\n\n  def testBuildEndPointsWithDepthMultiplierLessThanOne(self):\n    batch_size = 5\n    height, width = 299, 299\n    num_classes = 1000\n\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    _, end_points = inception.inception_v3(inputs, num_classes)\n\n    endpoint_keys = [key for key in end_points.keys()\n                     if key.startswith('Mixed') or key.startswith('Conv')]\n\n    _, end_points_with_multiplier = inception.inception_v3(\n        inputs, num_classes, scope='depth_multiplied_net',\n        depth_multiplier=0.5)\n\n    for key in endpoint_keys:\n      original_depth = end_points[key].get_shape().as_list()[3]\n      new_depth = end_points_with_multiplier[key].get_shape().as_list()[3]\n      self.assertEqual(0.5 * original_depth, new_depth)\n\n  def testBuildEndPointsWithDepthMultiplierGreaterThanOne(self):\n    batch_size = 5\n    height, width = 299, 299\n    num_classes = 1000\n\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    _, end_points = inception.inception_v3(inputs, num_classes)\n\n    endpoint_keys = [key for key in end_points.keys()\n                     if key.startswith('Mixed') or key.startswith('Conv')]\n\n    _, end_points_with_multiplier = inception.inception_v3(\n        inputs, num_classes, scope='depth_multiplied_net',\n        depth_multiplier=2.0)\n\n    for key in endpoint_keys:\n      original_depth = end_points[key].get_shape().as_list()[3]\n      new_depth = end_points_with_multiplier[key].get_shape().as_list()[3]\n      self.assertEqual(2.0 * original_depth, new_depth)\n\n  def testRaiseValueErrorWithInvalidDepthMultiplier(self):\n    batch_size = 5\n    height, width = 299, 299\n    num_classes = 1000\n\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    with self.assertRaises(ValueError):\n      _ = inception.inception_v3(inputs, num_classes, depth_multiplier=-0.1)\n    with self.assertRaises(ValueError):\n      _ = inception.inception_v3(inputs, num_classes, depth_multiplier=0.0)\n\n  def testHalfSizeImages(self):\n    batch_size = 5\n    height, width = 150, 150\n    num_classes = 1000\n\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    logits, end_points = inception.inception_v3(inputs, num_classes)\n    self.assertTrue(logits.op.name.startswith('InceptionV3/Logits'))\n    self.assertListEqual(logits.get_shape().as_list(),\n                         [batch_size, num_classes])\n    pre_pool = end_points['Mixed_7c']\n    self.assertListEqual(pre_pool.get_shape().as_list(),\n                         [batch_size, 3, 3, 2048])\n\n  def testUnknownImageShape(self):\n    tf.reset_default_graph()\n    batch_size = 2\n    height, width = 299, 299\n    num_classes = 1000\n    input_np = np.random.uniform(0, 1, (batch_size, height, width, 3))\n    with self.test_session() as sess:\n      inputs = tf.placeholder(tf.float32, shape=(batch_size, None, None, 3))\n      logits, end_points = inception.inception_v3(inputs, num_classes)\n      self.assertListEqual(logits.get_shape().as_list(),\n                           [batch_size, num_classes])\n      pre_pool = end_points['Mixed_7c']\n      feed_dict = {inputs: input_np}\n      tf.global_variables_initializer().run()\n      pre_pool_out = sess.run(pre_pool, feed_dict=feed_dict)\n      self.assertListEqual(list(pre_pool_out.shape), [batch_size, 8, 8, 2048])\n\n  def testUnknowBatchSize(self):\n    batch_size = 1\n    height, width = 299, 299\n    num_classes = 1000\n\n    inputs = tf.placeholder(tf.float32, (None, height, width, 3))\n    logits, _ = inception.inception_v3(inputs, num_classes)\n    self.assertTrue(logits.op.name.startswith('InceptionV3/Logits'))\n    self.assertListEqual(logits.get_shape().as_list(),\n                         [None, num_classes])\n    images = tf.random_uniform((batch_size, height, width, 3))\n\n    with self.test_session() as sess:\n      sess.run(tf.global_variables_initializer())\n      output = sess.run(logits, {inputs: images.eval()})\n      self.assertEquals(output.shape, (batch_size, num_classes))\n\n  def testEvaluation(self):\n    batch_size = 2\n    height, width = 299, 299\n    num_classes = 1000\n\n    eval_inputs = tf.random_uniform((batch_size, height, width, 3))\n    logits, _ = inception.inception_v3(eval_inputs, num_classes,\n                                       is_training=False)\n    predictions = tf.argmax(logits, 1)\n\n    with self.test_session() as sess:\n      sess.run(tf.global_variables_initializer())\n      output = sess.run(predictions)\n      self.assertEquals(output.shape, (batch_size,))\n\n  def testTrainEvalWithReuse(self):\n    train_batch_size = 5\n    eval_batch_size = 2\n    height, width = 150, 150\n    num_classes = 1000\n\n    train_inputs = tf.random_uniform((train_batch_size, height, width, 3))\n    inception.inception_v3(train_inputs, num_classes)\n    eval_inputs = tf.random_uniform((eval_batch_size, height, width, 3))\n    logits, _ = inception.inception_v3(eval_inputs, num_classes,\n                                       is_training=False, reuse=True)\n    predictions = tf.argmax(logits, 1)\n\n    with self.test_session() as sess:\n      sess.run(tf.global_variables_initializer())\n      output = sess.run(predictions)\n      self.assertEquals(output.shape, (eval_batch_size,))\n\n  def testLogitsNotSqueezed(self):\n    num_classes = 25\n    images = tf.random_uniform([1, 299, 299, 3])\n    logits, _ = inception.inception_v3(images,\n                                       num_classes=num_classes,\n                                       spatial_squeeze=False)\n\n    with self.test_session() as sess:\n      tf.global_variables_initializer().run()\n      logits_out = sess.run(logits)\n      self.assertListEqual(list(logits_out.shape), [1, 1, 1, num_classes])\n\n\nif __name__ == '__main__':\n  tf.test.main()\n"
  },
  {
    "path": "nets/inception_v4.py",
    "content": "# Copyright 2016 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\"\"\"Contains the definition of the Inception V4 architecture.\n\nAs described in http://arxiv.org/abs/1602.07261.\n\n  Inception-v4, Inception-ResNet and the Impact of Residual Connections\n    on Learning\n  Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, Alex Alemi\n\"\"\"\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport tensorflow as tf\n\nfrom nets import inception_utils\n\nslim = tf.contrib.slim\n\n\ndef block_inception_a(inputs, scope=None, reuse=None):\n  \"\"\"Builds Inception-A block for Inception v4 network.\"\"\"\n  # By default use stride=1 and SAME padding\n  with slim.arg_scope([slim.conv2d, slim.avg_pool2d, slim.max_pool2d],\n                      stride=1, padding='SAME'):\n    with tf.variable_scope(scope, 'BlockInceptionA', [inputs], reuse=reuse):\n      with tf.variable_scope('Branch_0'):\n        branch_0 = slim.conv2d(inputs, 96, [1, 1], scope='Conv2d_0a_1x1')\n      with tf.variable_scope('Branch_1'):\n        branch_1 = slim.conv2d(inputs, 64, [1, 1], scope='Conv2d_0a_1x1')\n        branch_1 = slim.conv2d(branch_1, 96, [3, 3], scope='Conv2d_0b_3x3')\n      with tf.variable_scope('Branch_2'):\n        branch_2 = slim.conv2d(inputs, 64, [1, 1], scope='Conv2d_0a_1x1')\n        branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0b_3x3')\n        branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0c_3x3')\n      with tf.variable_scope('Branch_3'):\n        branch_3 = slim.avg_pool2d(inputs, [3, 3], scope='AvgPool_0a_3x3')\n        branch_3 = slim.conv2d(branch_3, 96, [1, 1], scope='Conv2d_0b_1x1')\n      return tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])\n\n\ndef block_reduction_a(inputs, scope=None, reuse=None):\n  \"\"\"Builds Reduction-A block for Inception v4 network.\"\"\"\n  # By default use stride=1 and SAME padding\n  with slim.arg_scope([slim.conv2d, slim.avg_pool2d, slim.max_pool2d],\n                      stride=1, padding='SAME'):\n    with tf.variable_scope(scope, 'BlockReductionA', [inputs], reuse=reuse):\n      with tf.variable_scope('Branch_0'):\n        branch_0 = slim.conv2d(inputs, 384, [3, 3], stride=2, padding='VALID',\n                               scope='Conv2d_1a_3x3')\n      with tf.variable_scope('Branch_1'):\n        branch_1 = slim.conv2d(inputs, 192, [1, 1], scope='Conv2d_0a_1x1')\n        branch_1 = slim.conv2d(branch_1, 224, [3, 3], scope='Conv2d_0b_3x3')\n        branch_1 = slim.conv2d(branch_1, 256, [3, 3], stride=2,\n                               padding='VALID', scope='Conv2d_1a_3x3')\n      with tf.variable_scope('Branch_2'):\n        branch_2 = slim.max_pool2d(inputs, [3, 3], stride=2, padding='VALID',\n                                   scope='MaxPool_1a_3x3')\n      return tf.concat(axis=3, values=[branch_0, branch_1, branch_2])\n\n\ndef block_inception_b(inputs, scope=None, reuse=None):\n  \"\"\"Builds Inception-B block for Inception v4 network.\"\"\"\n  # By default use stride=1 and SAME padding\n  with slim.arg_scope([slim.conv2d, slim.avg_pool2d, slim.max_pool2d],\n                      stride=1, padding='SAME'):\n    with tf.variable_scope(scope, 'BlockInceptionB', [inputs], reuse=reuse):\n      with tf.variable_scope('Branch_0'):\n        branch_0 = slim.conv2d(inputs, 384, [1, 1], scope='Conv2d_0a_1x1')\n      with tf.variable_scope('Branch_1'):\n        branch_1 = slim.conv2d(inputs, 192, [1, 1], scope='Conv2d_0a_1x1')\n        branch_1 = slim.conv2d(branch_1, 224, [1, 7], scope='Conv2d_0b_1x7')\n        branch_1 = slim.conv2d(branch_1, 256, [7, 1], scope='Conv2d_0c_7x1')\n      with tf.variable_scope('Branch_2'):\n        branch_2 = slim.conv2d(inputs, 192, [1, 1], scope='Conv2d_0a_1x1')\n        branch_2 = slim.conv2d(branch_2, 192, [7, 1], scope='Conv2d_0b_7x1')\n        branch_2 = slim.conv2d(branch_2, 224, [1, 7], scope='Conv2d_0c_1x7')\n        branch_2 = slim.conv2d(branch_2, 224, [7, 1], scope='Conv2d_0d_7x1')\n        branch_2 = slim.conv2d(branch_2, 256, [1, 7], scope='Conv2d_0e_1x7')\n      with tf.variable_scope('Branch_3'):\n        branch_3 = slim.avg_pool2d(inputs, [3, 3], scope='AvgPool_0a_3x3')\n        branch_3 = slim.conv2d(branch_3, 128, [1, 1], scope='Conv2d_0b_1x1')\n      return tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])\n\n\ndef block_reduction_b(inputs, scope=None, reuse=None):\n  \"\"\"Builds Reduction-B block for Inception v4 network.\"\"\"\n  # By default use stride=1 and SAME padding\n  with slim.arg_scope([slim.conv2d, slim.avg_pool2d, slim.max_pool2d],\n                      stride=1, padding='SAME'):\n    with tf.variable_scope(scope, 'BlockReductionB', [inputs], reuse=reuse):\n      with tf.variable_scope('Branch_0'):\n        branch_0 = slim.conv2d(inputs, 192, [1, 1], scope='Conv2d_0a_1x1')\n        branch_0 = slim.conv2d(branch_0, 192, [3, 3], stride=2,\n                               padding='VALID', scope='Conv2d_1a_3x3')\n      with tf.variable_scope('Branch_1'):\n        branch_1 = slim.conv2d(inputs, 256, [1, 1], scope='Conv2d_0a_1x1')\n        branch_1 = slim.conv2d(branch_1, 256, [1, 7], scope='Conv2d_0b_1x7')\n        branch_1 = slim.conv2d(branch_1, 320, [7, 1], scope='Conv2d_0c_7x1')\n        branch_1 = slim.conv2d(branch_1, 320, [3, 3], stride=2,\n                               padding='VALID', scope='Conv2d_1a_3x3')\n      with tf.variable_scope('Branch_2'):\n        branch_2 = slim.max_pool2d(inputs, [3, 3], stride=2, padding='VALID',\n                                   scope='MaxPool_1a_3x3')\n      return tf.concat(axis=3, values=[branch_0, branch_1, branch_2])\n\n\ndef block_inception_c(inputs, scope=None, reuse=None):\n  \"\"\"Builds Inception-C block for Inception v4 network.\"\"\"\n  # By default use stride=1 and SAME padding\n  with slim.arg_scope([slim.conv2d, slim.avg_pool2d, slim.max_pool2d],\n                      stride=1, padding='SAME'):\n    with tf.variable_scope(scope, 'BlockInceptionC', [inputs], reuse=reuse):\n      with tf.variable_scope('Branch_0'):\n        branch_0 = slim.conv2d(inputs, 256, [1, 1], scope='Conv2d_0a_1x1')\n      with tf.variable_scope('Branch_1'):\n        branch_1 = slim.conv2d(inputs, 384, [1, 1], scope='Conv2d_0a_1x1')\n        branch_1 = tf.concat(axis=3, values=[\n            slim.conv2d(branch_1, 256, [1, 3], scope='Conv2d_0b_1x3'),\n            slim.conv2d(branch_1, 256, [3, 1], scope='Conv2d_0c_3x1')])\n      with tf.variable_scope('Branch_2'):\n        branch_2 = slim.conv2d(inputs, 384, [1, 1], scope='Conv2d_0a_1x1')\n        branch_2 = slim.conv2d(branch_2, 448, [3, 1], scope='Conv2d_0b_3x1')\n        branch_2 = slim.conv2d(branch_2, 512, [1, 3], scope='Conv2d_0c_1x3')\n        branch_2 = tf.concat(axis=3, values=[\n            slim.conv2d(branch_2, 256, [1, 3], scope='Conv2d_0d_1x3'),\n            slim.conv2d(branch_2, 256, [3, 1], scope='Conv2d_0e_3x1')])\n      with tf.variable_scope('Branch_3'):\n        branch_3 = slim.avg_pool2d(inputs, [3, 3], scope='AvgPool_0a_3x3')\n        branch_3 = slim.conv2d(branch_3, 256, [1, 1], scope='Conv2d_0b_1x1')\n      return tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])\n\n\ndef inception_v4_base(inputs, final_endpoint='Mixed_7d', scope=None):\n  \"\"\"Creates the Inception V4 network up to the given final endpoint.\n\n  Args:\n    inputs: a 4-D tensor of size [batch_size, height, width, 3].\n    final_endpoint: specifies the endpoint to construct the network up to.\n      It can be one of [ 'Conv2d_1a_3x3', 'Conv2d_2a_3x3', 'Conv2d_2b_3x3',\n      'Mixed_3a', 'Mixed_4a', 'Mixed_5a', 'Mixed_5b', 'Mixed_5c', 'Mixed_5d',\n      'Mixed_5e', 'Mixed_6a', 'Mixed_6b', 'Mixed_6c', 'Mixed_6d', 'Mixed_6e',\n      'Mixed_6f', 'Mixed_6g', 'Mixed_6h', 'Mixed_7a', 'Mixed_7b', 'Mixed_7c',\n      'Mixed_7d']\n    scope: Optional variable_scope.\n\n  Returns:\n    logits: the logits outputs of the model.\n    end_points: the set of end_points from the inception model.\n\n  Raises:\n    ValueError: if final_endpoint is not set to one of the predefined values,\n  \"\"\"\n  end_points = {}\n\n  def add_and_check_final(name, net):\n    end_points[name] = net\n    return name == final_endpoint\n\n  with tf.variable_scope(scope, 'InceptionV4', [inputs]):\n    with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d],\n                        stride=1, padding='SAME'):\n      # 299 x 299 x 3\n      net = slim.conv2d(inputs, 32, [3, 3], stride=2,\n                        padding='VALID', scope='Conv2d_1a_3x3')\n      if add_and_check_final('Conv2d_1a_3x3', net): return net, end_points\n      # 149 x 149 x 32\n      net = slim.conv2d(net, 32, [3, 3], padding='VALID',\n                        scope='Conv2d_2a_3x3')\n      if add_and_check_final('Conv2d_2a_3x3', net): return net, end_points\n      # 147 x 147 x 32\n      net = slim.conv2d(net, 64, [3, 3], scope='Conv2d_2b_3x3')\n      if add_and_check_final('Conv2d_2b_3x3', net): return net, end_points\n      # 147 x 147 x 64\n      with tf.variable_scope('Mixed_3a'):\n        with tf.variable_scope('Branch_0'):\n          branch_0 = slim.max_pool2d(net, [3, 3], stride=2, padding='VALID',\n                                     scope='MaxPool_0a_3x3')\n        with tf.variable_scope('Branch_1'):\n          branch_1 = slim.conv2d(net, 96, [3, 3], stride=2, padding='VALID',\n                                 scope='Conv2d_0a_3x3')\n        net = tf.concat(axis=3, values=[branch_0, branch_1])\n        if add_and_check_final('Mixed_3a', net): return net, end_points\n\n      # 73 x 73 x 160\n      with tf.variable_scope('Mixed_4a'):\n        with tf.variable_scope('Branch_0'):\n          branch_0 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')\n          branch_0 = slim.conv2d(branch_0, 96, [3, 3], padding='VALID',\n                                 scope='Conv2d_1a_3x3')\n        with tf.variable_scope('Branch_1'):\n          branch_1 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')\n          branch_1 = slim.conv2d(branch_1, 64, [1, 7], scope='Conv2d_0b_1x7')\n          branch_1 = slim.conv2d(branch_1, 64, [7, 1], scope='Conv2d_0c_7x1')\n          branch_1 = slim.conv2d(branch_1, 96, [3, 3], padding='VALID',\n                                 scope='Conv2d_1a_3x3')\n        net = tf.concat(axis=3, values=[branch_0, branch_1])\n        if add_and_check_final('Mixed_4a', net): return net, end_points\n\n      # 71 x 71 x 192\n      with tf.variable_scope('Mixed_5a'):\n        with tf.variable_scope('Branch_0'):\n          branch_0 = slim.conv2d(net, 192, [3, 3], stride=2, padding='VALID',\n                                 scope='Conv2d_1a_3x3')\n        with tf.variable_scope('Branch_1'):\n          branch_1 = slim.max_pool2d(net, [3, 3], stride=2, padding='VALID',\n                                     scope='MaxPool_1a_3x3')\n        net = tf.concat(axis=3, values=[branch_0, branch_1])\n        if add_and_check_final('Mixed_5a', net): return net, end_points\n\n      # 35 x 35 x 384\n      # 4 x Inception-A blocks\n      for idx in xrange(4):\n        block_scope = 'Mixed_5' + chr(ord('b') + idx)\n        net = block_inception_a(net, block_scope)\n        if add_and_check_final(block_scope, net): return net, end_points\n\n      # 35 x 35 x 384\n      # Reduction-A block\n      net = block_reduction_a(net, 'Mixed_6a')\n      if add_and_check_final('Mixed_6a', net): return net, end_points\n\n      # 17 x 17 x 1024\n      # 7 x Inception-B blocks\n      for idx in xrange(7):\n        block_scope = 'Mixed_6' + chr(ord('b') + idx)\n        net = block_inception_b(net, block_scope)\n        if add_and_check_final(block_scope, net): return net, end_points\n\n      # 17 x 17 x 1024\n      # Reduction-B block\n      net = block_reduction_b(net, 'Mixed_7a')\n      if add_and_check_final('Mixed_7a', net): return net, end_points\n\n      # 8 x 8 x 1536\n      # 3 x Inception-C blocks\n      for idx in xrange(3):\n        block_scope = 'Mixed_7' + chr(ord('b') + idx)\n        net = block_inception_c(net, block_scope)\n        if add_and_check_final(block_scope, net): return net, end_points\n  raise ValueError('Unknown final endpoint %s' % final_endpoint)\n\n\ndef inception_v4(inputs, num_classes=1001, is_training=True,\n                 dropout_keep_prob=0.8,\n                 reuse=None,\n                 scope='InceptionV4',\n                 create_aux_logits=True):\n  \"\"\"Creates the Inception V4 model.\n\n  Args:\n    inputs: a 4-D tensor of size [batch_size, height, width, 3].\n    num_classes: number of predicted classes.\n    is_training: whether is training or not.\n    dropout_keep_prob: float, the fraction to keep before final layer.\n    reuse: whether or not the network and its variables should be reused. To be\n      able to reuse 'scope' must be given.\n    scope: Optional variable_scope.\n    create_aux_logits: Whether to include the auxilliary logits.\n\n  Returns:\n    logits: the logits outputs of the model.\n    end_points: the set of end_points from the inception model.\n  \"\"\"\n  end_points = {}\n  with tf.variable_scope(scope, 'InceptionV4', [inputs], reuse=reuse) as scope:\n    with slim.arg_scope([slim.batch_norm, slim.dropout],\n                        is_training=is_training):\n      net, end_points = inception_v4_base(inputs, scope=scope)\n\n      with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d],\n                          stride=1, padding='SAME'):\n        # Auxiliary Head logits\n        if create_aux_logits:\n          with tf.variable_scope('AuxLogits'):\n            # 17 x 17 x 1024\n            aux_logits = end_points['Mixed_6h']\n            # Originally, kernel_size = 5\n            # However, if we change the input size then we need to change the kernel size\n            # We want to pool the feature map to be 5x5xC\n            # With padding = 0, and stride 3, this means our kernel is H - 12\n            kernel_size = [aux_logits.get_shape().as_list()[1] - 12] * 2\n            aux_logits = slim.avg_pool2d(aux_logits, kernel_size, stride=3,\n                                         padding='VALID',\n                                         scope='AvgPool_1a_5x5')\n            aux_logits = slim.conv2d(aux_logits, 128, [1, 1],\n                                     scope='Conv2d_1b_1x1')\n            aux_logits = slim.conv2d(aux_logits, 768,\n                                     aux_logits.get_shape()[1:3],\n                                     padding='VALID', scope='Conv2d_2a')\n            aux_logits = slim.flatten(aux_logits)\n            aux_logits = slim.fully_connected(aux_logits, num_classes,\n                                              activation_fn=None,\n                                              scope='Aux_logits')\n            end_points['AuxLogits'] = aux_logits\n\n        # Final pooling and prediction\n        with tf.variable_scope('Logits'):\n          # 8 x 8 x 1536\n          net = slim.avg_pool2d(net, net.get_shape()[1:3], padding='VALID',\n                                scope='AvgPool_1a')\n          # 1 x 1 x 1536\n          net = slim.dropout(net, dropout_keep_prob, scope='Dropout_1b')\n          net = slim.flatten(net, scope='PreLogitsFlatten')\n          end_points['PreLogitsFlatten'] = net\n          # 1536\n          logits = slim.fully_connected(net, num_classes, activation_fn=None,\n                                        scope='Logits')\n          end_points['Logits'] = logits\n          end_points['Predictions'] = tf.nn.softmax(logits, name='Predictions')\n    return logits, end_points\ninception_v4.default_image_size = 299\n\n\ninception_v4_arg_scope = inception_utils.inception_arg_scope\n"
  },
  {
    "path": "nets/inception_v4_test.py",
    "content": "# Copyright 2016 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\"\"\"Tests for slim.inception_v4.\"\"\"\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport tensorflow as tf\n\nfrom nets import inception\n\n\nclass InceptionTest(tf.test.TestCase):\n\n  def testBuildLogits(self):\n    batch_size = 5\n    height, width = 299, 299\n    num_classes = 1000\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    logits, end_points = inception.inception_v4(inputs, num_classes)\n    auxlogits = end_points['AuxLogits']\n    predictions = end_points['Predictions']\n    self.assertTrue(auxlogits.op.name.startswith('InceptionV4/AuxLogits'))\n    self.assertListEqual(auxlogits.get_shape().as_list(),\n                         [batch_size, num_classes])\n    self.assertTrue(logits.op.name.startswith('InceptionV4/Logits'))\n    self.assertListEqual(logits.get_shape().as_list(),\n                         [batch_size, num_classes])\n    self.assertTrue(predictions.op.name.startswith(\n        'InceptionV4/Logits/Predictions'))\n    self.assertListEqual(predictions.get_shape().as_list(),\n                         [batch_size, num_classes])\n\n  def testBuildWithoutAuxLogits(self):\n    batch_size = 5\n    height, width = 299, 299\n    num_classes = 1000\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    logits, endpoints = inception.inception_v4(inputs, num_classes,\n                                               create_aux_logits=False)\n    self.assertFalse('AuxLogits' in endpoints)\n    self.assertTrue(logits.op.name.startswith('InceptionV4/Logits'))\n    self.assertListEqual(logits.get_shape().as_list(),\n                         [batch_size, num_classes])\n\n  def testAllEndPointsShapes(self):\n    batch_size = 5\n    height, width = 299, 299\n    num_classes = 1000\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    _, end_points = inception.inception_v4(inputs, num_classes)\n    endpoints_shapes = {'Conv2d_1a_3x3': [batch_size, 149, 149, 32],\n                        'Conv2d_2a_3x3': [batch_size, 147, 147, 32],\n                        'Conv2d_2b_3x3': [batch_size, 147, 147, 64],\n                        'Mixed_3a': [batch_size, 73, 73, 160],\n                        'Mixed_4a': [batch_size, 71, 71, 192],\n                        'Mixed_5a': [batch_size, 35, 35, 384],\n                        # 4 x Inception-A blocks\n                        'Mixed_5b': [batch_size, 35, 35, 384],\n                        'Mixed_5c': [batch_size, 35, 35, 384],\n                        'Mixed_5d': [batch_size, 35, 35, 384],\n                        'Mixed_5e': [batch_size, 35, 35, 384],\n                        # Reduction-A block\n                        'Mixed_6a': [batch_size, 17, 17, 1024],\n                        # 7 x Inception-B blocks\n                        'Mixed_6b': [batch_size, 17, 17, 1024],\n                        'Mixed_6c': [batch_size, 17, 17, 1024],\n                        'Mixed_6d': [batch_size, 17, 17, 1024],\n                        'Mixed_6e': [batch_size, 17, 17, 1024],\n                        'Mixed_6f': [batch_size, 17, 17, 1024],\n                        'Mixed_6g': [batch_size, 17, 17, 1024],\n                        'Mixed_6h': [batch_size, 17, 17, 1024],\n                        # Reduction-A block\n                        'Mixed_7a': [batch_size, 8, 8, 1536],\n                        # 3 x Inception-C blocks\n                        'Mixed_7b': [batch_size, 8, 8, 1536],\n                        'Mixed_7c': [batch_size, 8, 8, 1536],\n                        'Mixed_7d': [batch_size, 8, 8, 1536],\n                        # Logits and predictions\n                        'AuxLogits': [batch_size, num_classes],\n                        'PreLogitsFlatten': [batch_size, 1536],\n                        'Logits': [batch_size, num_classes],\n                        'Predictions': [batch_size, num_classes]}\n    self.assertItemsEqual(endpoints_shapes.keys(), end_points.keys())\n    for endpoint_name in endpoints_shapes:\n      expected_shape = endpoints_shapes[endpoint_name]\n      self.assertTrue(endpoint_name in end_points)\n      self.assertListEqual(end_points[endpoint_name].get_shape().as_list(),\n                           expected_shape)\n\n  def testBuildBaseNetwork(self):\n    batch_size = 5\n    height, width = 299, 299\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    net, end_points = inception.inception_v4_base(inputs)\n    self.assertTrue(net.op.name.startswith(\n        'InceptionV4/Mixed_7d'))\n    self.assertListEqual(net.get_shape().as_list(), [batch_size, 8, 8, 1536])\n    expected_endpoints = [\n        'Conv2d_1a_3x3', 'Conv2d_2a_3x3', 'Conv2d_2b_3x3', 'Mixed_3a',\n        'Mixed_4a', 'Mixed_5a', 'Mixed_5b', 'Mixed_5c', 'Mixed_5d',\n        'Mixed_5e', 'Mixed_6a', 'Mixed_6b', 'Mixed_6c', 'Mixed_6d',\n        'Mixed_6e', 'Mixed_6f', 'Mixed_6g', 'Mixed_6h', 'Mixed_7a',\n        'Mixed_7b', 'Mixed_7c', 'Mixed_7d']\n    self.assertItemsEqual(end_points.keys(), expected_endpoints)\n    for name, op in end_points.iteritems():\n      self.assertTrue(op.name.startswith('InceptionV4/' + name))\n\n  def testBuildOnlyUpToFinalEndpoint(self):\n    batch_size = 5\n    height, width = 299, 299\n    all_endpoints = [\n        'Conv2d_1a_3x3', 'Conv2d_2a_3x3', 'Conv2d_2b_3x3', 'Mixed_3a',\n        'Mixed_4a', 'Mixed_5a', 'Mixed_5b', 'Mixed_5c', 'Mixed_5d',\n        'Mixed_5e', 'Mixed_6a', 'Mixed_6b', 'Mixed_6c', 'Mixed_6d',\n        'Mixed_6e', 'Mixed_6f', 'Mixed_6g', 'Mixed_6h', 'Mixed_7a',\n        'Mixed_7b', 'Mixed_7c', 'Mixed_7d']\n    for index, endpoint in enumerate(all_endpoints):\n      with tf.Graph().as_default():\n        inputs = tf.random_uniform((batch_size, height, width, 3))\n        out_tensor, end_points = inception.inception_v4_base(\n            inputs, final_endpoint=endpoint)\n        self.assertTrue(out_tensor.op.name.startswith(\n            'InceptionV4/' + endpoint))\n        self.assertItemsEqual(all_endpoints[:index+1], end_points)\n\n  def testVariablesSetDevice(self):\n    batch_size = 5\n    height, width = 299, 299\n    num_classes = 1000\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    # Force all Variables to reside on the device.\n    with tf.variable_scope('on_cpu'), tf.device('/cpu:0'):\n      inception.inception_v4(inputs, num_classes)\n    with tf.variable_scope('on_gpu'), tf.device('/gpu:0'):\n      inception.inception_v4(inputs, num_classes)\n    for v in tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, scope='on_cpu'):\n      self.assertDeviceEqual(v.device, '/cpu:0')\n    for v in tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, scope='on_gpu'):\n      self.assertDeviceEqual(v.device, '/gpu:0')\n\n  def testHalfSizeImages(self):\n    batch_size = 5\n    height, width = 150, 150\n    num_classes = 1000\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    logits, end_points = inception.inception_v4(inputs, num_classes)\n    self.assertTrue(logits.op.name.startswith('InceptionV4/Logits'))\n    self.assertListEqual(logits.get_shape().as_list(),\n                         [batch_size, num_classes])\n    pre_pool = end_points['Mixed_7d']\n    self.assertListEqual(pre_pool.get_shape().as_list(),\n                         [batch_size, 3, 3, 1536])\n\n  def testUnknownBatchSize(self):\n    batch_size = 1\n    height, width = 299, 299\n    num_classes = 1000\n    with self.test_session() as sess:\n      inputs = tf.placeholder(tf.float32, (None, height, width, 3))\n      logits, _ = inception.inception_v4(inputs, num_classes)\n      self.assertTrue(logits.op.name.startswith('InceptionV4/Logits'))\n      self.assertListEqual(logits.get_shape().as_list(),\n                           [None, num_classes])\n      images = tf.random_uniform((batch_size, height, width, 3))\n      sess.run(tf.global_variables_initializer())\n      output = sess.run(logits, {inputs: images.eval()})\n      self.assertEquals(output.shape, (batch_size, num_classes))\n\n  def testEvaluation(self):\n    batch_size = 2\n    height, width = 299, 299\n    num_classes = 1000\n    with self.test_session() as sess:\n      eval_inputs = tf.random_uniform((batch_size, height, width, 3))\n      logits, _ = inception.inception_v4(eval_inputs,\n                                         num_classes,\n                                         is_training=False)\n      predictions = tf.argmax(logits, 1)\n      sess.run(tf.global_variables_initializer())\n      output = sess.run(predictions)\n      self.assertEquals(output.shape, (batch_size,))\n\n  def testTrainEvalWithReuse(self):\n    train_batch_size = 5\n    eval_batch_size = 2\n    height, width = 150, 150\n    num_classes = 1000\n    with self.test_session() as sess:\n      train_inputs = tf.random_uniform((train_batch_size, height, width, 3))\n      inception.inception_v4(train_inputs, num_classes)\n      eval_inputs = tf.random_uniform((eval_batch_size, height, width, 3))\n      logits, _ = inception.inception_v4(eval_inputs,\n                                         num_classes,\n                                         is_training=False,\n                                         reuse=True)\n      predictions = tf.argmax(logits, 1)\n      sess.run(tf.global_variables_initializer())\n      output = sess.run(predictions)\n      self.assertEquals(output.shape, (eval_batch_size,))\n\n\nif __name__ == '__main__':\n  tf.test.main()\n"
  },
  {
    "path": "nets/mobilenet_v1.py",
    "content": "# Copyright 2017 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# =============================================================================\n\"\"\"MobileNet v1.\n\nMobileNet is a general architecture and can be used for multiple use cases.\nDepending on the use case, it can use different input layer size and different\nhead (for example: embeddings, localization and classification).\n\nAs described in https://arxiv.org/abs/1704.04861.\n\n  MobileNets: Efficient Convolutional Neural Networks for\n    Mobile Vision Applications\n  Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang,\n    Tobias Weyand, Marco Andreetto, Hartwig Adam\n\n100% Mobilenet V1 (base) with input size 224x224:\n\nSee mobilenet_v1()\n\nLayer                                                     params           macs\n--------------------------------------------------------------------------------\nMobilenetV1/Conv2d_0/Conv2D:                                 864      10,838,016\nMobilenetV1/Conv2d_1_depthwise/depthwise:                    288       3,612,672\nMobilenetV1/Conv2d_1_pointwise/Conv2D:                     2,048      25,690,112\nMobilenetV1/Conv2d_2_depthwise/depthwise:                    576       1,806,336\nMobilenetV1/Conv2d_2_pointwise/Conv2D:                     8,192      25,690,112\nMobilenetV1/Conv2d_3_depthwise/depthwise:                  1,152       3,612,672\nMobilenetV1/Conv2d_3_pointwise/Conv2D:                    16,384      51,380,224\nMobilenetV1/Conv2d_4_depthwise/depthwise:                  1,152         903,168\nMobilenetV1/Conv2d_4_pointwise/Conv2D:                    32,768      25,690,112\nMobilenetV1/Conv2d_5_depthwise/depthwise:                  2,304       1,806,336\nMobilenetV1/Conv2d_5_pointwise/Conv2D:                    65,536      51,380,224\nMobilenetV1/Conv2d_6_depthwise/depthwise:                  2,304         451,584\nMobilenetV1/Conv2d_6_pointwise/Conv2D:                   131,072      25,690,112\nMobilenetV1/Conv2d_7_depthwise/depthwise:                  4,608         903,168\nMobilenetV1/Conv2d_7_pointwise/Conv2D:                   262,144      51,380,224\nMobilenetV1/Conv2d_8_depthwise/depthwise:                  4,608         903,168\nMobilenetV1/Conv2d_8_pointwise/Conv2D:                   262,144      51,380,224\nMobilenetV1/Conv2d_9_depthwise/depthwise:                  4,608         903,168\nMobilenetV1/Conv2d_9_pointwise/Conv2D:                   262,144      51,380,224\nMobilenetV1/Conv2d_10_depthwise/depthwise:                 4,608         903,168\nMobilenetV1/Conv2d_10_pointwise/Conv2D:                  262,144      51,380,224\nMobilenetV1/Conv2d_11_depthwise/depthwise:                 4,608         903,168\nMobilenetV1/Conv2d_11_pointwise/Conv2D:                  262,144      51,380,224\nMobilenetV1/Conv2d_12_depthwise/depthwise:                 4,608         225,792\nMobilenetV1/Conv2d_12_pointwise/Conv2D:                  524,288      25,690,112\nMobilenetV1/Conv2d_13_depthwise/depthwise:                 9,216         451,584\nMobilenetV1/Conv2d_13_pointwise/Conv2D:                1,048,576      51,380,224\n--------------------------------------------------------------------------------\nTotal:                                                 3,185,088     567,716,352\n\n\n75% Mobilenet V1 (base) with input size 128x128:\n\nSee mobilenet_v1_075()\n\nLayer                                                     params           macs\n--------------------------------------------------------------------------------\nMobilenetV1/Conv2d_0/Conv2D:                                 648       2,654,208\nMobilenetV1/Conv2d_1_depthwise/depthwise:                    216         884,736\nMobilenetV1/Conv2d_1_pointwise/Conv2D:                     1,152       4,718,592\nMobilenetV1/Conv2d_2_depthwise/depthwise:                    432         442,368\nMobilenetV1/Conv2d_2_pointwise/Conv2D:                     4,608       4,718,592\nMobilenetV1/Conv2d_3_depthwise/depthwise:                    864         884,736\nMobilenetV1/Conv2d_3_pointwise/Conv2D:                     9,216       9,437,184\nMobilenetV1/Conv2d_4_depthwise/depthwise:                    864         221,184\nMobilenetV1/Conv2d_4_pointwise/Conv2D:                    18,432       4,718,592\nMobilenetV1/Conv2d_5_depthwise/depthwise:                  1,728         442,368\nMobilenetV1/Conv2d_5_pointwise/Conv2D:                    36,864       9,437,184\nMobilenetV1/Conv2d_6_depthwise/depthwise:                  1,728         110,592\nMobilenetV1/Conv2d_6_pointwise/Conv2D:                    73,728       4,718,592\nMobilenetV1/Conv2d_7_depthwise/depthwise:                  3,456         221,184\nMobilenetV1/Conv2d_7_pointwise/Conv2D:                   147,456       9,437,184\nMobilenetV1/Conv2d_8_depthwise/depthwise:                  3,456         221,184\nMobilenetV1/Conv2d_8_pointwise/Conv2D:                   147,456       9,437,184\nMobilenetV1/Conv2d_9_depthwise/depthwise:                  3,456         221,184\nMobilenetV1/Conv2d_9_pointwise/Conv2D:                   147,456       9,437,184\nMobilenetV1/Conv2d_10_depthwise/depthwise:                 3,456         221,184\nMobilenetV1/Conv2d_10_pointwise/Conv2D:                  147,456       9,437,184\nMobilenetV1/Conv2d_11_depthwise/depthwise:                 3,456         221,184\nMobilenetV1/Conv2d_11_pointwise/Conv2D:                  147,456       9,437,184\nMobilenetV1/Conv2d_12_depthwise/depthwise:                 3,456          55,296\nMobilenetV1/Conv2d_12_pointwise/Conv2D:                  294,912       4,718,592\nMobilenetV1/Conv2d_13_depthwise/depthwise:                 6,912         110,592\nMobilenetV1/Conv2d_13_pointwise/Conv2D:                  589,824       9,437,184\n--------------------------------------------------------------------------------\nTotal:                                                 1,800,144     106,002,432\n\n\"\"\"\n\n# Tensorflow mandates these.\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nfrom collections import namedtuple\nimport functools\n\nimport tensorflow as tf\n\nslim = tf.contrib.slim\n\n# Conv and DepthSepConv namedtuple define layers of the MobileNet architecture\n# Conv defines 3x3 convolution layers\n# DepthSepConv defines 3x3 depthwise convolution followed by 1x1 convolution.\n# stride is the stride of the convolution\n# depth is the number of channels or filters in a layer\nConv = namedtuple('Conv', ['kernel', 'stride', 'depth'])\nDepthSepConv = namedtuple('DepthSepConv', ['kernel', 'stride', 'depth'])\n\n# _CONV_DEFS specifies the MobileNet body\n_CONV_DEFS = [\n    Conv(kernel=[3, 3], stride=2, depth=32),\n    DepthSepConv(kernel=[3, 3], stride=1, depth=64),\n    DepthSepConv(kernel=[3, 3], stride=2, depth=128),\n    DepthSepConv(kernel=[3, 3], stride=1, depth=128),\n    DepthSepConv(kernel=[3, 3], stride=2, depth=256),\n    DepthSepConv(kernel=[3, 3], stride=1, depth=256),\n    DepthSepConv(kernel=[3, 3], stride=2, depth=512),\n    DepthSepConv(kernel=[3, 3], stride=1, depth=512),\n    DepthSepConv(kernel=[3, 3], stride=1, depth=512),\n    DepthSepConv(kernel=[3, 3], stride=1, depth=512),\n    DepthSepConv(kernel=[3, 3], stride=1, depth=512),\n    DepthSepConv(kernel=[3, 3], stride=1, depth=512),\n    DepthSepConv(kernel=[3, 3], stride=2, depth=1024),\n    DepthSepConv(kernel=[3, 3], stride=1, depth=1024)\n]\n\n\ndef mobilenet_v1_base(inputs,\n                      final_endpoint='Conv2d_13_pointwise',\n                      min_depth=8,\n                      depth_multiplier=1.0,\n                      conv_defs=None,\n                      output_stride=None,\n                      scope=None):\n  \"\"\"Mobilenet v1.\n\n  Constructs a Mobilenet v1 network from inputs to the given final endpoint.\n\n  Args:\n    inputs: a tensor of shape [batch_size, height, width, channels].\n    final_endpoint: specifies the endpoint to construct the network up to. It\n      can be one of ['Conv2d_0', 'Conv2d_1_pointwise', 'Conv2d_2_pointwise',\n      'Conv2d_3_pointwise', 'Conv2d_4_pointwise', 'Conv2d_5'_pointwise,\n      'Conv2d_6_pointwise', 'Conv2d_7_pointwise', 'Conv2d_8_pointwise',\n      'Conv2d_9_pointwise', 'Conv2d_10_pointwise', 'Conv2d_11_pointwise',\n      'Conv2d_12_pointwise', 'Conv2d_13_pointwise'].\n    min_depth: Minimum depth value (number of channels) for all convolution ops.\n      Enforced when depth_multiplier < 1, and not an active constraint when\n      depth_multiplier >= 1.\n    depth_multiplier: Float multiplier for the depth (number of channels)\n      for all convolution ops. The value must be greater than zero. Typical\n      usage will be to set this value in (0, 1) to reduce the number of\n      parameters or computation cost of the model.\n    conv_defs: A list of ConvDef namedtuples specifying the net architecture.\n    output_stride: An integer that specifies the requested ratio of input to\n      output spatial resolution. If not None, then we invoke atrous convolution\n      if necessary to prevent the network from reducing the spatial resolution\n      of the activation maps. Allowed values are 8 (accurate fully convolutional\n      mode), 16 (fast fully convolutional mode), 32 (classification mode).\n    scope: Optional variable_scope.\n\n  Returns:\n    tensor_out: output tensor corresponding to the final_endpoint.\n    end_points: a set of activations for external use, for example summaries or\n                losses.\n\n  Raises:\n    ValueError: if final_endpoint is not set to one of the predefined values,\n                or depth_multiplier <= 0, or the target output_stride is not\n                allowed.\n  \"\"\"\n  depth = lambda d: max(int(d * depth_multiplier), min_depth)\n  end_points = {}\n\n  # Used to find thinned depths for each layer.\n  if depth_multiplier <= 0:\n    raise ValueError('depth_multiplier is not greater than zero.')\n\n  if conv_defs is None:\n    conv_defs = _CONV_DEFS\n\n  if output_stride is not None and output_stride not in [8, 16, 32]:\n    raise ValueError('Only allowed output_stride values are 8, 16, 32.')\n\n  with tf.variable_scope(scope, 'MobilenetV1', [inputs]):\n    with slim.arg_scope([slim.conv2d, slim.separable_conv2d], padding='SAME'):\n      # The current_stride variable keeps track of the output stride of the\n      # activations, i.e., the running product of convolution strides up to the\n      # current network layer. This allows us to invoke atrous convolution\n      # whenever applying the next convolution would result in the activations\n      # having output stride larger than the target output_stride.\n      current_stride = 1\n\n      # The atrous convolution rate parameter.\n      rate = 1\n\n      net = inputs\n      for i, conv_def in enumerate(conv_defs):\n        end_point_base = 'Conv2d_%d' % i\n\n        if output_stride is not None and current_stride == output_stride:\n          # If we have reached the target output_stride, then we need to employ\n          # atrous convolution with stride=1 and multiply the atrous rate by the\n          # current unit's stride for use in subsequent layers.\n          layer_stride = 1\n          layer_rate = rate\n          rate *= conv_def.stride\n        else:\n          layer_stride = conv_def.stride\n          layer_rate = 1\n          current_stride *= conv_def.stride\n\n        if isinstance(conv_def, Conv):\n          end_point = end_point_base\n          net = slim.conv2d(net, depth(conv_def.depth), conv_def.kernel,\n                            stride=conv_def.stride,\n                            normalizer_fn=slim.batch_norm,\n                            scope=end_point)\n          end_points[end_point] = net\n          if end_point == final_endpoint:\n            return net, end_points\n\n        elif isinstance(conv_def, DepthSepConv):\n          end_point = end_point_base + '_depthwise'\n\n          # By passing filters=None\n          # separable_conv2d produces only a depthwise convolution layer\n          net = slim.separable_conv2d(net, None, conv_def.kernel,\n                                      depth_multiplier=1,\n                                      stride=layer_stride,\n                                      rate=layer_rate,\n                                      normalizer_fn=slim.batch_norm,\n                                      scope=end_point)\n\n          end_points[end_point] = net\n          if end_point == final_endpoint:\n            return net, end_points\n\n          end_point = end_point_base + '_pointwise'\n\n          net = slim.conv2d(net, depth(conv_def.depth), [1, 1],\n                            stride=1,\n                            normalizer_fn=slim.batch_norm,\n                            scope=end_point)\n\n          end_points[end_point] = net\n          if end_point == final_endpoint:\n            return net, end_points\n        else:\n          raise ValueError('Unknown convolution type %s for layer %d'\n                           % (conv_def.ltype, i))\n  raise ValueError('Unknown final endpoint %s' % final_endpoint)\n\n\ndef mobilenet_v1(inputs,\n                 num_classes=1000,\n                 dropout_keep_prob=0.999,\n                 is_training=True,\n                 min_depth=8,\n                 depth_multiplier=1.0,\n                 conv_defs=None,\n                 prediction_fn=tf.contrib.layers.softmax,\n                 spatial_squeeze=True,\n                 reuse=None,\n                 scope='MobilenetV1'):\n  \"\"\"Mobilenet v1 model for classification.\n\n  Args:\n    inputs: a tensor of shape [batch_size, height, width, channels].\n    num_classes: number of predicted classes.\n    dropout_keep_prob: the percentage of activation values that are retained.\n    is_training: whether is training or not.\n    min_depth: Minimum depth value (number of channels) for all convolution ops.\n      Enforced when depth_multiplier < 1, and not an active constraint when\n      depth_multiplier >= 1.\n    depth_multiplier: Float multiplier for the depth (number of channels)\n      for all convolution ops. The value must be greater than zero. Typical\n      usage will be to set this value in (0, 1) to reduce the number of\n      parameters or computation cost of the model.\n    conv_defs: A list of ConvDef namedtuples specifying the net architecture.\n    prediction_fn: a function to get predictions out of logits.\n    spatial_squeeze: if True, logits is of shape is [B, C], if false logits is\n        of shape [B, 1, 1, C], where B is batch_size and C is number of classes.\n    reuse: whether or not the network and its variables should be reused. To be\n      able to reuse 'scope' must be given.\n    scope: Optional variable_scope.\n\n  Returns:\n    logits: the pre-softmax activations, a tensor of size\n      [batch_size, num_classes]\n    end_points: a dictionary from components of the network to the corresponding\n      activation.\n\n  Raises:\n    ValueError: Input rank is invalid.\n  \"\"\"\n  input_shape = inputs.get_shape().as_list()\n  if len(input_shape) != 4:\n    raise ValueError('Invalid input tensor rank, expected 4, was: %d' %\n                     len(input_shape))\n\n  with tf.variable_scope(scope, 'MobilenetV1', [inputs, num_classes],\n                         reuse=reuse) as scope:\n    with slim.arg_scope([slim.batch_norm, slim.dropout],\n                        is_training=is_training):\n      net, end_points = mobilenet_v1_base(inputs, scope=scope,\n                                          min_depth=min_depth,\n                                          depth_multiplier=depth_multiplier,\n                                          conv_defs=conv_defs)\n      with tf.variable_scope('Logits'):\n        #kernel_size = _reduced_kernel_size_for_small_input(net, [7, 7])\n        kernel_size = net.get_shape()[1:3]\n        net = slim.avg_pool2d(net, kernel_size, padding='VALID',\n                              scope='AvgPool_1a')\n        end_points['AvgPool_1a'] = net\n        # 1 x 1 x 1024\n        net = slim.dropout(net, keep_prob=dropout_keep_prob, scope='Dropout_1b')\n        logits = slim.conv2d(net, num_classes, [1, 1], activation_fn=None,\n                             normalizer_fn=None, scope='Conv2d_1c_1x1')\n        if spatial_squeeze:\n          logits = tf.squeeze(logits, [1, 2], name='SpatialSqueeze')\n      end_points['Logits'] = logits\n      if prediction_fn:\n        end_points['Predictions'] = prediction_fn(logits, scope='Predictions')\n  return logits, end_points\n\nmobilenet_v1.default_image_size = 224\n\n\ndef wrapped_partial(func, *args, **kwargs):\n  partial_func = functools.partial(func, *args, **kwargs)\n  functools.update_wrapper(partial_func, func)\n  return partial_func\n\n\nmobilenet_v1_075 = wrapped_partial(mobilenet_v1, depth_multiplier=0.75)\nmobilenet_v1_050 = wrapped_partial(mobilenet_v1, depth_multiplier=0.50)\nmobilenet_v1_025 = wrapped_partial(mobilenet_v1, depth_multiplier=0.25)\n\n\ndef _reduced_kernel_size_for_small_input(input_tensor, kernel_size):\n  \"\"\"Define kernel size which is automatically reduced for small input.\n\n  If the shape of the input images is unknown at graph construction time this\n  function assumes that the input images are large enough.\n\n  Args:\n    input_tensor: input tensor of size [batch_size, height, width, channels].\n    kernel_size: desired kernel size of length 2: [kernel_height, kernel_width]\n\n  Returns:\n    a tensor with the kernel size.\n  \"\"\"\n  shape = input_tensor.get_shape().as_list()\n  if shape[1] is None or shape[2] is None:\n    kernel_size_out = kernel_size\n  else:\n    kernel_size_out = [min(shape[1], kernel_size[0]),\n                       min(shape[2], kernel_size[1])]\n  return kernel_size_out\n\n\ndef mobilenet_v1_arg_scope(is_training=True,\n                           weight_decay=0.00004,\n                           stddev=0.09,\n                           batch_norm_decay=0.9997,\n                           batch_norm_epsilon=0.001,\n                           regularize_depthwise=False):\n  \"\"\"Defines the default MobilenetV1 arg scope.\n\n  Args:\n    is_training: Whether or not we're training the model.\n    weight_decay: The weight decay to use for regularizing the model.\n    stddev: The standard deviation of the trunctated normal weight initializer.\n    regularize_depthwise: Whether or not apply regularization on depthwise.\n\n  Returns:\n    An `arg_scope` to use for the mobilenet v1 model.\n  \"\"\"\n  batch_norm_params = {\n      'is_training': is_training,\n      'center': True,\n      'scale': True,\n      'decay': batch_norm_decay,\n      'epsilon': batch_norm_epsilon,\n  }\n\n  # Set weight_decay for weights in Conv and DepthSepConv layers.\n  weights_init = tf.truncated_normal_initializer(stddev=stddev)\n  regularizer = tf.contrib.layers.l2_regularizer(weight_decay)\n  if regularize_depthwise:\n    depthwise_regularizer = regularizer\n  else:\n    depthwise_regularizer = None\n  with slim.arg_scope([slim.conv2d, slim.separable_conv2d],\n                      weights_initializer=weights_init,\n                      activation_fn=tf.nn.relu6, normalizer_fn=slim.batch_norm):\n    with slim.arg_scope([slim.batch_norm], **batch_norm_params):\n      with slim.arg_scope([slim.conv2d], weights_regularizer=regularizer):\n        with slim.arg_scope([slim.separable_conv2d],\n                            weights_regularizer=depthwise_regularizer) as sc:\n          return sc"
  },
  {
    "path": "nets/mobilenet_v1_test.py",
    "content": "# Copyright 2017 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# =============================================================================\n\"\"\"Tests for MobileNet v1.\"\"\"\n\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport numpy as np\nimport tensorflow as tf\n\nfrom nets import mobilenet_v1\n\nslim = tf.contrib.slim\n\n\nclass MobilenetV1Test(tf.test.TestCase):\n\n  def testBuildClassificationNetwork(self):\n    batch_size = 5\n    height, width = 224, 224\n    num_classes = 1000\n\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    logits, end_points = mobilenet_v1.mobilenet_v1(inputs, num_classes)\n    self.assertTrue(logits.op.name.startswith('MobilenetV1/Logits'))\n    self.assertListEqual(logits.get_shape().as_list(),\n                         [batch_size, num_classes])\n    self.assertTrue('Predictions' in end_points)\n    self.assertListEqual(end_points['Predictions'].get_shape().as_list(),\n                         [batch_size, num_classes])\n\n  def testBuildBaseNetwork(self):\n    batch_size = 5\n    height, width = 224, 224\n\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    net, end_points = mobilenet_v1.mobilenet_v1_base(inputs)\n    self.assertTrue(net.op.name.startswith('MobilenetV1/Conv2d_13'))\n    self.assertListEqual(net.get_shape().as_list(),\n                         [batch_size, 7, 7, 1024])\n    expected_endpoints = ['Conv2d_0',\n                          'Conv2d_1_depthwise', 'Conv2d_1_pointwise',\n                          'Conv2d_2_depthwise', 'Conv2d_2_pointwise',\n                          'Conv2d_3_depthwise', 'Conv2d_3_pointwise',\n                          'Conv2d_4_depthwise', 'Conv2d_4_pointwise',\n                          'Conv2d_5_depthwise', 'Conv2d_5_pointwise',\n                          'Conv2d_6_depthwise', 'Conv2d_6_pointwise',\n                          'Conv2d_7_depthwise', 'Conv2d_7_pointwise',\n                          'Conv2d_8_depthwise', 'Conv2d_8_pointwise',\n                          'Conv2d_9_depthwise', 'Conv2d_9_pointwise',\n                          'Conv2d_10_depthwise', 'Conv2d_10_pointwise',\n                          'Conv2d_11_depthwise', 'Conv2d_11_pointwise',\n                          'Conv2d_12_depthwise', 'Conv2d_12_pointwise',\n                          'Conv2d_13_depthwise', 'Conv2d_13_pointwise']\n    self.assertItemsEqual(end_points.keys(), expected_endpoints)\n\n  def testBuildOnlyUptoFinalEndpoint(self):\n    batch_size = 5\n    height, width = 224, 224\n    endpoints = ['Conv2d_0',\n                 'Conv2d_1_depthwise', 'Conv2d_1_pointwise',\n                 'Conv2d_2_depthwise', 'Conv2d_2_pointwise',\n                 'Conv2d_3_depthwise', 'Conv2d_3_pointwise',\n                 'Conv2d_4_depthwise', 'Conv2d_4_pointwise',\n                 'Conv2d_5_depthwise', 'Conv2d_5_pointwise',\n                 'Conv2d_6_depthwise', 'Conv2d_6_pointwise',\n                 'Conv2d_7_depthwise', 'Conv2d_7_pointwise',\n                 'Conv2d_8_depthwise', 'Conv2d_8_pointwise',\n                 'Conv2d_9_depthwise', 'Conv2d_9_pointwise',\n                 'Conv2d_10_depthwise', 'Conv2d_10_pointwise',\n                 'Conv2d_11_depthwise', 'Conv2d_11_pointwise',\n                 'Conv2d_12_depthwise', 'Conv2d_12_pointwise',\n                 'Conv2d_13_depthwise', 'Conv2d_13_pointwise']\n    for index, endpoint in enumerate(endpoints):\n      with tf.Graph().as_default():\n        inputs = tf.random_uniform((batch_size, height, width, 3))\n        out_tensor, end_points = mobilenet_v1.mobilenet_v1_base(\n            inputs, final_endpoint=endpoint)\n        self.assertTrue(out_tensor.op.name.startswith(\n            'MobilenetV1/' + endpoint))\n        self.assertItemsEqual(endpoints[:index+1], end_points)\n\n  def testBuildCustomNetworkUsingConvDefs(self):\n    batch_size = 5\n    height, width = 224, 224\n    conv_defs = [\n        mobilenet_v1.Conv(kernel=[3, 3], stride=2, depth=32),\n        mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=1, depth=64),\n        mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=2, depth=128),\n        mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=1, depth=512)\n    ]\n\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    net, end_points = mobilenet_v1.mobilenet_v1_base(\n        inputs, final_endpoint='Conv2d_3_pointwise', conv_defs=conv_defs)\n    self.assertTrue(net.op.name.startswith('MobilenetV1/Conv2d_3'))\n    self.assertListEqual(net.get_shape().as_list(),\n                         [batch_size, 56, 56, 512])\n    expected_endpoints = ['Conv2d_0',\n                          'Conv2d_1_depthwise', 'Conv2d_1_pointwise',\n                          'Conv2d_2_depthwise', 'Conv2d_2_pointwise',\n                          'Conv2d_3_depthwise', 'Conv2d_3_pointwise']\n    self.assertItemsEqual(end_points.keys(), expected_endpoints)\n\n  def testBuildAndCheckAllEndPointsUptoConv2d_13(self):\n    batch_size = 5\n    height, width = 224, 224\n\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    with slim.arg_scope([slim.conv2d, slim.separable_conv2d],\n                        normalizer_fn=slim.batch_norm):\n      _, end_points = mobilenet_v1.mobilenet_v1_base(\n          inputs, final_endpoint='Conv2d_13_pointwise')\n    endpoints_shapes = {'Conv2d_0': [batch_size, 112, 112, 32],\n                        'Conv2d_1_depthwise': [batch_size, 112, 112, 32],\n                        'Conv2d_1_pointwise': [batch_size, 112, 112, 64],\n                        'Conv2d_2_depthwise': [batch_size, 56, 56, 64],\n                        'Conv2d_2_pointwise': [batch_size, 56, 56, 128],\n                        'Conv2d_3_depthwise': [batch_size, 56, 56, 128],\n                        'Conv2d_3_pointwise': [batch_size, 56, 56, 128],\n                        'Conv2d_4_depthwise': [batch_size, 28, 28, 128],\n                        'Conv2d_4_pointwise': [batch_size, 28, 28, 256],\n                        'Conv2d_5_depthwise': [batch_size, 28, 28, 256],\n                        'Conv2d_5_pointwise': [batch_size, 28, 28, 256],\n                        'Conv2d_6_depthwise': [batch_size, 14, 14, 256],\n                        'Conv2d_6_pointwise': [batch_size, 14, 14, 512],\n                        'Conv2d_7_depthwise': [batch_size, 14, 14, 512],\n                        'Conv2d_7_pointwise': [batch_size, 14, 14, 512],\n                        'Conv2d_8_depthwise': [batch_size, 14, 14, 512],\n                        'Conv2d_8_pointwise': [batch_size, 14, 14, 512],\n                        'Conv2d_9_depthwise': [batch_size, 14, 14, 512],\n                        'Conv2d_9_pointwise': [batch_size, 14, 14, 512],\n                        'Conv2d_10_depthwise': [batch_size, 14, 14, 512],\n                        'Conv2d_10_pointwise': [batch_size, 14, 14, 512],\n                        'Conv2d_11_depthwise': [batch_size, 14, 14, 512],\n                        'Conv2d_11_pointwise': [batch_size, 14, 14, 512],\n                        'Conv2d_12_depthwise': [batch_size, 7, 7, 512],\n                        'Conv2d_12_pointwise': [batch_size, 7, 7, 1024],\n                        'Conv2d_13_depthwise': [batch_size, 7, 7, 1024],\n                        'Conv2d_13_pointwise': [batch_size, 7, 7, 1024]}\n    self.assertItemsEqual(endpoints_shapes.keys(), end_points.keys())\n    for endpoint_name, expected_shape in endpoints_shapes.iteritems():\n      self.assertTrue(endpoint_name in end_points)\n      self.assertListEqual(end_points[endpoint_name].get_shape().as_list(),\n                           expected_shape)\n\n  def testOutputStride16BuildAndCheckAllEndPointsUptoConv2d_13(self):\n    batch_size = 5\n    height, width = 224, 224\n    output_stride = 16\n\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    with slim.arg_scope([slim.conv2d, slim.separable_conv2d],\n                        normalizer_fn=slim.batch_norm):\n      _, end_points = mobilenet_v1.mobilenet_v1_base(\n          inputs, output_stride=output_stride,\n          final_endpoint='Conv2d_13_pointwise')\n    endpoints_shapes = {'Conv2d_0': [batch_size, 112, 112, 32],\n                        'Conv2d_1_depthwise': [batch_size, 112, 112, 32],\n                        'Conv2d_1_pointwise': [batch_size, 112, 112, 64],\n                        'Conv2d_2_depthwise': [batch_size, 56, 56, 64],\n                        'Conv2d_2_pointwise': [batch_size, 56, 56, 128],\n                        'Conv2d_3_depthwise': [batch_size, 56, 56, 128],\n                        'Conv2d_3_pointwise': [batch_size, 56, 56, 128],\n                        'Conv2d_4_depthwise': [batch_size, 28, 28, 128],\n                        'Conv2d_4_pointwise': [batch_size, 28, 28, 256],\n                        'Conv2d_5_depthwise': [batch_size, 28, 28, 256],\n                        'Conv2d_5_pointwise': [batch_size, 28, 28, 256],\n                        'Conv2d_6_depthwise': [batch_size, 14, 14, 256],\n                        'Conv2d_6_pointwise': [batch_size, 14, 14, 512],\n                        'Conv2d_7_depthwise': [batch_size, 14, 14, 512],\n                        'Conv2d_7_pointwise': [batch_size, 14, 14, 512],\n                        'Conv2d_8_depthwise': [batch_size, 14, 14, 512],\n                        'Conv2d_8_pointwise': [batch_size, 14, 14, 512],\n                        'Conv2d_9_depthwise': [batch_size, 14, 14, 512],\n                        'Conv2d_9_pointwise': [batch_size, 14, 14, 512],\n                        'Conv2d_10_depthwise': [batch_size, 14, 14, 512],\n                        'Conv2d_10_pointwise': [batch_size, 14, 14, 512],\n                        'Conv2d_11_depthwise': [batch_size, 14, 14, 512],\n                        'Conv2d_11_pointwise': [batch_size, 14, 14, 512],\n                        'Conv2d_12_depthwise': [batch_size, 14, 14, 512],\n                        'Conv2d_12_pointwise': [batch_size, 14, 14, 1024],\n                        'Conv2d_13_depthwise': [batch_size, 14, 14, 1024],\n                        'Conv2d_13_pointwise': [batch_size, 14, 14, 1024]}\n    self.assertItemsEqual(endpoints_shapes.keys(), end_points.keys())\n    for endpoint_name, expected_shape in endpoints_shapes.iteritems():\n      self.assertTrue(endpoint_name in end_points)\n      self.assertListEqual(end_points[endpoint_name].get_shape().as_list(),\n                           expected_shape)\n\n  def testOutputStride8BuildAndCheckAllEndPointsUptoConv2d_13(self):\n    batch_size = 5\n    height, width = 224, 224\n    output_stride = 8\n\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    with slim.arg_scope([slim.conv2d, slim.separable_conv2d],\n                        normalizer_fn=slim.batch_norm):\n      _, end_points = mobilenet_v1.mobilenet_v1_base(\n          inputs, output_stride=output_stride,\n          final_endpoint='Conv2d_13_pointwise')\n    endpoints_shapes = {'Conv2d_0': [batch_size, 112, 112, 32],\n                        'Conv2d_1_depthwise': [batch_size, 112, 112, 32],\n                        'Conv2d_1_pointwise': [batch_size, 112, 112, 64],\n                        'Conv2d_2_depthwise': [batch_size, 56, 56, 64],\n                        'Conv2d_2_pointwise': [batch_size, 56, 56, 128],\n                        'Conv2d_3_depthwise': [batch_size, 56, 56, 128],\n                        'Conv2d_3_pointwise': [batch_size, 56, 56, 128],\n                        'Conv2d_4_depthwise': [batch_size, 28, 28, 128],\n                        'Conv2d_4_pointwise': [batch_size, 28, 28, 256],\n                        'Conv2d_5_depthwise': [batch_size, 28, 28, 256],\n                        'Conv2d_5_pointwise': [batch_size, 28, 28, 256],\n                        'Conv2d_6_depthwise': [batch_size, 28, 28, 256],\n                        'Conv2d_6_pointwise': [batch_size, 28, 28, 512],\n                        'Conv2d_7_depthwise': [batch_size, 28, 28, 512],\n                        'Conv2d_7_pointwise': [batch_size, 28, 28, 512],\n                        'Conv2d_8_depthwise': [batch_size, 28, 28, 512],\n                        'Conv2d_8_pointwise': [batch_size, 28, 28, 512],\n                        'Conv2d_9_depthwise': [batch_size, 28, 28, 512],\n                        'Conv2d_9_pointwise': [batch_size, 28, 28, 512],\n                        'Conv2d_10_depthwise': [batch_size, 28, 28, 512],\n                        'Conv2d_10_pointwise': [batch_size, 28, 28, 512],\n                        'Conv2d_11_depthwise': [batch_size, 28, 28, 512],\n                        'Conv2d_11_pointwise': [batch_size, 28, 28, 512],\n                        'Conv2d_12_depthwise': [batch_size, 28, 28, 512],\n                        'Conv2d_12_pointwise': [batch_size, 28, 28, 1024],\n                        'Conv2d_13_depthwise': [batch_size, 28, 28, 1024],\n                        'Conv2d_13_pointwise': [batch_size, 28, 28, 1024]}\n    self.assertItemsEqual(endpoints_shapes.keys(), end_points.keys())\n    for endpoint_name, expected_shape in endpoints_shapes.iteritems():\n      self.assertTrue(endpoint_name in end_points)\n      self.assertListEqual(end_points[endpoint_name].get_shape().as_list(),\n                           expected_shape)\n\n  def testBuildAndCheckAllEndPointsApproximateFaceNet(self):\n    batch_size = 5\n    height, width = 128, 128\n\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    with slim.arg_scope([slim.conv2d, slim.separable_conv2d],\n                        normalizer_fn=slim.batch_norm):\n      _, end_points = mobilenet_v1.mobilenet_v1_base(\n          inputs, final_endpoint='Conv2d_13_pointwise', depth_multiplier=0.75)\n    # For the Conv2d_0 layer FaceNet has depth=16\n    endpoints_shapes = {'Conv2d_0': [batch_size, 64, 64, 24],\n                        'Conv2d_1_depthwise': [batch_size, 64, 64, 24],\n                        'Conv2d_1_pointwise': [batch_size, 64, 64, 48],\n                        'Conv2d_2_depthwise': [batch_size, 32, 32, 48],\n                        'Conv2d_2_pointwise': [batch_size, 32, 32, 96],\n                        'Conv2d_3_depthwise': [batch_size, 32, 32, 96],\n                        'Conv2d_3_pointwise': [batch_size, 32, 32, 96],\n                        'Conv2d_4_depthwise': [batch_size, 16, 16, 96],\n                        'Conv2d_4_pointwise': [batch_size, 16, 16, 192],\n                        'Conv2d_5_depthwise': [batch_size, 16, 16, 192],\n                        'Conv2d_5_pointwise': [batch_size, 16, 16, 192],\n                        'Conv2d_6_depthwise': [batch_size, 8, 8, 192],\n                        'Conv2d_6_pointwise': [batch_size, 8, 8, 384],\n                        'Conv2d_7_depthwise': [batch_size, 8, 8, 384],\n                        'Conv2d_7_pointwise': [batch_size, 8, 8, 384],\n                        'Conv2d_8_depthwise': [batch_size, 8, 8, 384],\n                        'Conv2d_8_pointwise': [batch_size, 8, 8, 384],\n                        'Conv2d_9_depthwise': [batch_size, 8, 8, 384],\n                        'Conv2d_9_pointwise': [batch_size, 8, 8, 384],\n                        'Conv2d_10_depthwise': [batch_size, 8, 8, 384],\n                        'Conv2d_10_pointwise': [batch_size, 8, 8, 384],\n                        'Conv2d_11_depthwise': [batch_size, 8, 8, 384],\n                        'Conv2d_11_pointwise': [batch_size, 8, 8, 384],\n                        'Conv2d_12_depthwise': [batch_size, 4, 4, 384],\n                        'Conv2d_12_pointwise': [batch_size, 4, 4, 768],\n                        'Conv2d_13_depthwise': [batch_size, 4, 4, 768],\n                        'Conv2d_13_pointwise': [batch_size, 4, 4, 768]}\n    self.assertItemsEqual(endpoints_shapes.keys(), end_points.keys())\n    for endpoint_name, expected_shape in endpoints_shapes.iteritems():\n      self.assertTrue(endpoint_name in end_points)\n      self.assertListEqual(end_points[endpoint_name].get_shape().as_list(),\n                           expected_shape)\n\n  def testModelHasExpectedNumberOfParameters(self):\n    batch_size = 5\n    height, width = 224, 224\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    with slim.arg_scope([slim.conv2d, slim.separable_conv2d],\n                        normalizer_fn=slim.batch_norm):\n      mobilenet_v1.mobilenet_v1_base(inputs)\n      total_params, _ = slim.model_analyzer.analyze_vars(\n          slim.get_model_variables())\n      self.assertAlmostEqual(3217920L, total_params)\n\n  def testBuildEndPointsWithDepthMultiplierLessThanOne(self):\n    batch_size = 5\n    height, width = 224, 224\n    num_classes = 1000\n\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    _, end_points = mobilenet_v1.mobilenet_v1(inputs, num_classes)\n\n    endpoint_keys = [key for key in end_points.keys() if key.startswith('Conv')]\n\n    _, end_points_with_multiplier = mobilenet_v1.mobilenet_v1(\n        inputs, num_classes, scope='depth_multiplied_net',\n        depth_multiplier=0.5)\n\n    for key in endpoint_keys:\n      original_depth = end_points[key].get_shape().as_list()[3]\n      new_depth = end_points_with_multiplier[key].get_shape().as_list()[3]\n      self.assertEqual(0.5 * original_depth, new_depth)\n\n  def testBuildEndPointsWithDepthMultiplierGreaterThanOne(self):\n    batch_size = 5\n    height, width = 224, 224\n    num_classes = 1000\n\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    _, end_points = mobilenet_v1.mobilenet_v1(inputs, num_classes)\n\n    endpoint_keys = [key for key in end_points.keys()\n                     if key.startswith('Mixed') or key.startswith('Conv')]\n\n    _, end_points_with_multiplier = mobilenet_v1.mobilenet_v1(\n        inputs, num_classes, scope='depth_multiplied_net',\n        depth_multiplier=2.0)\n\n    for key in endpoint_keys:\n      original_depth = end_points[key].get_shape().as_list()[3]\n      new_depth = end_points_with_multiplier[key].get_shape().as_list()[3]\n      self.assertEqual(2.0 * original_depth, new_depth)\n\n  def testRaiseValueErrorWithInvalidDepthMultiplier(self):\n    batch_size = 5\n    height, width = 224, 224\n    num_classes = 1000\n\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    with self.assertRaises(ValueError):\n      _ = mobilenet_v1.mobilenet_v1(\n          inputs, num_classes, depth_multiplier=-0.1)\n    with self.assertRaises(ValueError):\n      _ = mobilenet_v1.mobilenet_v1(\n          inputs, num_classes, depth_multiplier=0.0)\n\n  def testHalfSizeImages(self):\n    batch_size = 5\n    height, width = 112, 112\n    num_classes = 1000\n\n    inputs = tf.random_uniform((batch_size, height, width, 3))\n    logits, end_points = mobilenet_v1.mobilenet_v1(inputs, num_classes)\n    self.assertTrue(logits.op.name.startswith('MobilenetV1/Logits'))\n    self.assertListEqual(logits.get_shape().as_list(),\n                         [batch_size, num_classes])\n    pre_pool = end_points['Conv2d_13_pointwise']\n    self.assertListEqual(pre_pool.get_shape().as_list(),\n                         [batch_size, 4, 4, 1024])\n\n  def testUnknownImageShape(self):\n    tf.reset_default_graph()\n    batch_size = 2\n    height, width = 224, 224\n    num_classes = 1000\n    input_np = np.random.uniform(0, 1, (batch_size, height, width, 3))\n    with self.test_session() as sess:\n      inputs = tf.placeholder(tf.float32, shape=(batch_size, None, None, 3))\n      logits, end_points = mobilenet_v1.mobilenet_v1(inputs, num_classes)\n      self.assertTrue(logits.op.name.startswith('MobilenetV1/Logits'))\n      self.assertListEqual(logits.get_shape().as_list(),\n                           [batch_size, num_classes])\n      pre_pool = end_points['Conv2d_13_pointwise']\n      feed_dict = {inputs: input_np}\n      tf.global_variables_initializer().run()\n      pre_pool_out = sess.run(pre_pool, feed_dict=feed_dict)\n      self.assertListEqual(list(pre_pool_out.shape), [batch_size, 7, 7, 1024])\n\n  def testUnknowBatchSize(self):\n    batch_size = 1\n    height, width = 224, 224\n    num_classes = 1000\n\n    inputs = tf.placeholder(tf.float32, (None, height, width, 3))\n    logits, _ = mobilenet_v1.mobilenet_v1(inputs, num_classes)\n    self.assertTrue(logits.op.name.startswith('MobilenetV1/Logits'))\n    self.assertListEqual(logits.get_shape().as_list(),\n                         [None, num_classes])\n    images = tf.random_uniform((batch_size, height, width, 3))\n\n    with self.test_session() as sess:\n      sess.run(tf.global_variables_initializer())\n      output = sess.run(logits, {inputs: images.eval()})\n      self.assertEquals(output.shape, (batch_size, num_classes))\n\n  def testEvaluation(self):\n    batch_size = 2\n    height, width = 224, 224\n    num_classes = 1000\n\n    eval_inputs = tf.random_uniform((batch_size, height, width, 3))\n    logits, _ = mobilenet_v1.mobilenet_v1(eval_inputs, num_classes,\n                                          is_training=False)\n    predictions = tf.argmax(logits, 1)\n\n    with self.test_session() as sess:\n      sess.run(tf.global_variables_initializer())\n      output = sess.run(predictions)\n      self.assertEquals(output.shape, (batch_size,))\n\n  def testTrainEvalWithReuse(self):\n    train_batch_size = 5\n    eval_batch_size = 2\n    height, width = 150, 150\n    num_classes = 1000\n\n    train_inputs = tf.random_uniform((train_batch_size, height, width, 3))\n    mobilenet_v1.mobilenet_v1(train_inputs, num_classes)\n    eval_inputs = tf.random_uniform((eval_batch_size, height, width, 3))\n    logits, _ = mobilenet_v1.mobilenet_v1(eval_inputs, num_classes,\n                                          reuse=True)\n    predictions = tf.argmax(logits, 1)\n\n    with self.test_session() as sess:\n      sess.run(tf.global_variables_initializer())\n      output = sess.run(predictions)\n      self.assertEquals(output.shape, (eval_batch_size,))\n\n  def testLogitsNotSqueezed(self):\n    num_classes = 25\n    images = tf.random_uniform([1, 224, 224, 3])\n    logits, _ = mobilenet_v1.mobilenet_v1(images,\n                                          num_classes=num_classes,\n                                          spatial_squeeze=False)\n\n    with self.test_session() as sess:\n      tf.global_variables_initializer().run()\n      logits_out = sess.run(logits)\n      self.assertListEqual(list(logits_out.shape), [1, 1, 1, num_classes])\n\n\nif __name__ == '__main__':\n  tf.test.main()"
  },
  {
    "path": "nets/net_profile.py",
    "content": "from __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport argparse\n\nimport tensorflow as tf\n\nfrom nets import nets_factory\n\ndef profile(model_name, num_classes, image_size, batch_size):\n\n    graph = tf.Graph()\n    sess = tf.Session(graph=graph)\n\n    with graph.as_default(), sess.as_default():\n\n        network_fn = nets_factory.get_network_fn(model_name, num_classes=num_classes)\n        inputs = tf.random_uniform((batch_size, image_size, image_size, 3))\n        logits, _ = network_fn(inputs)\n\n        print(\"Profiling model %s\" % model_name)\n\n        # Print trainable variable parameter statistics to stdout.\n        param_stats = tf.contrib.tfprof.model_analyzer.print_model_analysis(\n            tf.get_default_graph(),\n            tfprof_options=tf.contrib.tfprof.model_analyzer.\n                TRAINABLE_VARS_PARAMS_STAT_OPTIONS)\n\n        # param_stats is tensorflow.tfprof.TFProfNode proto. It organize the statistics\n        # of each graph node in tree scructure. Let's print the root below.\n        print('total_params: %d\\n' % param_stats.total_parameters)\n\n        print()\n\n        # Print to stdout an analysis of the number of floating point operations in the\n        # model broken down by individual operations.\n        tf.contrib.tfprof.model_analyzer.print_model_analysis(\n            tf.get_default_graph(),\n            tfprof_options=tf.contrib.tfprof.model_analyzer.FLOAT_OPS_OPTIONS)\n\n\ndef parse_args():\n\n    parser = argparse.ArgumentParser(description='')\n\n    parser.add_argument('--model_name', dest='model_name',\n                        help='The name of the architecture to profile.', type=str,\n                        required=False, default='inception_v3')\n\n    parser.add_argument('--num_classes', dest='num_classes',\n                        help='The number of classes.', type=int,\n                        required=False, default=1000)\n\n    parser.add_argument('--image_size', dest='image_size',\n                          help='The size of the input image.', type=int,\n                          required=False, default=299)\n\n    parser.add_argument('--batch_size', dest='batch_size',\n                        help='The number of images in a batch.', type=int,\n                        required=False, default=1)\n\n    args = parser.parse_args()\n    return args\n\ndef main():\n    args = parse_args()\n\n    profile(args.model_name, args.num_classes, args.image_size, args.batch_size)\n\nif __name__ == '__main__':\n    main()"
  },
  {
    "path": "nets/nets_factory.py",
    "content": "# Copyright 2016 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\"\"\"Contains a factory for building various models.\"\"\"\n\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\nimport functools\n\nimport tensorflow as tf\n\nfrom nets import inception\nfrom nets import mobilenet_v1\nfrom nets import resnet_v2\n\nslim = tf.contrib.slim\n\nnetworks_map = {\n                'inception_v1': inception.inception_v1,\n                'inception_v2': inception.inception_v2,\n                'inception_v3': inception.inception_v3,\n                'inception_v4': inception.inception_v4,\n                'inception_resnet_v2': inception.inception_resnet_v2,\n                'resnet_v2_50': resnet_v2.resnet_v2_50,\n                'resnet_v2_101': resnet_v2.resnet_v2_101,\n                'resnet_v2_152': resnet_v2.resnet_v2_152,\n                'resnet_v2_200': resnet_v2.resnet_v2_200,\n                'mobilenet_v1': mobilenet_v1.mobilenet_v1,\n                'mobilenet_v1_075': mobilenet_v1.mobilenet_v1_075,\n                'mobilenet_v1_050': mobilenet_v1.mobilenet_v1_050,\n                'mobilenet_v1_025': mobilenet_v1.mobilenet_v1_025,\n               }\n\narg_scopes_map = {'inception_v1': inception.inception_v3_arg_scope,\n                  'inception_v2': inception.inception_v3_arg_scope,\n                  'inception_v3': inception.inception_v3_arg_scope,\n                  'inception_v4': inception.inception_v4_arg_scope,\n                  'inception_resnet_v2': inception.inception_resnet_v2_arg_scope,\n                  'resnet_v2_50': resnet_v2.resnet_arg_scope,\n                  'resnet_v2_101': resnet_v2.resnet_arg_scope,\n                  'resnet_v2_152': resnet_v2.resnet_arg_scope,\n                  'resnet_v2_200': resnet_v2.resnet_arg_scope,\n                  'mobilenet_v1': mobilenet_v1.mobilenet_v1_arg_scope,\n                  'mobilenet_v1_075': mobilenet_v1.mobilenet_v1_arg_scope,\n                  'mobilenet_v1_050': mobilenet_v1.mobilenet_v1_arg_scope,\n                  'mobilenet_v1_025': mobilenet_v1.mobilenet_v1_arg_scope,\n                 }\n\n\ndef get_network_fn(name, num_classes, weight_decay=0.0, is_training=False):\n  \"\"\"Returns a network_fn such as `logits, end_points = network_fn(images)`.\n\n  Args:\n    name: The name of the network.\n    num_classes: The number of classes to use for classification.\n    weight_decay: The l2 coefficient for the model weights.\n    is_training: `True` if the model is being used for training and `False`\n      otherwise.\n\n  Returns:\n    network_fn: A function that applies the model to a batch of images. It has\n      the following signature:\n        logits, end_points = network_fn(images)\n  Raises:\n    ValueError: If network `name` is not recognized.\n  \"\"\"\n  if name not in networks_map:\n    raise ValueError('Name of network unknown %s' % name)\n  arg_scope = arg_scopes_map[name](weight_decay=weight_decay)\n  func = networks_map[name]\n  @functools.wraps(func)\n  def network_fn(images):\n    with slim.arg_scope(arg_scope):\n      return func(images, num_classes, is_training=is_training)\n  if hasattr(func, 'default_image_size'):\n    network_fn.default_image_size = func.default_image_size\n\n  return network_fn\n"
  },
  {
    "path": "nets/nets_factory_test.py",
    "content": "# Copyright 2016 Google Inc. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\n\"\"\"Tests for slim.inception.\"\"\"\n\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\n\nimport tensorflow as tf\n\nfrom nets import nets_factory\n\n\nclass NetworksTest(tf.test.TestCase):\n\n  def testGetNetworkFn(self):\n    batch_size = 5\n    num_classes = 1000\n    for net in nets_factory.networks_map:\n      with self.test_session():\n        net_fn = nets_factory.get_network_fn(net, num_classes)\n        # Most networks use 224 as their default_image_size\n        image_size = getattr(net_fn, 'default_image_size', 224)\n        inputs = tf.random_uniform((batch_size, image_size, image_size, 3))\n        logits, end_points = net_fn(inputs)\n        self.assertTrue(isinstance(logits, tf.Tensor))\n        self.assertTrue(isinstance(end_points, dict))\n        self.assertEqual(logits.get_shape().as_list()[0], batch_size)\n        self.assertEqual(logits.get_shape().as_list()[-1], num_classes)\n\nif __name__ == '__main__':\n  tf.test.main()\n"
  },
  {
    "path": "nets/resnet_utils.py",
    "content": "# Copyright 2016 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\"\"\"Contains building blocks for various versions of Residual Networks.\n\nResidual networks (ResNets) were proposed in:\n  Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun\n  Deep Residual Learning for Image Recognition. arXiv:1512.03385, 2015\n\nMore variants were introduced in:\n  Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun\n  Identity Mappings in Deep Residual Networks. arXiv: 1603.05027, 2016\n\nWe can obtain different ResNet variants by changing the network depth, width,\nand form of residual unit. This module implements the infrastructure for\nbuilding them. Concrete ResNet units and full ResNet networks are implemented in\nthe accompanying resnet_v1.py and resnet_v2.py modules.\n\nCompared to https://github.com/KaimingHe/deep-residual-networks, in the current\nimplementation we subsample the output activations in the last residual unit of\neach block, instead of subsampling the input activations in the first residual\nunit of each block. The two implementations give identical results but our\nimplementation is more memory efficient.\n\"\"\"\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport collections\nimport tensorflow as tf\n\nslim = tf.contrib.slim\n\n\nclass Block(collections.namedtuple('Block', ['scope', 'unit_fn', 'args'])):\n  \"\"\"A named tuple describing a ResNet block.\n\n  Its parts are:\n    scope: The scope of the `Block`.\n    unit_fn: The ResNet unit function which takes as input a `Tensor` and\n      returns another `Tensor` with the output of the ResNet unit.\n    args: A list of length equal to the number of units in the `Block`. The list\n      contains one (depth, depth_bottleneck, stride) tuple for each unit in the\n      block to serve as argument to unit_fn.\n  \"\"\"\n\n\ndef subsample(inputs, factor, scope=None):\n  \"\"\"Subsamples the input along the spatial dimensions.\n\n  Args:\n    inputs: A `Tensor` of size [batch, height_in, width_in, channels].\n    factor: The subsampling factor.\n    scope: Optional variable_scope.\n\n  Returns:\n    output: A `Tensor` of size [batch, height_out, width_out, channels] with the\n      input, either intact (if factor == 1) or subsampled (if factor > 1).\n  \"\"\"\n  if factor == 1:\n    return inputs\n  else:\n    return slim.max_pool2d(inputs, [1, 1], stride=factor, scope=scope)\n\n\ndef conv2d_same(inputs, num_outputs, kernel_size, stride, rate=1, scope=None):\n  \"\"\"Strided 2-D convolution with 'SAME' padding.\n\n  When stride > 1, then we do explicit zero-padding, followed by conv2d with\n  'VALID' padding.\n\n  Note that\n\n     net = conv2d_same(inputs, num_outputs, 3, stride=stride)\n\n  is equivalent to\n\n     net = slim.conv2d(inputs, num_outputs, 3, stride=1, padding='SAME')\n     net = subsample(net, factor=stride)\n\n  whereas\n\n     net = slim.conv2d(inputs, num_outputs, 3, stride=stride, padding='SAME')\n\n  is different when the input's height or width is even, which is why we add the\n  current function. For more details, see ResnetUtilsTest.testConv2DSameEven().\n\n  Args:\n    inputs: A 4-D tensor of size [batch, height_in, width_in, channels].\n    num_outputs: An integer, the number of output filters.\n    kernel_size: An int with the kernel_size of the filters.\n    stride: An integer, the output stride.\n    rate: An integer, rate for atrous convolution.\n    scope: Scope.\n\n  Returns:\n    output: A 4-D tensor of size [batch, height_out, width_out, channels] with\n      the convolution output.\n  \"\"\"\n  if stride == 1:\n    return slim.conv2d(inputs, num_outputs, kernel_size, stride=1, rate=rate,\n                       padding='SAME', scope=scope)\n  else:\n    kernel_size_effective = kernel_size + (kernel_size - 1) * (rate - 1)\n    pad_total = kernel_size_effective - 1\n    pad_beg = pad_total // 2\n    pad_end = pad_total - pad_beg\n    inputs = tf.pad(inputs,\n                    [[0, 0], [pad_beg, pad_end], [pad_beg, pad_end], [0, 0]])\n    return slim.conv2d(inputs, num_outputs, kernel_size, stride=stride,\n                       rate=rate, padding='VALID', scope=scope)\n\n\n@slim.add_arg_scope\ndef stack_blocks_dense(net, blocks, output_stride=None,\n                       outputs_collections=None):\n  \"\"\"Stacks ResNet `Blocks` and controls output feature density.\n\n  First, this function creates scopes for the ResNet in the form of\n  'block_name/unit_1', 'block_name/unit_2', etc.\n\n  Second, this function allows the user to explicitly control the ResNet\n  output_stride, which is the ratio of the input to output spatial resolution.\n  This is useful for dense prediction tasks such as semantic segmentation or\n  object detection.\n\n  Most ResNets consist of 4 ResNet blocks and subsample the activations by a\n  factor of 2 when transitioning between consecutive ResNet blocks. This results\n  to a nominal ResNet output_stride equal to 8. If we set the output_stride to\n  half the nominal network stride (e.g., output_stride=4), then we compute\n  responses twice.\n\n  Control of the output feature density is implemented by atrous convolution.\n\n  Args:\n    net: A `Tensor` of size [batch, height, width, channels].\n    blocks: A list of length equal to the number of ResNet `Blocks`. Each\n      element is a ResNet `Block` object describing the units in the `Block`.\n    output_stride: If `None`, then the output will be computed at the nominal\n      network stride. If output_stride is not `None`, it specifies the requested\n      ratio of input to output spatial resolution, which needs to be equal to\n      the product of unit strides from the start up to some level of the ResNet.\n      For example, if the ResNet employs units with strides 1, 2, 1, 3, 4, 1,\n      then valid values for the output_stride are 1, 2, 6, 24 or None (which\n      is equivalent to output_stride=24).\n    outputs_collections: Collection to add the ResNet block outputs.\n\n  Returns:\n    net: Output tensor with stride equal to the specified output_stride.\n\n  Raises:\n    ValueError: If the target output_stride is not valid.\n  \"\"\"\n  # The current_stride variable keeps track of the effective stride of the\n  # activations. This allows us to invoke atrous convolution whenever applying\n  # the next residual unit would result in the activations having stride larger\n  # than the target output_stride.\n  current_stride = 1\n\n  # The atrous convolution rate parameter.\n  rate = 1\n\n  for block in blocks:\n    with tf.variable_scope(block.scope, 'block', [net]) as sc:\n      for i, unit in enumerate(block.args):\n        if output_stride is not None and current_stride > output_stride:\n          raise ValueError('The target output_stride cannot be reached.')\n\n        with tf.variable_scope('unit_%d' % (i + 1), values=[net]):\n          # If we have reached the target output_stride, then we need to employ\n          # atrous convolution with stride=1 and multiply the atrous rate by the\n          # current unit's stride for use in subsequent layers.\n          if output_stride is not None and current_stride == output_stride:\n            net = block.unit_fn(net, rate=rate, **dict(unit, stride=1))\n            rate *= unit.get('stride', 1)\n\n          else:\n            net = block.unit_fn(net, rate=1, **unit)\n            current_stride *= unit.get('stride', 1)\n      net = slim.utils.collect_named_outputs(outputs_collections, sc.name, net)\n\n  if output_stride is not None and current_stride != output_stride:\n    raise ValueError('The target output_stride cannot be reached.')\n\n  return net\n\n\ndef resnet_arg_scope(weight_decay=0.0001,\n                     batch_norm_decay=0.997,\n                     batch_norm_epsilon=1e-5,\n                     batch_norm_scale=True,\n                     activation_fn=tf.nn.relu,\n                     use_batch_norm=True):\n  \"\"\"Defines the default ResNet arg scope.\n\n  TODO(gpapan): The batch-normalization related default values above are\n    appropriate for use in conjunction with the reference ResNet models\n    released at https://github.com/KaimingHe/deep-residual-networks. When\n    training ResNets from scratch, they might need to be tuned.\n\n  Args:\n    weight_decay: The weight decay to use for regularizing the model.\n    batch_norm_decay: The moving average decay when estimating layer activation\n      statistics in batch normalization.\n    batch_norm_epsilon: Small constant to prevent division by zero when\n      normalizing activations by their variance in batch normalization.\n    batch_norm_scale: If True, uses an explicit `gamma` multiplier to scale the\n      activations in the batch normalization layer.\n    activation_fn: The activation function which is used in ResNet.\n    use_batch_norm: Whether or not to use batch normalization.\n\n  Returns:\n    An `arg_scope` to use for the resnet models.\n  \"\"\"\n  batch_norm_params = {\n      'decay': batch_norm_decay,\n      'epsilon': batch_norm_epsilon,\n      'scale': batch_norm_scale,\n      'updates_collections': tf.GraphKeys.UPDATE_OPS,\n  }\n\n  with slim.arg_scope(\n      [slim.conv2d],\n      weights_regularizer=slim.l2_regularizer(weight_decay),\n      weights_initializer=slim.variance_scaling_initializer(),\n      activation_fn=activation_fn,\n      normalizer_fn=slim.batch_norm if use_batch_norm else None,\n      normalizer_params=batch_norm_params):\n    with slim.arg_scope([slim.batch_norm], **batch_norm_params):\n      # The following implies padding='SAME' for pool1, which makes feature\n      # alignment easier for dense prediction tasks. This is also used in\n      # https://github.com/facebook/fb.resnet.torch. However the accompanying\n      # code of 'Deep Residual Learning for Image Recognition' uses\n      # padding='VALID' for pool1. You can switch to that choice by setting\n      # slim.arg_scope([slim.max_pool2d], padding='VALID').\n      with slim.arg_scope([slim.max_pool2d], padding='SAME') as arg_sc:\n        return arg_sc"
  },
  {
    "path": "nets/resnet_v2.py",
    "content": "# Copyright 2016 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\"\"\"Contains definitions for the preactivation form of Residual Networks.\n\nResidual networks (ResNets) were originally proposed in:\n[1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun\n    Deep Residual Learning for Image Recognition. arXiv:1512.03385\n\nThe full preactivation 'v2' ResNet variant implemented in this module was\nintroduced by:\n[2] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun\n    Identity Mappings in Deep Residual Networks. arXiv: 1603.05027\n\nThe key difference of the full preactivation 'v2' variant compared to the\n'v1' variant in [1] is the use of batch normalization before every weight layer.\n\nTypical use:\n\n   from tensorflow.contrib.slim.nets import resnet_v2\n\nResNet-101 for image classification into 1000 classes:\n\n   # inputs has shape [batch, 224, 224, 3]\n   with slim.arg_scope(resnet_v2.resnet_arg_scope()):\n      net, end_points = resnet_v2.resnet_v2_101(inputs, 1000, is_training=False)\n\nResNet-101 for semantic segmentation into 21 classes:\n\n   # inputs has shape [batch, 513, 513, 3]\n   with slim.arg_scope(resnet_v2.resnet_arg_scope(is_training)):\n      net, end_points = resnet_v2.resnet_v2_101(inputs,\n                                                21,\n                                                is_training=False,\n                                                global_pool=False,\n                                                output_stride=16)\n\"\"\"\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport tensorflow as tf\n\nfrom nets import resnet_utils\n\nslim = tf.contrib.slim\nresnet_arg_scope = resnet_utils.resnet_arg_scope\n\n\n@slim.add_arg_scope\ndef bottleneck(inputs, depth, depth_bottleneck, stride, rate=1,\n               outputs_collections=None, scope=None):\n  \"\"\"Bottleneck residual unit variant with BN before convolutions.\n\n  This is the full preactivation residual unit variant proposed in [2]. See\n  Fig. 1(b) of [2] for its definition. Note that we use here the bottleneck\n  variant which has an extra bottleneck layer.\n\n  When putting together two consecutive ResNet blocks that use this unit, one\n  should use stride = 2 in the last unit of the first block.\n\n  Args:\n    inputs: A tensor of size [batch, height, width, channels].\n    depth: The depth of the ResNet unit output.\n    depth_bottleneck: The depth of the bottleneck layers.\n    stride: The ResNet unit's stride. Determines the amount of downsampling of\n      the units output compared to its input.\n    rate: An integer, rate for atrous convolution.\n    outputs_collections: Collection to add the ResNet unit output.\n    scope: Optional variable_scope.\n\n  Returns:\n    The ResNet unit's output.\n  \"\"\"\n  with tf.variable_scope(scope, 'bottleneck_v2', [inputs]) as sc:\n    depth_in = slim.utils.last_dimension(inputs.get_shape(), min_rank=4)\n    preact = slim.batch_norm(inputs, activation_fn=tf.nn.relu, scope='preact')\n    if depth == depth_in:\n      shortcut = resnet_utils.subsample(inputs, stride, 'shortcut')\n    else:\n      shortcut = slim.conv2d(preact, depth, [1, 1], stride=stride,\n                             normalizer_fn=None, activation_fn=None,\n                             scope='shortcut')\n\n    residual = slim.conv2d(preact, depth_bottleneck, [1, 1], stride=1,\n                           scope='conv1')\n    residual = resnet_utils.conv2d_same(residual, depth_bottleneck, 3, stride,\n                                        rate=rate, scope='conv2')\n    residual = slim.conv2d(residual, depth, [1, 1], stride=1,\n                           normalizer_fn=None, activation_fn=None,\n                           scope='conv3')\n\n    output = shortcut + residual\n\n    return slim.utils.collect_named_outputs(outputs_collections,\n                                            sc.original_name_scope,\n                                            output)\n\n\ndef resnet_v2(inputs,\n              blocks,\n              num_classes=None,\n              is_training=True,\n              global_pool=True,\n              output_stride=None,\n              include_root_block=True,\n              spatial_squeeze=True,\n              dropout_keep_prob=1.,\n              reuse=None,\n              scope=None):\n  \"\"\"Generator for v2 (preactivation) ResNet models.\n\n  This function generates a family of ResNet v2 models. See the resnet_v2_*()\n  methods for specific model instantiations, obtained by selecting different\n  block instantiations that produce ResNets of various depths.\n\n  Training for image classification on Imagenet is usually done with [224, 224]\n  inputs, resulting in [7, 7] feature maps at the output of the last ResNet\n  block for the ResNets defined in [1] that have nominal stride equal to 32.\n  However, for dense prediction tasks we advise that one uses inputs with\n  spatial dimensions that are multiples of 32 plus 1, e.g., [321, 321]. In\n  this case the feature maps at the ResNet output will have spatial shape\n  [(height - 1) / output_stride + 1, (width - 1) / output_stride + 1]\n  and corners exactly aligned with the input image corners, which greatly\n  facilitates alignment of the features to the image. Using as input [225, 225]\n  images results in [8, 8] feature maps at the output of the last ResNet block.\n\n  For dense prediction tasks, the ResNet needs to run in fully-convolutional\n  (FCN) mode and global_pool needs to be set to False. The ResNets in [1, 2] all\n  have nominal stride equal to 32 and a good choice in FCN mode is to use\n  output_stride=16 in order to increase the density of the computed features at\n  small computational and memory overhead, cf. http://arxiv.org/abs/1606.00915.\n\n  Args:\n    inputs: A tensor of size [batch, height_in, width_in, channels].\n    blocks: A list of length equal to the number of ResNet blocks. Each element\n      is a resnet_utils.Block object describing the units in the block.\n    num_classes: Number of predicted classes for classification tasks. If None\n      we return the features before the logit layer.\n    is_training: whether is training or not.\n    global_pool: If True, we perform global average pooling before computing the\n      logits. Set to True for image classification, False for dense prediction.\n    output_stride: If None, then the output will be computed at the nominal\n      network stride. If output_stride is not None, it specifies the requested\n      ratio of input to output spatial resolution.\n    include_root_block: If True, include the initial convolution followed by\n      max-pooling, if False excludes it. If excluded, `inputs` should be the\n      results of an activation-less convolution.\n    spatial_squeeze: if True, logits is of shape [B, C], if false logits is\n        of shape [B, 1, 1, C], where B is batch_size and C is number of classes.\n        To use this parameter, the input images must be smaller than 300x300\n        pixels, in which case the output logit layer does not contain spatial\n        information and can be removed.\n    reuse: whether or not the network and its variables should be reused. To be\n      able to reuse 'scope' must be given.\n    scope: Optional variable_scope.\n\n\n  Returns:\n    net: A rank-4 tensor of size [batch, height_out, width_out, channels_out].\n      If global_pool is False, then height_out and width_out are reduced by a\n      factor of output_stride compared to the respective height_in and width_in,\n      else both height_out and width_out equal one. If num_classes is None, then\n      net is the output of the last ResNet block, potentially after global\n      average pooling. If num_classes is not None, net contains the pre-softmax\n      activations.\n    end_points: A dictionary from components of the network to the corresponding\n      activation.\n\n  Raises:\n    ValueError: If the target output_stride is not valid.\n  \"\"\"\n  with tf.variable_scope(scope, 'resnet_v2', [inputs], reuse=reuse) as sc:\n    end_points_collection = sc.name + '_end_points'\n    with slim.arg_scope([slim.conv2d, bottleneck,\n                         resnet_utils.stack_blocks_dense],\n                        outputs_collections=end_points_collection):\n      with slim.arg_scope([slim.batch_norm], is_training=is_training):\n        net = inputs\n        if include_root_block:\n          if output_stride is not None:\n            if output_stride % 4 != 0:\n              raise ValueError('The output_stride needs to be a multiple of 4.')\n            output_stride /= 4\n          # We do not include batch normalization or activation functions in\n          # conv1 because the first ResNet unit will perform these. Cf.\n          # Appendix of [2].\n          with slim.arg_scope([slim.conv2d],\n                              activation_fn=None, normalizer_fn=None):\n            net = resnet_utils.conv2d_same(net, 64, 7, stride=2, scope='conv1')\n          net = slim.max_pool2d(net, [3, 3], stride=2, scope='pool1')\n        net = resnet_utils.stack_blocks_dense(net, blocks, output_stride)\n        # This is needed because the pre-activation variant does not have batch\n        # normalization or activation functions in the residual unit output. See\n        # Appendix of [2].\n        net = slim.batch_norm(net, activation_fn=tf.nn.relu, scope='postnorm')\n        if global_pool:\n          # Global average pooling.\n          net = tf.reduce_mean(net, [1, 2], name='pool5', keep_dims=True)\n        if num_classes is not None:\n          net = slim.dropout(net, keep_prob=dropout_keep_prob, scope='Dropout_1b')\n          net = slim.conv2d(net, num_classes, [1, 1], activation_fn=None,\n                            normalizer_fn=None, scope='logits')\n          if spatial_squeeze:\n            net = tf.squeeze(net, [1, 2], name='SpatialSqueeze')\n        # Convert end_points_collection into a dictionary of end_points.\n        end_points = slim.utils.convert_collection_to_dict(\n            end_points_collection)\n        if num_classes is not None:\n          end_points['predictions'] = slim.softmax(net, scope='predictions')\n        return net, end_points\nresnet_v2.default_image_size = 224\n\n\ndef resnet_v2_block(scope, base_depth, num_units, stride):\n  \"\"\"Helper function for creating a resnet_v2 bottleneck block.\n\n  Args:\n    scope: The scope of the block.\n    base_depth: The depth of the bottleneck layer for each unit.\n    num_units: The number of units in the block.\n    stride: The stride of the block, implemented as a stride in the last unit.\n      All other units have stride=1.\n\n  Returns:\n    A resnet_v2 bottleneck block.\n  \"\"\"\n  return resnet_utils.Block(scope, bottleneck, [{\n      'depth': base_depth * 4,\n      'depth_bottleneck': base_depth,\n      'stride': 1\n  }] * (num_units - 1) + [{\n      'depth': base_depth * 4,\n      'depth_bottleneck': base_depth,\n      'stride': stride\n  }])\nresnet_v2.default_image_size = 224\n\n\ndef resnet_v2_50(inputs,\n                 num_classes=None,\n                 is_training=True,\n                 global_pool=True,\n                 output_stride=None,\n                 spatial_squeeze=True,\n                 dropout_keep_prob=1.,\n                 reuse=None,\n                 scope='resnet_v2_50'):\n  \"\"\"ResNet-50 model of [1]. See resnet_v2() for arg and return description.\"\"\"\n  blocks = [\n      resnet_v2_block('block1', base_depth=64, num_units=3, stride=2),\n      resnet_v2_block('block2', base_depth=128, num_units=4, stride=2),\n      resnet_v2_block('block3', base_depth=256, num_units=6, stride=2),\n      resnet_v2_block('block4', base_depth=512, num_units=3, stride=1),\n  ]\n  return resnet_v2(inputs, blocks, num_classes, is_training=is_training,\n                   global_pool=global_pool, output_stride=output_stride,\n                   include_root_block=True, spatial_squeeze=spatial_squeeze,\n                   dropout_keep_prob=dropout_keep_prob, reuse=reuse, scope=scope)\nresnet_v2_50.default_image_size = resnet_v2.default_image_size\n\n\ndef resnet_v2_101(inputs,\n                  num_classes=None,\n                  is_training=True,\n                  global_pool=True,\n                  output_stride=None,\n                  spatial_squeeze=True,\n                  reuse=None,\n                  scope='resnet_v2_101'):\n  \"\"\"ResNet-101 model of [1]. See resnet_v2() for arg and return description.\"\"\"\n  blocks = [\n      resnet_v2_block('block1', base_depth=64, num_units=3, stride=2),\n      resnet_v2_block('block2', base_depth=128, num_units=4, stride=2),\n      resnet_v2_block('block3', base_depth=256, num_units=23, stride=2),\n      resnet_v2_block('block4', base_depth=512, num_units=3, stride=1),\n  ]\n  return resnet_v2(inputs, blocks, num_classes, is_training=is_training,\n                   global_pool=global_pool, output_stride=output_stride,\n                   include_root_block=True, spatial_squeeze=spatial_squeeze,\n                   reuse=reuse, scope=scope)\nresnet_v2_101.default_image_size = resnet_v2.default_image_size\n\n\ndef resnet_v2_152(inputs,\n                  num_classes=None,\n                  is_training=True,\n                  global_pool=True,\n                  output_stride=None,\n                  spatial_squeeze=True,\n                  dropout_keep_prob=1.,\n                  reuse=None,\n                  scope='resnet_v2_152'):\n  \"\"\"ResNet-152 model of [1]. See resnet_v2() for arg and return description.\"\"\"\n  blocks = [\n      resnet_v2_block('block1', base_depth=64, num_units=3, stride=2),\n      resnet_v2_block('block2', base_depth=128, num_units=8, stride=2),\n      resnet_v2_block('block3', base_depth=256, num_units=36, stride=2),\n      resnet_v2_block('block4', base_depth=512, num_units=3, stride=1),\n  ]\n  return resnet_v2(inputs, blocks, num_classes, is_training=is_training,\n                   global_pool=global_pool, output_stride=output_stride,\n                   include_root_block=True, spatial_squeeze=spatial_squeeze,\n                   dropout_keep_prob=dropout_keep_prob, reuse=reuse, scope=scope)\nresnet_v2_152.default_image_size = resnet_v2.default_image_size\n\n\ndef resnet_v2_200(inputs,\n                  num_classes=None,\n                  is_training=True,\n                  global_pool=True,\n                  output_stride=None,\n                  spatial_squeeze=True,\n                  dropout_keep_prob=1.,\n                  reuse=None,\n                  scope='resnet_v2_200'):\n  \"\"\"ResNet-200 model of [2]. See resnet_v2() for arg and return description.\"\"\"\n  blocks = [\n      resnet_v2_block('block1', base_depth=64, num_units=3, stride=2),\n      resnet_v2_block('block2', base_depth=128, num_units=24, stride=2),\n      resnet_v2_block('block3', base_depth=256, num_units=36, stride=2),\n      resnet_v2_block('block4', base_depth=512, num_units=3, stride=1),\n  ]\n  return resnet_v2(inputs, blocks, num_classes, is_training=is_training,\n                   global_pool=global_pool, output_stride=output_stride,\n                   include_root_block=True, spatial_squeeze=spatial_squeeze,\n                   dropout_keep_prob=dropout_keep_prob, reuse=reuse, scope=scope)\nresnet_v2_200.default_image_size = resnet_v2.default_image_size"
  },
  {
    "path": "nets/resnet_v2_test.py",
    "content": "# Copyright 2016 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\"\"\"Tests for slim.nets.resnet_v2.\"\"\"\n\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport numpy as np\nimport tensorflow as tf\n\nfrom nets import resnet_utils\nfrom nets import resnet_v2\n\nslim = tf.contrib.slim\n\n\ndef create_test_input(batch_size, height, width, channels):\n  \"\"\"Create test input tensor.\n\n  Args:\n    batch_size: The number of images per batch or `None` if unknown.\n    height: The height of each image or `None` if unknown.\n    width: The width of each image or `None` if unknown.\n    channels: The number of channels per image or `None` if unknown.\n\n  Returns:\n    Either a placeholder `Tensor` of dimension\n      [batch_size, height, width, channels] if any of the inputs are `None` or a\n    constant `Tensor` with the mesh grid values along the spatial dimensions.\n  \"\"\"\n  if None in [batch_size, height, width, channels]:\n    return tf.placeholder(tf.float32, (batch_size, height, width, channels))\n  else:\n    return tf.to_float(\n        np.tile(\n            np.reshape(\n                np.reshape(np.arange(height), [height, 1]) +\n                np.reshape(np.arange(width), [1, width]),\n                [1, height, width, 1]),\n            [batch_size, 1, 1, channels]))\n\n\nclass ResnetUtilsTest(tf.test.TestCase):\n\n  def testSubsampleThreeByThree(self):\n    x = tf.reshape(tf.to_float(tf.range(9)), [1, 3, 3, 1])\n    x = resnet_utils.subsample(x, 2)\n    expected = tf.reshape(tf.constant([0, 2, 6, 8]), [1, 2, 2, 1])\n    with self.test_session():\n      self.assertAllClose(x.eval(), expected.eval())\n\n  def testSubsampleFourByFour(self):\n    x = tf.reshape(tf.to_float(tf.range(16)), [1, 4, 4, 1])\n    x = resnet_utils.subsample(x, 2)\n    expected = tf.reshape(tf.constant([0, 2, 8, 10]), [1, 2, 2, 1])\n    with self.test_session():\n      self.assertAllClose(x.eval(), expected.eval())\n\n  def testConv2DSameEven(self):\n    n, n2 = 4, 2\n\n    # Input image.\n    x = create_test_input(1, n, n, 1)\n\n    # Convolution kernel.\n    w = create_test_input(1, 3, 3, 1)\n    w = tf.reshape(w, [3, 3, 1, 1])\n\n    tf.get_variable('Conv/weights', initializer=w)\n    tf.get_variable('Conv/biases', initializer=tf.zeros([1]))\n    tf.get_variable_scope().reuse_variables()\n\n    y1 = slim.conv2d(x, 1, [3, 3], stride=1, scope='Conv')\n    y1_expected = tf.to_float([[14, 28, 43, 26],\n                               [28, 48, 66, 37],\n                               [43, 66, 84, 46],\n                               [26, 37, 46, 22]])\n    y1_expected = tf.reshape(y1_expected, [1, n, n, 1])\n\n    y2 = resnet_utils.subsample(y1, 2)\n    y2_expected = tf.to_float([[14, 43],\n                               [43, 84]])\n    y2_expected = tf.reshape(y2_expected, [1, n2, n2, 1])\n\n    y3 = resnet_utils.conv2d_same(x, 1, 3, stride=2, scope='Conv')\n    y3_expected = y2_expected\n\n    y4 = slim.conv2d(x, 1, [3, 3], stride=2, scope='Conv')\n    y4_expected = tf.to_float([[48, 37],\n                               [37, 22]])\n    y4_expected = tf.reshape(y4_expected, [1, n2, n2, 1])\n\n    with self.test_session() as sess:\n      sess.run(tf.global_variables_initializer())\n      self.assertAllClose(y1.eval(), y1_expected.eval())\n      self.assertAllClose(y2.eval(), y2_expected.eval())\n      self.assertAllClose(y3.eval(), y3_expected.eval())\n      self.assertAllClose(y4.eval(), y4_expected.eval())\n\n  def testConv2DSameOdd(self):\n    n, n2 = 5, 3\n\n    # Input image.\n    x = create_test_input(1, n, n, 1)\n\n    # Convolution kernel.\n    w = create_test_input(1, 3, 3, 1)\n    w = tf.reshape(w, [3, 3, 1, 1])\n\n    tf.get_variable('Conv/weights', initializer=w)\n    tf.get_variable('Conv/biases', initializer=tf.zeros([1]))\n    tf.get_variable_scope().reuse_variables()\n\n    y1 = slim.conv2d(x, 1, [3, 3], stride=1, scope='Conv')\n    y1_expected = tf.to_float([[14, 28, 43, 58, 34],\n                               [28, 48, 66, 84, 46],\n                               [43, 66, 84, 102, 55],\n                               [58, 84, 102, 120, 64],\n                               [34, 46, 55, 64, 30]])\n    y1_expected = tf.reshape(y1_expected, [1, n, n, 1])\n\n    y2 = resnet_utils.subsample(y1, 2)\n    y2_expected = tf.to_float([[14, 43, 34],\n                               [43, 84, 55],\n                               [34, 55, 30]])\n    y2_expected = tf.reshape(y2_expected, [1, n2, n2, 1])\n\n    y3 = resnet_utils.conv2d_same(x, 1, 3, stride=2, scope='Conv')\n    y3_expected = y2_expected\n\n    y4 = slim.conv2d(x, 1, [3, 3], stride=2, scope='Conv')\n    y4_expected = y2_expected\n\n    with self.test_session() as sess:\n      sess.run(tf.global_variables_initializer())\n      self.assertAllClose(y1.eval(), y1_expected.eval())\n      self.assertAllClose(y2.eval(), y2_expected.eval())\n      self.assertAllClose(y3.eval(), y3_expected.eval())\n      self.assertAllClose(y4.eval(), y4_expected.eval())\n\n  def _resnet_plain(self, inputs, blocks, output_stride=None, scope=None):\n    \"\"\"A plain ResNet without extra layers before or after the ResNet blocks.\"\"\"\n    with tf.variable_scope(scope, values=[inputs]):\n      with slim.arg_scope([slim.conv2d], outputs_collections='end_points'):\n        net = resnet_utils.stack_blocks_dense(inputs, blocks, output_stride)\n        end_points = slim.utils.convert_collection_to_dict('end_points')\n        return net, end_points\n\n  def testEndPointsV2(self):\n    \"\"\"Test the end points of a tiny v2 bottleneck network.\"\"\"\n    blocks = [\n        resnet_v2.resnet_v2_block(\n            'block1', base_depth=1, num_units=2, stride=2),\n        resnet_v2.resnet_v2_block(\n            'block2', base_depth=2, num_units=2, stride=1),\n    ]\n    inputs = create_test_input(2, 32, 16, 3)\n    with slim.arg_scope(resnet_utils.resnet_arg_scope()):\n      _, end_points = self._resnet_plain(inputs, blocks, scope='tiny')\n    expected = [\n        'tiny/block1/unit_1/bottleneck_v2/shortcut',\n        'tiny/block1/unit_1/bottleneck_v2/conv1',\n        'tiny/block1/unit_1/bottleneck_v2/conv2',\n        'tiny/block1/unit_1/bottleneck_v2/conv3',\n        'tiny/block1/unit_2/bottleneck_v2/conv1',\n        'tiny/block1/unit_2/bottleneck_v2/conv2',\n        'tiny/block1/unit_2/bottleneck_v2/conv3',\n        'tiny/block2/unit_1/bottleneck_v2/shortcut',\n        'tiny/block2/unit_1/bottleneck_v2/conv1',\n        'tiny/block2/unit_1/bottleneck_v2/conv2',\n        'tiny/block2/unit_1/bottleneck_v2/conv3',\n        'tiny/block2/unit_2/bottleneck_v2/conv1',\n        'tiny/block2/unit_2/bottleneck_v2/conv2',\n        'tiny/block2/unit_2/bottleneck_v2/conv3']\n    self.assertItemsEqual(expected, end_points)\n\n  def _stack_blocks_nondense(self, net, blocks):\n    \"\"\"A simplified ResNet Block stacker without output stride control.\"\"\"\n    for block in blocks:\n      with tf.variable_scope(block.scope, 'block', [net]):\n        for i, unit in enumerate(block.args):\n          with tf.variable_scope('unit_%d' % (i + 1), values=[net]):\n            net = block.unit_fn(net, rate=1, **unit)\n    return net\n\n  def testAtrousValuesBottleneck(self):\n    \"\"\"Verify the values of dense feature extraction by atrous convolution.\n\n    Make sure that dense feature extraction by stack_blocks_dense() followed by\n    subsampling gives identical results to feature extraction at the nominal\n    network output stride using the simple self._stack_blocks_nondense() above.\n    \"\"\"\n    block = resnet_v2.resnet_v2_block\n    blocks = [\n        block('block1', base_depth=1, num_units=2, stride=2),\n        block('block2', base_depth=2, num_units=2, stride=2),\n        block('block3', base_depth=4, num_units=2, stride=2),\n        block('block4', base_depth=8, num_units=2, stride=1),\n    ]\n    nominal_stride = 8\n\n    # Test both odd and even input dimensions.\n    height = 30\n    width = 31\n    with slim.arg_scope(resnet_utils.resnet_arg_scope()):\n      with slim.arg_scope([slim.batch_norm], is_training=False):\n        for output_stride in [1, 2, 4, 8, None]:\n          with tf.Graph().as_default():\n            with self.test_session() as sess:\n              tf.set_random_seed(0)\n              inputs = create_test_input(1, height, width, 3)\n              # Dense feature extraction followed by subsampling.\n              output = resnet_utils.stack_blocks_dense(inputs,\n                                                       blocks,\n                                                       output_stride)\n              if output_stride is None:\n                factor = 1\n              else:\n                factor = nominal_stride // output_stride\n\n              output = resnet_utils.subsample(output, factor)\n              # Make the two networks use the same weights.\n              tf.get_variable_scope().reuse_variables()\n              # Feature extraction at the nominal network rate.\n              expected = self._stack_blocks_nondense(inputs, blocks)\n              sess.run(tf.global_variables_initializer())\n              output, expected = sess.run([output, expected])\n              self.assertAllClose(output, expected, atol=1e-4, rtol=1e-4)\n\n\nclass ResnetCompleteNetworkTest(tf.test.TestCase):\n  \"\"\"Tests with complete small ResNet v2 networks.\"\"\"\n\n  def _resnet_small(self,\n                    inputs,\n                    num_classes=None,\n                    is_training=True,\n                    global_pool=True,\n                    output_stride=None,\n                    include_root_block=True,\n                    spatial_squeeze=True,\n                    reuse=None,\n                    scope='resnet_v2_small'):\n    \"\"\"A shallow and thin ResNet v2 for faster tests.\"\"\"\n    block = resnet_v2.resnet_v2_block\n    blocks = [\n        block('block1', base_depth=1, num_units=3, stride=2),\n        block('block2', base_depth=2, num_units=3, stride=2),\n        block('block3', base_depth=4, num_units=3, stride=2),\n        block('block4', base_depth=8, num_units=2, stride=1),\n    ]\n    return resnet_v2.resnet_v2(inputs, blocks, num_classes,\n                               is_training=is_training,\n                               global_pool=global_pool,\n                               output_stride=output_stride,\n                               include_root_block=include_root_block,\n                               spatial_squeeze=spatial_squeeze,\n                               reuse=reuse,\n                               scope=scope)\n\n  def testClassificationEndPoints(self):\n    global_pool = True\n    num_classes = 10\n    inputs = create_test_input(2, 224, 224, 3)\n    with slim.arg_scope(resnet_utils.resnet_arg_scope()):\n      logits, end_points = self._resnet_small(inputs, num_classes,\n                                              global_pool=global_pool,\n                                              spatial_squeeze=False,\n                                              scope='resnet')\n    self.assertTrue(logits.op.name.startswith('resnet/logits'))\n    self.assertListEqual(logits.get_shape().as_list(), [2, 1, 1, num_classes])\n    self.assertTrue('predictions' in end_points)\n    self.assertListEqual(end_points['predictions'].get_shape().as_list(),\n                         [2, 1, 1, num_classes])\n\n  def testClassificationShapes(self):\n    global_pool = True\n    num_classes = 10\n    inputs = create_test_input(2, 224, 224, 3)\n    with slim.arg_scope(resnet_utils.resnet_arg_scope()):\n      _, end_points = self._resnet_small(inputs, num_classes,\n                                         global_pool=global_pool,\n                                         scope='resnet')\n      endpoint_to_shape = {\n          'resnet/block1': [2, 28, 28, 4],\n          'resnet/block2': [2, 14, 14, 8],\n          'resnet/block3': [2, 7, 7, 16],\n          'resnet/block4': [2, 7, 7, 32]}\n      for endpoint in endpoint_to_shape:\n        shape = endpoint_to_shape[endpoint]\n        self.assertListEqual(end_points[endpoint].get_shape().as_list(), shape)\n\n  def testFullyConvolutionalEndpointShapes(self):\n    global_pool = False\n    num_classes = 10\n    inputs = create_test_input(2, 321, 321, 3)\n    with slim.arg_scope(resnet_utils.resnet_arg_scope()):\n      _, end_points = self._resnet_small(inputs, num_classes,\n                                         global_pool=global_pool,\n                                         spatial_squeeze=False,\n                                         scope='resnet')\n      endpoint_to_shape = {\n          'resnet/block1': [2, 41, 41, 4],\n          'resnet/block2': [2, 21, 21, 8],\n          'resnet/block3': [2, 11, 11, 16],\n          'resnet/block4': [2, 11, 11, 32]}\n      for endpoint in endpoint_to_shape:\n        shape = endpoint_to_shape[endpoint]\n        self.assertListEqual(end_points[endpoint].get_shape().as_list(), shape)\n\n  def testRootlessFullyConvolutionalEndpointShapes(self):\n    global_pool = False\n    num_classes = 10\n    inputs = create_test_input(2, 128, 128, 3)\n    with slim.arg_scope(resnet_utils.resnet_arg_scope()):\n      _, end_points = self._resnet_small(inputs, num_classes,\n                                         global_pool=global_pool,\n                                         include_root_block=False,\n                                         spatial_squeeze=False,\n                                         scope='resnet')\n      endpoint_to_shape = {\n          'resnet/block1': [2, 64, 64, 4],\n          'resnet/block2': [2, 32, 32, 8],\n          'resnet/block3': [2, 16, 16, 16],\n          'resnet/block4': [2, 16, 16, 32]}\n      for endpoint in endpoint_to_shape:\n        shape = endpoint_to_shape[endpoint]\n        self.assertListEqual(end_points[endpoint].get_shape().as_list(), shape)\n\n  def testAtrousFullyConvolutionalEndpointShapes(self):\n    global_pool = False\n    num_classes = 10\n    output_stride = 8\n    inputs = create_test_input(2, 321, 321, 3)\n    with slim.arg_scope(resnet_utils.resnet_arg_scope()):\n      _, end_points = self._resnet_small(inputs,\n                                         num_classes,\n                                         global_pool=global_pool,\n                                         output_stride=output_stride,\n                                         spatial_squeeze=False,\n                                         scope='resnet')\n      endpoint_to_shape = {\n          'resnet/block1': [2, 41, 41, 4],\n          'resnet/block2': [2, 41, 41, 8],\n          'resnet/block3': [2, 41, 41, 16],\n          'resnet/block4': [2, 41, 41, 32]}\n      for endpoint in endpoint_to_shape:\n        shape = endpoint_to_shape[endpoint]\n        self.assertListEqual(end_points[endpoint].get_shape().as_list(), shape)\n\n  def testAtrousFullyConvolutionalValues(self):\n    \"\"\"Verify dense feature extraction with atrous convolution.\"\"\"\n    nominal_stride = 32\n    for output_stride in [4, 8, 16, 32, None]:\n      with slim.arg_scope(resnet_utils.resnet_arg_scope()):\n        with tf.Graph().as_default():\n          with self.test_session() as sess:\n            tf.set_random_seed(0)\n            inputs = create_test_input(2, 81, 81, 3)\n            # Dense feature extraction followed by subsampling.\n            output, _ = self._resnet_small(inputs, None,\n                                           is_training=False,\n                                           global_pool=False,\n                                           output_stride=output_stride)\n            if output_stride is None:\n              factor = 1\n            else:\n              factor = nominal_stride // output_stride\n            output = resnet_utils.subsample(output, factor)\n            # Make the two networks use the same weights.\n            tf.get_variable_scope().reuse_variables()\n            # Feature extraction at the nominal network rate.\n            expected, _ = self._resnet_small(inputs, None,\n                                             is_training=False,\n                                             global_pool=False)\n            sess.run(tf.global_variables_initializer())\n            self.assertAllClose(output.eval(), expected.eval(),\n                                atol=1e-4, rtol=1e-4)\n\n  def testUnknownBatchSize(self):\n    batch = 2\n    height, width = 65, 65\n    global_pool = True\n    num_classes = 10\n    inputs = create_test_input(None, height, width, 3)\n    with slim.arg_scope(resnet_utils.resnet_arg_scope()):\n      logits, _ = self._resnet_small(inputs, num_classes,\n                                     global_pool=global_pool,\n                                     spatial_squeeze=False,\n                                     scope='resnet')\n    self.assertTrue(logits.op.name.startswith('resnet/logits'))\n    self.assertListEqual(logits.get_shape().as_list(),\n                         [None, 1, 1, num_classes])\n    images = create_test_input(batch, height, width, 3)\n    with self.test_session() as sess:\n      sess.run(tf.global_variables_initializer())\n      output = sess.run(logits, {inputs: images.eval()})\n      self.assertEqual(output.shape, (batch, 1, 1, num_classes))\n\n  def testFullyConvolutionalUnknownHeightWidth(self):\n    batch = 2\n    height, width = 65, 65\n    global_pool = False\n    inputs = create_test_input(batch, None, None, 3)\n    with slim.arg_scope(resnet_utils.resnet_arg_scope()):\n      output, _ = self._resnet_small(inputs, None,\n                                     global_pool=global_pool)\n    self.assertListEqual(output.get_shape().as_list(),\n                         [batch, None, None, 32])\n    images = create_test_input(batch, height, width, 3)\n    with self.test_session() as sess:\n      sess.run(tf.global_variables_initializer())\n      output = sess.run(output, {inputs: images.eval()})\n      self.assertEqual(output.shape, (batch, 3, 3, 32))\n\n  def testAtrousFullyConvolutionalUnknownHeightWidth(self):\n    batch = 2\n    height, width = 65, 65\n    global_pool = False\n    output_stride = 8\n    inputs = create_test_input(batch, None, None, 3)\n    with slim.arg_scope(resnet_utils.resnet_arg_scope()):\n      output, _ = self._resnet_small(inputs,\n                                     None,\n                                     global_pool=global_pool,\n                                     output_stride=output_stride)\n    self.assertListEqual(output.get_shape().as_list(),\n                         [batch, None, None, 32])\n    images = create_test_input(batch, height, width, 3)\n    with self.test_session() as sess:\n      sess.run(tf.global_variables_initializer())\n      output = sess.run(output, {inputs: images.eval()})\n      self.assertEqual(output.shape, (batch, 9, 9, 32))\n\n\nif __name__ == '__main__':\n  tf.test.main()"
  },
  {
    "path": "preprocessing/__init__.py",
    "content": ""
  },
  {
    "path": "preprocessing/decode_example.py",
    "content": "from __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport tensorflow as tf\n\ndef decode_serialized_example(serialized_example, features_to_fetch, decode_image=True):\n    \"\"\"\n    Args:\n        serialized_example : A tfrecord example\n        features_to_fetch : a list of tuples (feature key, name for feature)\n    Returns:\n        dictionary : maps name to parsed example\n    \"\"\"\n\n    feature_map = {}\n    for feature_key, feature_name in features_to_fetch:\n        feature_map[feature_key] = {\n            'image/height': tf.FixedLenFeature([], tf.int64),\n            'image/width': tf.FixedLenFeature([], tf.int64),\n            'image/colorspace': tf.FixedLenFeature([], tf.string),\n            'image/channels': tf.FixedLenFeature([], tf.int64),\n            'image/format': tf.FixedLenFeature([], tf.string),\n            'image/filename': tf.FixedLenFeature([], tf.string),\n            'image/id': tf.FixedLenFeature([], tf.string),\n            'image/encoded': tf.FixedLenFeature([], tf.string),\n            'image/extra': tf.FixedLenFeature([], tf.string),\n            'image/class/label': tf.FixedLenFeature([], tf.int64),\n            'image/class/text': tf.FixedLenFeature([], tf.string),\n            'image/class/conf':  tf.FixedLenFeature([], tf.float32),\n            'image/object/bbox/xmin': tf.VarLenFeature(dtype=tf.float32),\n            'image/object/bbox/xmax': tf.VarLenFeature(dtype=tf.float32),\n            'image/object/bbox/ymin': tf.VarLenFeature(dtype=tf.float32),\n            'image/object/bbox/ymax': tf.VarLenFeature(dtype=tf.float32),\n            'image/object/bbox/label': tf.VarLenFeature(dtype=tf.int64),\n            'image/object/bbox/text': tf.VarLenFeature(dtype=tf.string),\n            'image/object/bbox/conf': tf.VarLenFeature(dtype=tf.float32),\n            'image/object/bbox/score' : tf.VarLenFeature(dtype=tf.float32),\n            'image/object/parts/x' : tf.VarLenFeature(dtype=tf.float32),\n            'image/object/parts/y' : tf.VarLenFeature(dtype=tf.float32),\n            'image/object/parts/v' : tf.VarLenFeature(dtype=tf.int64),\n            'image/object/parts/score' : tf.VarLenFeature(dtype=tf.float32),\n            'image/object/count' : tf.FixedLenFeature([], tf.int64),\n            'image/object/area' : tf.VarLenFeature(dtype=tf.float32),\n            'image/object/id' : tf.VarLenFeature(dtype=tf.string)\n        }[feature_key]\n\n    features = tf.parse_single_example(\n      serialized_example,\n      features = feature_map\n    )\n\n    # return a dictionary of the features\n    parsed_features = {}\n\n    for feature_key, feature_name in features_to_fetch:\n        if feature_key == 'image/height':\n            parsed_features[feature_name] = features[feature_key]\n        elif feature_key == 'image/width':\n            parsed_features[feature_name] = features[feature_key]\n        elif feature_key == 'image/colorspace':\n            parsed_features[feature_name] = features[feature_key]\n        elif feature_key == 'image/channels':\n            parsed_features[feature_name] = features[feature_key]\n        elif feature_key == 'image/format':\n            parsed_features[feature_name] = features[feature_key]\n        elif feature_key == 'image/filename':\n            parsed_features[feature_name] = features[feature_key]\n        elif feature_key == 'image/id':\n            parsed_features[feature_name] = features[feature_key]\n        elif feature_key == 'image/encoded':\n            if decode_image:\n                parsed_features[feature_name] = tf.image.decode_jpeg(features[feature_key], channels=3)\n            else:\n                parsed_features[feature_name] = features[feature_key]\n        elif feature_key == 'image/extra':\n            parsed_features[feature_name] = features[feature_key]\n        elif feature_key == 'image/class/label':\n            parsed_features[feature_name] = features[feature_key]\n        elif feature_key == 'image/class/text':\n            parsed_features[feature_name] = features[feature_key]\n        elif feature_key == 'image/class/conf':\n            parsed_features[feature_name] = features[feature_key]\n        elif feature_key == 'image/object/bbox/xmin':\n            parsed_features[feature_name] = features[feature_key].values\n        elif feature_key == 'image/object/bbox/xmax':\n            parsed_features[feature_name] = features[feature_key].values\n        elif feature_key == 'image/object/bbox/ymin':\n            parsed_features[feature_name] = features[feature_key].values\n        elif feature_key == 'image/object/bbox/ymax':\n            parsed_features[feature_name] = features[feature_key].values\n        elif feature_key == 'image/object/bbox/label':\n            parsed_features[feature_name] = features[feature_key].values\n        elif feature_key == 'image/object/bbox/text':\n            parsed_features[feature_name] = features[feature_key].values\n        elif feature_key == 'image/object/bbox/conf':\n            parsed_features[feature_name] = features[feature_key].values\n        elif feature_key == 'image/object/bbox/score' :\n            parsed_features[feature_name] = features[feature_key].values\n        elif feature_key == 'image/object/parts/x' :\n            parsed_features[feature_name] = features[feature_key].values\n        elif feature_key == 'image/object/parts/y' :\n            parsed_features[feature_name] = features[feature_key].values\n        elif feature_key == 'image/object/parts/v' :\n            parsed_features[feature_name] = features[feature_key].values\n        elif feature_key == 'image/object/parts/score' :\n            parsed_features[feature_name] = features[feature_key].values\n        elif feature_key == 'image/object/count' :\n            parsed_features[feature_name] = features[feature_key]\n        elif feature_key == 'image/object/area' :\n            parsed_features[feature_name] = features[feature_key].values\n        elif feature_key == 'image/object/id' :\n            parsed_features[feature_name] = features[feature_key].values\n\n    return parsed_features"
  },
  {
    "path": "preprocessing/inputs.py",
    "content": "# Some of this code came from the https://github.com/tensorflow/models/tree/master/slim\n# directory, so lets keep the Google license around for now.\n#\n# Copyright 2016 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\n\"\"\"Provides utilities to preprocess images for the Inception networks.\"\"\"\n\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nfrom easydict import EasyDict\nimport tensorflow as tf\nfrom tensorflow.python.ops import control_flow_ops\n\nfrom preprocessing.decode_example import decode_serialized_example\n\n\n\ndef apply_with_random_selector(x, func, num_cases):\n  \"\"\"Computes func(x, sel), with sel sampled from [0...num_cases-1].\n  Args:\n    x: input Tensor.\n    func: Python function to apply.\n    num_cases: Python int32, number of cases to sample sel from.\n  Returns:\n    The result of func(x, sel), where func receives the value of the\n    selector as a python integer, but sel is sampled dynamically.\n  \"\"\"\n  sel = tf.random_uniform([], maxval=num_cases, dtype=tf.int32)\n  # Pass the real x only to one of the func calls.\n  return control_flow_ops.merge([\n      func(control_flow_ops.switch(x, tf.equal(sel, case))[1], case)\n      for case in range(num_cases)])[0]\n\n\ndef distort_color(image, color_ordering=0, fast_mode=True, scope=None):\n  \"\"\"Distort the color of a Tensor image.\n  Each color distortion is non-commutative and thus ordering of the color ops\n  matters. Ideally we would randomly permute the ordering of the color ops.\n  Rather then adding that level of complication, we select a distinct ordering\n  of color ops for each preprocessing thread.\n  Args:\n    image: 3-D Tensor containing single image in [0, 1].\n    color_ordering: Python int, a type of distortion (valid values: 0-3).\n    fast_mode: Avoids slower ops (random_hue and random_contrast)\n    scope: Optional scope for name_scope.\n  Returns:\n    3-D Tensor color-distorted image on range [0, 1]\n  Raises:\n    ValueError: if color_ordering not in [0, 3]\n  \"\"\"\n  with tf.name_scope(scope, 'distort_color', [image]):\n    if fast_mode:\n      if color_ordering == 0:\n        image = tf.image.random_brightness(image, max_delta=32. / 255.)\n        image = tf.image.random_saturation(image, lower=0.5, upper=1.5)\n      else:\n        image = tf.image.random_saturation(image, lower=0.5, upper=1.5)\n        image = tf.image.random_brightness(image, max_delta=32. / 255.)\n    else:\n      if color_ordering == 0:\n        image = tf.image.random_brightness(image, max_delta=32. / 255.)\n        image = tf.image.random_saturation(image, lower=0.5, upper=1.5)\n        image = tf.image.random_hue(image, max_delta=0.2)\n        image = tf.image.random_contrast(image, lower=0.5, upper=1.5)\n      elif color_ordering == 1:\n        image = tf.image.random_saturation(image, lower=0.5, upper=1.5)\n        image = tf.image.random_brightness(image, max_delta=32. / 255.)\n        image = tf.image.random_contrast(image, lower=0.5, upper=1.5)\n        image = tf.image.random_hue(image, max_delta=0.2)\n      elif color_ordering == 2:\n        image = tf.image.random_contrast(image, lower=0.5, upper=1.5)\n        image = tf.image.random_hue(image, max_delta=0.2)\n        image = tf.image.random_brightness(image, max_delta=32. / 255.)\n        image = tf.image.random_saturation(image, lower=0.5, upper=1.5)\n      elif color_ordering == 3:\n        image = tf.image.random_hue(image, max_delta=0.2)\n        image = tf.image.random_saturation(image, lower=0.5, upper=1.5)\n        image = tf.image.random_contrast(image, lower=0.5, upper=1.5)\n        image = tf.image.random_brightness(image, max_delta=32. / 255.)\n      else:\n        raise ValueError('color_ordering must be in [0, 3]')\n\n    # The random_* ops do not necessarily clamp.\n    return tf.clip_by_value(image, 0.0, 1.0)\n\ndef distorted_bounding_box_crop(image,\n                                bbox,\n                                min_object_covered=0.1,\n                                aspect_ratio_range=(0.75, 1.33),\n                                area_range=(0.05, 1.0),\n                                max_attempts=100,\n                                scope=None):\n  \"\"\"Generates cropped_image using a one of the bboxes randomly distorted.\n  See `tf.image.sample_distorted_bounding_box` for more documentation.\n  Args:\n    image: 3-D Tensor of image (it will be converted to floats in [0, 1]).\n    bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]\n      where each coordinate is [0, 1) and the coordinates are arranged\n      as [ymin, xmin, ymax, xmax]. If num_boxes is 0 then it would use the whole\n      image.\n    min_object_covered: An optional `float`. Defaults to `0.1`. The cropped\n      area of the image must contain at least this fraction of any bounding box\n      supplied.\n    aspect_ratio_range: An optional list of `floats`. The cropped area of the\n      image must have an aspect ratio = width / height within this range.\n    area_range: An optional list of `floats`. The cropped area of the image\n      must contain a fraction of the supplied image within in this range.\n    max_attempts: An optional `int`. Number of attempts at generating a cropped\n      region of the image of the specified constraints. After `max_attempts`\n      failures, return the entire image.\n    scope: Optional scope for name_scope.\n  Returns:\n    A tuple, a 3-D Tensor cropped_image and the distorted bbox\n  \"\"\"\n  with tf.name_scope(scope, 'distorted_bounding_box_crop', [image, bbox]):\n    # Each bounding box has shape [1, num_boxes, box coords] and\n    # the coordinates are ordered [ymin, xmin, ymax, xmax].\n\n    # A large fraction of image datasets contain a human-annotated bounding\n    # box delineating the region of the image containing the object of interest.\n    # We choose to create a new bounding box for the object which is a randomly\n    # distorted version of the human-annotated bounding box that obeys an\n    # allowed range of aspect ratios, sizes and overlap with the human-annotated\n    # bounding box. If no box is supplied, then we assume the bounding box is\n    # the entire image.\n    sample_distorted_bounding_box = tf.image.sample_distorted_bounding_box(\n        tf.shape(image),\n        bounding_boxes=bbox,\n        min_object_covered=min_object_covered,\n        aspect_ratio_range=aspect_ratio_range,\n        area_range=area_range,\n        max_attempts=max_attempts,\n        use_image_if_no_bounding_boxes=True)\n    bbox_begin, bbox_size, distort_bbox = sample_distorted_bounding_box\n\n    # Crop the image to the specified bounding box.\n    cropped_image = tf.slice(image, bbox_begin, bbox_size)\n    return tf.tuple([cropped_image, distort_bbox])\n\ndef _largest_size_at_most(height, width, largest_side):\n  \"\"\"Computes new shape with the largest side equal to `largest_side`.\n  Computes new shape with the largest side equal to `largest_side` while\n  preserving the original aspect ratio.\n  Args:\n    height: an int32 scalar tensor indicating the current height.\n    width: an int32 scalar tensor indicating the current width.\n    largest_side: A python integer or scalar `Tensor` indicating the size of\n      the largest side after resize.\n  Returns:\n    new_height: an int32 scalar tensor indicating the new height.\n    new_width: and int32 scalar tensor indicating the new width.\n  \"\"\"\n  largest_side = tf.convert_to_tensor(largest_side, dtype=tf.int32)\n\n  height = tf.to_float(height)\n  width = tf.to_float(width)\n  largest_side = tf.to_float(largest_side)\n\n  scale = tf.cond(tf.greater(height, width),\n                  lambda: largest_side / height,\n                  lambda: largest_side / width)\n  new_height = tf.to_int32(height * scale)\n  new_width = tf.to_int32(width * scale)\n  return new_height, new_width\n\nclass DistortedInputs():\n\n    def __init__(self, cfg, add_summaries):\n        self.cfg = cfg\n        self.add_summaries = add_summaries\n\n    def apply(self, original_image, bboxes, distorted_inputs, image_summaries, current_index):\n\n        cfg = self.cfg\n        add_summaries = self.add_summaries\n\n        image_shape = tf.shape(original_image)\n        image_height = tf.cast(image_shape[0], dtype=tf.float32) # cast so that we can multiply them by the bbox coords\n        image_width = tf.cast(image_shape[1], dtype=tf.float32)\n\n        # First thing we need to do is crop out the bbox region from the image\n        bbox = bboxes[current_index]\n        xmin = tf.cast(bbox[0] * image_width, tf.int32)\n        ymin = tf.cast(bbox[1] * image_height, tf.int32)\n        xmax = tf.cast(bbox[2] * image_width, tf.int32)\n        ymax = tf.cast(bbox[3] * image_height, tf.int32)\n        bbox_width = xmax - xmin\n        bbox_height = ymax - ymin\n\n        image = tf.image.crop_to_bounding_box(\n            image=original_image,\n            offset_height=ymin,\n            offset_width=xmin,\n            target_height=bbox_height,\n            target_width=bbox_width\n        )\n        image_height = bbox_height\n        image_width = bbox_width\n\n        # Convert the pixel values to be in the range [0,1]\n        if image.dtype != tf.float32:\n          image = tf.image.convert_image_dtype(image, dtype=tf.float32)\n\n        # Add a summary of the original data\n        if add_summaries:\n            new_height, new_width = _largest_size_at_most(image_height, image_width, cfg.INPUT_SIZE)\n            resized_original_image = tf.image.resize_bilinear(tf.expand_dims(image, 0), [new_height, new_width])\n            resized_original_image = tf.squeeze(resized_original_image)\n            resized_original_image = tf.image.pad_to_bounding_box(resized_original_image, 0, 0, cfg.INPUT_SIZE, cfg.INPUT_SIZE)\n\n            # If there are multiple boxes for an image, we only want to write to the TensorArray once.\n            #image_summaries = image_summaries.write(0, tf.expand_dims(resized_original_image, 0))\n            image_summaries = tf.cond(tf.equal(current_index, 0),\n                lambda: image_summaries.write(0, tf.expand_dims(resized_original_image, 0)),\n                lambda: image_summaries.identity()\n            )\n\n        # Extract a distorted bbox\n        if cfg.DO_RANDOM_CROP > 0:\n            r = tf.random_uniform([], minval=0, maxval=1, dtype=tf.float32)\n            do_crop = tf.less(r, cfg.DO_RANDOM_CROP)\n            rc_cfg = cfg.RANDOM_CROP_CFG\n            bbox = tf.constant([0.0, 0.0, 1.0, 1.0], dtype=tf.float32, shape=[1, 1, 4])\n            distorted_image, distorted_bbox = tf.cond(do_crop,\n                    lambda: distorted_bounding_box_crop(image, bbox,\n                                                           aspect_ratio_range=(rc_cfg.MIN_ASPECT_RATIO, rc_cfg.MAX_ASPECT_RATIO),\n                                                           area_range=(rc_cfg.MIN_AREA, rc_cfg.MAX_AREA),\n                                                           max_attempts=rc_cfg.MAX_ATTEMPTS),\n                    lambda: tf.tuple([image, bbox])\n                )\n        else:\n            distorted_image = tf.identity(image)\n            distorted_bbox = tf.constant([[[0.0, 0.0, 1.0, 1.0]]]) # ymin, xmin, ymax, xmax\n\n        if cfg.DO_CENTRAL_CROP > 0:\n            r = tf.random_uniform([], minval=0, maxval=1, dtype=tf.float32)\n            do_crop = tf.less(r, cfg.DO_CENTRAL_CROP)\n            distorted_image = tf.cond(do_crop,\n                lambda: tf.image.central_crop(distorted_image, cfg.CENTRAL_CROP_FRACTION),\n                lambda: tf.identity(distorted_image)\n            )\n\n        distorted_image.set_shape([None, None, 3])\n\n        # Add a summary\n        if add_summaries:\n            image_with_bbox = tf.image.draw_bounding_boxes(tf.expand_dims(image, 0), distorted_bbox)\n            new_height, new_width = _largest_size_at_most(image_height, image_width, cfg.INPUT_SIZE)\n            resized_image_with_bbox = tf.image.resize_bilinear(image_with_bbox, [new_height, new_width])\n            resized_image_with_bbox = tf.squeeze(resized_image_with_bbox)\n            resized_image_with_bbox = tf.image.pad_to_bounding_box(resized_image_with_bbox, 0, 0, cfg.INPUT_SIZE, cfg.INPUT_SIZE)\n            #image_summaries = image_summaries.write(1, tf.expand_dims(resized_image_with_bbox, 0))\n            image_summaries = tf.cond(tf.equal(current_index, 0),\n                lambda: image_summaries.write(1, tf.expand_dims(resized_image_with_bbox, 0)),\n                lambda: image_summaries.identity()\n            )\n\n        # Resize the distorted image to the correct dimensions for the network\n        if cfg.MAINTAIN_ASPECT_RATIO:\n            shape = tf.shape(distorted_image)\n            height = shape[0]\n            width = shape[1]\n            new_height, new_width = _largest_size_at_most(height, width, cfg.INPUT_SIZE)\n        else:\n            new_height = cfg.INPUT_SIZE\n            new_width = cfg.INPUT_SIZE\n\n        num_resize_cases = 1 if cfg.RESIZE_FAST else 4\n        distorted_image = apply_with_random_selector(\n            distorted_image,\n            lambda x, method: tf.image.resize_images(x, [new_height, new_width], method=method),\n            num_cases=num_resize_cases)\n\n        distorted_image = tf.image.pad_to_bounding_box(distorted_image, 0, 0, cfg.INPUT_SIZE, cfg.INPUT_SIZE)\n\n        if add_summaries:\n            #image_summaries = image_summaries.write(2, tf.expand_dims(distorted_image, 0))\n            image_summaries = tf.cond(tf.equal(current_index, 0),\n                lambda: image_summaries.write(2, tf.expand_dims(distorted_image, 0)),\n                lambda: image_summaries.identity()\n            )\n\n        # Randomly flip the image:\n        if cfg.DO_RANDOM_FLIP_LEFT_RIGHT > 0:\n          r = tf.random_uniform([], minval=0, maxval=1, dtype=tf.float32)\n          do_flip = tf.less(r, 0.5)\n          distorted_image = tf.cond(do_flip, lambda: tf.image.flip_left_right(distorted_image), lambda: tf.identity(distorted_image))\n\n        # TODO: Can this be changed so that we don't always distort the colors?\n        # Distort the colors\n        if cfg.DO_COLOR_DISTORTION > 0:\n            r = tf.random_uniform([], minval=0, maxval=1, dtype=tf.float32)\n            do_color_distortion = tf.less(r, cfg.DO_COLOR_DISTORTION)\n            num_color_cases = 1 if cfg.COLOR_DISTORT_FAST else 4\n            distorted_color_image = apply_with_random_selector(\n              distorted_image,\n              lambda x, ordering: distort_color(x, ordering, fast_mode=cfg.COLOR_DISTORT_FAST),\n              num_cases=num_color_cases)\n            distorted_image = tf.cond(do_color_distortion, lambda: tf.identity(distorted_color_image), lambda: tf.identity(distorted_image))\n\n        distorted_image.set_shape([cfg.INPUT_SIZE, cfg.INPUT_SIZE, 3])\n\n        # Add a summary\n        if add_summaries:\n            #image_summaries = image_summaries.write(3, tf.expand_dims(distorted_image, 0))\n            image_summaries = tf.cond(tf.equal(current_index, 0),\n                lambda: image_summaries.write(3, tf.expand_dims(distorted_image, 0)),\n                lambda: image_summaries.identity()\n            )\n\n        # Add the distorted image to the TensorArray\n        distorted_inputs = distorted_inputs.write(current_index, tf.expand_dims(distorted_image, 0))\n\n        return [original_image, bboxes, distorted_inputs, image_summaries, current_index + 1]\n\ndef check_normalized_box_values(xmin, ymin, xmax, ymax, maximum_normalized_coordinate=1.01, prefix=\"\"):\n    \"\"\" Make sure the normalized coordinates are less than 1\n    \"\"\"\n\n    xmin_maximum = tf.reduce_max(xmin)\n    xmin_assert = tf.Assert(\n        tf.greater_equal(1.01, xmin_maximum),\n        ['%s, maximum xmin coordinate value is larger '\n         'than %f: ' % (prefix, maximum_normalized_coordinate), xmin_maximum])\n    with tf.control_dependencies([xmin_assert]):\n        xmin = tf.identity(xmin)\n\n    ymin_maximum = tf.reduce_max(ymin)\n    ymin_assert = tf.Assert(\n        tf.greater_equal(1.01, ymin_maximum),\n        ['%s, maximum ymin coordinate value is larger '\n        'than %f: ' % (prefix, maximum_normalized_coordinate), ymin_maximum])\n    with tf.control_dependencies([ymin_assert]):\n        ymin = tf.identity(ymin)\n\n    xmax_maximum = tf.reduce_max(xmax)\n    xmax_assert = tf.Assert(\n        tf.greater_equal(1.01, xmax_maximum),\n        ['%s, maximum xmax coordinate value is larger '\n        'than %f: ' % (prefix, maximum_normalized_coordinate), xmax_maximum])\n    with tf.control_dependencies([xmax_assert]):\n        xmax = tf.identity(xmax)\n\n    ymax_maximum = tf.reduce_max(ymax)\n    ymax_assert = tf.Assert(\n        tf.greater_equal(1.01, ymax_maximum),\n        ['%s, maximum ymax coordinate value is larger '\n        'than %f: ' % (prefix, maximum_normalized_coordinate), ymax_maximum])\n    with tf.control_dependencies([ymax_assert]):\n        ymax = tf.identity(ymax)\n\n    return xmin, ymin, xmax, ymax\n\ndef expand_bboxes(xmin, xmax, ymin, ymax, cfg):\n    \"\"\"\n    Expand the bboxes.\n    \"\"\"\n\n    w = xmax - xmin\n    h = ymax - ymin\n\n    w = w * cfg.WIDTH_EXPANSION_FACTOR\n    h = h * cfg.HEIGHT_EXPANSION_FACTOR\n\n    half_w = w / 2.\n    half_h = h / 2.\n\n    xmin = tf.clip_by_value(xmin - half_w, 0, 1)\n    xmax = tf.clip_by_value(xmax + half_w, 0, 1)\n    ymin = tf.clip_by_value(ymin - half_h, 0, 1)\n    ymax = tf.clip_by_value(ymax + half_h, 0, 1)\n\n    return tf.tuple([xmin, xmax, ymin, ymax])\n\ndef get_region_data(serialized_example, cfg, fetch_ids=True, fetch_labels=True, fetch_text_labels=True, read_filename=False):\n    \"\"\"\n    Return the image, an array of bounding boxes, and an array of ids.\n    \"\"\"\n\n    feature_dict = {}\n\n    if cfg.REGION_TYPE == 'bbox':\n\n        bbox_cfg = cfg.BBOX_CFG\n\n        features_to_extract = [('image/object/bbox/xmin', 'xmin'),\n                               ('image/object/bbox/xmax', 'xmax'),\n                               ('image/object/bbox/ymin', 'ymin'),\n                               ('image/object/bbox/ymax', 'ymax'),\n                               ('image/object/bbox/ymax', 'ymax')]\n\n        if read_filename:\n            features_to_extract.append(('image/filename', 'filename'))\n        else:\n            features_to_extract.append(('image/encoded', 'image'))\n\n        if fetch_ids:\n            features_to_extract.append(('image/object/id', 'id'))\n\n        if fetch_labels:\n            features_to_extract.append(('image/object/bbox/label', 'label'))\n\n        if fetch_text_labels:\n            features_to_extract.append(('image/object/bbox/text', 'text'))\n\n        features = decode_serialized_example(serialized_example, features_to_extract)\n\n        if read_filename:\n            image_buffer = tf.read_file(features['filename'])\n            image = tf.image.decode_jpeg(image_buffer, channels=3)\n        else:\n            image = features['image']\n\n        feature_dict['image'] = image\n\n        xmin = tf.expand_dims(features['xmin'], 0)\n        ymin = tf.expand_dims(features['ymin'], 0)\n        xmax = tf.expand_dims(features['xmax'], 0)\n        ymax = tf.expand_dims(features['ymax'], 0)\n\n        xmin, ymin, xmax, ymax = check_normalized_box_values(xmin, ymin, xmax, ymax, prefix=\"From tfrecords \")\n\n        if 'DO_EXPANSION' in bbox_cfg and bbox_cfg.DO_EXPANSION > 0:\n            r = tf.random_uniform([], minval=0, maxval=1, dtype=tf.float32)\n            do_expansion = tf.less(r, bbox_cfg.DO_EXPANSION)\n            xmin, xmax, ymin, ymax = tf.cond(do_expansion,\n                lambda: expand_bboxes(xmin, xmax, ymin, ymax, bbox_cfg.EXPANSION_CFG),\n                lambda: tf.tuple([xmin, xmax, ymin, ymax])\n            )\n\n            xmin, ymin, xmax, ymax = check_normalized_box_values(xmin, ymin, xmax, ymax, prefix=\"After expansion \")\n\n        # combine the bounding boxes\n        bboxes = tf.concat(values=[xmin, ymin, xmax, ymax], axis=0)\n        # order the bboxes so that they have the shape: [num_bboxes, bbox_coords]\n        bboxes = tf.transpose(bboxes, [1, 0])\n\n        feature_dict['bboxes'] = bboxes\n\n        if fetch_ids:\n            ids = features['id']\n            feature_dict['ids'] = ids\n\n        if fetch_labels:\n            labels = features['label']\n            feature_dict['labels'] = labels\n\n        if fetch_text_labels:\n            text = features['text']\n            feature_dict['text'] = text\n\n    elif cfg.REGION_TYPE == 'image':\n\n        features_to_extract = []\n\n        if read_filename:\n            features_to_extract.append(('image/filename', 'filename'))\n        else:\n            features_to_extract.append(('image/encoded', 'image'))\n\n        if fetch_ids:\n            features_to_extract.append(('image/id', 'id'))\n\n        if fetch_labels:\n            features_to_extract.append(('image/class/label', 'label'))\n\n        if fetch_text_labels:\n            features_to_extract.append(('image/class/text', 'text'))\n\n        features = decode_serialized_example(serialized_example, features_to_extract)\n\n        if read_filename:\n            image_buffer = tf.read_file(features['filename'])\n            image = tf.image.decode_jpeg(image_buffer, channels=3)\n        else:\n            image = features['image']\n\n        feature_dict['image'] = image\n\n        bboxes = tf.constant([[0.0, 0.0, 1.0, 1.0]])\n        feature_dict['bboxes'] = bboxes\n\n        if fetch_ids:\n            ids = [features['id']]\n            feature_dict['ids'] = ids\n\n        if fetch_labels:\n            labels = [features['label']]\n            feature_dict['labels'] = labels\n\n        if fetch_text_labels:\n            text = [features['text']]\n            feature_dict['text'] = text\n\n    else:\n        raise ValueError(\"Unknown REGION_TYPE: %s\" % (cfg.REGION_TYPE,))\n\n    return feature_dict\n\ndef bbox_crop_loop_cond(original_image, bboxes, distorted_inputs, image_summaries, current_index):\n    num_bboxes = tf.shape(bboxes)[0]\n    return current_index < num_bboxes\n\ndef get_distorted_inputs(original_image, bboxes, cfg, add_summaries):\n\n    distorter = DistortedInputs(cfg, add_summaries)\n    num_bboxes = tf.shape(bboxes)[0]\n    distorted_inputs = tf.TensorArray(\n        dtype=tf.float32,\n        size=num_bboxes,\n        element_shape=tf.TensorShape([1, cfg.INPUT_SIZE, cfg.INPUT_SIZE, 3])\n    )\n\n    if add_summaries:\n        image_summaries = tf.TensorArray(\n            dtype=tf.float32,\n            size=4,\n            element_shape=tf.TensorShape([1, cfg.INPUT_SIZE, cfg.INPUT_SIZE, 3])\n        )\n    else:\n        image_summaries = tf.constant([])\n\n    current_index = tf.constant(0, dtype=tf.int32)\n\n    loop_vars = [original_image, bboxes, distorted_inputs, image_summaries, current_index]\n    original_image, bboxes, distorted_inputs, image_summaries, current_index = tf.while_loop(\n        cond=bbox_crop_loop_cond,\n        body=distorter.apply,\n        loop_vars=loop_vars,\n        parallel_iterations=10, back_prop=False, swap_memory=False\n    )\n\n    distorted_inputs = distorted_inputs.concat()\n\n    if add_summaries:\n        tf.summary.image('0.original_image', image_summaries.read(0))\n        tf.summary.image('1.image_with_random_crop', image_summaries.read(1))\n        tf.summary.image('2.cropped_resized_image', image_summaries.read(2))\n        tf.summary.image('3.final_distorted_image', image_summaries.read(3))\n\n\n    return distorted_inputs\n\ndef create_training_batch(serialized_example, cfg, add_summaries, read_filenames=False):\n\n    features = get_region_data(serialized_example, cfg, fetch_ids=False,\n                               fetch_labels=True, fetch_text_labels=False, read_filename=read_filenames)\n\n    original_image = features['image']\n    bboxes = features['bboxes']\n    labels = features['labels']\n\n    distorted_inputs = get_distorted_inputs(original_image, bboxes, cfg, add_summaries)\n\n    distorted_inputs = tf.subtract(distorted_inputs, 0.5)\n    distorted_inputs = tf.multiply(distorted_inputs, 2.0)\n\n    names = ('inputs', 'labels')\n    tensors = [distorted_inputs, labels]\n    return [names, tensors]\n\ndef create_visualization_batch(serialized_example, cfg, add_summaries, fetch_text_labels=False, read_filenames=False):\n\n    features = get_region_data(serialized_example, cfg, fetch_ids=True,\n                               fetch_labels=True, fetch_text_labels=fetch_text_labels, read_filename=read_filenames)\n\n    original_image = features['image']\n    ids = features['ids']\n    bboxes = features['bboxes']\n    labels = features['labels']\n    if fetch_text_labels:\n        text_labels = features['text']\n\n    cpy_original_image = tf.identity(original_image)\n\n    distorted_inputs = get_distorted_inputs(original_image, bboxes, cfg, add_summaries)\n\n    original_image = cpy_original_image\n\n    # Resize the original image\n    if original_image.dtype != tf.float32:\n      original_image = tf.image.convert_image_dtype(original_image, dtype=tf.float32)\n    shape = tf.shape(original_image)\n    height = shape[0]\n    width = shape[1]\n    new_height, new_width = _largest_size_at_most(height, width, cfg.INPUT_SIZE)\n    original_image = tf.image.resize_images(original_image, [new_height, new_width], method=0)\n    original_image = tf.image.pad_to_bounding_box(original_image, 0, 0, cfg.INPUT_SIZE, cfg.INPUT_SIZE)\n    original_image = tf.image.convert_image_dtype(original_image, dtype=tf.uint8)\n\n    # make a copy of the original image for each bounding box\n    num_bboxes = tf.shape(bboxes)[0]\n    expanded_original_image = tf.expand_dims(original_image, 0)\n    concatenated_original_images = tf.tile(expanded_original_image, [num_bboxes, 1, 1, 1])\n\n    names = ['original_inputs', 'inputs', 'ids', 'labels']\n    tensors = [concatenated_original_images, distorted_inputs, ids, labels]\n\n    if fetch_text_labels:\n        names.append('text_labels')\n        tensors.append(text_labels)\n\n    return [names, tensors]\n\ndef create_classification_batch(serialized_example, cfg, add_summaries, read_filenames=False):\n\n    features = get_region_data(serialized_example, cfg, fetch_ids=True,\n                               fetch_labels=False, fetch_text_labels=False, read_filename=read_filenames)\n\n    original_image = features['image']\n    bboxes = features['bboxes']\n    ids = features['ids']\n\n    distorted_inputs = get_distorted_inputs(original_image, bboxes, cfg, add_summaries)\n\n    distorted_inputs = tf.subtract(distorted_inputs, 0.5)\n    distorted_inputs = tf.multiply(distorted_inputs, 2.0)\n\n    names = ('inputs', 'ids')\n    tensors = [distorted_inputs, ids]\n    return [names, tensors]\n\ndef input_nodes(tfrecords, cfg, num_epochs=None, batch_size=32, num_threads=2,\n                shuffle_batch = True, random_seed=1, capacity = 1000, min_after_dequeue = 96,\n                add_summaries=True, input_type='train', fetch_text_labels=False,\n                read_filenames=False):\n    \"\"\"\n    Args:\n        tfrecords:\n        cfg:\n        num_epochs: number of times to read the tfrecords\n        batch_size:\n        num_threads:\n        shuffle_batch:\n        capacity:\n        min_after_dequeue:\n        add_summaries: Add tensorboard summaries of the images\n        input_type: 'train', 'visualize', 'test', 'classification'\n    \"\"\"\n    with tf.name_scope('inputs'):\n\n        # A producer to generate tfrecord file paths\n        filename_queue = tf.train.string_input_producer(\n          tfrecords,\n          num_epochs=num_epochs\n        )\n\n        # Construct a Reader to read examples from the tfrecords file\n        reader = tf.TFRecordReader()\n        _, serialized_example = reader.read(filename_queue)\n\n        if input_type=='train' or input_type=='test':\n            batch_keys, data_to_batch = create_training_batch(serialized_example, cfg, add_summaries, read_filenames)\n        elif input_type=='visualize':\n            batch_keys, data_to_batch = create_visualization_batch(serialized_example, cfg, add_summaries, fetch_text_labels, read_filenames)\n        elif input_type=='classification':\n            batch_keys, data_to_batch = create_classification_batch(serialized_example, cfg, add_summaries, read_filenames)\n        else:\n            raise ValueError(\"Unknown input type: %s. Options are `train`, `test`, \" \\\n                             \"`visualize`, and `classification`.\" % (input_type,))\n\n        if shuffle_batch:\n            batch = tf.train.shuffle_batch(\n                data_to_batch,\n                batch_size=batch_size,\n                num_threads=num_threads,\n                capacity= capacity,\n                min_after_dequeue= min_after_dequeue,\n                seed = random_seed,\n                enqueue_many=True\n            )\n\n        else:\n            batch = tf.train.batch(\n                data_to_batch,\n                batch_size=batch_size,\n                num_threads=num_threads,\n                capacity= capacity,\n                enqueue_many=True\n            )\n\n        batch_dict = {k : v for k, v in zip(batch_keys, batch)}\n\n        return batch_dict"
  },
  {
    "path": "requirements.txt",
    "content": "easydict>=1.6\nmatplotlib>=2.0.0\nnumpy>=1.12.0\nPyYAML>=3.11\ntensorflow>=1.0.0"
  },
  {
    "path": "test.py",
    "content": "from __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport argparse\nimport os\n\nimport numpy as np\nimport tensorflow as tf\nimport tensorflow.contrib.slim as slim\n\nfrom config.parse_config import parse_config_file\nfrom nets import nets_factory\nfrom preprocessing import inputs\n\ndef test(tfrecords, checkpoint_path, save_dir, max_iterations, eval_interval_secs, cfg, read_images=False):\n    \"\"\"\n    Args:\n        tfrecords (list)\n        checkpoint_path (str)\n        savedir (str)\n        max_iterations (int)\n        cfg (EasyDict)\n    \"\"\"\n    tf.logging.set_verbosity(tf.logging.DEBUG)\n\n    graph = tf.Graph()\n\n    with graph.as_default():\n\n        global_step = slim.get_or_create_global_step()\n\n        with tf.device('/cpu:0'):\n            batch_dict = inputs.input_nodes(\n                tfrecords=tfrecords,\n                cfg=cfg.IMAGE_PROCESSING,\n                num_epochs=1,\n                batch_size=cfg.BATCH_SIZE,\n                num_threads=cfg.NUM_INPUT_THREADS,\n                shuffle_batch =cfg.SHUFFLE_QUEUE,\n                random_seed=cfg.RANDOM_SEED,\n                capacity=cfg.QUEUE_CAPACITY,\n                min_after_dequeue=cfg.QUEUE_MIN,\n                add_summaries=False,\n                input_type='test',\n                read_filenames=read_images\n            )\n\n            batched_one_hot_labels = slim.one_hot_encoding(batch_dict['labels'],\n                                                        num_classes=cfg.NUM_CLASSES)\n\n        arg_scope = nets_factory.arg_scopes_map[cfg.MODEL_NAME]()\n\n        with slim.arg_scope(arg_scope):\n            logits, end_points = nets_factory.networks_map[cfg.MODEL_NAME](\n                inputs=batch_dict['inputs'],\n                num_classes=cfg.NUM_CLASSES,\n                is_training=False\n            )\n\n            predictions = end_points['Predictions']\n            #labels = tf.squeeze(batch_dict['labels'])\n            labels = batch_dict['labels']\n\n            # Add the loss summary\n            loss = tf.losses.softmax_cross_entropy(\n                logits=logits, onehot_labels=batched_one_hot_labels, label_smoothing=0., weights=1.0)\n\n        if 'MOVING_AVERAGE_DECAY' in cfg and cfg.MOVING_AVERAGE_DECAY > 0:\n            variable_averages = tf.train.ExponentialMovingAverage(\n                cfg.MOVING_AVERAGE_DECAY, global_step)\n            variables_to_restore = variable_averages.variables_to_restore(\n                slim.get_model_variables())\n            variables_to_restore[global_step.op.name] = global_step\n        else:\n            variables_to_restore = slim.get_variables_to_restore()\n            variables_to_restore.append(global_step)\n\n\n        # Define the metrics:\n        metric_map = {\n            'Accuracy': tf.metrics.accuracy(labels=labels, predictions=tf.argmax(predictions, 1)),#slim.metrics.streaming_accuracy(labels=labels, predictions=tf.argmax(predictions, 1)),\n            loss.op.name : slim.metrics.streaming_mean(loss)\n        }\n        if len(cfg.ACCURACY_AT_K_METRIC) > 0:\n            bool_labels = tf.ones([cfg.BATCH_SIZE], dtype=tf.bool)\n            for k in cfg.ACCURACY_AT_K_METRIC:\n                if k <= 1 or k > cfg.NUM_CLASSES:\n                    continue\n                in_top_k = tf.nn.in_top_k(predictions=predictions, targets=labels, k=k)\n                metric_map['Accuracy_at_%s' % k] = tf.metrics.accuracy(labels=bool_labels, predictions=in_top_k)#slim.metrics.streaming_accuracy(labels=bool_labels, predictions=in_top_k)\n\n        names_to_values, names_to_updates = slim.metrics.aggregate_metric_map(metric_map)\n\n        # Print the summaries to screen.\n        print_global_step = True\n        for name, value in names_to_values.iteritems():\n            summary_name = 'eval/%s' % name\n            op = tf.summary.scalar(summary_name, value, collections=[])\n            if print_global_step:\n                op=tf.Print(op, [global_step], \"Model Step \")\n                print_global_step = False\n            op = tf.Print(op, [value], summary_name)\n            tf.add_to_collection(tf.GraphKeys.SUMMARIES, op)\n\n        if max_iterations > 0:\n            num_batches = max_iterations\n        else:\n            # This ensures that we make a single pass over all of the data.\n            # We could use ceil if the batch queue is allowed to pad the last batch\n            num_batches = np.floor(cfg.NUM_TEST_EXAMPLES / float(cfg.BATCH_SIZE))\n\n\n        sess_config = tf.ConfigProto(\n            log_device_placement=cfg.SESSION_CONFIG.LOG_DEVICE_PLACEMENT,\n            allow_soft_placement = True,\n            gpu_options = tf.GPUOptions(\n                per_process_gpu_memory_fraction=cfg.SESSION_CONFIG.PER_PROCESS_GPU_MEMORY_FRACTION\n            ),\n            intra_op_parallelism_threads=cfg.SESSION_CONFIG.INTRA_OP_PARALLELISM_THREADS if 'INTRA_OP_PARALLELISM_THREADS' in cfg.SESSION_CONFIG else None,\n            inter_op_parallelism_threads=cfg.SESSION_CONFIG.INTER_OP_PARALLELISM_THREADS if 'INTER_OP_PARALLELISM_THREADS' in cfg.SESSION_CONFIG else None\n        )\n\n        if eval_interval_secs > 0:\n\n            if not os.path.isdir(checkpoint_path):\n                raise ValueError(\"checkpoint_path should be a path to a directory when \" \\\n                                 \"evaluating in a loop.\")\n\n            slim.evaluation.evaluation_loop(\n                master='',\n                checkpoint_dir=checkpoint_path,\n                logdir=save_dir,\n                num_evals=num_batches,\n                initial_op=None,\n                initial_op_feed_dict=None,\n                eval_op=names_to_updates.values(),\n                eval_op_feed_dict=None,\n                final_op=None,\n                final_op_feed_dict=None,\n                summary_op=tf.summary.merge_all(),\n                summary_op_feed_dict=None,\n                variables_to_restore=variables_to_restore,\n                eval_interval_secs=eval_interval_secs,\n                max_number_of_evaluations=None,\n                session_config=sess_config,\n                timeout=None\n            )\n\n        else:\n            if os.path.isdir(checkpoint_path):\n                checkpoint_dir = checkpoint_path\n                checkpoint_path = tf.train.latest_checkpoint(checkpoint_dir)\n\n                if checkpoint_path is None:\n                    raise ValueError(\"Unable to find a model checkpoint in the \" \\\n                                     \"directory %s\" % (checkpoint_dir,))\n\n            tf.logging.info('Evaluating %s' % checkpoint_path)\n\n            slim.evaluation.evaluate_once(\n                master='',\n                checkpoint_path=checkpoint_path,\n                logdir=save_dir,\n                num_evals=num_batches,\n                eval_op=names_to_updates.values(),\n                variables_to_restore=variables_to_restore,\n                session_config=sess_config\n            )\n\ndef parse_args():\n\n    parser = argparse.ArgumentParser(description='Test the person classifier')\n\n    parser.add_argument('--tfrecords', dest='tfrecords',\n                        help='Paths to tfrecords.', type=str,\n                        nargs='+', required=True)\n\n    parser.add_argument('--checkpoint_path', dest='checkpoint_path',\n                          help='Path to a specific model to test against. If a directory, then the newest checkpoint file will be used.', type=str,\n                          required=True, default=None)\n\n    parser.add_argument('--save_dir', dest='savedir',\n                          help='Path to directory to store summary files.', type=str,\n                          required=True)\n\n    parser.add_argument('--config', dest='config_file',\n                        help='Path to the configuration file.',\n                        required=True, type=str)\n\n    parser.add_argument('--eval_interval_secs', dest='eval_interval_secs',\n                        help='Go into an evaluation loop, waiting this many seconds between evaluations. Default is to evaluate once.',\n                        required=False, type=int, default=0)\n\n    parser.add_argument('--batch_size', dest='batch_size',\n                        help='The number of images in a batch.',\n                        required=False, type=int, default=None)\n\n    parser.add_argument('--batches', dest='batches',\n                        help='Maximum number of iterations to run. Default is all records (modulo the batch size).',\n                        required=False, type=int, default=0)\n\n    parser.add_argument('--model_name', dest='model_name',\n                        help='The name of the architecture to use.',\n                        required=False, type=str, default=None)\n\n    parser.add_argument('--read_images', dest='read_images',\n                        help='Read the images from the file system using the `filename` field rather than using the `encoded` field of the tfrecord.',\n                        action='store_true', default=False)\n\n    args = parser.parse_args()\n    return args\n\ndef main():\n\n    args = parse_args()\n\n    cfg = parse_config_file(args.config_file)\n\n    if args.batch_size != None:\n        cfg.BATCH_SIZE = args.batch_size\n\n    if args.model_name != None:\n        cfg.MODEL_NAME = args.model_name\n\n    test(\n        tfrecords=args.tfrecords,\n        checkpoint_path=args.checkpoint_path,\n        save_dir=args.savedir,\n        max_iterations=args.batches,\n        eval_interval_secs=args.eval_interval_secs,\n        cfg=cfg,\n        read_images=args.read_images\n    )\n\nif __name__ == '__main__':\n    main()\n"
  },
  {
    "path": "tfserving/README.md",
    "content": "# TensorFlow Serving Utilities\n\nThis directory contains utility code for interacting with a [TensorFlow Serving](https://www.tensorflow.org/serving/) instance. I'll walk through the basic steps of using TensorFlow Serving below.\n\n## Export a Trained Model\nWhen your training process has finished you will be left with a training checkpoint file created by the [tf.train.Saver](https://www.tensorflow.org/api_docs/python/tf/train/Saver) class. We need to convert this checkpoint file for use with TensorFlow Serving. You'll need to create a yaml configuration file for the export (essentially specifying the number of classes, input size, and a few other things). An example:\n\n```yaml\n# Export specific configuration\n\nRANDOM_SEED : 1.0\n\nSESSION_CONFIG : {\n  # If true, then the device location of each variable will be printed\n  LOG_DEVICE_PLACEMENT : false,\n\n  # How much GPU memory we are allowed to pre-allocate\n  PER_PROCESS_GPU_MEMORY_FRACTION : 0.9\n}\n\n#################################################\n# Dataset Info\n# The number of classes we are classifying\nNUM_CLASSES : 200\n\n# The model architecture to use.\nMODEL_NAME : 'inception_v3'\n\n# END: Dataset Info\n#################################################\n# Image Processing and Augmentation \n\nIMAGE_PROCESSING : {\n    # Images are assumed to be raveled, and have length  INPUT_SIZE * INPUT_SIZE * 3\n    INPUT_SIZE : 299\n}\n\n# END: Image Processing and Augmentation\n#################################################\n# Regularization \n#\n# The decay to use for the moving average. If 0, then moving average is not computed\n# When restoring models, this value is needed to determine whether to restore moving\n# average variables or not.\nMOVING_AVERAGE_DECAY : 0.9999\n\n# End: Regularization\n#################################################\n```\n\nTo export the model, we'll use the [export.py](export.py) script:\n```\npython export.py \\\n--checkpoint_path model.ckpt-399739 \\\n--export_dir export \\\n--export_version 1 \\\n--config config_export.yaml \\\n--serving \\\n--add_preprocess \\\n--class_names class-codes.txt\n```\nThis will create a directory called `1` in the `export_dir` directory and will contain the files that TensorFlow Serving requires. We've passed in semantic identifiers for the classes using the `--class_names` argument. This will allow clients to receive semantically meaningful identifiers along with the prediction results. This removes the requirement of clients having to map from score indices to identifiers themselves. The class-codes.txt file contains one identifier per line, with each line corresponding to one index in the scores array. For example:\n```txt\ncar\npedestrian\nlight post\ntrash can\nbench\n```\n\n## Server Machine\nSpin up an Ubuntu 16.04 instance on your favorite cloud provider, or use your personal machine. You'll need to add the TensorFlow Serving distribution URI as a package source prior to installing (notes [here](https://github.com/tensorflow/serving/blob/master/tensorflow_serving/g3doc/setup.md#installing-using-apt-get)):\n```\necho \"deb [arch=amd64] http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal\" | sudo tee /etc/apt/sources.list.d/tensorflow-serving.list\n\ncurl https://storage.googleapis.com/tensorflow-serving-apt/tensorflow-serving.release.pub.gpg | sudo apt-key add -\n\nsudo apt-get update && sudo apt-get install tensorflow-model-server\n```\nYou can also install from [source](https://github.com/tensorflow/serving/blob/master/tensorflow_serving/g3doc/setup.md#installation).\n\nCreate a models directory, such as `/home/ubuntu/serving/models`, and copy your `1` directory (that was created with the export.py script) to this directory. Alternatively, you can just specify `/home/ubuntu/serving/models` as your `--export_dir` when calling the export.py script.\n\nNow you can start the server:\n```\ntensorflow_model_server --port=9000 --model_name=inception --model_base_path=/home/ubuntu/serving/models\n```\nNote the `--model_name` field, the client will need to know this when querying the server. \n\n## Client Machine\nTo query the server from a client machine you'll need to install the `tensorflow-serving-api` PIP package along with the `tensorflow` package. I use `numpy` for some operations so I'll install that too:\n```\npip install numpy tensorflow tensorflow-serving-api\n```\n\nWe can now query the server using the [client.py](client.py) file:\n```\npython client.py \\\n--images IMG_0932_sm.jpg \\\n--num_results 10 \\\n--model_name inception \\\n--host localhost \\\n--port 9000 \\\n--timeout 10\n```\nThis command will send the `IMG_0932_sm.jpg` file to the TensorFlow Serving instance at `localhost:9000` and print the top 10 class predictions. \n\nRather than sending the raw image bytes to the TensorFlow Serving instance, we can send the prepared image array. This image array will be fed directly into the network, so it must be the proper size and have had any transformations already applied. The [inputs.py](inputs.py) file has a convenience function to prepare an image for inception style networks. For example:\n```python\nfrom scipy.misc import imread\n\nimport inputs\nimport tfserver\n\nimage = imread('IMG_0898.jpg')\n\npreped_image = inputs.prepare_image(image)\nimage_data = [preped_image]\n\npredictions = tfserver.predict(image_data)\nresults = tfserver.process_classification_prediction(predictions, max_classes=10)\n\nprint(results)\n```\n"
  },
  {
    "path": "tfserving/__init__.py",
    "content": ""
  },
  {
    "path": "tfserving/client.py",
    "content": "\"\"\"\nA simple client to query a TensorFlow Serving instance.\n\nExample:\n$ python client.py \\\n--images IMG_0932_sm.jpg \\\n--num_results 10 \\\n--model_name inception \\\n--host localhost \\\n--port 9000 \\\n--timeout 10\n\nAuthor: Grant Van Horn\n\"\"\"\n\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport argparse\nimport time\n\nimport tfserver\n\ndef parse_args():\n\n  parser = argparse.ArgumentParser(description='Command line classification client. Sorts and prints the classification results.')\n\n  parser.add_argument('--images', dest='image_paths',\n                        help='Path to one or more images to classify (jpeg or png).',\n                        type=str, nargs='+', required=True)\n\n  parser.add_argument('--num_results', dest='num_results',\n                      help='The number of results to print. Set to 0 to print all classes.',\n                      required=False, type=int, default=0)\n\n  parser.add_argument('--model_name', dest='model_name',\n                        help='The name of the model to query.',\n                        required=False, type=str, default='inception')\n\n  parser.add_argument('--host', dest='host',\n                        help='Machine host where the TensorFlow Serving model is.',\n                        required=False, type=str, default='localhost')\n\n  parser.add_argument('--port', dest='port',\n                      help='Port that the TensorFlow Server is listening on.',\n                      required=False, type=int, default=9000)\n\n  parser.add_argument('--timeout', dest='timeout',\n                      help='Amount of time to wait before failing.',\n                      required=False, type=int, default=10)\n\n  args = parser.parse_args()\n\n  return args\n\ndef main():\n\n  args = parse_args()\n\n  # Read in the image bytes\n  image_data = []\n  for fp in args.image_paths:\n    with open(fp) as f:\n      data = f.read()\n    image_data.append(data)\n\n  # Get the predictions\n  t = time.time()\n  predictions = tfserver.predict(image_data, model_name=args.model_name,\n    host=args.host, port=args.port, timeout=args.timeout\n  )\n  dt = time.time() - t\n  print(\"Prediction call took %0.4f seconds\" % (dt,))\n\n  # Process the results\n  results = tfserver.process_classification_prediction(predictions, max_classes=args.num_results)\n\n  # Print the results\n  for i, fp in enumerate(args.image_paths):\n    print(\"Results for image: %s\" % (fp,))\n    for name, score in results[i]:\n      print(\"%s: %0.3f\" % (name, score))\n    print()\n\nif __name__ == '__main__':\n  main()"
  },
  {
    "path": "tfserving/inputs.py",
    "content": "\"\"\"\nNumpy and scipy image preparation.\n\nAuthor: Grant Van Horn\n\"\"\"\n\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport numpy as np\nfrom scipy.misc import imresize\n\ndef prepare_image(image, input_height=299, input_width=299):\n  \"\"\" Prepare an image to be passed through a network.\n  Arguments:\n    image (numpy.ndarray): An uint8 RGB image\n  Returns:\n    list: the image resized, centered and raveled\n  \"\"\"\n\n  # We assume an uint8 RGB image\n  assert image.dtype == np.uint8\n  assert image.ndim == 3\n  assert image.shape[2] == 3\n\n  resized_image = imresize(image, (input_height, input_width, 3))\n  float_image = resized_image.astype(np.float32)\n  centered_image = ((float_image / 255.) - 0.5) * 2.0\n\n  return centered_image.ravel().tolist()\n"
  },
  {
    "path": "tfserving/tfserver.py",
    "content": "\"\"\"\nTensorFlow Serving caller code.\n\nRequirements:\npip install numpy tensorflow tensorflow-serving-api\n\nAuthor: Grant Van Horn\n\"\"\"\n\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nfrom grpc.beta import implementations\nimport numpy as np\nimport tensorflow as tf\nfrom tensorflow_serving.apis import predict_pb2\nfrom tensorflow_serving.apis import prediction_service_pb2\n\ndef predict(image_data,\n            model_name='inception',\n            host='localhost',\n            port=9000,\n            timeout=10):\n  \"\"\"\n  Arguments:\n    image_data (list): A list of image data. The image data should either be the image bytes or\n      float arrays.\n    model_name (str): The name of the model to query (specified when you started the Server)\n    model_signature_name (str): The name of the signature to query (specified when you created the exported model)\n    host (str): The machine host identifier that the classifier is running on.\n    port (int): The port that the classifier is listening on.\n    timeout (int): Time in seconds before timing out.\n\n  Returns:\n    PredictResponse protocol buffer. See here: https://github.com/tensorflow/serving/blob/master/tensorflow_serving/apis/predict.proto\n  \"\"\"\n\n  if len(image_data) <= 0:\n    return None\n\n  channel = implementations.insecure_channel(host, int(port))\n  stub = prediction_service_pb2.beta_create_PredictionService_stub(channel)\n  request = predict_pb2.PredictRequest()\n  request.model_spec.name = model_name\n\n  if type(image_data[0]) == str:\n    request.model_spec.signature_name = 'predict_image_bytes'\n    request.inputs['images'].CopyFrom(\n        tf.contrib.util.make_tensor_proto(image_data, shape=[len(image_data)]))\n  else:\n    request.model_spec.signature_name = 'predict_image_array'\n    request.inputs['images'].CopyFrom(\n        tf.contrib.util.make_tensor_proto(image_data, shape=[len(image_data), len(image_data[1])]))\n\n  result = stub.Predict(request, timeout)\n  return result\n\ndef process_classification_prediction(predictions, max_classes=10):\n  \"\"\"\n  Arguments:\n    prediction (PredictResponse protocol buffer): TensorFlow Serving prediction response.\n    num_classes (int): Maximum number of results to return. Set to 0 for all results.\n  Returns:\n    list of lists: A list of (name, score) tuples, one for each prediction.\n  \"\"\"\n\n  # Determine how many outputs there are\n  dims = predictions.outputs['classes'].tensor_shape.dim\n  num_inputs = dims[0].size\n  num_classes = dims[1].size\n\n  all_class_names = np.array(predictions.outputs['classes'].string_val).reshape(num_inputs, num_classes)\n  all_scores = np.array(predictions.outputs['scores'].float_val).reshape(num_inputs, num_classes)\n\n  results = []\n  for i in range(num_inputs):\n\n    scores = all_scores[i]\n    class_names = all_class_names[i]\n\n    idxs = np.argsort(scores)[::-1]\n    scores = scores[idxs]\n    class_names = class_names[idxs]\n\n    num_to_return = min(num_classes, max_classes)\n    if num_to_return <= 0:\n      num_to_return = scores.shape[-1]\n\n    names_scores = [(class_names[i], scores[i]) for i in range(num_to_return)]\n    results.append(names_scores)\n\n  return results"
  },
  {
    "path": "train.py",
    "content": "# Some of this code came from the https://github.com/tensorflow/models/tree/master/slim\n# directory, so lets keep the Google license around for now.\n#\n# Copyright 2016 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport argparse\nimport copy\nimport os\n\nimport numpy as np\nimport tensorflow as tf\nimport tensorflow.contrib.slim as slim\n\nfrom config.parse_config import parse_config_file\nfrom nets import nets_factory\nfrom preprocessing.inputs import input_nodes\n\n\ndef _configure_learning_rate(global_step, cfg):\n    \"\"\"Configures the learning rate.\n    Args:\n        num_samples_per_epoch: The number of samples in each epoch of training.\n        global_step: The global_step tensor.\n    Returns:\n        A `Tensor` representing the learning rate.\n    Raises:\n        ValueError: if cfg.LEARNING_RATE_DECAY_TYPE is not recognized.\n    \"\"\"\n\n\n    decay_steps = int(cfg.NUM_TRAIN_EXAMPLES / cfg.BATCH_SIZE * cfg.NUM_EPOCHS_PER_DELAY)\n\n    if cfg.LEARNING_RATE_DECAY_TYPE == 'exponential':\n        return tf.train.exponential_decay(cfg.INITIAL_LEARNING_RATE,\n                                          global_step,\n                                          decay_steps,\n                                          cfg.LEARNING_RATE_DECAY_FACTOR,\n                                          staircase=cfg.LEARNING_RATE_STAIRCASE,\n                                          name='exponential_decay_learning_rate')\n\n    elif cfg.LEARNING_RATE_DECAY_TYPE == 'fixed':\n        return tf.constant(cfg.INITIAL_LEARNING_RATE, name='fixed_learning_rate')\n\n    elif cfg.LEARNING_RATE_DECAY_TYPE == 'polynomial':\n        return tf.train.polynomial_decay(cfg.INITIAL_LEARNING_RATE,\n                                         global_step,\n                                         decay_steps,\n                                         cfg.END_LEARNING_RATE,\n                                         power=1.0,\n                                         cycle=False,\n                                         name='polynomial_decay_learning_rate')\n    else:\n        raise ValueError('learning_rate_decay_type [%s] was not recognized',\n                         cfg.LEARNING_RATE_DECAY_TYPE)\n\n\ndef _configure_optimizer(learning_rate, cfg):\n    \"\"\"Configures the optimizer used for training.\n    Args:\n        learning_rate: A scalar or `Tensor` learning rate.\n    Returns:\n        An instance of an optimizer.\n    Raises:\n        ValueError: if FLAGS.optimizer is not recognized.\n    \"\"\"\n    if cfg.OPTIMIZER == 'adadelta':\n        optimizer = tf.train.AdadeltaOptimizer(\n            learning_rate,\n            rho=cfg.ADADELTA_RHO,\n            epsilon=cfg.OPTIMIZER_EPSILON)\n    elif cfg.OPTIMIZER == 'adagrad':\n        optimizer = tf.train.AdagradOptimizer(\n            learning_rate,\n            initial_accumulator_value=cfg.ADAGRAD_INITIAL_ACCUMULATOR_VALUE)\n    elif cfg.OPTIMIZER == 'adam':\n        optimizer = tf.train.AdamOptimizer(\n            learning_rate,\n            beta1=cfg.ADAM_BETA1,\n            beta2=cfg.ADAM_BETA2,\n            epsilon=cfg.OPTIMIZER_EPSILON)\n    elif cfg.OPTIMIZER == 'ftrl':\n        optimizer = tf.train.FtrlOptimizer(\n            learning_rate,\n            learning_rate_power=cfg.FTRL_LEARNING_RATE_POWER,\n            initial_accumulator_value=cfg.FTRL_INITIAL_ACCUMULATOR_VALUE,\n            l1_regularization_strength=cfg.FTRL_L1,\n            l2_regularization_strength=cfg.FTRL_L2)\n    elif cfg.OPTIMIZER == 'momentum':\n        optimizer = tf.train.MomentumOptimizer(\n            learning_rate,\n            momentum=cfg.MOMENTUM,\n            name='Momentum')\n    elif cfg.OPTIMIZER == 'rmsprop':\n        optimizer = tf.train.RMSPropOptimizer(\n            learning_rate,\n            decay=cfg.RMSPROP_DECAY,\n            momentum=cfg.MOMENTUM,\n            epsilon=cfg.OPTIMIZER_EPSILON)\n    elif cfg.OPTIMIZER == 'sgd':\n        optimizer = tf.train.GradientDescentOptimizer(learning_rate)\n    else:\n        raise ValueError('Optimizer [%s] was not recognized', cfg.OPTIMIZER)\n    return optimizer\n\ndef get_trainable_variables(trainable_scopes):\n    \"\"\"Returns a list of variables to train.\n    Returns:\n        A list of variables to train by the optimizer.\n    \"\"\"\n\n    if trainable_scopes is None:\n        return tf.trainable_variables()\n\n    trainable_scopes = [scope.strip() for scope in trainable_scopes]\n\n    variables_to_train = []\n    for scope in trainable_scopes:\n        variables = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope)\n        variables_to_train.extend(variables)\n    return variables_to_train\n\n\ndef get_init_function(logdir, pretrained_model_path, checkpoint_exclude_scopes, restore_variables_with_moving_averages=False, restore_moving_averages=False, ema=None):\n    \"\"\"\n    Args:\n        logdir : location of where we will be storing checkpoint files.\n        pretrained_model_path : a path to a specific model, or a directory with a checkpoint file. The latest model will be used.\n        fine_tune : If True, then the detection heads will not be restored.\n        original_inception_vars : A list of variables that do not include the detection heads.\n        use_moving_averages : If True, then the moving average values of the variables will be restored.\n        restore_moving_averages : If True, then the moving average values will also be restored.\n        ema : The exponential moving average object\n    \"\"\"\n\n\n    if pretrained_model_path is None:\n        return None\n\n    # Warn the user if a checkpoint exists in the train_dir. Then we'll be\n    # ignoring the checkpoint anyway.\n    if tf.train.latest_checkpoint(logdir):\n        tf.logging.info(\n            'Ignoring --pretrained_model_path because a checkpoint already exists in %s'\n            % logdir)\n        return None\n\n    exclusions = []\n    if checkpoint_exclude_scopes:\n        exclusions = [scope.strip() for scope in checkpoint_exclude_scopes]\n\n    variables_to_restore = []\n    for var in slim.get_model_variables():\n        excluded = False\n        for exclusion in exclusions:\n          if var.op.name.startswith(exclusion):\n            excluded = True\n            break\n        if not excluded:\n          variables_to_restore.append(var)\n\n    #for variable in variables_to_restore:\n    #    print(variable.name)\n\n    if os.path.isdir(pretrained_model_path):\n        checkpoint_path = tf.train.latest_checkpoint(pretrained_model_path)\n        if checkpoint_path is None:\n            raise ValueError(\n                \"No model checkpoint file found in directory %s\" % (pretrained_model_path))\n\n    else:\n        checkpoint_path = pretrained_model_path\n\n    tf.logging.info('Restoring variables from %s' % checkpoint_path)\n\n    if ema != None:\n        # # Restore each variable with its moving average value\n        # if restore_variables_with_moving_averages:\n\n        #     # Also restore the moving average variables\n        #     if restore_moving_averages:\n        #         variables_to_restore_with_ma = variables_to_restore + [ema.average(var) for var in variables_to_restore]\n        #         normal_saver = tf.train.Saver(variables_to_restore_with_ma, reshape=False)\n        #     else:\n        #         normal_saver = tf.train.Saver(variables_to_restore, reshape=False)\n        #     ema_saver = tf.train.Saver({\n        #         ema.average_name(var) : ema.average(var)\n        #         for var in variables_to_restore\n        #     }, reshape=False)\n\n        #     def callback(session):\n        #         normal_saver.restore(session, checkpoint_path)\n        #         ema_saver.restore(session, checkpoint_path)\n        #     return callback\n\n        # elif restore_moving_averages:\n        #     variables_to_restore += [ema.average(var) for var in variables_to_restore]\n\n        # Load in the moving average value for a variable, rather than the variable itself\n        if restore_variables_with_moving_averages:\n\n            variables_to_restore = {\n                ema.average_name(var) : var\n                for var in variables_to_restore\n            }\n\n        # Do we want to restore the moving average variables? Otherwise they will be reinitialized\n        if restore_moving_averages:\n\n            # If we are already using the moving averages to restore the variables, then we will need\n            # two Saver() objects (since the names in the dictionaries will clash)\n            if restore_variables_with_moving_averages:\n\n                normal_saver = tf.train.Saver(variables_to_restore, reshape=False)\n                ema_saver = tf.train.Saver({\n                    ema.average_name(var) : ema.average(var)\n                    for var in variables_to_restore.values()\n                }, reshape=False)\n\n                def callback(session):\n                    normal_saver.restore(session, checkpoint_path)\n                    ema_saver.restore(session, checkpoint_path)\n                return callback\n\n            else:\n                # GVH: Need to check for dict\n                variables_to_restore += [ema.average(var) for var in variables_to_restore]\n\n    return slim.assign_from_checkpoint_fn(\n        checkpoint_path,\n        variables_to_restore,\n        ignore_missing_vars=False)\n\n\ndef train(tfrecords, logdir, cfg, pretrained_model_path=None, trainable_scopes=None, checkpoint_exclude_scopes=None, restore_variables_with_moving_averages=False, restore_moving_averages=False, read_images=False):\n    \"\"\"\n    Args:\n        tfrecords (list)\n        bbox_priors (np.array)\n        logdir (str)\n        cfg (EasyDict)\n        pretrained_model_path (str) : path to a pretrained Inception Network\n    \"\"\"\n    tf.logging.set_verbosity(tf.logging.INFO)\n\n    graph = tf.Graph()\n\n    # Force all Variables to reside on the CPU.\n    with graph.as_default():\n\n        # Create a variable to count the number of train() calls.\n        global_step = slim.get_or_create_global_step()\n\n        with tf.device('/cpu:0'):\n            batch_dict = input_nodes(\n                tfrecords=tfrecords,\n                cfg=cfg.IMAGE_PROCESSING,\n                num_epochs=None,\n                batch_size=cfg.BATCH_SIZE,\n                num_threads=cfg.NUM_INPUT_THREADS,\n                shuffle_batch =cfg.SHUFFLE_QUEUE,\n                random_seed=cfg.RANDOM_SEED,\n                capacity=cfg.QUEUE_CAPACITY,\n                min_after_dequeue=cfg.QUEUE_MIN,\n                add_summaries=True,\n                input_type='train',\n                read_filenames=read_images\n            )\n\n            batched_one_hot_labels = slim.one_hot_encoding(batch_dict['labels'],\n                                                        num_classes=cfg.NUM_CLASSES)\n\n        # GVH: Doesn't seem to help to the poor queueing performance...\n        # batch_queue = slim.prefetch_queue.prefetch_queue(\n        #                   [batch_dict['inputs'], batched_one_hot_labels], capacity=2)\n        # inputs, labels = batch_queue.dequeue()\n\n        arg_scope = nets_factory.arg_scopes_map[cfg.MODEL_NAME](\n            weight_decay=cfg.WEIGHT_DECAY,\n            batch_norm_decay=cfg.BATCHNORM_MOVING_AVERAGE_DECAY,\n            batch_norm_epsilon=cfg.BATCHNORM_EPSILON\n        )\n\n        with slim.arg_scope(arg_scope):\n            logits, end_points = nets_factory.networks_map[cfg.MODEL_NAME](\n                inputs=batch_dict['inputs'],\n                num_classes=cfg.NUM_CLASSES,\n                dropout_keep_prob=cfg.DROPOUT_KEEP_PROB,\n                is_training=True\n            )\n\n            # Add the losses\n            if 'AuxLogits' in end_points:\n                tf.losses.softmax_cross_entropy(\n                    logits=end_points['AuxLogits'], onehot_labels=batched_one_hot_labels,\n                    label_smoothing=cfg.LABEL_SMOOTHING, weights=0.4, scope='aux_loss')\n\n            tf.losses.softmax_cross_entropy(\n                logits=logits, onehot_labels=batched_one_hot_labels, label_smoothing=cfg.LABEL_SMOOTHING, weights=1.0)\n\n\n\n        summaries = set(tf.get_collection(tf.GraphKeys.SUMMARIES))\n\n        # Summarize the losses\n        for loss in tf.get_collection(tf.GraphKeys.LOSSES):\n            summaries.add(tf.summary.scalar(name='losses/%s' % loss.op.name, tensor=loss))\n\n        regularization_losses = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)\n        if regularization_losses:\n            regularization_loss = tf.add_n(regularization_losses, name='regularization_loss')\n            summaries.add(tf.summary.scalar(name='losses/regularization_loss', tensor=regularization_loss))\n\n        total_loss = tf.losses.get_total_loss()\n        summaries.add(tf.summary.scalar(name='losses/total_loss', tensor=total_loss))\n\n\n        if 'MOVING_AVERAGE_DECAY' in cfg and cfg.MOVING_AVERAGE_DECAY > 0:\n            moving_average_variables = slim.get_model_variables()\n            ema = tf.train.ExponentialMovingAverage(\n                decay=cfg.MOVING_AVERAGE_DECAY,\n                num_updates=global_step\n            )\n        elif restore_variables_with_moving_averages or restore_moving_averages:\n            # Perhaps we are finetuning the last layer of a pretrained model?\n            # So we just need something to load in the moving averages, for use in get_init_function()\n            moving_average_variables = None\n            ema = tf.train.ExponentialMovingAverage(\n                decay=1,\n                num_updates=global_step\n            )\n        else:\n            moving_average_variables = None\n            ema = None\n\n\n        # Calculate the learning rate schedule.\n        lr = _configure_learning_rate(global_step, cfg)\n\n        # Create an optimizer that performs gradient descent.\n        optimizer = _configure_optimizer(lr, cfg)\n\n        summaries.add(tf.summary.scalar(tensor=lr,\n                                        name='learning_rate'))\n\n        # Add the moving average update ops to the graph\n        if ema != None and moving_average_variables != None:\n            tf.add_to_collection(tf.GraphKeys.UPDATE_OPS, ema.apply(moving_average_variables))\n\n        trainable_vars = get_trainable_variables(trainable_scopes)\n        train_op = slim.learning.create_train_op(total_loss=total_loss,\n                                                 optimizer=optimizer,\n                                                 global_step=global_step,\n                                                 variables_to_train=trainable_vars,\n                                                 clip_gradient_norm=cfg.CLIP_GRADIENT_NORM)\n\n        # Merge all of the summaries\n        summaries |= set(tf.get_collection(tf.GraphKeys.SUMMARIES))\n        summary_op = tf.summary.merge(inputs=list(summaries), name='summary_op')\n\n        sess_config = tf.ConfigProto(\n          log_device_placement=cfg.SESSION_CONFIG.LOG_DEVICE_PLACEMENT,\n          allow_soft_placement = True,\n          gpu_options = tf.GPUOptions(\n              per_process_gpu_memory_fraction=cfg.SESSION_CONFIG.PER_PROCESS_GPU_MEMORY_FRACTION\n          ),\n          intra_op_parallelism_threads=cfg.SESSION_CONFIG.INTRA_OP_PARALLELISM_THREADS if 'INTRA_OP_PARALLELISM_THREADS' in cfg.SESSION_CONFIG else None,\n          inter_op_parallelism_threads=cfg.SESSION_CONFIG.INTER_OP_PARALLELISM_THREADS if 'INTER_OP_PARALLELISM_THREADS' in cfg.SESSION_CONFIG else None\n        )\n\n        saver = tf.train.Saver(\n          # Save all variables\n          max_to_keep = cfg.MAX_TO_KEEP,\n          keep_checkpoint_every_n_hours = cfg.KEEP_CHECKPOINT_EVERY_N_HOURS\n        )\n\n        # Run training.\n        slim.learning.train(\n            train_op=train_op,\n            logdir=logdir,\n            init_fn=get_init_function(logdir, pretrained_model_path, checkpoint_exclude_scopes, restore_variables_with_moving_averages=restore_variables_with_moving_averages, restore_moving_averages=restore_moving_averages, ema=ema),\n            number_of_steps=cfg.NUM_TRAIN_ITERATIONS,\n            save_summaries_secs=cfg.SAVE_SUMMARY_SECS,\n            save_interval_secs=cfg.SAVE_INTERVAL_SECS,\n            saver=saver,\n            session_config=sess_config,\n            summary_op = summary_op,\n            log_every_n_steps = cfg.LOG_EVERY_N_STEPS\n        )\n\ndef parse_args():\n\n    parser = argparse.ArgumentParser(description='Train the classification system')\n\n    parser.add_argument('--tfrecords', dest='tfrecords',\n                        help='Paths to tfrecord files.', type=str,\n                        nargs='+', required=True)\n\n    parser.add_argument('--logdir', dest='logdir',\n                          help='path to directory to store summary files and checkpoint files', type=str,\n                          required=True)\n\n    parser.add_argument('--config', dest='config_file',\n                        help='Path to the configuration file',\n                        required=True, type=str)\n\n    parser.add_argument('--pretrained_model', dest='pretrained_model',\n                        help='Path to a model to restore. This is ignored if there is model in the logdir.',\n                        required=False, type=str, default=None)\n\n    parser.add_argument('--trainable_scopes', dest='trainable_scopes',\n                        help='Only variables within these scopes will be trained.',\n                        type=str, nargs='+', default=None, required=False)\n\n    parser.add_argument('--checkpoint_exclude_scopes', dest='checkpoint_exclude_scopes',\n                        help='Variables within these scopes will not be restored from the checkpoint files.',\n                        type=str, nargs='+', default=None, required=False)\n\n    parser.add_argument('--max_number_of_steps', dest='max_number_of_steps',\n                        help='The maximum number of iterations to run.',\n                        required=False, type=int, default=None)\n\n    parser.add_argument('--learning_rate_decay_type', dest='learning_rate_decay_type',\n                          help='Type of the decay', type=str,\n                          required=False, default=None)\n\n    parser.add_argument('--lr', dest='learning_rate',\n                          help='Initial learning rate', type=float,\n                          required=False, default=None)\n\n    parser.add_argument('--batch_size', dest='batch_size',\n                        help='The number of images in a batch.',\n                        required=False, type=int, default=None)\n\n    parser.add_argument('--model_name', dest='model_name',\n                        help='The name of the architecture to use.',\n                        required=False, type=str, default=None)\n\n    parser.add_argument('--restore_variables_with_moving_averages', dest='restore_variables_with_moving_averages',\n                        help='If True, then we restore variables with their moving average values.',\n                        required=False, action='store_true', default=False)\n\n    parser.add_argument('--restore_moving_averages', dest='restore_moving_averages',\n                        help='If True, then we restore the variable that tracks the moving average of each trainable varibale.',\n                        required=False, action='store_true', default=False)\n\n    parser.add_argument('--read_images', dest='read_images',\n                        help='Read the images from the file system using the `filename` field rather than using the `encoded` field of the tfrecord.',\n                        action='store_true', default=False)\n\n    args = parser.parse_args()\n    return args\n\ndef main():\n    args = parse_args()\n\n    cfg = parse_config_file(args.config_file)\n\n    # Replace cfg parameters with the command line values\n    if args.max_number_of_steps != None:\n        cfg.NUM_TRAIN_ITERATIONS = args.max_number_of_steps\n\n    if args.learning_rate_decay_type != None:\n        cfg.LEARNING_RATE_DECAY_TYPE = args.learning_rate_decay_type\n\n    if args.learning_rate != None:\n        cfg.INITIAL_LEARNING_RATE = args.learning_rate\n\n    if args.batch_size != None:\n        cfg.BATCH_SIZE = args.batch_size\n\n    if args.model_name != None:\n        cfg.MODEL_NAME = args.model_name\n\n    train(\n        tfrecords=args.tfrecords,\n        logdir=args.logdir,\n        cfg=cfg,\n        pretrained_model_path=args.pretrained_model,\n        trainable_scopes = args.trainable_scopes,\n        checkpoint_exclude_scopes = args.checkpoint_exclude_scopes,\n        restore_variables_with_moving_averages=args.restore_variables_with_moving_averages,\n        restore_moving_averages=args.restore_moving_averages,\n        read_images=args.read_images\n    )\n\nif __name__ == '__main__':\n  main()\n"
  },
  {
    "path": "visualize_train_inputs.py",
    "content": "from __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport argparse\n\nfrom matplotlib import pyplot as plt\nimport numpy as np\nimport tensorflow as tf\n\nfrom config.parse_config import parse_config_file\nfrom preprocessing.inputs import input_nodes\n\ndef visualize_train_inputs(tfrecords, cfg, show_text_labels=False, read_images=False):\n\n    graph = tf.Graph()\n    sess = tf.Session(graph = graph)\n\n    # run a session to look at the images...\n    with sess.as_default(), graph.as_default():\n\n        # Input Nodes\n        with tf.device('/cpu:0'):\n            batch_dict = input_nodes(\n                tfrecords=tfrecords,\n                cfg=cfg.IMAGE_PROCESSING,\n                num_epochs=1,\n                batch_size=cfg.BATCH_SIZE,\n                num_threads=cfg.NUM_INPUT_THREADS,\n                shuffle_batch =cfg.SHUFFLE_QUEUE,\n                random_seed=cfg.RANDOM_SEED,\n                capacity=cfg.QUEUE_CAPACITY,\n                min_after_dequeue=cfg.QUEUE_MIN,\n                add_summaries=False,\n                input_type='visualize',\n                fetch_text_labels=show_text_labels,\n                read_filenames=read_images\n            )\n\n        # Convert float images to uint8 images\n        image_to_convert = tf.placeholder(dtype=tf.float32,\n                                          shape=[cfg.IMAGE_PROCESSING.INPUT_SIZE,\n                                                 cfg.IMAGE_PROCESSING.INPUT_SIZE, 3])\n        uint8_image = tf.image.convert_image_dtype(image_to_convert, dtype=tf.uint8)\n\n\n        coord = tf.train.Coordinator()\n        tf.global_variables_initializer().run()\n        tf.local_variables_initializer().run()\n        threads = tf.train.start_queue_runners(sess=sess, coord=coord)\n\n        plt.ion()\n        done = False\n        while not done:\n\n            output = sess.run(batch_dict)\n\n            original_images = output['original_inputs']\n            distorted_images = output['inputs']\n            image_ids = output['ids']\n            labels = output['labels']\n            if show_text_labels:\n                text_labels = output['text_labels']\n\n            for b in range(cfg.BATCH_SIZE):\n\n                original_image = original_images[b]\n                distorted_image = distorted_images[b]\n\n                if original_image.dtype != np.uint8:\n                    original_image = sess.run(uint8_image, {image_to_convert : original_image})\n\n                if distorted_image.dtype != np.uint8:\n                    distorted_image = sess.run(uint8_image, {image_to_convert : distorted_image})\n\n                image_id = image_ids[b]\n                label = labels[b]\n\n                fig = plt.figure('Train Inputs')\n\n                if show_text_labels:\n                    text_label = text_labels[b]\n                    st = fig.suptitle(\"Image: %s\\nLabel: %d\\nText: %s\" %\n                                      (image_id, label, text_label), fontsize=12)\n                else:\n                    st = fig.suptitle(\"Image: %s\\nLabel: %d\" % (image_id, label), fontsize=12)\n\n                plt.subplot(2, 1, 1)\n                plt.imshow(original_image)\n                plt.title(\"Original\")\n                plt.axis('off')\n\n                plt.subplot(2, 1, 2)\n                plt.imshow(distorted_image)\n                plt.title(\"Modified\")\n                plt.axis('off')\n\n                # Shift the subplots down a bit to make room for the super title\n                st.set_y(0.95)\n                fig.subplots_adjust(top=0.75)\n\n                plt.show(block=False)\n\n                t = raw_input(\"Press Enter to view next image. Press any key followed \" \\\n                              \"by Enter to quite: \")\n                if t != '':\n                    done = True\n                    break\n                plt.clf()\n\n\ndef parse_args():\n\n    parser = argparse.ArgumentParser(description='Visualize the inputs to train the classification system.')\n\n    parser.add_argument('--tfrecords', dest='tfrecords',\n                        help='Paths to tfrecord files.', type=str,\n                        nargs='+', required=True)\n\n    parser.add_argument('--config', dest='config_file',\n                        help='Path to the configuration file',\n                        required=True, type=str)\n\n    parser.add_argument('--text_labels', dest='show_text_labels',\n                        help='If text labels have been stored in the tfrecords, then you can use this flag to show them.',\n                        action='store_true', default=False)\n\n    parser.add_argument('--read_images', dest='read_images',\n                        help='Read the images from the file system using the `filename` field rather than using the `encoded` field of the tfrecord.',\n                        action='store_true', default=False)\n\n    args = parser.parse_args()\n    return args\n\ndef main():\n  args = parse_args()\n  cfg = parse_config_file(args.config_file)\n  visualize_train_inputs(\n    tfrecords=args.tfrecords,\n    cfg=cfg,\n    show_text_labels=args.show_text_labels,\n    read_images=args.read_images\n  )\n\n\n\nif __name__ == '__main__':\n  main()\n"
  }
]