Repository: visipedia/tf_classification
Branch: master
Commit: 7dac8bb3a419
Files: 47
Total size: 397.8 KB

Directory structure:
gitextract_pxfoh05k/

├── .gitignore
├── LICENSE
├── README.md
├── classify.py
├── config/
│   ├── README.md
│   ├── __init__.py
│   ├── config_classify.yaml
│   ├── config_export.yaml
│   ├── config_test.yaml
│   ├── config_train.yaml
│   └── parse_config.py
├── export.py
├── extract.py
├── nets/
│   ├── README.md
│   ├── __init__.py
│   ├── inception.py
│   ├── inception_resnet_v2.py
│   ├── inception_resnet_v2_test.py
│   ├── inception_utils.py
│   ├── inception_v1.py
│   ├── inception_v1_test.py
│   ├── inception_v2.py
│   ├── inception_v2_test.py
│   ├── inception_v3.py
│   ├── inception_v3_test.py
│   ├── inception_v4.py
│   ├── inception_v4_test.py
│   ├── mobilenet_v1.py
│   ├── mobilenet_v1_test.py
│   ├── net_profile.py
│   ├── nets_factory.py
│   ├── nets_factory_test.py
│   ├── resnet_utils.py
│   ├── resnet_v2.py
│   └── resnet_v2_test.py
├── preprocessing/
│   ├── __init__.py
│   ├── decode_example.py
│   └── inputs.py
├── requirements.txt
├── test.py
├── tfserving/
│   ├── README.md
│   ├── __init__.py
│   ├── client.py
│   ├── inputs.py
│   └── tfserver.py
├── train.py
└── visualize_train_inputs.py

================================================
FILE CONTENTS
================================================

================================================
FILE: .gitignore
================================================
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
*.egg-info/
.installed.cfg
*.egg

# PyInstaller
#  Usually these files are written by a python script from a template
#  before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*,cover
.hypothesis/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# IPython Notebook
.ipynb_checkpoints

# pyenv
.python-version

# celery beat schedule file
celerybeat-schedule

# dotenv
.env

# virtualenv
venv/
ENV/

# Spyder project settings
.spyderproject

# Rope project settings
.ropeproject

# Mac stuff
.DS_Store

# Visual Studio Code stuff
.vscode/

================================================
FILE: LICENSE
================================================
MIT License

Copyright (c) 2017 Visipedia

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.


================================================
FILE: README.md
================================================
# TensorFlow Classification
This repo contains training, testing and classifcation code for image classification using [TensorFlow](https://www.tensorflow.org/). Whole image classification as well as multi instance bounding box classification is supported. 

Checkout the [Wiki](https://github.com/visipedia/tf_classification/wiki) for more detailed tutorials. 

---

## Requirements
TensorFlow 1.0+ is required. The code is tested with TensorFlow 1.3 and Python 2.7 on Ubuntu 16.04 and Mac OSX 10.11. Check out the [requirements.txt](requirements.txt) file for a list of python dependencies. 

---

## Prepare the Data
The models require the image data to be in a specific format. You can use the Visipedia [tfrecords repo](https://github.com/visipedia/tfrecords) to produce the files. 

For the commands below, I'll assume that you have created a `DATASET_DIR` environment variable that points to the directory that contains your tfrecords:
```
$ export DATASET_DIR=/home/ubuntu/tf_datasets/cub
```

---

## Directory Structure
I have found that its useful to have the following directory and file setup:
* experiment/
  * logdir/
    * train_summaries/
    * val_summaries/
    * test_summaries/
    * results/
    * finetune/
      * train_summaries/
      * val_summaries/
  * cmds.txt
  * config_train.yaml
  * config_test.yaml
  * config_export.yaml

The purpose of each directory and file will be explained below. 

The `cmds.txt` is useful to save the different training and testing commands. There are quite a few command-line arguments to some of the scripts, so its convienent to compose the commands in an editor. 

For the commands below, I'll assume that you have created a `EXPERIMENT_DIR` environment variable that points to your experiment directory:
```
$ export EXPERIMENT_DIR=/home/ubuntu/tf_experiments/cub
```

---

## Configuration
There are example configuration files in the [config directory](config/). At the very least you'll need a `config_train.yaml` file, and you'll probably want a `config_test.yaml` file. It is convienent to copy the example configuration files into your `experiment` directory. See the configuration [README](config/README.md) for more details.

### Choose a Network Architecture
This repo currently supports the Google Inception, ResNet and MobileNet flavor of networks. See the nets [README](nets/README.md) for more information on the different Inception versions. At the moment, `inception_v3` probably offers the best tradeoff in terms of size and performance, although its always worth experimenting with a few different architectures. The [README](nets/README.md) also contains links where you can download checkpoint files for the models. In most cases you should start your training from these checkpoint files rather than training from scratch. 

You can specify the name of the choosen network in the configuration yaml file. Alternatively you can pass it in as a command-line argument to most of the scripts. 

For the commands below, I'll assume that you have created an environment variable that points to the pretrained checkpoint file that you downloaded:
```
$ export PRETRAINED_MODEL=/home/ubuntu/tf_models/inception_v3.ckpt
```

---

## Data Visualization
Now that you have a configuration script for training, it is a good idea to visualize the inputs to the network and ensure that they look good. This allows you to debug any problems with your tfrecords and lets you play with different augmentation techniques. Visualize your data by doing:
```
$ CUDA_VISIBLE_DEVICES=1 python visualize_train_inputs.py \
--tfrecords $DATASET_DIR/train* \
--config $EXPERIMENT_DIR/config_train.yaml
```

If you are in a virtualenv and Matplotlib is complaining, then you may need to modify your environment. See this [FAQ](http://matplotlib.org/faq/virtualenv_faq.html) and [this document](http://matplotlib.org/faq/osx_framework.html#osxframework-faq) for fixing this issue. I use a virtualenv on my Mac OSX 10.11 machine and I needed to do the `PYTHONHOME` [work around](http://matplotlib.org/faq/osx_framework.html#pythonhome-function) for Matplotlib to work properly. In this case the command looks like:
```
$ CUDA_VISIBLE_DEVICES=1 frameworkpython visualize_train_inputs.py \
--tfrecords $DATASET_DIR/train* \
--config $EXPERIMENT_DIR/config_train.yaml
```

---

## Training and Validating
It's recommended to start from a pretrained network when training a network on your own data. However, this isn't necessary and you can train from scratch if you have enough data. The following warmup section assumes you are starting from a pretrained network. See the nets [README](nets/README.md) to find links to pretrained checkpoint files.

### Finetune A Pretrained Network
Finetuning a pretrained network essentially uses the pretrained network as a generic feature extractor and learns a new final layer that will output predictions for your target classes (rather than the original classes that the pretrained network was trained on). To do this, we will specify the pretrained model as the starting point, and only allow the logits layers to be modified. We can put the trained models in the `experiment/logdir/finetune` directory. 

```
$ CUDA_VISIBLE_DEVICES=0 python train.py \
--tfrecords $DATASET_DIR/train* \
--logdir $EXPERIMENT_DIR/logdir/finetune \
--config $EXPERIMENT_DIR/config_train.yaml \
--pretrained_model $PRETRAINED_MODEL \
--trainable_scopes InceptionV3/Logits InceptionV3/AuxLogits \
--checkpoint_exclude_scopes InceptionV3/Logits InceptionV3/AuxLogits \
--learning_rate_decay_type fixed \
--lr 0.01 
```

#### Monitoring Progress
We'll want to monitor performance of the model on a validation set. Once the model performance starts to plateau we can assume that the final layer is warmed up and we can switch to full training. We can monitor the validation performance by running:
```
$ CUDA_VISIBLE_DEVICES=1 python test.py \
--tfrecords $DATASET_DIR/val* \
--save_dir $EXPERIMENT_DIR/logdir/finetune/val_summaries \
--checkpoint_path $EXPERIMENT_DIR/logdir/finetune \
--config $EXPERIMENT_DIR/config_test.yaml \
--batches 100 \
--eval_interval_secs 300
```

You may want to also monitor the accuracy on the train set. Simply pass in the train tfrecords to the `test.py` script and change the output directory:
```
$ CUDA_VISIBLE_DEVICES=1 python test.py \
--tfrecords $DATASET_DIR/train* \
--save_dir $EXPERIMENT_DIR/logdir/finetune/train_summaries \
--checkpoint_path $EXPERIMENT_DIR/logdir/finetune \
--config $EXPERIMENT_DIR/config_test.yaml \
--batches 100 \
--eval_interval_secs 300
```

Keeping the train summaries and val summaries in separate directories will keep the tensorboard ui clean. To monitor the training process you can fireup tensorboard:
```
$ tensorboard --logdir=$EXPERIMENT_DIR/logdir --port=6006
```

### Training the Entire Network
The benefit of finetuning a network is that the training is very fast, as only the last layer is modified. However, to get the best performance you'll typically want to modify more (or all) of the layers of the network. Starting from a pretrained network (which can happen to be a finetuned network), this full training step essentially adapts the network to operating on the domain of your specific dataset.  We'll store the generated files in the `experiment/logdir` directory. You can do the finetuning process as a warmup and then start the full train:
```
$ CUDA_VISIBLE_DEVICES=0 python train.py \
--tfrecords $DATASET_DIR/train* \
--logdir $EXPERIMENT_DIR/logdir \
--config $EXPERIMENT_DIR/config_train.yaml \
--pretrained_model $EXPERIMENT_DIR/logdir/finetune
```

Or you can just start the full train from a pretrained model:
```
$ CUDA_VISIBLE_DEVICES=0 python train.py \
--tfrecords $DATASET_DIR/train* \
--logdir $EXPERIMENT_DIR/logdir \
--config $EXPERIMENT_DIR/config_train.yaml \
--pretrained_model $PRETRAINED_MODEL \
--checkpoint_exclude_scopes InceptionV3/Logits InceptionV3/AuxLogits
```

Or if you have enough data, you may not want to even use the pretrained model. Rather you can train from scratch:
```
$ CUDA_VISIBLE_DEVICES=0 python train.py \
--tfrecords $DATASET_DIR/train* \
--logdir $EXPERIMENT_DIR/logdir/ \
--config $EXPERIMENT_DIR/config_train.yaml
``` 

#### Monitoring Progress

For watching the validation performance we can do:
```
$ CUDA_VISIBLE_DEVICES=1 python test.py \
--tfrecords $DATASET_DIR/val* \
--save_dir $EXPERIMENT_DIR/logdir/val_summaries \
--checkpoint_path $EXPERIMENT_DIR/logdir \
--config $EXPERIMENT_DIR/config_test.yaml \
--batches 100 \
--eval_interval_secs 300
```

Similar for the train data: 
```
$ CUDA_VISIBLE_DEVICES=1 python test.py \
--tfrecords $DATASET_DIR/train* \
--save_dir $EXPERIMENT_DIR/train_summaries \
--checkpoint_path $EXPERIMENT_DIR/logdir \
--config $EXPERIMENT_DIR/config_test.yaml \
--batches 100 \
--eval_interval_secs 300
```

The command for tensorboard doesn't need to change:
```
$ tensorboard --logdir=$EXPERIMENT_DIR/logdir --port=6006
```
You will be able to see the fine-tune and the full train data plotted on the same plots. 

---

## Test
Once performance on the validation data has plateaued (or some other criterion has been met), you can test the model on a held out set of images to see how well it generalizes to new data:
```
$ CUDA_VISIBLE_DEVICES=1 python test.py \
--tfrecords $DATASET_DIR/test* \
--save_dir $EXPERIMENT_DIR/logdir/test_summaries \
--checkpoint_path $EXPERIMENT_DIR/logdir \
--config $EXPERIMENT_DIR/config_test.yaml \
--batch_size 32 \
--batches 100
```

If you are happy with the performance of the model, then you are ready to classify new images and export the model for production use. Otherwise its back to the drawing board to figure out how to increase performance. 

---

## Classifying 
If you want to classify data offline using the trained model then you can do:
```
CUDA_VISIBLE_DEVICES=1 python classify.py \
--tfrecords $DATASET_DIR/new/* \
--checkpoint_path $EXPERIMENT_DIR/logdir \
--save_path $EXPERIMENT_DIR/logdir/results/classification_results.npz \
--config $EXPERIMENT_DIR/config_test.yaml \
--batch_size 32 \
--batches 1000 \
--save_logits
```

The output of the script is a numpy uncompressed .npz file saved at `--save_path`. The file will contain at least 2 arrays: one that contains ids and one that contains the predicted class label. If `--save_logits` is specified, then the raw logits (before going through the softmax) will also be saved. 

---

## Export & Compress
To export a model for easy use on a mobile device you can use:
```
python export.py \
--checkpoint_path model.ckpt-399739 \
--export_dir ./export \
--export_version 1 \
--config config_export.yaml \
--class_names class-codes.txt
```
The input node is called `images` and the output node is called `Predictions`. Checkout [this](https://github.com/visipedia/tf_classification/wiki/Exporting-an-Optimized-Model) wiki article for more tips. 

If you are going to use the model with [TensorFlow Serving](https://www.tensorflow.org/deploy/tfserve) then you can use the following:
```
python export.py \
--checkpoint_path model.ckpt-399739 \
--export_dir ./export \
--export_version 1 \
--config config_export.yaml \
--serving \
--add_preprocess \
--class_names class-codes.txt
```
Check out the resources in the [tfserving](tfserving/) directory for more help with deploying on TensorFlow Serving.


================================================
FILE: classify.py
================================================
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import argparse
import os
import time

import numpy as np
import tensorflow as tf
import tensorflow.contrib.slim as slim

from config.parse_config import parse_config_file
from nets import nets_factory
from preprocessing import inputs

def classify(tfrecords, checkpoint_path, save_path, max_iterations, save_logits, cfg, read_images=False):
    """
    Args:
        tfrecords (list)
        checkpoint_path (str)
        save_dir (str)
        max_iterations (int)
        save_logits (bool)
        cfg (EasyDict)
    """
    tf.logging.set_verbosity(tf.logging.DEBUG)

    graph = tf.Graph()

    with graph.as_default():

        global_step = slim.get_or_create_global_step()

        with tf.device('/cpu:0'):
            batch_dict = inputs.input_nodes(
                tfrecords=tfrecords,
                cfg=cfg.IMAGE_PROCESSING,
                num_epochs=1,
                batch_size=cfg.BATCH_SIZE,
                num_threads=cfg.NUM_INPUT_THREADS,
                shuffle_batch =cfg.SHUFFLE_QUEUE,
                random_seed=cfg.RANDOM_SEED,
                capacity=cfg.QUEUE_CAPACITY,
                min_after_dequeue=cfg.QUEUE_MIN,
                add_summaries=False,
                input_type='classification',
                read_filenames=read_images
            )

        arg_scope = nets_factory.arg_scopes_map[cfg.MODEL_NAME]()

        with slim.arg_scope(arg_scope):
            logits, end_points = nets_factory.networks_map[cfg.MODEL_NAME](
                inputs=batch_dict['inputs'],
                num_classes=cfg.NUM_CLASSES,
                is_training=False
            )

            predicted_labels = tf.argmax(end_points['Predictions'], 1)

        if 'MOVING_AVERAGE_DECAY' in cfg and cfg.MOVING_AVERAGE_DECAY > 0:
            variable_averages = tf.train.ExponentialMovingAverage(
                cfg.MOVING_AVERAGE_DECAY, global_step)
            variables_to_restore = variable_averages.variables_to_restore(
                slim.get_model_variables())
            variables_to_restore[global_step.op.name] = global_step
        else:
            variables_to_restore = slim.get_variables_to_restore()
            variables_to_restore.append(global_step)

        saver = tf.train.Saver(variables_to_restore, reshape=True)

        num_batches = max_iterations
        num_images = num_batches * cfg.BATCH_SIZE
        label_array = np.empty(num_images, dtype=np.int32)
        id_array = np.empty(num_images, dtype=np.object)
        fetches = [predicted_labels, batch_dict['ids']]
        if save_logits:
            fetches.append(logits)
            logits_array = np.empty((num_images, cfg.NUM_CLASSES), dtype=np.float32)

        if os.path.isdir(checkpoint_path):
            checkpoint_dir = checkpoint_path
            checkpoint_path = tf.train.latest_checkpoint(checkpoint_dir)

            if checkpoint_path is None:
                raise ValueError("Unable to find a model checkpoint in the " \
                                 "directory %s" % (checkpoint_dir,))

        tf.logging.info('Classifying records using %s' % checkpoint_path)

        coord = tf.train.Coordinator()

        sess_config = tf.ConfigProto(
                log_device_placement=cfg.SESSION_CONFIG.LOG_DEVICE_PLACEMENT,
                allow_soft_placement = True,
                gpu_options = tf.GPUOptions(
                    per_process_gpu_memory_fraction=cfg.SESSION_CONFIG.PER_PROCESS_GPU_MEMORY_FRACTION
                ),
                intra_op_parallelism_threads=cfg.SESSION_CONFIG.INTRA_OP_PARALLELISM_THREADS if 'INTRA_OP_PARALLELISM_THREADS' in cfg.SESSION_CONFIG else None,
                inter_op_parallelism_threads=cfg.SESSION_CONFIG.INTER_OP_PARALLELISM_THREADS if 'INTER_OP_PARALLELISM_THREADS' in cfg.SESSION_CONFIG else None
            )
        sess = tf.Session(graph=graph, config=sess_config)

        with sess.as_default():

            tf.global_variables_initializer().run()
            tf.local_variables_initializer().run()
            threads = tf.train.start_queue_runners(sess=sess, coord=coord)

            try:

                # Restore from checkpoint
                saver.restore(sess, checkpoint_path)

                print_str = ', '.join([
                  'Step: %d',
                  'Time/image (ms): %.1f'
                ])

                step = 0
                while not coord.should_stop():

                    t = time.time()
                    outputs = sess.run(fetches)
                    dt = time.time()-t

                    idx1 = cfg.BATCH_SIZE * step
                    idx2 = idx1 + cfg.BATCH_SIZE
                    label_array[idx1:idx2] = outputs[0]
                    id_array[idx1:idx2] = outputs[1]
                    if save_logits:
                        logits_array[idx1:idx2] = outputs[2]

                    step += 1
                    print(print_str % (step, (dt / cfg.BATCH_SIZE) * 1000))

                    if max_iterations > 0 and step == max_iterations:
                        break

            except tf.errors.OutOfRangeError as e:
                pass

        coord.request_stop()
        coord.join(threads)

        # save the results
        if save_logits:
            np.savez(save_path, labels=label_array, ids=id_array, logits=logits_array)
        else:
            np.savez(save_path, labels=label_array, ids=id_array)


def parse_args():

    parser = argparse.ArgumentParser(description='Classify images, optionally saving the logits.')

    parser.add_argument('--tfrecords', dest='tfrecords',
                        help='Paths to tfrecords.', type=str,
                        nargs='+', required=True)

    parser.add_argument('--checkpoint_path', dest='checkpoint_path',
                          help='Path to a specific model to test against. If a directory, then the newest checkpoint file will be used.', type=str,
                          required=True, default=None)

    parser.add_argument('--save_path', dest='save_path',
                          help='File name path to a save the classification results.', type=str,
                          required=True, default=None)

    parser.add_argument('--config', dest='config_file',
                        help='Path to the configuration file',
                        required=True, type=str)

    parser.add_argument('--batch_size', dest='batch_size',
                        help='The number of images in a batch.',
                        required=True, type=int, default=None)

    parser.add_argument('--batches', dest='batches',
                        help='Maximum number of iterations to run. Default is all records (modulo the batch size).',
                        required=True, type=int, default=0)

    parser.add_argument('--save_logits', dest='save_logits',
                        help='Should the logits be saved?',
                        action='store_true', default=False)

    parser.add_argument('--model_name', dest='model_name',
                        help='The name of the architecture to use.',
                        required=False, type=str, default=None)

    parser.add_argument('--read_images', dest='read_images',
                        help='Read the images from the file system using the `filename` field rather than using the `encoded` field of the tfrecord.',
                        action='store_true', default=False)


    args = parser.parse_args()
    return args

def main():
    args = parse_args()

    cfg = parse_config_file(args.config_file)

    if args.batch_size != None:
        cfg.BATCH_SIZE = args.batch_size

    if args.model_name != None:
        cfg.MODEL_NAME = args.model_name

    classify(
        tfrecords=args.tfrecords,
        checkpoint_path=args.checkpoint_path,
        save_path = args.save_path,
        max_iterations=args.batches,
        save_logits=args.save_logits,
        cfg=cfg,
        read_images=args.read_images
    )

if __name__ == '__main__':
    main()


================================================
FILE: config/README.md
================================================
This directory contains example configuration scripts for training, testing, classifying and exporting models. I find it easy to copy these configuration files to my experiment directory and make the necessary changes. 

## Training Configuration
See the [example training config file](config_train.yaml). 

The training configuration script contains the most configurations. The other scripts mainly contain subsets of the training configuration. The `Learning Rate Parameters`, `Regularization`, and `Optimization` configurations provided experimenters fine-grained control over the learning process. Non-researchers will probably find most of the default settings adequate. I will not go into detail for these configuration parameters, but there are comments for these parameters in the [example training config file](config_train.yaml).

The configuration sections that you will want to pay attention to are the `Dataset Info` section and the `Image Processing and Augmentation` section. You'll most likely be modifying these for each experiment. Once you determine good settings for the `Queues` and `Saving Models and Summaries` you'll probably reuse these values across experiments.

### Dataset Info
| Config Name | Type | Description |
:----:|:----:|------------|
NUM_CLASSES | int | This is how you specify how many classes are in your dataset. |
NUM_TRAIN_EXAMPLES | int | This is the number of images (or bounding boxes) in your training tfrecords. This value, along with the `BATCH_SIZE` is used to compute the number of iterations in an epoch (i.e. the number of batches it takes to go through the whole training set) |
NUM_TRAIN_ITERATIONS | int | The maximum number of iterations to execute before stopping. If you are manually monitoring the training, then you can set this to a large number (e.g. 1000000) |
BATCH_SIZE | int | The number of images to process in one iteration. This number is constrained by the amount of GPU memory you have. The larger the batch size, the more GPU memory you need. You typically want the largest batch size that will fit on your GPU. |
MODEL_NAME | str | The architecture to use. Its important to keep this configuration parameter constant in all of your configuration files. |

### Image Processing and Augmentation
Deep neural networks are notoriously data hungry. One technique for increasing the amount of data that you can pass through the network is to augment your training data. Augmentations can be as simple as randomly flipping the images horizontally, or as complex as extracting crops and perturbing the pixel values. You will typically only want to augment data for the training phase. 

`IMAGE_PROCESSING` contains the parameters for controlling how to extract data from the images:

| Config Name | Type | Description |
:----:|:----:|------------|
INPUT_SIZE | int | All images will be resized to [`INPUT_SIZE`, `INPUT_SIZE`, 3] prior to passing through the network. You'll want to set this to the same value that the pretrained model used. See the nets [README](../nets/README.md) for the input size of each model architecture. |
REGION_TYPE | str | Which region should be used when creating an example? Possible values are `image` and `bbox`. |
MAINTAIN_ASPECT_RATIO | bool | When we resize an extracted region, should we maintain the aspect ratio? Or just squish it? 
RESIZE_FAST | bool | If true, then slower resize operations will be avoided and only [bilinear resizing](https://en.wikipedia.org/wiki/Bilinear_interpolation) will be used. Otherwise, a random choice between [bilinear](), [nearest neighbor](https://en.wikipedia.org/wiki/Nearest-neighbor_interpolation), [bicubic](https://en.wikipedia.org/wiki/Bicubic_interpolation) and area interpolation will be used. |
DO_RANDOM_FLIP_LEFT_RIGHT | bool | If true, then each region has a 50% chance of being flipped. | 
DO_COLOR_DISTORTION | float | Value between 0 and 1. 0 means never distort the color, and 1 means always distort the color. |
COLOR_DISTORT_FAST | bool | Its possible to distort the brightness, saturation, hue and contrast of an image. If true, then slower modifications (hue and contrast) are avoided. |

#### Region Extraction

Currently there are two different region extraction protocols: 
* `image`: The entire image is extracted and passed to the next phase of augmentation 
* `bbox`: Each bounding box in the tfrecord is used to crop out an image region. These regions are passed on to the next phase of augmentation. If there are `n` bounding boxes in a tfrecord, then `n` regions will be extracted from the image. 

For bounding boxes, we can specify wether we want to enlarge the box. This can be used as another form of augmentation (loose bounding boxes vs tight bounding boxes).

| Config Name | Type | Description |
:----:|:----:|------------|
DO_EXPANSION | float | Value between 0 and 1. 0 means never expand the box. 1 means always expand the box. |
EXPANSION_CFG | | Contains the parameters controlling the expansion of the bounding box. | 
EXPANSION_CFG.<br />WIDTH_EXPANSION_FACTOR | float | Scaling factor for the width of the box. | 
EXPANSION_CFG.<br />HEIGHT_EXPANSION_FACTOR | float | Scaling factor for the height of the box. | 


#### Random Cropping

Each region that is extracted from an image can then be randomly cropped. Again, this is a form of data augmentation. We are trying to make the network robust to changes in the data that do not effect the class label. 

`RANDOM_CROP_CFG` contains parameters for cropping out a rectangular patch from each region. 

| Config Name | Type | Description |
:----:|:----:|------------|
DO_RANDOM_CROP | float | Value between 0 and 1. 0 means never crop a region. 1 means always take a crop. |
RANDOM_CROP_CFG | | This contains parameters that controls the types of crops that are possible. |
RANDOM_CROP_CFG.<br />MIN_AREA | float | Value between 0 and 1. This controls how much of the region is required to be in the crop, essentially controlling how small a crop can be. |
RANDOM_CROP_CFG.<br />MAX_AREA | float | Value between 0 and 1. This controls the maximum size of the crop. |
RANDOM_CROP_CFG.<br />MIN_ASPECT_RATIO | float | The minimum [aspect ratio](https://en.wikipedia.org/wiki/Aspect_ratio_(image)) of the crop. Don't forget that this crop will be resized to [`INPUT_SIZE`, `INPUT_SIZE`, 3] prior to passing through the network. |
RANDOM_CROP_CFG.<br />MAX_ASPECT_RATIO | float | The maximum [aspect ratio](https://en.wikipedia.org/wiki/Aspect_ratio_(image)) of the crop. Don't forget that this crop will be resized to [`INPUT_SIZE`, `INPUT_SIZE`, 3] prior to passing through the network. |
RANDOM_CROP_CFG.<br />MAX_ATTEMPTS | int | The number of crop attempts to try before returning the whole region. |

### Queues
This section of the config file contains parameters for controlling the queueing of data to feed the network. These setting depend on the number of cores in your machine and the amount of memory available. Please see the comments in the example config file for more information. 

### Saving Models and Summaries 
This section of the config file contains parameters for controlling how often a model checkpoint should be created and how often tensorboard summary files should be generated. Please see the comments in the example config file for more information. 

## Testing Configuration
See the [example testing config file](config_test.yaml). 

The `Learning Rate Parameters`, `Optimization`, and `Saving Models and Summaries` parameters are not necessary for testing. The remaining parameters from the training config carry over to testing. In addition there are a few new configurations:

| Config Name | Type | Description |
:----:|:----:|------------|
PRECISION_AT_K_METRIC | array of ints | You can track top-k metrics using this array. Top-1 (i.e. accuracy) will always be plotted |
NUM_TEST_EXAMPLES | int | The number of images (or bounding boxes) in the tfrecords. This can be ignored if you use the `--batches` command line flag. | 

Typically in a testing situation you'll want to turn off the augmentations to the extracted image regions. This way you are passing "real" data to the network. See the `Image Processing and Augmentation` section of the [example testing config file](config_test.yaml) to see how to extract regions without augmentations.

## Classification Configuration
See the [example classification config file](config_classify.yaml).

The classification configuration contains even fewer necessary fields than the testing configuration. The `Metrics` section is removed and you'll need to pass batch size and total batch information through command-line arguments. 

## Export Configuration
See the [example export config file](config_export.yaml).

The export configuration is the smallest configuration file. See the [example](config_export.yaml) for which fields are required. 


================================================
FILE: config/__init__.py
================================================


================================================
FILE: config/config_classify.yaml
================================================
# Classification specific configuration

RANDOM_SEED : 1.0

SESSION_CONFIG : {
  # If true, then the device location of each variable will be printed
  LOG_DEVICE_PLACEMENT : false,

  # How much GPU memory we are allowed to pre-allocate
  PER_PROCESS_GPU_MEMORY_FRACTION : 0.9,

  # Set the number of accessible cpu threads. Leave as null to use everything.
  # Set to 1 to help with debugging (makes the print statements legible)
  INTRA_OP_PARALLELISM_THREADS : null,
  INTER_OP_PARALLELISM_THREADS : null
}

#################################################
# Dataset Info
# The number of classes we are classifying
NUM_CLASSES : 200

# The model architecture to use.
MODEL_NAME : 'inception_v3'

# END: Dataset Info
#################################################
# Image Processing and Augmentation
# There are 5 steps to image processing:
# 1) Extract regions from the image
# 2) Extract a crops from each region
# 3) Resize the crops for the network architecture
# 4) Flip the crops
# 5) Modify the colors of the crops
IMAGE_PROCESSING : {
    # All images will be resized to the [INPUT_SIZE, INPUT_SIZE, 3]
    INPUT_SIZE : 299,

    # 1) First we extract regions from the image
    # What type of region should be extracted, either 'image' or 'bbox'
    REGION_TYPE : 'image',

    # Specific whole image region extraction configuration
    WHOLE_IMAGE_CFG : {},

    # Specific bounding box region extraction configuration
    BBOX_CFG : {
        # We can centrally expand a bbox (i.e. turn a tight crop into a loose crop)
        # The fraction of time to expand the bounding box, 0 is never, 1 is always
        DO_EXPANSION : 1,
        EXPANSION_CFG : {
            WIDTH_EXPANSION_FACTOR : 2.0, # Expand the width by a factor of 2 (centrally)
            HEIGHT_EXPANSION_FACTOR : 2.0, # Expand the height by a factor of 2 (centrally)
        }
    },

    # 2) Then we take a random crop from the region
    # The fraction of time to take a random crop, 0 is never, 1 is always
    DO_RANDOM_CROP : 0,
    RANDOM_CROP_CFG : {
        MIN_AREA : 0.5, # between 0 and 1, how much of the region must be included
        MAX_AREA : 1.0, # between 0 and 1, how much of the region can be included
        MIN_ASPECT_RATIO : 0.7, # minimum aspect ratio of the crop
        MAX_ASPECT_RATIO : 1.33, # maximum aspect ratio of the crop
        MAX_ATTEMPTS : 100, # maximum number of attempts before returning the whole region
    },

    # Alternatively we can take a central crop from the image
    DO_CENTRAL_CROP : 0, # Fraction of the time to take a central crop, 0 is never, 1 is always
    CENTRAL_CROP_FRACTION : 0.875, # Between 0 and 1, fraction of size to crop

    # 3) We need to resize the extracted regions to feed into the network.
    MAINTAIN_ASPECT_RATIO : false,
    # Avoid slower resize operations (bi-cubic, etc.)
    RESIZE_FAST : true,

    # 4) We can flip the regions
    # Randomly flip the image left right, 50% chance of flipping
    DO_RANDOM_FLIP_LEFT_RIGHT : false,

    # 5) We can distort the colors of the regions
    # The fraction of time to distort the color, 0 is never, 1 is always
    DO_COLOR_DISTORTION : 0,
    # Avoids slower ops (random_hue and random_contrast)
    COLOR_DISTORT_FAST : false
}

# END: Image Processing and Augmentation
#################################################
# Queues
#
# Number of threads to populate the batch queue
NUM_INPUT_THREADS : 2
# Should the data be shuffled?
SHUFFLE_QUEUE : false
# Capacity of the queue producing batched examples
QUEUE_CAPACITY : 1000
# Minimum size of the queue to ensure good shuffling
QUEUE_MIN :  200

# END: Queues
#################################################
# Regularization
#
# The decay to use for the moving average. If 0, then moving average is not computed
# When restoring models, this value is needed to determine whether to restore moving
# average variables or not.
MOVING_AVERAGE_DECAY : 0.9999

# End: Regularization
#################################################

================================================
FILE: config/config_export.yaml
================================================
# Export specific configuration

RANDOM_SEED : 1.0

SESSION_CONFIG : {
  # If true, then the device location of each variable will be printed
  LOG_DEVICE_PLACEMENT : false,

  # How much GPU memory we are allowed to pre-allocate
  PER_PROCESS_GPU_MEMORY_FRACTION : 0.9,

  # Set the number of accessible cpu threads. Leave as null to use everything.
  # Set to 1 to help with debugging (makes the print statements legible)
  INTRA_OP_PARALLELISM_THREADS : null,
  INTER_OP_PARALLELISM_THREADS : null
}

#################################################
# Dataset Info
# The number of classes we are classifying
NUM_CLASSES : 200

# The model architecture to use.
MODEL_NAME : 'inception_v3'

# END: Dataset Info
#################################################
# Image Processing and Augmentation

IMAGE_PROCESSING : {
    # Images are assumed to be raveled, and have length  INPUT_SIZE * INPUT_SIZE * 3
    INPUT_SIZE : 299
}

# END: Image Processing and Augmentation
#################################################
# Regularization
#
# The decay to use for the moving average. If 0, then moving average is not computed
# When restoring models, this value is needed to determine whether to restore moving
# average variables or not.
MOVING_AVERAGE_DECAY : 0.9999

# End: Regularization
#################################################

================================================
FILE: config/config_test.yaml
================================================
# Testing specific configuration

RANDOM_SEED : 1.0

SESSION_CONFIG : {
  # If true, then the device location of each variable will be printed
  LOG_DEVICE_PLACEMENT : false,

  # How much GPU memory we are allowed to pre-allocate
  PER_PROCESS_GPU_MEMORY_FRACTION : 0.9,

  # Set the number of accessible cpu threads. Leave as null to use everything.
  # Set to 1 to help with debugging (makes the print statements legible)
  INTRA_OP_PARALLELISM_THREADS : null,
  INTER_OP_PARALLELISM_THREADS : null
}

#################################################
# Metrics
#
# Top-k precision information. Each entry is a different k value.
ACCURACY_AT_K_METRIC : [3, 5]

# END: Metrics
#################################################
# Dataset Info
# The number of classes we are classifying
NUM_CLASSES : 200

# Number of test examples in the tfrecords. This is needed to compute the total number of
# batches to pass through the network.
NUM_TEST_EXAMPLES : 5794

# The number of images to pass through the network on each iteration
BATCH_SIZE : 32

# The model architecture to use.
MODEL_NAME : 'inception_v3'

# END: Dataset Info
#################################################
# Image Processing and Augmentation
# There are 5 steps to image processing:
# 1) Extract regions from the image
# 2) Extract a crops from each region
# 3) Resize the crops for the network architecture
# 4) Flip the crops
# 5) Modify the colors of the crops
IMAGE_PROCESSING : {
    # All images will be resized to the [INPUT_SIZE, INPUT_SIZE, 3]
    INPUT_SIZE : 299,

    # 1) First we extract regions from the image
    # What type of region should be extracted, either 'image' or 'bbox'
    REGION_TYPE : 'image',

    # Specific whole image region extraction configuration
    WHOLE_IMAGE_CFG : {},

    # Specific bounding box region extraction configuration
    BBOX_CFG : {
        # We can centrally expand a bbox (i.e. turn a tight crop into a loose crop)
        # The fraction of time to expand the bounding box, 0 is never, 1 is always
        DO_EXPANSION : 1,
        EXPANSION_CFG : {
            WIDTH_EXPANSION_FACTOR : 2.0, # Expand the width by a factor of 2 (centrally)
            HEIGHT_EXPANSION_FACTOR : 2.0, # Expand the height by a factor of 2 (centrally)
        }
    },

    # 2) Then we take a random crop from the region
    # The fraction of time to take a random crop, 0 is never, 1 is always
    DO_RANDOM_CROP : 0,
    RANDOM_CROP_CFG : {
        MIN_AREA : 0.5, # between 0 and 1, how much of the region must be included
        MAX_AREA : 1.0, # between 0 and 1, how much of the region can be included
        MIN_ASPECT_RATIO : 0.7, # minimum aspect ratio of the crop
        MAX_ASPECT_RATIO : 1.33, # maximum aspect ratio of the crop
        MAX_ATTEMPTS : 100, # maximum number of attempts before returning the whole region
    },

    # Alternatively we can take a central crop from the image
    DO_CENTRAL_CROP : 0, # Fraction of the time to take a central crop, 0 is never, 1 is always
    CENTRAL_CROP_FRACTION : 0.875, # Between 0 and 1, fraction of size to crop

    # 3) We need to resize the extracted regions to feed into the network.
    MAINTAIN_ASPECT_RATIO : false,
    # Avoid slower resize operations (bi-cubic, etc.)
    RESIZE_FAST : true,

    # 4) We can flip the regions
    # Randomly flip the image left right, 50% chance of flipping
    DO_RANDOM_FLIP_LEFT_RIGHT : false,

    # 5) We can distort the colors of the regions
    # The fraction of time to distort the color, 0 is never, 1 is always
    DO_COLOR_DISTORTION : 0,
    # Avoids slower ops (random_hue and random_contrast)
    COLOR_DISTORT_FAST : false
}

# END: Image Processing and Augmentation
#################################################
# Queues
#
# Number of threads to populate the batch queue
NUM_INPUT_THREADS : 2
# Should the data be shuffled?
SHUFFLE_QUEUE : false
# Capacity of the queue producing batched examples
QUEUE_CAPACITY : 1000
# Minimum size of the queue to ensure good shuffling
QUEUE_MIN :  200

# END: Queues
#################################################
# Regularization
#
# The decay to use for the moving average. If 0, then moving average is not computed
# When restoring models, this value is needed to determine whether to restore moving
# average variables or not.
MOVING_AVERAGE_DECAY : 0.9999

# End: Regularization
#################################################

================================================
FILE: config/config_train.yaml
================================================
# Training specific configuration

RANDOM_SEED : 1.0

SESSION_CONFIG : {
  # If true, then the device location of each variable will be printed
  LOG_DEVICE_PLACEMENT : false,

  # How much GPU memory we are allowed to pre-allocate
  PER_PROCESS_GPU_MEMORY_FRACTION : 0.9,

  # Set the number of accessible cpu threads. Leave as null to use everything.
  # Set to 1 to help with debugging (makes the print statements legible)
  INTRA_OP_PARALLELISM_THREADS : null,
  INTER_OP_PARALLELISM_THREADS : null
}

#################################################
# Dataset Info
#
# The number of classes we are classifying
NUM_CLASSES : 200

# Number of training examples in the tfrecords. This is needed to compute the number of
# batches in an epoch
NUM_TRAIN_EXAMPLES : 5994

# Maximum number of iterations to run before stopping
NUM_TRAIN_ITERATIONS : 20000

# The number of images to pass through the network in a single iteration
BATCH_SIZE : 32

# Which model architecture to use.
MODEL_NAME : 'inception_v3'

# END: Dataset Info
#################################################
# Image Processing and Augmentation
# There are 5 steps to image processing:
# 1) Extract regions from the image
# 2) Extract a crops from each region
# 3) Resize the crops for the network architecture
# 4) Flip the crops
# 5) Modify the colors of the crops
IMAGE_PROCESSING : {
    # All images will be resized to the [INPUT_SIZE, INPUT_SIZE, 3]
    INPUT_SIZE : 299,

    # 1) First we extract regions from the image
    # What type of region should be extracted, either 'image' or 'bbox'
    REGION_TYPE : 'image',

    # Specific whole image region extraction configuration
    WHOLE_IMAGE_CFG : {},

    # Specific bounding box region extraction configuration
    BBOX_CFG : {
        # We can centrally expand a bbox (i.e. turn a tight crop into a loose crop)
        # The fraction of time to expand the bounding box, 0 is never, 1 is always
        DO_EXPANSION : 1,
        EXPANSION_CFG : {
            WIDTH_EXPANSION_FACTOR : 2.0, # Expand the width by a factor of 2 (centrally)
            HEIGHT_EXPANSION_FACTOR : 2.0, # Expand the height by a factor of 2 (centrally)
        }
    },

    # 2) Then we take a random crop from the region
    # The fraction of time to take a random crop, 0 is never, 1 is always
    DO_RANDOM_CROP : 1,
    RANDOM_CROP_CFG : {
        MIN_AREA : 0.5, # between 0 and 1, how much of the region must be included
        MAX_AREA : 1.0, # between 0 and 1, how much of the region can be included
        MIN_ASPECT_RATIO : 0.7, # minimum aspect ratio of the crop
        MAX_ASPECT_RATIO : 1.33, # maximum aspect ratio of the crop
        MAX_ATTEMPTS : 100, # maximum number of attempts before returning the whole region
    },

    # Alternatively we can take a central crop from the image
    DO_CENTRAL_CROP : 0, # Fraction of the time to take a central crop, 0 is never, 1 is always
    CENTRAL_CROP_FRACTION : 0.875, # Between 0 and 1, fraction of size to crop

    # 3) We need to resize the extracted regions to feed into the network.
    MAINTAIN_ASPECT_RATIO : false,
    # Avoid slower resize operations (bi-cubic, etc.)
    RESIZE_FAST : false,

    # 4) We can flip the regions
    # Randomly flip the image left right, 50% chance of flipping
    DO_RANDOM_FLIP_LEFT_RIGHT : true,

    # 5) We can distort the colors of the regions
    # The fraction of time to distort the color, 0 is never, 1 is always
    DO_COLOR_DISTORTION : 0.3,
    # Avoids slower ops (random_hue and random_contrast)
    COLOR_DISTORT_FAST : false
}

# END: Image Processing and Augmentation
#################################################
# Queues
#
# Number of threads to populate the batch queue
NUM_INPUT_THREADS : 4
# Should the data be shuffled?
SHUFFLE_QUEUE : true
# Capacity of the queue producing batched examples
QUEUE_CAPACITY : 1000
# Minimum size of the queue to ensure good shuffling
QUEUE_MIN :  200

# END: Queues
#################################################
# Saving Models and Summaries
#
# How often, in seconds, to save summaries.
SAVE_SUMMARY_SECS : 30

# How often, in seconds, to save the model
SAVE_INTERVAL_SECS : 1800

# The maximum number of recent checkpoint files to keep.
MAX_TO_KEEP : 3

# In addition to keeping the most recent `max_to_keep` checkpoint files,
# you might want to keep one checkpoint file for every N hours of training
# The default value of 10,000 hours effectively disables the feature.
KEEP_CHECKPOINT_EVERY_N_HOURS : 10000

# The frequency, in terms of global steps, that the loss and global step and logged.
LOG_EVERY_N_STEPS : 10

# END: Saving Models and Summaries
#################################################
# Learning Rate Parameters
LEARNING_RATE_DECAY_TYPE : 'exponential' # One of "fixed", "exponential", or "polynomial"

INITIAL_LEARNING_RATE : 0.01

# The minimal end learning rate used by a polynomial decay learning rate.
END_LEARNING_RATE : 0.0001

# The amount of label smoothing.
LABEL_SMOOTHING : 0.1

# How much to decay the learning rate
LEARNING_RATE_DECAY_FACTOR : 0.94
# Number of epochs between decaying the learning rate
NUM_EPOCHS_PER_DELAY : 4

LEARNING_RATE_STAIRCASE : true

# END: Learning Rate Parameters
#################################################
# Regularization
#
# The decay to use for the moving average. If 0, then moving average is not computed
MOVING_AVERAGE_DECAY : 0.9999

# The weight decay on the model weights
WEIGHT_DECAY : 0.00004

BATCHNORM_MOVING_AVERAGE_DECAY : 0.9997
BATCHNORM_EPSILON : 0.001

DROPOUT_KEEP_PROB : 0.5

CLIP_GRADIENT_NORM : 0 # If 0, no clipping is performed. Otherwise acts as a threshold to clip the gradients.

# End: Regularization
#################################################
# Optimization
#
# The name of the optimizer, one of "adadelta", "adagrad", "adam", "ftrl", "momentum", "sgd" or "rmsprop"
OPTIMIZER : 'rmsprop'
OPTIMIZER_EPSILON : 1.0

# The decay rate for adadelta.
ADADELTA_RHO: 0.95

# Starting value for the AdaGrad accumulators.
ADAGRAD_INITIAL_ACCUMULATOR_VALUE: 0.1

# The exponential decay rate for the 1st moment estimates.
ADAM_BETA1 : 0.9
# The exponential decay rate for the 2nd moment estimates.
ADAM_BETA2 : 0.99

# The learning rate power.
FTRL_LEARNING_RATE_POWER : -0.5
# Starting value for the FTRL accumulators.
FTRL_INITIAL_ACCUMULATOR_VALUE : 0.1
# The FTRL l1 regularization strength.
FTRL_L1 : 0.0
# The FTRL l2 regularization strength.
FTRL_L2 : 0.0

# The momentum for the MomentumOptimizer and RMSPropOptimizer
MOMENTUM : 0.9

# Decay term for RMSProp.
RMSPROP_DECAY : 0.9

# END: Optimization
#################################################

================================================
FILE: config/parse_config.py
================================================
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import yaml
from easydict import EasyDict as easydict

def parse_config_file(path_to_config):

    with open(path_to_config) as f:
        cfg = yaml.load(f)

    return easydict(cfg)

================================================
FILE: export.py
================================================
"""
Export a trained model for application use.

Example for use with TensorFlow Serving:
python export.py \
--checkpoint_path model.ckpt-399739 \
--export_dir export \
--export_version 1 \
--config config_export.yaml \
--serving \
--add_preprocess \
--class_names class-codes.txt

Example for use with TensorFlow Mobile:
python export.py \
--checkpoint_path model.ckpt-399739 \
--export_dir export \
--export_version 1 \
--config config_export.yaml \
--class_names class-codes.txt

Author: Grant Van Horn
"""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import argparse
import os

import tensorflow as tf
from tensorflow.python.framework import dtypes
from tensorflow.python.framework import graph_util
from tensorflow.python.saved_model import builder as saved_model_builder
from tensorflow.python.saved_model import signature_constants
from tensorflow.python.saved_model import signature_def_utils
from tensorflow.python.saved_model import tag_constants
from tensorflow.python.saved_model import utils
from tensorflow.python.tools import optimize_for_inference_lib
slim = tf.contrib.slim

from config.parse_config import parse_config_file
from nets import nets_factory


def export(checkpoint_path,
           export_dir, export_version, export_for_serving, export_tflite, export_coreml,
           add_preprocess_step,
           output_classes, class_names,
           batch_size, raveled_input,
           cfg):
    """Export a model for use with TensorFlow Serving or for more conveinent use on mobile devices, etc.
    Arguments:
      checkpoint_path (str): Path to the specific model checkpoint file to export.
      export_dir (str): Path to a directory to store the export files.
      export_version (int): The version number of this export. If `export_for_serving` is True, then this version
        number must not exist in the `export_dir`.
      export_for_serving (bool): Export a model for use with TensorFlow Serving.
      export_tflite (bool): Export a model for tensorflow lite.
      export_coreml (bool): Export a model for coreml.
      add_preprocess_step (bool): If True, then an input path for handling image byte strings will be added to the graph.
      output_classes (bool): If True, then the class indices (or `class_names` if provided) will be output along with the scores.
      class_names (list): A list of semantic class identifiers to embed within the model that correspond to the prediction
        indices. Set to None to not embed.
      batch_size (int or None): Specify a fixed batch size, or use None to keep it flexible. For tflite export you'll need a fixed batch size.
      raveled_input (bool): If True, then the input is considered to be a raveled vector that will be reshaped to a fixed height and width. Otherwise it will be treated as the proper shape.
      cfg (dict): Configuration dictionary.
    """

    if not os.path.exists(export_dir):
        print("Making export directory: %s" % (export_dir,))
        os.makedirs(export_dir)

    graph = tf.Graph()

    array_input_node_name = "images"
    bytes_input_node_name = "image_bytes"

    output_node_name = "Predictions"
    class_names_node_name = "names"

    input_height = cfg.IMAGE_PROCESSING.INPUT_SIZE
    input_width = cfg.IMAGE_PROCESSING.INPUT_SIZE
    input_depth = 3

    with graph.as_default():

        global_step = slim.get_or_create_global_step()

        # We want to store the preprocessing operation in the graph
        if add_preprocess_step:

            # The TensorFlow map_fn() function passes one argument only,
            # so I have put this method here to take advantage of scope
            # (to access input_height, etc.)
            def preprocess_image(image_buffer):
                """Preprocess image bytes to 3D float Tensor."""

                # Decode image bytes
                image = tf.image.decode_image(image_buffer)
                image = tf.image.convert_image_dtype(image, dtype=tf.float32)

                # make sure the image is of rank 3
                image = tf.cond(
                    tf.equal(tf.rank(image), 2),
                    lambda: tf.expand_dims(image, 2),
                    lambda: image
                )

                num_channels = tf.shape(image)[2]

                # if we decoded 1 channel (grayscale), then convert to a RGB image
                image = tf.cond(
                    tf.equal(num_channels, 1),
                    lambda: tf.image.grayscale_to_rgb(image),
                    lambda: image
                )

                # if we decoded 2 channels (grayscale + alpha), then strip off the last dim and convert to rgb
                image = tf.cond(
                    tf.equal(num_channels, 2),
                    lambda: tf.image.grayscale_to_rgb(
                        tf.expand_dims(image[:, :, 0], 2)),
                    lambda: image
                )

                # if we decoded 4 or more channels (rgb + alpha), then take the first three channels
                image = tf.cond(
                    tf.greater(num_channels, 3),
                    lambda: image[:, :, :3],
                    lambda: image
                )

                # Resize the image to the input height and width for the network.
                image = tf.expand_dims(image, 0)
                image = tf.image.resize_bilinear(image,
                                                 [input_height, input_width],
                                                 align_corners=False)
                image = tf.squeeze(image, [0])
                # Finally, rescale to [-1,1] instead of [0, 1)
                image = tf.subtract(image, 0.5)
                image = tf.multiply(image, 2.0)
                return image

            image_bytes_placeholder = tf.placeholder(
                tf.string, name=bytes_input_node_name)
            preped_images = tf.map_fn(
                preprocess_image, image_bytes_placeholder, dtype=tf.float32)
            # Explicit name (we can't name the map_fn)
            input_placeholder = tf.identity(
                preped_images, name=array_input_node_name)

        # We assume the client has preprocessed the data for us
        else:
            # Is the input coming in as a raveled vector? Or is it a tensor?
            if raveled_input:
                input_placeholder = tf.placeholder(tf.float32, shape=[batch_size, input_height * input_width * input_depth], name=array_input_node_name)
            else:
                input_placeholder = tf.placeholder(tf.float32, shape=[batch_size, input_height, input_width, input_depth], name=array_input_node_name)

        # Reshape the images to proper tensors if they are coming in as vectors.
        if raveled_input:
            images = tf.reshape(input_placeholder,
                                [-1, input_height, input_width, input_depth])
        else:
            images = input_placeholder

        arg_scope = nets_factory.arg_scopes_map[cfg.MODEL_NAME]()

        with slim.arg_scope(arg_scope):
            logits, end_points = nets_factory.networks_map[cfg.MODEL_NAME](
                inputs=images,
                num_classes=cfg.NUM_CLASSES,
                is_training=False
            )

        class_scores = end_points['Predictions']
        if output_classes:
            if class_names == None:
                class_names = tf.range(class_scores.get_shape().as_list()[1])
            predicted_classes = tf.tile(tf.expand_dims(class_names, 0), [
                                        tf.shape(class_scores)[0], 1], name=class_names_node_name)

        # GVH: I would like to use tf.identity here, but the function tensorflow.python.framework.graph_util.remove_training_nodes
        # called in (optimize_for_inference_lib.optimize_for_inference) removes the identity function.
        # Sticking with an add 0 operation for now.
        # We are doing this so that we can rename the output to `output_node_name` (i.e. something consistent)
        output_node = tf.add(
            end_points['Predictions'], 0., name=output_node_name)
        output_node_name = output_node.op.name

        if 'MOVING_AVERAGE_DECAY' in cfg and cfg.MOVING_AVERAGE_DECAY > 0:
            variable_averages = tf.train.ExponentialMovingAverage(
                cfg.MOVING_AVERAGE_DECAY, global_step)
            variables_to_restore = variable_averages.variables_to_restore(
                slim.get_model_variables())
        else:
            variables_to_restore = slim.get_variables_to_restore()

        saver = tf.train.Saver(variables_to_restore, reshape=True)

        if os.path.isdir(checkpoint_path):
            checkpoint_dir = checkpoint_path
            checkpoint_path = tf.train.latest_checkpoint(checkpoint_dir)

            if checkpoint_path is None:
                raise ValueError("Unable to find a model checkpoint in the "
                                 "directory %s" % (checkpoint_dir,))

        tf.logging.info('Exporting model: %s' % checkpoint_path)

        sess_config = tf.ConfigProto(
            log_device_placement=cfg.SESSION_CONFIG.LOG_DEVICE_PLACEMENT,
            allow_soft_placement=True,
            gpu_options=tf.GPUOptions(
                per_process_gpu_memory_fraction=cfg.SESSION_CONFIG.PER_PROCESS_GPU_MEMORY_FRACTION
            )
        )
        sess = tf.Session(graph=graph, config=sess_config)

        if export_for_serving:

            with tf.Session(graph=graph) as sess:

                tf.global_variables_initializer().run()

                saver.restore(sess, checkpoint_path)

                save_path = os.path.join(export_dir, "%d" % (export_version,))

                builder = saved_model_builder.SavedModelBuilder(save_path)

                # Build the signature_def_map.
                signature_def_map = {}
                signature_def_outputs = {
                    'scores': utils.build_tensor_info(class_scores)}
                if output_classes:
                    signature_def_outputs['classes'] = utils.build_tensor_info(
                        predicted_classes)

                # image bytes input
                if add_preprocess_step:
                    image_bytes_tensor_info = utils.build_tensor_info(
                        image_bytes_placeholder)
                    image_bytes_prediction_signature = signature_def_utils.build_signature_def(
                        inputs={'images': image_bytes_tensor_info},
                        outputs=signature_def_outputs,
                        method_name=signature_constants.PREDICT_METHOD_NAME
                    )
                    signature_def_map['predict_image_bytes'] = image_bytes_prediction_signature

                # image array input
                image_array_tensor_info = utils.build_tensor_info(
                    input_placeholder)
                image_array_prediction_signature = signature_def_utils.build_signature_def(
                    inputs={'images': image_array_tensor_info},
                    outputs=signature_def_outputs,
                    method_name=signature_constants.PREDICT_METHOD_NAME
                )
                signature_def_map['predict_image_array'] = image_array_prediction_signature
                signature_def_map[signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY] = image_array_prediction_signature

                legacy_init_op = tf.group(
                    tf.tables_initializer(), name='legacy_init_op')

                builder.add_meta_graph_and_variables(
                    sess, [tag_constants.SERVING],
                    signature_def_map=signature_def_map,
                    legacy_init_op=legacy_init_op
                )

                builder.save()

                print("Saved optimized model for TensorFlow Serving.")

        else:
            with sess.as_default():

                tf.global_variables_initializer().run()

                saver.restore(sess, checkpoint_path)

                input_graph_def = graph.as_graph_def()
                input_node_names = [array_input_node_name]
                if add_preprocess_step:
                    input_node_names.append(bytes_input_node_name)
                output_node_names = [output_node_name]
                if output_classes:
                    output_node_names.append(class_names_node_name)

                constant_graph_def = graph_util.convert_variables_to_constants(
                    sess=sess,
                    input_graph_def=input_graph_def,
                    output_node_names=output_node_names,
                    variable_names_whitelist=None,
                    variable_names_blacklist=None
                )

                if add_preprocess_step:
                    optimized_graph_def = constant_graph_def
                else:
                    optimized_graph_def = optimize_for_inference_lib.optimize_for_inference(
                        input_graph_def=constant_graph_def,
                        input_node_names=input_node_names,
                        output_node_names=output_node_names,
                        placeholder_type_enum=dtypes.float32.as_datatype_enum
                    )

                save_dir = os.path.join(export_dir, str(export_version))
                if not os.path.exists(save_dir):
                    print("Making version directory in export directory: %s" %
                          (save_dir,))
                    os.makedirs(save_dir)
                save_path = os.path.join(save_dir, 'optimized_model.pb')
                with open(save_path, 'w') as f:
                    f.write(optimized_graph_def.SerializeToString())

                print("Saved optimized model for mobile devices at: %s." %
                      (save_path,))
                print("Input node names: %s" % (input_node_names,))
                print("Output node name: %s" % (output_node_names,))

                if export_tflite:

                    # Patch the tensorflow lite conversion module
                    # See here: https://github.com/tensorflow/tensorflow/issues/15410
                    import tempfile
                    import subprocess
                    tf.contrib.lite.tempfile = tempfile
                    tf.contrib.lite.subprocess = subprocess

                    assert batch_size != None, "We need a fixed batch size for the tensorflow lite export. (e.g. set --batch_size=1)"

                    tflite_model = tf.contrib.lite.toco_convert(
                        optimized_graph_def, [input_placeholder], [output_node])
                    tflite_save_path = os.path.join(
                        save_dir, 'optimized_model.tflite')
                    with open(tflite_save_path, 'wb') as f:
                        f.write(tflite_model)

                    print()
                    print("Saved optimized model for tensorflow lite: %s." %
                          (tflite_save_path,))
                    print("Input node names: %s" % (input_node_names,))
                    print("Output node name: %s" % (output_node_name,))

    # We have to get out of the graph scope.
    if export_coreml:
        try:
            import tfcoreml as tf_converter
        except:
            raise ValueError("Can't import tfcoreml, so we can't create a coreml model.")

        assert batch_size != None, "We need a fixed batch size for the coreml export. (e.g. set --batch_size=1)"
        assert raveled_input == False, "The input cannot be raveled. CoreML does not support `reshape()`."

        coreml_save_path = os.path.join(save_dir, 'optimized_model.mlmodel')
        tf_converter.convert(tf_model_path=save_path,
                             mlmodel_path=coreml_save_path,
                             output_feature_names=[output_node_name + ":0"],
                             input_name_shape_dict={'images:0': [
                                 batch_size, input_height, input_width, input_depth]}
                             )

        print()
        print("Saved optimized model for coreml: %s." % (coreml_save_path,))
        print("Input node names: %s" % (input_node_names,))
        print("Output node name: %s" % (output_node_name,))


def parse_args():

    parser = argparse.ArgumentParser(
        description='Test an Inception V3 network')

    parser.add_argument('--checkpoint_path', dest='checkpoint_path',
                        help='Path to the specific model you want to export.',
                        required=True, type=str)

    parser.add_argument('--export_dir', dest='export_dir',
                        help='Path to a directory where the exported model will be saved.',
                        required=True, type=str)

    parser.add_argument('--export_version', dest='export_version',
                        help='Version number of the model.',
                        required=True, type=int)

    parser.add_argument('--config', dest='config_file',
                        help='Path to the configuration file',
                        required=True, type=str)

    parser.add_argument('--serving', dest='serving',
                        help='Export for TensorFlow Serving usage. Otherwise, a constant graph will be generated.',
                        action='store_true', default=False)

    parser.add_argument('--export_tflite', dest='export_tflite',
                        help='If True, then a tensorflow lite file will be produced along with the normal tensorflow model export (This is ignored if --serving is present).',
                        action='store_true', default=False)

    parser.add_argument('--export_coreml', dest='export_coreml',
                        help='If True, then a coreml file will be produced along with the normal tensorflow model export (This is ignored if --serving is present).',
                        action='store_true', default=False)

    parser.add_argument('--add_preprocess', dest='add_preprocess',
                        help='Add the image decoding and preprocessing nodes to the graph so that image bytes can be passed in.',
                        action='store_true', default=False)

    parser.add_argument('--output_classes', dest='output_classes',
                        help='If True, then class indices (or names if `class_names` is provided) are output along with the scores.',
                        action='store_true', default=False)

    parser.add_argument('--class_names', dest='class_names_path',
                        help='Path to the class names corresponding to each entry in the predictions output. This file should have one line for each index.',
                        required=False, type=str, default=None)

    parser.add_argument('--batch_size', dest='batch_size',
                        help='Use this to specify a fixed batch size. Leave as None to have a flexible batch size. This must be specified to create tflite and coreml exports.',
                        required=False, type=int, default=None)

    parser.add_argument('--raveled_input', dest='raveled_input',
                        help='If True, then the input is considered to be a vector that will be reshaped to the proper tensor form. This cannot be used with coreml',
                        action='store_true', default=False)

    args = parser.parse_args()

    return args


if __name__ == '__main__':

    args = parse_args()
    cfg = parse_config_file(args.config_file)

    if args.class_names_path != None:
        class_names = []
        with open(args.class_names_path) as f:
            for line in f:
                class_names.append(line.strip())
    else:
        class_names = None

    export(checkpoint_path=args.checkpoint_path,
           export_dir=args.export_dir,
           export_version=args.export_version,
           export_for_serving=args.serving,
           export_tflite=args.export_tflite,
           export_coreml=args.export_coreml,
           add_preprocess_step=args.add_preprocess,
           output_classes=args.output_classes,
           class_names=class_names,
           batch_size=args.batch_size,
           raveled_input=args.raveled_input,
           cfg=cfg
    )


================================================
FILE: extract.py
================================================
"""
Extract features.
"""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import argparse
import os
import time

import numpy as np
import tensorflow as tf
import tensorflow.contrib.slim as slim

from config.parse_config import parse_config_file
from nets import nets_factory
from preprocessing import inputs

def extract_features(tfrecords, checkpoint_path, num_iterations, feature_keys, cfg, read_images=False):
    """
    Extract and return the features
    """

    tf.logging.set_verbosity(tf.logging.INFO)

    graph = tf.Graph()

    with graph.as_default():

        global_step = slim.get_or_create_global_step()

        with tf.device('/cpu:0'):
            batch_dict = inputs.input_nodes(
                tfrecords=tfrecords,
                cfg=cfg.IMAGE_PROCESSING,
                num_epochs=1,
                batch_size=cfg.BATCH_SIZE,
                num_threads=cfg.NUM_INPUT_THREADS,
                shuffle_batch =cfg.SHUFFLE_QUEUE,
                random_seed=cfg.RANDOM_SEED,
                capacity=cfg.QUEUE_CAPACITY,
                min_after_dequeue=cfg.QUEUE_MIN,
                add_summaries=False,
                input_type='classification',
                read_filenames=read_images
            )

        arg_scope = nets_factory.arg_scopes_map[cfg.MODEL_NAME]()

        with slim.arg_scope(arg_scope):
            logits, end_points = nets_factory.networks_map[cfg.MODEL_NAME](
                inputs=batch_dict['inputs'],
                num_classes=cfg.NUM_CLASSES,
                is_training=False
            )

            predicted_labels = tf.argmax(end_points['Predictions'], 1)

        if 'MOVING_AVERAGE_DECAY' in cfg and cfg.MOVING_AVERAGE_DECAY > 0:
            variable_averages = tf.train.ExponentialMovingAverage(
                cfg.MOVING_AVERAGE_DECAY, global_step)
            variables_to_restore = variable_averages.variables_to_restore(
                slim.get_model_variables())
            variables_to_restore[global_step.op.name] = global_step
        else:
            variables_to_restore = slim.get_variables_to_restore()
            variables_to_restore.append(global_step)


        saver = tf.train.Saver(variables_to_restore, reshape=True)

        num_batches = num_iterations
        num_items = num_batches * cfg.BATCH_SIZE

        fetches = []
        feature_stores = []
        for feature_key in feature_keys:
            feature = tf.reshape(end_points[feature_key], [cfg.BATCH_SIZE, -1])
            num_elements = feature.get_shape().as_list()[1]
            feature_stores.append(np.empty([num_items, num_elements], dtype=np.float32))
            fetches.append(feature)

        fetches.append(batch_dict['ids'])
        feature_stores.append(np.empty(num_items, dtype=np.object))

        if os.path.isdir(checkpoint_path):
            checkpoint_dir = checkpoint_path
            checkpoint_path = tf.train.latest_checkpoint(checkpoint_dir)

            if checkpoint_path is None:
                raise ValueError("Unable to find a model checkpoint in the " \
                                 "directory %s" % (checkpoint_dir,))

        tf.logging.info('Classifying records using %s' % checkpoint_path)

        coord = tf.train.Coordinator()

        sess_config = tf.ConfigProto(
                log_device_placement=cfg.SESSION_CONFIG.LOG_DEVICE_PLACEMENT,
                allow_soft_placement = True,
                gpu_options = tf.GPUOptions(
                    per_process_gpu_memory_fraction=cfg.SESSION_CONFIG.PER_PROCESS_GPU_MEMORY_FRACTION
                )
            )
        sess = tf.Session(graph=graph, config=sess_config)

        with sess.as_default():

            tf.global_variables_initializer().run()
            tf.local_variables_initializer().run()
            threads = tf.train.start_queue_runners(sess=sess, coord=coord)

            try:

                # Restore from checkpoint
                saver.restore(sess, checkpoint_path)

                print_str = ', '.join([
                  'Step: %d',
                  'Time/image (ms): %.1f'
                ])

                step = 0
                while not coord.should_stop():

                    t = time.time()
                    outputs = sess.run(fetches)
                    dt = time.time()-t

                    idx1 = cfg.BATCH_SIZE * step
                    idx2 = idx1 + cfg.BATCH_SIZE

                    for i in range(len(outputs)):
                        feature_stores[i][idx1:idx2] = outputs[i]

                    step += 1
                    print(print_str % (step, (dt / cfg.BATCH_SIZE) * 1000))

                    if num_iterations > 0 and step == num_iterations:
                        break

            except tf.errors.OutOfRangeError as e:
                pass

        coord.request_stop()
        coord.join(threads)

        feature_dict = {feature_key : feature for feature_key, feature in zip(feature_keys, feature_stores[:-1])}
        feature_dict['ids'] = feature_stores[-1]

        return feature_dict

def extract_and_save(tfrecords, checkpoint_path, save_path, num_iterations, feature_keys, cfg, read_images=False):
    """Extract and save the features
    Args:
        tfrecords (list)
        checkpoint_path (str)
        save_dir (str)
        max_iterations (int)
        save_logits (bool)
        cfg (EasyDict)
    """

    feature_dict = extract_features(tfrecords, checkpoint_path, num_iterations, feature_keys, cfg, read_images=read_images)

    # save the results
    np.savez(save_path, **feature_dict)


def parse_args():

    parser = argparse.ArgumentParser(description='Classify images, optionally saving the logits.')

    parser.add_argument('--tfrecords', dest='tfrecords',
                        help='Paths to tfrecords.', type=str,
                        nargs='+', required=True)

    parser.add_argument('--checkpoint_path', dest='checkpoint_path',
                          help='Path to a specific model to test against. If a directory, then the newest checkpoint file will be used.', type=str,
                          required=True)

    parser.add_argument('--save_path', dest='save_path',
                          help='File name path to a save the classification results.', type=str,
                          required=True)

    parser.add_argument('--config', dest='config_file',
                        help='Path to the configuration file',
                        required=True, type=str)

    parser.add_argument('--batch_size', dest='batch_size',
                        help='The number of images in a batch.',
                        required=True, type=int)

    parser.add_argument('--batches', dest='batches',
                        help='Maximum number of iterations to run. Default is all records (modulo the batch size).',
                        required=True, type=int)

    parser.add_argument('--features', dest='features',
                        help='The features to extract. These are keys into the end_points dictionary returned by the model architecture.',
                        type=str, nargs='+', required=True)

    parser.add_argument('--model_name', dest='model_name',
                        help='The name of the architecture to use.',
                        required=False, type=str, default=None)

    parser.add_argument('--read_images', dest='read_images',
                        help='Read the images from the file system using the `filename` field rather than using the `encoded` field of the tfrecord.',
                        action='store_true', default=False)


    args = parser.parse_args()
    return args

def main():
    args = parse_args()

    cfg = parse_config_file(args.config_file)

    if args.batch_size != None:
        cfg.BATCH_SIZE = args.batch_size

    if args.model_name != None:
        cfg.MODEL_NAME = args.model_name

    extract_and_save(
        tfrecords=args.tfrecords,
        checkpoint_path=args.checkpoint_path,
        save_path = args.save_path,
        num_iterations=args.batches,
        feature_keys=args.features,
        cfg=cfg,
        read_images=args.read_images
    )

if __name__ == '__main__':
    main()


================================================
FILE: nets/README.md
================================================
# Models

This directory contains the available classification models. All of these models were copied from the [TensorFlow Models repo](https://github.com/tensorflow/models/tree/master/slim/nets) and updated to TensorFlow r1.0.

The table below lists relevant information for each model. To use one of these models (e.g. when using the training scripts), simply set the `--model_name` flag to the appropriate name. The number of parameters and the number of flops were computed using the `profile` function in [net_profile.py](net_profile.py). I assumed a batch size of 1, and 1000 classes for all models. All available checkpoint files are from models trained on the [ILSVRC-2012-CLS](http://www.image-net.org/challenges/LSVRC/2012/) dataset. Top-1 and Top-5 numbers correspond to performance on that datasets. When fine-tuning from one of these checkpoints, it is recommended to use the same image size as the default image size for that model.

| Model | Name | TF-Slim File | Checkpoint | Top-1 Accuracy | Top-5 Accuracy | Default Image Size | Num Params | Num Flops |
:----:|:----:|:------------:|:----------:|:-------:|:--------:|:--------:|:--------:|:--------:|
[Inception V1](http://arxiv.org/abs/1409.4842v1) | inception_v1 | [Code](inception_v1.py) | [Checkpoint](http://download.tensorflow.org/models/inception_v1_2016_08_28.tar.gz) | 69.8 | 89.6 | 224px | 6,617,624 | 3.00b |
[Inception V2](http://arxiv.org/abs/1502.03167) | inception_v2 | [Code](inception_v2.py) | [Checkpoint](http://download.tensorflow.org/models/inception_v2_2016_08_28.tar.gz) | 73.9 | 91.8 | 224px | 11,178,336 | 3.87b |
[Inception V3](http://arxiv.org/abs/1512.00567) | inception_v3 | [Code](inception_v3.py) | [Checkpoint](http://download.tensorflow.org/models/inception_v3_2016_08_28.tar.gz) | 78.0 | 93.9 | 299px | 27,143,152 | 11.44b |
[Inception V4](http://arxiv.org/abs/1602.07261) | inception_v4 | [Code](inception_v4.py) | [Checkpoint](http://download.tensorflow.org/models/inception_v4_2016_09_09.tar.gz) | 80.2 | 95.2 | 299px | 46,006,800 | 24.52b |
[Inception-ResNet-v2](http://arxiv.org/abs/1602.07261) | inception_resnet_v2 | [Code](inception_resnet_v2.py) | [Checkpoint](http://download.tensorflow.org/models/inception_resnet_v2_2016_08_30.tar.gz) | 80.4 | 95.3 | 299px | 59,179,952 | 26.34b |
[ResNet V2 50](https://arxiv.org/abs/1603.05027) | resnet_v2_50 | [Code](resnet_v2.py) | [Checkpoint](http://download.tensorflow.org/models/resnet_v2_50_2017_04_14.tar.gz) | 75.6 | 92.8 | 299px | 25,568,360 | 13.08b |
[ResNet V2 101](https://arxiv.org/abs/1603.05027) | resnet_v2_101 | [Code](resnet_v2.py) | [Checkpoint](http://download.tensorflow.org/models/resnet_v2_101_2017_04_14.tar.gz) | 77.0 | 93.7 | 299px | 44,577,896 | 26.77b |
[ResNet V2 152](https://arxiv.org/abs/1603.05027) | resnet_v2_152 | [Code](resnet_v2.py) | [Checkpoint](http://download.tensorflow.org/models/resnet_v2_152_2017_04_14.tar.gz) | 77.8 | 94.1 | 299px | 60,236,904 | 40.45b |
[MobileNet-v1](https://arxiv.org/abs/1704.04861) | mobilenet_v1 | [Code](mobilenet_v1.py) | [Checkpoint](http://download.tensorflow.org/models/mobilenet_v1_1.0_224_2017_06_14.tar.gz) | 70.7 | 89.5 | 224px | 4,231,976 | 1.14b |

# Finetuning

When you finetune one of the above models, you'll start the training procedure using something like:
```
python train.py \
--tfrecords $DATASET_DIR/train* \
--logdir $EXPERIMENT_DIR/logdir \
--config $EXPERIMENT_DIR/config_train.yaml \
--pretrained_model $PRETRAINED_MODEL \
--checkpoint_exclude_scopes <model specific scopes>
```

The `--checkpoint_exclude_scopes` argument allows you to prevent restoring variables that have different sizes, which are typically your logit variables (which have a different size due to the number of classes in your application being different than the number of classes in ImageNet). The below table provides the proper value for `--checkpoint_exclude_scopes` for each model.

| Model | Name | TF-Slim File | Default Image Size | Exclude Scopes |
:----:|:----:|:------------:|:----------:|:-------:|
[Inception V1](http://arxiv.org/abs/1409.4842v1) | inception_v1 | [Code](inception_v1.py) | 224px | InceptionV1/Logits |
[Inception V2](http://arxiv.org/abs/1502.03167) | inception_v2 | [Code](inception_v2.py) | 224px | InceptionV2/Logits |
[Inception V3](http://arxiv.org/abs/1512.00567) | inception_v3 | [Code](inception_v3.py) | 299px | InceptionV3/Logits InceptionV3/AuxLogits |
[Inception V4](http://arxiv.org/abs/1602.07261) | inception_v4 | [Code](inception_v4.py) | 299px | InceptionV4/Logits InceptionV4/AuxLogits |
[Inception-ResNet-v2](http://arxiv.org/abs/1602.07261) | inception_resnet_v2 | [Code](inception_resnet_v2.py) | 299px | InceptionResnetV2/Logits InceptionResnetV2/AuxLogits |
[ResNet V2 50](https://arxiv.org/abs/1603.05027) | resnet_v2_50 | [Code](resnet_v2.py) | 224px | resnet_v2_50/logits |
[ResNet V2 101](https://arxiv.org/abs/1603.05027) | resnet_v2_101 | [Code](resnet_v2.py) | 224px | resnet_v2_101/logits |
[ResNet V2 152](https://arxiv.org/abs/1603.05027) | resnet_v2_152 | [Code](resnet_v2.py) | 224px | resnet_v2_152/logits |
[MobileNet-v1](https://arxiv.org/abs/1704.04861) | mobilenet_v1 | [Code](mobilenet_v1.py) | 224px | MobilenetV1/Logits |


================================================
FILE: nets/__init__.py
================================================


================================================
FILE: nets/inception.py
================================================
# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Brings all inception models under one namespace."""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

# pylint: disable=unused-import
from nets.inception_resnet_v2 import inception_resnet_v2
from nets.inception_resnet_v2 import inception_resnet_v2_arg_scope
from nets.inception_v1 import inception_v1
from nets.inception_v1 import inception_v1_arg_scope
from nets.inception_v1 import inception_v1_base
from nets.inception_v2 import inception_v2
from nets.inception_v2 import inception_v2_arg_scope
from nets.inception_v2 import inception_v2_base
from nets.inception_v3 import inception_v3
from nets.inception_v3 import inception_v3_arg_scope
from nets.inception_v3 import inception_v3_base
from nets.inception_v4 import inception_v4
from nets.inception_v4 import inception_v4_arg_scope
from nets.inception_v4 import inception_v4_base
# pylint: enable=unused-import


================================================
FILE: nets/inception_resnet_v2.py
================================================
# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Contains the definition of the Inception Resnet V2 architecture.

As described in http://arxiv.org/abs/1602.07261.

  Inception-v4, Inception-ResNet and the Impact of Residual Connections
    on Learning
  Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, Alex Alemi
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function


import tensorflow as tf

slim = tf.contrib.slim


def block35(net, scale=1.0, activation_fn=tf.nn.relu, scope=None, reuse=None):
  """Builds the 35x35 resnet block."""
  with tf.variable_scope(scope, 'Block35', [net], reuse=reuse):
    with tf.variable_scope('Branch_0'):
      tower_conv = slim.conv2d(net, 32, 1, scope='Conv2d_1x1')
    with tf.variable_scope('Branch_1'):
      tower_conv1_0 = slim.conv2d(net, 32, 1, scope='Conv2d_0a_1x1')
      tower_conv1_1 = slim.conv2d(tower_conv1_0, 32, 3, scope='Conv2d_0b_3x3')
    with tf.variable_scope('Branch_2'):
      tower_conv2_0 = slim.conv2d(net, 32, 1, scope='Conv2d_0a_1x1')
      tower_conv2_1 = slim.conv2d(tower_conv2_0, 48, 3, scope='Conv2d_0b_3x3')
      tower_conv2_2 = slim.conv2d(tower_conv2_1, 64, 3, scope='Conv2d_0c_3x3')
    mixed = tf.concat(axis=3, values=[tower_conv, tower_conv1_1, tower_conv2_2])
    up = slim.conv2d(mixed, net.get_shape()[3], 1, normalizer_fn=None,
                     activation_fn=None, scope='Conv2d_1x1')
    net += scale * up
    if activation_fn:
      net = activation_fn(net)
  return net


def block17(net, scale=1.0, activation_fn=tf.nn.relu, scope=None, reuse=None):
  """Builds the 17x17 resnet block."""
  with tf.variable_scope(scope, 'Block17', [net], reuse=reuse):
    with tf.variable_scope('Branch_0'):
      tower_conv = slim.conv2d(net, 192, 1, scope='Conv2d_1x1')
    with tf.variable_scope('Branch_1'):
      tower_conv1_0 = slim.conv2d(net, 128, 1, scope='Conv2d_0a_1x1')
      tower_conv1_1 = slim.conv2d(tower_conv1_0, 160, [1, 7],
                                  scope='Conv2d_0b_1x7')
      tower_conv1_2 = slim.conv2d(tower_conv1_1, 192, [7, 1],
                                  scope='Conv2d_0c_7x1')
    mixed = tf.concat(axis=3, values=[tower_conv, tower_conv1_2])
    up = slim.conv2d(mixed, net.get_shape()[3], 1, normalizer_fn=None,
                     activation_fn=None, scope='Conv2d_1x1')
    net += scale * up
    if activation_fn:
      net = activation_fn(net)
  return net


def block8(net, scale=1.0, activation_fn=tf.nn.relu, scope=None, reuse=None):
  """Builds the 8x8 resnet block."""
  with tf.variable_scope(scope, 'Block8', [net], reuse=reuse):
    with tf.variable_scope('Branch_0'):
      tower_conv = slim.conv2d(net, 192, 1, scope='Conv2d_1x1')
    with tf.variable_scope('Branch_1'):
      tower_conv1_0 = slim.conv2d(net, 192, 1, scope='Conv2d_0a_1x1')
      tower_conv1_1 = slim.conv2d(tower_conv1_0, 224, [1, 3],
                                  scope='Conv2d_0b_1x3')
      tower_conv1_2 = slim.conv2d(tower_conv1_1, 256, [3, 1],
                                  scope='Conv2d_0c_3x1')
    mixed = tf.concat(axis=3, values=[tower_conv, tower_conv1_2])
    up = slim.conv2d(mixed, net.get_shape()[3], 1, normalizer_fn=None,
                     activation_fn=None, scope='Conv2d_1x1')
    net += scale * up
    if activation_fn:
      net = activation_fn(net)
  return net


def inception_resnet_v2(inputs, num_classes=1001, is_training=True,
                        dropout_keep_prob=0.8,
                        reuse=None,
                        scope='InceptionResnetV2'):
  """Creates the Inception Resnet V2 model.

  Args:
    inputs: a 4-D tensor of size [batch_size, height, width, 3].
    num_classes: number of predicted classes.
    is_training: whether is training or not.
    dropout_keep_prob: float, the fraction to keep before final layer.
    reuse: whether or not the network and its variables should be reused. To be
      able to reuse 'scope' must be given.
    scope: Optional variable_scope.

  Returns:
    logits: the logits outputs of the model.
    end_points: the set of end_points from the inception model.
  """
  end_points = {}

  with tf.variable_scope(scope, 'InceptionResnetV2', [inputs], reuse=reuse):
    with slim.arg_scope([slim.batch_norm, slim.dropout],
                        is_training=is_training):
      with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d],
                          stride=1, padding='SAME'):

        # 149 x 149 x 32
        net = slim.conv2d(inputs, 32, 3, stride=2, padding='VALID',
                          scope='Conv2d_1a_3x3')
        end_points['Conv2d_1a_3x3'] = net
        # 147 x 147 x 32
        net = slim.conv2d(net, 32, 3, padding='VALID',
                          scope='Conv2d_2a_3x3')
        end_points['Conv2d_2a_3x3'] = net
        # 147 x 147 x 64
        net = slim.conv2d(net, 64, 3, scope='Conv2d_2b_3x3')
        end_points['Conv2d_2b_3x3'] = net
        # 73 x 73 x 64
        net = slim.max_pool2d(net, 3, stride=2, padding='VALID',
                              scope='MaxPool_3a_3x3')
        end_points['MaxPool_3a_3x3'] = net
        # 73 x 73 x 80
        net = slim.conv2d(net, 80, 1, padding='VALID',
                          scope='Conv2d_3b_1x1')
        end_points['Conv2d_3b_1x1'] = net
        # 71 x 71 x 192
        net = slim.conv2d(net, 192, 3, padding='VALID',
                          scope='Conv2d_4a_3x3')
        end_points['Conv2d_4a_3x3'] = net
        # 35 x 35 x 192
        net = slim.max_pool2d(net, 3, stride=2, padding='VALID',
                              scope='MaxPool_5a_3x3')
        end_points['MaxPool_5a_3x3'] = net

        # 35 x 35 x 320
        with tf.variable_scope('Mixed_5b'):
          with tf.variable_scope('Branch_0'):
            tower_conv = slim.conv2d(net, 96, 1, scope='Conv2d_1x1')
          with tf.variable_scope('Branch_1'):
            tower_conv1_0 = slim.conv2d(net, 48, 1, scope='Conv2d_0a_1x1')
            tower_conv1_1 = slim.conv2d(tower_conv1_0, 64, 5,
                                        scope='Conv2d_0b_5x5')
          with tf.variable_scope('Branch_2'):
            tower_conv2_0 = slim.conv2d(net, 64, 1, scope='Conv2d_0a_1x1')
            tower_conv2_1 = slim.conv2d(tower_conv2_0, 96, 3,
                                        scope='Conv2d_0b_3x3')
            tower_conv2_2 = slim.conv2d(tower_conv2_1, 96, 3,
                                        scope='Conv2d_0c_3x3')
          with tf.variable_scope('Branch_3'):
            tower_pool = slim.avg_pool2d(net, 3, stride=1, padding='SAME',
                                         scope='AvgPool_0a_3x3')
            tower_pool_1 = slim.conv2d(tower_pool, 64, 1,
                                       scope='Conv2d_0b_1x1')
          net = tf.concat(axis=3, values=[tower_conv, tower_conv1_1,
                              tower_conv2_2, tower_pool_1])

        end_points['Mixed_5b'] = net
        net = slim.repeat(net, 10, block35, scale=0.17)

        # 17 x 17 x 1024
        with tf.variable_scope('Mixed_6a'):
          with tf.variable_scope('Branch_0'):
            tower_conv = slim.conv2d(net, 384, 3, stride=2, padding='VALID',
                                     scope='Conv2d_1a_3x3')
          with tf.variable_scope('Branch_1'):
            tower_conv1_0 = slim.conv2d(net, 256, 1, scope='Conv2d_0a_1x1')
            tower_conv1_1 = slim.conv2d(tower_conv1_0, 256, 3,
                                        scope='Conv2d_0b_3x3')
            tower_conv1_2 = slim.conv2d(tower_conv1_1, 384, 3,
                                        stride=2, padding='VALID',
                                        scope='Conv2d_1a_3x3')
          with tf.variable_scope('Branch_2'):
            tower_pool = slim.max_pool2d(net, 3, stride=2, padding='VALID',
                                         scope='MaxPool_1a_3x3')
          net = tf.concat(axis=3, values=[tower_conv, tower_conv1_2, tower_pool])

        end_points['Mixed_6a'] = net
        net = slim.repeat(net, 20, block17, scale=0.10)

        # Auxillary tower
        with tf.variable_scope('AuxLogits'):
          # Originally, kernel_size = 5
          # However, if we change the input size then we need to change the kernel size
          # We want to pool the feature map to be 5x5xC
          # With padding = 0, and stride 3, this means our kernel is H - 12
          kernel_size = [net.get_shape().as_list()[1] - 12] * 2
          aux = slim.avg_pool2d(net, kernel_size, stride=3, padding='VALID',
                                scope='Conv2d_1a_3x3')
          aux = slim.conv2d(aux, 128, 1, scope='Conv2d_1b_1x1')
          aux = slim.conv2d(aux, 768, aux.get_shape()[1:3],
                            padding='VALID', scope='Conv2d_2a_5x5')
          aux = slim.flatten(aux)
          aux = slim.fully_connected(aux, num_classes, activation_fn=None,
                                     scope='Logits')
          end_points['AuxLogits'] = aux

        with tf.variable_scope('Mixed_7a'):
          with tf.variable_scope('Branch_0'):
            tower_conv = slim.conv2d(net, 256, 1, scope='Conv2d_0a_1x1')
            tower_conv_1 = slim.conv2d(tower_conv, 384, 3, stride=2,
                                       padding='VALID', scope='Conv2d_1a_3x3')
          with tf.variable_scope('Branch_1'):
            tower_conv1 = slim.conv2d(net, 256, 1, scope='Conv2d_0a_1x1')
            tower_conv1_1 = slim.conv2d(tower_conv1, 288, 3, stride=2,
                                        padding='VALID', scope='Conv2d_1a_3x3')
          with tf.variable_scope('Branch_2'):
            tower_conv2 = slim.conv2d(net, 256, 1, scope='Conv2d_0a_1x1')
            tower_conv2_1 = slim.conv2d(tower_conv2, 288, 3,
                                        scope='Conv2d_0b_3x3')
            tower_conv2_2 = slim.conv2d(tower_conv2_1, 320, 3, stride=2,
                                        padding='VALID', scope='Conv2d_1a_3x3')
          with tf.variable_scope('Branch_3'):
            tower_pool = slim.max_pool2d(net, 3, stride=2, padding='VALID',
                                         scope='MaxPool_1a_3x3')
          net = tf.concat(axis=3, values=[tower_conv_1, tower_conv1_1,
                              tower_conv2_2, tower_pool])

        end_points['Mixed_7a'] = net

        net = slim.repeat(net, 9, block8, scale=0.20)
        net = block8(net, activation_fn=None)

        net = slim.conv2d(net, 1536, 1, scope='Conv2d_7b_1x1')
        end_points['Conv2d_7b_1x1'] = net

        with tf.variable_scope('Logits'):
          end_points['PrePool'] = net
          net = slim.avg_pool2d(net, net.get_shape()[1:3], padding='VALID',
                                scope='AvgPool_1a_8x8')
          net = slim.flatten(net)

          net = slim.dropout(net, dropout_keep_prob, is_training=is_training,
                             scope='Dropout')

          end_points['PreLogitsFlatten'] = net
          logits = slim.fully_connected(net, num_classes, activation_fn=None,
                                        scope='Logits')
          end_points['Logits'] = logits
          end_points['Predictions'] = tf.nn.softmax(logits, name='Predictions')

    return logits, end_points
inception_resnet_v2.default_image_size = 299


def inception_resnet_v2_arg_scope(weight_decay=0.00004,
                                  batch_norm_decay=0.9997,
                                  batch_norm_epsilon=0.001):
  """Yields the scope with the default parameters for inception_resnet_v2.

  Args:
    weight_decay: the weight decay for weights variables.
    batch_norm_decay: decay for the moving average of batch_norm momentums.
    batch_norm_epsilon: small float added to variance to avoid dividing by zero.

  Returns:
    a arg_scope with the parameters needed for inception_resnet_v2.
  """
  # Set weight_decay for weights in conv2d and fully_connected layers.
  with slim.arg_scope([slim.conv2d, slim.fully_connected],
                      weights_regularizer=slim.l2_regularizer(weight_decay),
                      biases_regularizer=slim.l2_regularizer(weight_decay)):

    batch_norm_params = {
        'decay': batch_norm_decay,
        'epsilon': batch_norm_epsilon,
    }
    # Set activation_fn and parameters for batch_norm.
    with slim.arg_scope([slim.conv2d], activation_fn=tf.nn.relu,
                        normalizer_fn=slim.batch_norm,
                        normalizer_params=batch_norm_params) as scope:
      return scope


================================================
FILE: nets/inception_resnet_v2_test.py
================================================
# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tests for slim.inception_resnet_v2."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tensorflow as tf

from nets import inception


class InceptionTest(tf.test.TestCase):

  def testBuildLogits(self):
    batch_size = 5
    height, width = 299, 299
    num_classes = 1000
    with self.test_session():
      inputs = tf.random_uniform((batch_size, height, width, 3))
      logits, _ = inception.inception_resnet_v2(inputs, num_classes)
      self.assertTrue(logits.op.name.startswith('InceptionResnetV2/Logits'))
      self.assertListEqual(logits.get_shape().as_list(),
                           [batch_size, num_classes])

  def testBuildEndPoints(self):
    batch_size = 5
    height, width = 299, 299
    num_classes = 1000
    with self.test_session():
      inputs = tf.random_uniform((batch_size, height, width, 3))
      _, end_points = inception.inception_resnet_v2(inputs, num_classes)
      self.assertTrue('Logits' in end_points)
      logits = end_points['Logits']
      self.assertListEqual(logits.get_shape().as_list(),
                           [batch_size, num_classes])
      self.assertTrue('AuxLogits' in end_points)
      aux_logits = end_points['AuxLogits']
      self.assertListEqual(aux_logits.get_shape().as_list(),
                           [batch_size, num_classes])
      pre_pool = end_points['PrePool']
      self.assertListEqual(pre_pool.get_shape().as_list(),
                           [batch_size, 8, 8, 1536])

  def testVariablesSetDevice(self):
    batch_size = 5
    height, width = 299, 299
    num_classes = 1000
    with self.test_session():
      inputs = tf.random_uniform((batch_size, height, width, 3))
      # Force all Variables to reside on the device.
      with tf.variable_scope('on_cpu'), tf.device('/cpu:0'):
        inception.inception_resnet_v2(inputs, num_classes)
      with tf.variable_scope('on_gpu'), tf.device('/gpu:0'):
        inception.inception_resnet_v2(inputs, num_classes)
      for v in tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, scope='on_cpu'):
        self.assertDeviceEqual(v.device, '/cpu:0')
      for v in tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, scope='on_gpu'):
        self.assertDeviceEqual(v.device, '/gpu:0')

  def testHalfSizeImages(self):
    batch_size = 5
    height, width = 150, 150
    num_classes = 1000
    with self.test_session():
      inputs = tf.random_uniform((batch_size, height, width, 3))
      logits, end_points = inception.inception_resnet_v2(inputs, num_classes)
      self.assertTrue(logits.op.name.startswith('InceptionResnetV2/Logits'))
      self.assertListEqual(logits.get_shape().as_list(),
                           [batch_size, num_classes])
      pre_pool = end_points['PrePool']
      self.assertListEqual(pre_pool.get_shape().as_list(),
                           [batch_size, 3, 3, 1536])

  def testUnknownBatchSize(self):
    batch_size = 1
    height, width = 299, 299
    num_classes = 1000
    with self.test_session() as sess:
      inputs = tf.placeholder(tf.float32, (None, height, width, 3))
      logits, _ = inception.inception_resnet_v2(inputs, num_classes)
      self.assertTrue(logits.op.name.startswith('InceptionResnetV2/Logits'))
      self.assertListEqual(logits.get_shape().as_list(),
                           [None, num_classes])
      images = tf.random_uniform((batch_size, height, width, 3))
      sess.run(tf.global_variables_initializer())
      output = sess.run(logits, {inputs: images.eval()})
      self.assertEquals(output.shape, (batch_size, num_classes))

  def testEvaluation(self):
    batch_size = 2
    height, width = 299, 299
    num_classes = 1000
    with self.test_session() as sess:
      eval_inputs = tf.random_uniform((batch_size, height, width, 3))
      logits, _ = inception.inception_resnet_v2(eval_inputs,
                                                num_classes,
                                                is_training=False)
      predictions = tf.argmax(logits, 1)
      sess.run(tf.global_variables_initializer())
      output = sess.run(predictions)
      self.assertEquals(output.shape, (batch_size,))

  def testTrainEvalWithReuse(self):
    train_batch_size = 5
    eval_batch_size = 2
    height, width = 150, 150
    num_classes = 1000
    with self.test_session() as sess:
      train_inputs = tf.random_uniform((train_batch_size, height, width, 3))
      inception.inception_resnet_v2(train_inputs, num_classes)
      eval_inputs = tf.random_uniform((eval_batch_size, height, width, 3))
      logits, _ = inception.inception_resnet_v2(eval_inputs,
                                                num_classes,
                                                is_training=False,
                                                reuse=True)
      predictions = tf.argmax(logits, 1)
      sess.run(tf.global_variables_initializer())
      output = sess.run(predictions)
      self.assertEquals(output.shape, (eval_batch_size,))


if __name__ == '__main__':
  tf.test.main()


================================================
FILE: nets/inception_utils.py
================================================
# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Contains common code shared by all inception models.

Usage of arg scope:
  with slim.arg_scope(inception_arg_scope()):
    logits, end_points = inception.inception_v3(images, num_classes,
                                                is_training=is_training)

"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tensorflow as tf

slim = tf.contrib.slim


def inception_arg_scope(weight_decay=0.00004,
                        use_batch_norm=True,
                        batch_norm_decay=0.9997,
                        batch_norm_epsilon=0.001):
  """Defines the default arg scope for inception models.

  Args:
    weight_decay: The weight decay to use for regularizing the model.
    use_batch_norm: "If `True`, batch_norm is applied after each convolution.
    batch_norm_decay: Decay for batch norm moving average.
    batch_norm_epsilon: Small float added to variance to avoid dividing by zero
      in batch norm.

  Returns:
    An `arg_scope` to use for the inception models.
  """
  batch_norm_params = {
      # Decay for the moving averages.
      'decay': batch_norm_decay,
      # epsilon to prevent 0s in variance.
      'epsilon': batch_norm_epsilon,
      # collection containing update_ops.
      'updates_collections': tf.GraphKeys.UPDATE_OPS,
  }
  if use_batch_norm:
    normalizer_fn = slim.batch_norm
    normalizer_params = batch_norm_params
  else:
    normalizer_fn = None
    normalizer_params = {}
  # Set weight_decay for weights in Conv and FC layers.
  with slim.arg_scope([slim.conv2d, slim.fully_connected],
                      weights_regularizer=slim.l2_regularizer(weight_decay)):
    with slim.arg_scope(
        [slim.conv2d],
        weights_initializer=slim.variance_scaling_initializer(),
        activation_fn=tf.nn.relu,
        normalizer_fn=normalizer_fn,
        normalizer_params=normalizer_params) as sc:
      return sc


================================================
FILE: nets/inception_v1.py
================================================
# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Contains the definition for inception v1 classification network."""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tensorflow as tf

from nets import inception_utils

slim = tf.contrib.slim
trunc_normal = lambda stddev: tf.truncated_normal_initializer(0.0, stddev)


def inception_v1_base(inputs,
                      final_endpoint='Mixed_5c',
                      scope='InceptionV1'):
  """Defines the Inception V1 base architecture.

  This architecture is defined in:
    Going deeper with convolutions
    Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed,
    Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich.
    http://arxiv.org/pdf/1409.4842v1.pdf.

  Args:
    inputs: a tensor of size [batch_size, height, width, channels].
    final_endpoint: specifies the endpoint to construct the network up to. It
      can be one of ['Conv2d_1a_7x7', 'MaxPool_2a_3x3', 'Conv2d_2b_1x1',
      'Conv2d_2c_3x3', 'MaxPool_3a_3x3', 'Mixed_3b', 'Mixed_3c',
      'MaxPool_4a_3x3', 'Mixed_4b', 'Mixed_4c', 'Mixed_4d', 'Mixed_4e',
      'Mixed_4f', 'MaxPool_5a_2x2', 'Mixed_5b', 'Mixed_5c']
    scope: Optional variable_scope.

  Returns:
    A dictionary from components of the network to the corresponding activation.

  Raises:
    ValueError: if final_endpoint is not set to one of the predefined values.
  """
  end_points = {}
  with tf.variable_scope(scope, 'InceptionV1', [inputs]):
    with slim.arg_scope(
        [slim.conv2d, slim.fully_connected],
        weights_initializer=trunc_normal(0.01)):
      with slim.arg_scope([slim.conv2d, slim.max_pool2d],
                          stride=1, padding='SAME'):
        end_point = 'Conv2d_1a_7x7'
        net = slim.conv2d(inputs, 64, [7, 7], stride=2, scope=end_point)
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points
        end_point = 'MaxPool_2a_3x3'
        net = slim.max_pool2d(net, [3, 3], stride=2, scope=end_point)
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points
        end_point = 'Conv2d_2b_1x1'
        net = slim.conv2d(net, 64, [1, 1], scope=end_point)
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points
        end_point = 'Conv2d_2c_3x3'
        net = slim.conv2d(net, 192, [3, 3], scope=end_point)
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points
        end_point = 'MaxPool_3a_3x3'
        net = slim.max_pool2d(net, [3, 3], stride=2, scope=end_point)
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = 'Mixed_3b'
        with tf.variable_scope(end_point):
          with tf.variable_scope('Branch_0'):
            branch_0 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
          with tf.variable_scope('Branch_1'):
            branch_1 = slim.conv2d(net, 96, [1, 1], scope='Conv2d_0a_1x1')
            branch_1 = slim.conv2d(branch_1, 128, [3, 3], scope='Conv2d_0b_3x3')
          with tf.variable_scope('Branch_2'):
            branch_2 = slim.conv2d(net, 16, [1, 1], scope='Conv2d_0a_1x1')
            branch_2 = slim.conv2d(branch_2, 32, [3, 3], scope='Conv2d_0b_3x3')
          with tf.variable_scope('Branch_3'):
            branch_3 = slim.max_pool2d(net, [3, 3], scope='MaxPool_0a_3x3')
            branch_3 = slim.conv2d(branch_3, 32, [1, 1], scope='Conv2d_0b_1x1')
          net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = 'Mixed_3c'
        with tf.variable_scope(end_point):
          with tf.variable_scope('Branch_0'):
            branch_0 = slim.conv2d(net, 128, [1, 1], scope='Conv2d_0a_1x1')
          with tf.variable_scope('Branch_1'):
            branch_1 = slim.conv2d(net, 128, [1, 1], scope='Conv2d_0a_1x1')
            branch_1 = slim.conv2d(branch_1, 192, [3, 3], scope='Conv2d_0b_3x3')
          with tf.variable_scope('Branch_2'):
            branch_2 = slim.conv2d(net, 32, [1, 1], scope='Conv2d_0a_1x1')
            branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0b_3x3')
          with tf.variable_scope('Branch_3'):
            branch_3 = slim.max_pool2d(net, [3, 3], scope='MaxPool_0a_3x3')
            branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope='Conv2d_0b_1x1')
          net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = 'MaxPool_4a_3x3'
        net = slim.max_pool2d(net, [3, 3], stride=2, scope=end_point)
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = 'Mixed_4b'
        with tf.variable_scope(end_point):
          with tf.variable_scope('Branch_0'):
            branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
          with tf.variable_scope('Branch_1'):
            branch_1 = slim.conv2d(net, 96, [1, 1], scope='Conv2d_0a_1x1')
            branch_1 = slim.conv2d(branch_1, 208, [3, 3], scope='Conv2d_0b_3x3')
          with tf.variable_scope('Branch_2'):
            branch_2 = slim.conv2d(net, 16, [1, 1], scope='Conv2d_0a_1x1')
            branch_2 = slim.conv2d(branch_2, 48, [3, 3], scope='Conv2d_0b_3x3')
          with tf.variable_scope('Branch_3'):
            branch_3 = slim.max_pool2d(net, [3, 3], scope='MaxPool_0a_3x3')
            branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope='Conv2d_0b_1x1')
          net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = 'Mixed_4c'
        with tf.variable_scope(end_point):
          with tf.variable_scope('Branch_0'):
            branch_0 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')
          with tf.variable_scope('Branch_1'):
            branch_1 = slim.conv2d(net, 112, [1, 1], scope='Conv2d_0a_1x1')
            branch_1 = slim.conv2d(branch_1, 224, [3, 3], scope='Conv2d_0b_3x3')
          with tf.variable_scope('Branch_2'):
            branch_2 = slim.conv2d(net, 24, [1, 1], scope='Conv2d_0a_1x1')
            branch_2 = slim.conv2d(branch_2, 64, [3, 3], scope='Conv2d_0b_3x3')
          with tf.variable_scope('Branch_3'):
            branch_3 = slim.max_pool2d(net, [3, 3], scope='MaxPool_0a_3x3')
            branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope='Conv2d_0b_1x1')
          net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = 'Mixed_4d'
        with tf.variable_scope(end_point):
          with tf.variable_scope('Branch_0'):
            branch_0 = slim.conv2d(net, 128, [1, 1], scope='Conv2d_0a_1x1')
          with tf.variable_scope('Branch_1'):
            branch_1 = slim.conv2d(net, 128, [1, 1], scope='Conv2d_0a_1x1')
            branch_1 = slim.conv2d(branch_1, 256, [3, 3], scope='Conv2d_0b_3x3')
          with tf.variable_scope('Branch_2'):
            branch_2 = slim.conv2d(net, 24, [1, 1], scope='Conv2d_0a_1x1')
            branch_2 = slim.conv2d(branch_2, 64, [3, 3], scope='Conv2d_0b_3x3')
          with tf.variable_scope('Branch_3'):
            branch_3 = slim.max_pool2d(net, [3, 3], scope='MaxPool_0a_3x3')
            branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope='Conv2d_0b_1x1')
          net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = 'Mixed_4e'
        with tf.variable_scope(end_point):
          with tf.variable_scope('Branch_0'):
            branch_0 = slim.conv2d(net, 112, [1, 1], scope='Conv2d_0a_1x1')
          with tf.variable_scope('Branch_1'):
            branch_1 = slim.conv2d(net, 144, [1, 1], scope='Conv2d_0a_1x1')
            branch_1 = slim.conv2d(branch_1, 288, [3, 3], scope='Conv2d_0b_3x3')
          with tf.variable_scope('Branch_2'):
            branch_2 = slim.conv2d(net, 32, [1, 1], scope='Conv2d_0a_1x1')
            branch_2 = slim.conv2d(branch_2, 64, [3, 3], scope='Conv2d_0b_3x3')
          with tf.variable_scope('Branch_3'):
            branch_3 = slim.max_pool2d(net, [3, 3], scope='MaxPool_0a_3x3')
            branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope='Conv2d_0b_1x1')
          net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = 'Mixed_4f'
        with tf.variable_scope(end_point):
          with tf.variable_scope('Branch_0'):
            branch_0 = slim.conv2d(net, 256, [1, 1], scope='Conv2d_0a_1x1')
          with tf.variable_scope('Branch_1'):
            branch_1 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')
            branch_1 = slim.conv2d(branch_1, 320, [3, 3], scope='Conv2d_0b_3x3')
          with tf.variable_scope('Branch_2'):
            branch_2 = slim.conv2d(net, 32, [1, 1], scope='Conv2d_0a_1x1')
            branch_2 = slim.conv2d(branch_2, 128, [3, 3], scope='Conv2d_0b_3x3')
          with tf.variable_scope('Branch_3'):
            branch_3 = slim.max_pool2d(net, [3, 3], scope='MaxPool_0a_3x3')
            branch_3 = slim.conv2d(branch_3, 128, [1, 1], scope='Conv2d_0b_1x1')
          net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = 'MaxPool_5a_2x2'
        net = slim.max_pool2d(net, [2, 2], stride=2, scope=end_point)
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = 'Mixed_5b'
        with tf.variable_scope(end_point):
          with tf.variable_scope('Branch_0'):
            branch_0 = slim.conv2d(net, 256, [1, 1], scope='Conv2d_0a_1x1')
          with tf.variable_scope('Branch_1'):
            branch_1 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')
            branch_1 = slim.conv2d(branch_1, 320, [3, 3], scope='Conv2d_0b_3x3')
          with tf.variable_scope('Branch_2'):
            branch_2 = slim.conv2d(net, 32, [1, 1], scope='Conv2d_0a_1x1')
            branch_2 = slim.conv2d(branch_2, 128, [3, 3], scope='Conv2d_0a_3x3')
          with tf.variable_scope('Branch_3'):
            branch_3 = slim.max_pool2d(net, [3, 3], scope='MaxPool_0a_3x3')
            branch_3 = slim.conv2d(branch_3, 128, [1, 1], scope='Conv2d_0b_1x1')
          net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points

        end_point = 'Mixed_5c'
        with tf.variable_scope(end_point):
          with tf.variable_scope('Branch_0'):
            branch_0 = slim.conv2d(net, 384, [1, 1], scope='Conv2d_0a_1x1')
          with tf.variable_scope('Branch_1'):
            branch_1 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
            branch_1 = slim.conv2d(branch_1, 384, [3, 3], scope='Conv2d_0b_3x3')
          with tf.variable_scope('Branch_2'):
            branch_2 = slim.conv2d(net, 48, [1, 1], scope='Conv2d_0a_1x1')
            branch_2 = slim.conv2d(branch_2, 128, [3, 3], scope='Conv2d_0b_3x3')
          with tf.variable_scope('Branch_3'):
            branch_3 = slim.max_pool2d(net, [3, 3], scope='MaxPool_0a_3x3')
            branch_3 = slim.conv2d(branch_3, 128, [1, 1], scope='Conv2d_0b_1x1')
          net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if final_endpoint == end_point: return net, end_points
    raise ValueError('Unknown final endpoint %s' % final_endpoint)


def inception_v1(inputs,
                 num_classes=1000,
                 is_training=True,
                 dropout_keep_prob=0.8,
                 prediction_fn=slim.softmax,
                 spatial_squeeze=True,
                 reuse=None,
                 scope='InceptionV1'):
  """Defines the Inception V1 architecture.

  This architecture is defined in:

    Going deeper with convolutions
    Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed,
    Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich.
    http://arxiv.org/pdf/1409.4842v1.pdf.

  The default image size used to train this network is 224x224.

  Args:
    inputs: a tensor of size [batch_size, height, width, channels].
    num_classes: number of predicted classes.
    is_training: whether is training or not.
    dropout_keep_prob: the percentage of activation values that are retained.
    prediction_fn: a function to get predictions out of logits.
    spatial_squeeze: if True, logits is of shape is [B, C], if false logits is
        of shape [B, 1, 1, C], where B is batch_size and C is number of classes.
    reuse: whether or not the network and its variables should be reused. To be
      able to reuse 'scope' must be given.
    scope: Optional variable_scope.

  Returns:
    logits: the pre-softmax activations, a tensor of size
      [batch_size, num_classes]
    end_points: a dictionary from components of the network to the corresponding
      activation.
  """
  # Final pooling and prediction
  with tf.variable_scope(scope, 'InceptionV1', [inputs, num_classes],
                         reuse=reuse) as scope:
    with slim.arg_scope([slim.batch_norm, slim.dropout],
                        is_training=is_training):
      net, end_points = inception_v1_base(inputs, scope=scope)
      with tf.variable_scope('Logits'):
        net = slim.avg_pool2d(net, [7, 7], stride=1, scope='MaxPool_0a_7x7')
        net = slim.dropout(net,
                           dropout_keep_prob, scope='Dropout_0b')
        logits = slim.conv2d(net, num_classes, [1, 1], activation_fn=None,
                             normalizer_fn=None, scope='Conv2d_0c_1x1')
        if spatial_squeeze:
          logits = tf.squeeze(logits, [1, 2], name='SpatialSqueeze')

        end_points['Logits'] = logits
        end_points['Predictions'] = prediction_fn(logits, scope='Predictions')
  return logits, end_points
inception_v1.default_image_size = 224

inception_v1_arg_scope = inception_utils.inception_arg_scope


================================================
FILE: nets/inception_v1_test.py
================================================
# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tests for nets.inception_v1."""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np
import tensorflow as tf

from nets import inception

slim = tf.contrib.slim


class InceptionV1Test(tf.test.TestCase):

  def testBuildClassificationNetwork(self):
    batch_size = 5
    height, width = 224, 224
    num_classes = 1000

    inputs = tf.random_uniform((batch_size, height, width, 3))
    logits, end_points = inception.inception_v1(inputs, num_classes)
    self.assertTrue(logits.op.name.startswith('InceptionV1/Logits'))
    self.assertListEqual(logits.get_shape().as_list(),
                         [batch_size, num_classes])
    self.assertTrue('Predictions' in end_points)
    self.assertListEqual(end_points['Predictions'].get_shape().as_list(),
                         [batch_size, num_classes])

  def testBuildBaseNetwork(self):
    batch_size = 5
    height, width = 224, 224

    inputs = tf.random_uniform((batch_size, height, width, 3))
    mixed_6c, end_points = inception.inception_v1_base(inputs)
    self.assertTrue(mixed_6c.op.name.startswith('InceptionV1/Mixed_5c'))
    self.assertListEqual(mixed_6c.get_shape().as_list(),
                         [batch_size, 7, 7, 1024])
    expected_endpoints = ['Conv2d_1a_7x7', 'MaxPool_2a_3x3', 'Conv2d_2b_1x1',
                          'Conv2d_2c_3x3', 'MaxPool_3a_3x3', 'Mixed_3b',
                          'Mixed_3c', 'MaxPool_4a_3x3', 'Mixed_4b', 'Mixed_4c',
                          'Mixed_4d', 'Mixed_4e', 'Mixed_4f', 'MaxPool_5a_2x2',
                          'Mixed_5b', 'Mixed_5c']
    self.assertItemsEqual(end_points.keys(), expected_endpoints)

  def testBuildOnlyUptoFinalEndpoint(self):
    batch_size = 5
    height, width = 224, 224
    endpoints = ['Conv2d_1a_7x7', 'MaxPool_2a_3x3', 'Conv2d_2b_1x1',
                 'Conv2d_2c_3x3', 'MaxPool_3a_3x3', 'Mixed_3b', 'Mixed_3c',
                 'MaxPool_4a_3x3', 'Mixed_4b', 'Mixed_4c', 'Mixed_4d',
                 'Mixed_4e', 'Mixed_4f', 'MaxPool_5a_2x2', 'Mixed_5b',
                 'Mixed_5c']
    for index, endpoint in enumerate(endpoints):
      with tf.Graph().as_default():
        inputs = tf.random_uniform((batch_size, height, width, 3))
        out_tensor, end_points = inception.inception_v1_base(
            inputs, final_endpoint=endpoint)
        self.assertTrue(out_tensor.op.name.startswith(
            'InceptionV1/' + endpoint))
        self.assertItemsEqual(endpoints[:index+1], end_points)

  def testBuildAndCheckAllEndPointsUptoMixed5c(self):
    batch_size = 5
    height, width = 224, 224

    inputs = tf.random_uniform((batch_size, height, width, 3))
    _, end_points = inception.inception_v1_base(inputs,
                                                final_endpoint='Mixed_5c')
    endpoints_shapes = {'Conv2d_1a_7x7': [5, 112, 112, 64],
                        'MaxPool_2a_3x3': [5, 56, 56, 64],
                        'Conv2d_2b_1x1': [5, 56, 56, 64],
                        'Conv2d_2c_3x3': [5, 56, 56, 192],
                        'MaxPool_3a_3x3': [5, 28, 28, 192],
                        'Mixed_3b': [5, 28, 28, 256],
                        'Mixed_3c': [5, 28, 28, 480],
                        'MaxPool_4a_3x3': [5, 14, 14, 480],
                        'Mixed_4b': [5, 14, 14, 512],
                        'Mixed_4c': [5, 14, 14, 512],
                        'Mixed_4d': [5, 14, 14, 512],
                        'Mixed_4e': [5, 14, 14, 528],
                        'Mixed_4f': [5, 14, 14, 832],
                        'MaxPool_5a_2x2': [5, 7, 7, 832],
                        'Mixed_5b': [5, 7, 7, 832],
                        'Mixed_5c': [5, 7, 7, 1024]}

    self.assertItemsEqual(endpoints_shapes.keys(), end_points.keys())
    for endpoint_name in endpoints_shapes:
      expected_shape = endpoints_shapes[endpoint_name]
      self.assertTrue(endpoint_name in end_points)
      self.assertListEqual(end_points[endpoint_name].get_shape().as_list(),
                           expected_shape)

  def testModelHasExpectedNumberOfParameters(self):
    batch_size = 5
    height, width = 224, 224
    inputs = tf.random_uniform((batch_size, height, width, 3))
    with slim.arg_scope(inception.inception_v1_arg_scope()):
      inception.inception_v1_base(inputs)
    total_params, _ = slim.model_analyzer.analyze_vars(
        slim.get_model_variables())
    self.assertAlmostEqual(5607184, total_params)

  def testHalfSizeImages(self):
    batch_size = 5
    height, width = 112, 112

    inputs = tf.random_uniform((batch_size, height, width, 3))
    mixed_5c, _ = inception.inception_v1_base(inputs)
    self.assertTrue(mixed_5c.op.name.startswith('InceptionV1/Mixed_5c'))
    self.assertListEqual(mixed_5c.get_shape().as_list(),
                         [batch_size, 4, 4, 1024])

  def testUnknownImageShape(self):
    tf.reset_default_graph()
    batch_size = 2
    height, width = 224, 224
    num_classes = 1000
    input_np = np.random.uniform(0, 1, (batch_size, height, width, 3))
    with self.test_session() as sess:
      inputs = tf.placeholder(tf.float32, shape=(batch_size, None, None, 3))
      logits, end_points = inception.inception_v1(inputs, num_classes)
      self.assertTrue(logits.op.name.startswith('InceptionV1/Logits'))
      self.assertListEqual(logits.get_shape().as_list(),
                           [batch_size, num_classes])
      pre_pool = end_points['Mixed_5c']
      feed_dict = {inputs: input_np}
      tf.global_variables_initializer().run()
      pre_pool_out = sess.run(pre_pool, feed_dict=feed_dict)
      self.assertListEqual(list(pre_pool_out.shape), [batch_size, 7, 7, 1024])

  def testUnknowBatchSize(self):
    batch_size = 1
    height, width = 224, 224
    num_classes = 1000

    inputs = tf.placeholder(tf.float32, (None, height, width, 3))
    logits, _ = inception.inception_v1(inputs, num_classes)
    self.assertTrue(logits.op.name.startswith('InceptionV1/Logits'))
    self.assertListEqual(logits.get_shape().as_list(),
                         [None, num_classes])
    images = tf.random_uniform((batch_size, height, width, 3))

    with self.test_session() as sess:
      sess.run(tf.global_variables_initializer())
      output = sess.run(logits, {inputs: images.eval()})
      self.assertEquals(output.shape, (batch_size, num_classes))

  def testEvaluation(self):
    batch_size = 2
    height, width = 224, 224
    num_classes = 1000

    eval_inputs = tf.random_uniform((batch_size, height, width, 3))
    logits, _ = inception.inception_v1(eval_inputs, num_classes,
                                       is_training=False)
    predictions = tf.argmax(logits, 1)

    with self.test_session() as sess:
      sess.run(tf.global_variables_initializer())
      output = sess.run(predictions)
      self.assertEquals(output.shape, (batch_size,))

  def testTrainEvalWithReuse(self):
    train_batch_size = 5
    eval_batch_size = 2
    height, width = 224, 224
    num_classes = 1000

    train_inputs = tf.random_uniform((train_batch_size, height, width, 3))
    inception.inception_v1(train_inputs, num_classes)
    eval_inputs = tf.random_uniform((eval_batch_size, height, width, 3))
    logits, _ = inception.inception_v1(eval_inputs, num_classes, reuse=True)
    predictions = tf.argmax(logits, 1)

    with self.test_session() as sess:
      sess.run(tf.global_variables_initializer())
      output = sess.run(predictions)
      self.assertEquals(output.shape, (eval_batch_size,))

  def testLogitsNotSqueezed(self):
    num_classes = 25
    images = tf.random_uniform([1, 224, 224, 3])
    logits, _ = inception.inception_v1(images,
                                       num_classes=num_classes,
                                       spatial_squeeze=False)

    with self.test_session() as sess:
      tf.global_variables_initializer().run()
      logits_out = sess.run(logits)
      self.assertListEqual(list(logits_out.shape), [1, 1, 1, num_classes])


if __name__ == '__main__':
  tf.test.main()


================================================
FILE: nets/inception_v2.py
================================================
# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Contains the definition for inception v2 classification network."""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tensorflow as tf

from nets import inception_utils

slim = tf.contrib.slim
trunc_normal = lambda stddev: tf.truncated_normal_initializer(0.0, stddev)


def inception_v2_base(inputs,
                      final_endpoint='Mixed_5c',
                      min_depth=16,
                      depth_multiplier=1.0,
                      scope=None):
  """Inception v2 (6a2).

  Constructs an Inception v2 network from inputs to the given final endpoint.
  This method can construct the network up to the layer inception(5b) as
  described in http://arxiv.org/abs/1502.03167.

  Args:
    inputs: a tensor of shape [batch_size, height, width, channels].
    final_endpoint: specifies the endpoint to construct the network up to. It
      can be one of ['Conv2d_1a_7x7', 'MaxPool_2a_3x3', 'Conv2d_2b_1x1',
      'Conv2d_2c_3x3', 'MaxPool_3a_3x3', 'Mixed_3b', 'Mixed_3c', 'Mixed_4a',
      'Mixed_4b', 'Mixed_4c', 'Mixed_4d', 'Mixed_4e', 'Mixed_5a', 'Mixed_5b',
      'Mixed_5c'].
    min_depth: Minimum depth value (number of channels) for all convolution ops.
      Enforced when depth_multiplier < 1, and not an active constraint when
      depth_multiplier >= 1.
    depth_multiplier: Float multiplier for the depth (number of channels)
      for all convolution ops. The value must be greater than zero. Typical
      usage will be to set this value in (0, 1) to reduce the number of
      parameters or computation cost of the model.
    scope: Optional variable_scope.

  Returns:
    tensor_out: output tensor corresponding to the final_endpoint.
    end_points: a set of activations for external use, for example summaries or
                losses.

  Raises:
    ValueError: if final_endpoint is not set to one of the predefined values,
                or depth_multiplier <= 0
  """

  # end_points will collect relevant activations for external use, for example
  # summaries or losses.
  end_points = {}

  # Used to find thinned depths for each layer.
  if depth_multiplier <= 0:
    raise ValueError('depth_multiplier is not greater than zero.')
  depth = lambda d: max(int(d * depth_multiplier), min_depth)

  with tf.variable_scope(scope, 'InceptionV2', [inputs]):
    with slim.arg_scope(
        [slim.conv2d, slim.max_pool2d, slim.avg_pool2d, slim.separable_conv2d],
        stride=1, padding='SAME'):

      # Note that sizes in the comments below assume an input spatial size of
      # 224x224, however, the inputs can be of any size greater 32x32.

      # 224 x 224 x 3
      end_point = 'Conv2d_1a_7x7'
      # depthwise_multiplier here is different from depth_multiplier.
      # depthwise_multiplier determines the output channels of the initial
      # depthwise conv (see docs for tf.nn.separable_conv2d), while
      # depth_multiplier controls the # channels of the subsequent 1x1
      # convolution. Must have
      #   in_channels * depthwise_multipler <= out_channels
      # so that the separable convolution is not overparameterized.
      depthwise_multiplier = min(int(depth(64) / 3), 8)
      net = slim.separable_conv2d(
          inputs, depth(64), [7, 7], depth_multiplier=depthwise_multiplier,
          stride=2, weights_initializer=trunc_normal(1.0),
          scope=end_point)
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      # 112 x 112 x 64
      end_point = 'MaxPool_2a_3x3'
      net = slim.max_pool2d(net, [3, 3], scope=end_point, stride=2)
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      # 56 x 56 x 64
      end_point = 'Conv2d_2b_1x1'
      net = slim.conv2d(net, depth(64), [1, 1], scope=end_point,
                        weights_initializer=trunc_normal(0.1))
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      # 56 x 56 x 64
      end_point = 'Conv2d_2c_3x3'
      net = slim.conv2d(net, depth(192), [3, 3], scope=end_point)
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      # 56 x 56 x 192
      end_point = 'MaxPool_3a_3x3'
      net = slim.max_pool2d(net, [3, 3], scope=end_point, stride=2)
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      # 28 x 28 x 192
      # Inception module.
      end_point = 'Mixed_3b'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(64), [1, 1], scope='Conv2d_0a_1x1')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(
              net, depth(64), [1, 1],
              weights_initializer=trunc_normal(0.09),
              scope='Conv2d_0a_1x1')
          branch_1 = slim.conv2d(branch_1, depth(64), [3, 3],
                                 scope='Conv2d_0b_3x3')
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.conv2d(
              net, depth(64), [1, 1],
              weights_initializer=trunc_normal(0.09),
              scope='Conv2d_0a_1x1')
          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3],
                                 scope='Conv2d_0b_3x3')
          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3],
                                 scope='Conv2d_0c_3x3')
        with tf.variable_scope('Branch_3'):
          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
          branch_3 = slim.conv2d(
              branch_3, depth(32), [1, 1],
              weights_initializer=trunc_normal(0.1),
              scope='Conv2d_0b_1x1')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if end_point == final_endpoint: return net, end_points
      # 28 x 28 x 256
      end_point = 'Mixed_3c'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(64), [1, 1], scope='Conv2d_0a_1x1')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(
              net, depth(64), [1, 1],
              weights_initializer=trunc_normal(0.09),
              scope='Conv2d_0a_1x1')
          branch_1 = slim.conv2d(branch_1, depth(96), [3, 3],
                                 scope='Conv2d_0b_3x3')
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.conv2d(
              net, depth(64), [1, 1],
              weights_initializer=trunc_normal(0.09),
              scope='Conv2d_0a_1x1')
          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3],
                                 scope='Conv2d_0b_3x3')
          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3],
                                 scope='Conv2d_0c_3x3')
        with tf.variable_scope('Branch_3'):
          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
          branch_3 = slim.conv2d(
              branch_3, depth(64), [1, 1],
              weights_initializer=trunc_normal(0.1),
              scope='Conv2d_0b_1x1')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if end_point == final_endpoint: return net, end_points
      # 28 x 28 x 320
      end_point = 'Mixed_4a'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(
              net, depth(128), [1, 1],
              weights_initializer=trunc_normal(0.09),
              scope='Conv2d_0a_1x1')
          branch_0 = slim.conv2d(branch_0, depth(160), [3, 3], stride=2,
                                 scope='Conv2d_1a_3x3')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(
              net, depth(64), [1, 1],
              weights_initializer=trunc_normal(0.09),
              scope='Conv2d_0a_1x1')
          branch_1 = slim.conv2d(
              branch_1, depth(96), [3, 3], scope='Conv2d_0b_3x3')
          branch_1 = slim.conv2d(
              branch_1, depth(96), [3, 3], stride=2, scope='Conv2d_1a_3x3')
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.max_pool2d(
              net, [3, 3], stride=2, scope='MaxPool_1a_3x3')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2])
        end_points[end_point] = net
        if end_point == final_endpoint: return net, end_points
      # 14 x 14 x 576
      end_point = 'Mixed_4b'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(224), [1, 1], scope='Conv2d_0a_1x1')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(
              net, depth(64), [1, 1],
              weights_initializer=trunc_normal(0.09),
              scope='Conv2d_0a_1x1')
          branch_1 = slim.conv2d(
              branch_1, depth(96), [3, 3], scope='Conv2d_0b_3x3')
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.conv2d(
              net, depth(96), [1, 1],
              weights_initializer=trunc_normal(0.09),
              scope='Conv2d_0a_1x1')
          branch_2 = slim.conv2d(branch_2, depth(128), [3, 3],
                                 scope='Conv2d_0b_3x3')
          branch_2 = slim.conv2d(branch_2, depth(128), [3, 3],
                                 scope='Conv2d_0c_3x3')
        with tf.variable_scope('Branch_3'):
          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
          branch_3 = slim.conv2d(
              branch_3, depth(128), [1, 1],
              weights_initializer=trunc_normal(0.1),
              scope='Conv2d_0b_1x1')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if end_point == final_endpoint: return net, end_points
      # 14 x 14 x 576
      end_point = 'Mixed_4c'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(192), [1, 1], scope='Conv2d_0a_1x1')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(
              net, depth(96), [1, 1],
              weights_initializer=trunc_normal(0.09),
              scope='Conv2d_0a_1x1')
          branch_1 = slim.conv2d(branch_1, depth(128), [3, 3],
                                 scope='Conv2d_0b_3x3')
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.conv2d(
              net, depth(96), [1, 1],
              weights_initializer=trunc_normal(0.09),
              scope='Conv2d_0a_1x1')
          branch_2 = slim.conv2d(branch_2, depth(128), [3, 3],
                                 scope='Conv2d_0b_3x3')
          branch_2 = slim.conv2d(branch_2, depth(128), [3, 3],
                                 scope='Conv2d_0c_3x3')
        with tf.variable_scope('Branch_3'):
          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
          branch_3 = slim.conv2d(
              branch_3, depth(128), [1, 1],
              weights_initializer=trunc_normal(0.1),
              scope='Conv2d_0b_1x1')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if end_point == final_endpoint: return net, end_points
      # 14 x 14 x 576
      end_point = 'Mixed_4d'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(160), [1, 1], scope='Conv2d_0a_1x1')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(
              net, depth(128), [1, 1],
              weights_initializer=trunc_normal(0.09),
              scope='Conv2d_0a_1x1')
          branch_1 = slim.conv2d(branch_1, depth(160), [3, 3],
                                 scope='Conv2d_0b_3x3')
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.conv2d(
              net, depth(128), [1, 1],
              weights_initializer=trunc_normal(0.09),
              scope='Conv2d_0a_1x1')
          branch_2 = slim.conv2d(branch_2, depth(160), [3, 3],
                                 scope='Conv2d_0b_3x3')
          branch_2 = slim.conv2d(branch_2, depth(160), [3, 3],
                                 scope='Conv2d_0c_3x3')
        with tf.variable_scope('Branch_3'):
          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
          branch_3 = slim.conv2d(
              branch_3, depth(96), [1, 1],
              weights_initializer=trunc_normal(0.1),
              scope='Conv2d_0b_1x1')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if end_point == final_endpoint: return net, end_points

      # 14 x 14 x 576
      end_point = 'Mixed_4e'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(96), [1, 1], scope='Conv2d_0a_1x1')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(
              net, depth(128), [1, 1],
              weights_initializer=trunc_normal(0.09),
              scope='Conv2d_0a_1x1')
          branch_1 = slim.conv2d(branch_1, depth(192), [3, 3],
                                 scope='Conv2d_0b_3x3')
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.conv2d(
              net, depth(160), [1, 1],
              weights_initializer=trunc_normal(0.09),
              scope='Conv2d_0a_1x1')
          branch_2 = slim.conv2d(branch_2, depth(192), [3, 3],
                                 scope='Conv2d_0b_3x3')
          branch_2 = slim.conv2d(branch_2, depth(192), [3, 3],
                                 scope='Conv2d_0c_3x3')
        with tf.variable_scope('Branch_3'):
          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
          branch_3 = slim.conv2d(
              branch_3, depth(96), [1, 1],
              weights_initializer=trunc_normal(0.1),
              scope='Conv2d_0b_1x1')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if end_point == final_endpoint: return net, end_points
      # 14 x 14 x 576
      end_point = 'Mixed_5a'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(
              net, depth(128), [1, 1],
              weights_initializer=trunc_normal(0.09),
              scope='Conv2d_0a_1x1')
          branch_0 = slim.conv2d(branch_0, depth(192), [3, 3], stride=2,
                                 scope='Conv2d_1a_3x3')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(
              net, depth(192), [1, 1],
              weights_initializer=trunc_normal(0.09),
              scope='Conv2d_0a_1x1')
          branch_1 = slim.conv2d(branch_1, depth(256), [3, 3],
                                 scope='Conv2d_0b_3x3')
          branch_1 = slim.conv2d(branch_1, depth(256), [3, 3], stride=2,
                                 scope='Conv2d_1a_3x3')
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.max_pool2d(net, [3, 3], stride=2,
                                     scope='MaxPool_1a_3x3')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2])
        end_points[end_point] = net
        if end_point == final_endpoint: return net, end_points
      # 7 x 7 x 1024
      end_point = 'Mixed_5b'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(352), [1, 1], scope='Conv2d_0a_1x1')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(
              net, depth(192), [1, 1],
              weights_initializer=trunc_normal(0.09),
              scope='Conv2d_0a_1x1')
          branch_1 = slim.conv2d(branch_1, depth(320), [3, 3],
                                 scope='Conv2d_0b_3x3')
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.conv2d(
              net, depth(160), [1, 1],
              weights_initializer=trunc_normal(0.09),
              scope='Conv2d_0a_1x1')
          branch_2 = slim.conv2d(branch_2, depth(224), [3, 3],
                                 scope='Conv2d_0b_3x3')
          branch_2 = slim.conv2d(branch_2, depth(224), [3, 3],
                                 scope='Conv2d_0c_3x3')
        with tf.variable_scope('Branch_3'):
          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
          branch_3 = slim.conv2d(
              branch_3, depth(128), [1, 1],
              weights_initializer=trunc_normal(0.1),
              scope='Conv2d_0b_1x1')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if end_point == final_endpoint: return net, end_points

      # 7 x 7 x 1024
      end_point = 'Mixed_5c'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(352), [1, 1], scope='Conv2d_0a_1x1')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(
              net, depth(192), [1, 1],
              weights_initializer=trunc_normal(0.09),
              scope='Conv2d_0a_1x1')
          branch_1 = slim.conv2d(branch_1, depth(320), [3, 3],
                                 scope='Conv2d_0b_3x3')
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.conv2d(
              net, depth(192), [1, 1],
              weights_initializer=trunc_normal(0.09),
              scope='Conv2d_0a_1x1')
          branch_2 = slim.conv2d(branch_2, depth(224), [3, 3],
                                 scope='Conv2d_0b_3x3')
          branch_2 = slim.conv2d(branch_2, depth(224), [3, 3],
                                 scope='Conv2d_0c_3x3')
        with tf.variable_scope('Branch_3'):
          branch_3 = slim.max_pool2d(net, [3, 3], scope='MaxPool_0a_3x3')
          branch_3 = slim.conv2d(
              branch_3, depth(128), [1, 1],
              weights_initializer=trunc_normal(0.1),
              scope='Conv2d_0b_1x1')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
        end_points[end_point] = net
        if end_point == final_endpoint: return net, end_points
    raise ValueError('Unknown final endpoint %s' % final_endpoint)


def inception_v2(inputs,
                 num_classes=1000,
                 is_training=True,
                 dropout_keep_prob=0.8,
                 min_depth=16,
                 depth_multiplier=1.0,
                 prediction_fn=slim.softmax,
                 spatial_squeeze=True,
                 reuse=None,
                 scope='InceptionV2'):
  """Inception v2 model for classification.

  Constructs an Inception v2 network for classification as described in
  http://arxiv.org/abs/1502.03167.

  The default image size used to train this network is 224x224.

  Args:
    inputs: a tensor of shape [batch_size, height, width, channels].
    num_classes: number of predicted classes.
    is_training: whether is training or not.
    dropout_keep_prob: the percentage of activation values that are retained.
    min_depth: Minimum depth value (number of channels) for all convolution ops.
      Enforced when depth_multiplier < 1, and not an active constraint when
      depth_multiplier >= 1.
    depth_multiplier: Float multiplier for the depth (number of channels)
      for all convolution ops. The value must be greater than zero. Typical
      usage will be to set this value in (0, 1) to reduce the number of
      parameters or computation cost of the model.
    prediction_fn: a function to get predictions out of logits.
    spatial_squeeze: if True, logits is of shape is [B, C], if false logits is
        of shape [B, 1, 1, C], where B is batch_size and C is number of classes.
    reuse: whether or not the network and its variables should be reused. To be
      able to reuse 'scope' must be given.
    scope: Optional variable_scope.

  Returns:
    logits: the pre-softmax activations, a tensor of size
      [batch_size, num_classes]
    end_points: a dictionary from components of the network to the corresponding
      activation.

  Raises:
    ValueError: if final_endpoint is not set to one of the predefined values,
                or depth_multiplier <= 0
  """
  if depth_multiplier <= 0:
    raise ValueError('depth_multiplier is not greater than zero.')

  # Final pooling and prediction
  with tf.variable_scope(scope, 'InceptionV2', [inputs, num_classes],
                         reuse=reuse) as scope:
    with slim.arg_scope([slim.batch_norm, slim.dropout],
                        is_training=is_training):
      net, end_points = inception_v2_base(
          inputs, scope=scope, min_depth=min_depth,
          depth_multiplier=depth_multiplier)
      with tf.variable_scope('Logits'):
        kernel_size = _reduced_kernel_size_for_small_input(net, [7, 7])
        net = slim.avg_pool2d(net, kernel_size, padding='VALID',
                              scope='AvgPool_1a_{}x{}'.format(*kernel_size))
        # 1 x 1 x 1024
        net = slim.dropout(net, keep_prob=dropout_keep_prob, scope='Dropout_1b')
        logits = slim.conv2d(net, num_classes, [1, 1], activation_fn=None,
                             normalizer_fn=None, scope='Conv2d_1c_1x1')
        if spatial_squeeze:
          logits = tf.squeeze(logits, [1, 2], name='SpatialSqueeze')
      end_points['Logits'] = logits
      end_points['Predictions'] = prediction_fn(logits, scope='Predictions')
  return logits, end_points
inception_v2.default_image_size = 224


def _reduced_kernel_size_for_small_input(input_tensor, kernel_size):
  """Define kernel size which is automatically reduced for small input.

  If the shape of the input images is unknown at graph construction time this
  function assumes that the input images are is large enough.

  Args:
    input_tensor: input tensor of size [batch_size, height, width, channels].
    kernel_size: desired kernel size of length 2: [kernel_height, kernel_width]

  Returns:
    a tensor with the kernel size.

  TODO(jrru): Make this function work with unknown shapes. Theoretically, this
  can be done with the code below. Problems are two-fold: (1) If the shape was
  known, it will be lost. (2) inception.slim.ops._two_element_tuple cannot
  handle tensors that define the kernel size.
      shape = tf.shape(input_tensor)
      return = tf.pack([tf.minimum(shape[1], kernel_size[0]),
                        tf.minimum(shape[2], kernel_size[1])])

  """
  shape = input_tensor.get_shape().as_list()
  if shape[1] is None or shape[2] is None:
    kernel_size_out = kernel_size
  else:
    kernel_size_out = [min(shape[1], kernel_size[0]),
                       min(shape[2], kernel_size[1])]
  return kernel_size_out


inception_v2_arg_scope = inception_utils.inception_arg_scope


================================================
FILE: nets/inception_v2_test.py
================================================
# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tests for nets.inception_v2."""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np
import tensorflow as tf

from nets import inception

slim = tf.contrib.slim


class InceptionV2Test(tf.test.TestCase):

  def testBuildClassificationNetwork(self):
    batch_size = 5
    height, width = 224, 224
    num_classes = 1000

    inputs = tf.random_uniform((batch_size, height, width, 3))
    logits, end_points = inception.inception_v2(inputs, num_classes)
    self.assertTrue(logits.op.name.startswith('InceptionV2/Logits'))
    self.assertListEqual(logits.get_shape().as_list(),
                         [batch_size, num_classes])
    self.assertTrue('Predictions' in end_points)
    self.assertListEqual(end_points['Predictions'].get_shape().as_list(),
                         [batch_size, num_classes])

  def testBuildBaseNetwork(self):
    batch_size = 5
    height, width = 224, 224

    inputs = tf.random_uniform((batch_size, height, width, 3))
    mixed_5c, end_points = inception.inception_v2_base(inputs)
    self.assertTrue(mixed_5c.op.name.startswith('InceptionV2/Mixed_5c'))
    self.assertListEqual(mixed_5c.get_shape().as_list(),
                         [batch_size, 7, 7, 1024])
    expected_endpoints = ['Mixed_3b', 'Mixed_3c', 'Mixed_4a', 'Mixed_4b',
                          'Mixed_4c', 'Mixed_4d', 'Mixed_4e', 'Mixed_5a',
                          'Mixed_5b', 'Mixed_5c', 'Conv2d_1a_7x7',
                          'MaxPool_2a_3x3', 'Conv2d_2b_1x1', 'Conv2d_2c_3x3',
                          'MaxPool_3a_3x3']
    self.assertItemsEqual(end_points.keys(), expected_endpoints)

  def testBuildOnlyUptoFinalEndpoint(self):
    batch_size = 5
    height, width = 224, 224
    endpoints = ['Conv2d_1a_7x7', 'MaxPool_2a_3x3', 'Conv2d_2b_1x1',
                 'Conv2d_2c_3x3', 'MaxPool_3a_3x3', 'Mixed_3b', 'Mixed_3c',
                 'Mixed_4a', 'Mixed_4b', 'Mixed_4c', 'Mixed_4d', 'Mixed_4e',
                 'Mixed_5a', 'Mixed_5b', 'Mixed_5c']
    for index, endpoint in enumerate(endpoints):
      with tf.Graph().as_default():
        inputs = tf.random_uniform((batch_size, height, width, 3))
        out_tensor, end_points = inception.inception_v2_base(
            inputs, final_endpoint=endpoint)
        self.assertTrue(out_tensor.op.name.startswith(
            'InceptionV2/' + endpoint))
        self.assertItemsEqual(endpoints[:index+1], end_points)

  def testBuildAndCheckAllEndPointsUptoMixed5c(self):
    batch_size = 5
    height, width = 224, 224

    inputs = tf.random_uniform((batch_size, height, width, 3))
    _, end_points = inception.inception_v2_base(inputs,
                                                final_endpoint='Mixed_5c')
    endpoints_shapes = {'Mixed_3b': [batch_size, 28, 28, 256],
                        'Mixed_3c': [batch_size, 28, 28, 320],
                        'Mixed_4a': [batch_size, 14, 14, 576],
                        'Mixed_4b': [batch_size, 14, 14, 576],
                        'Mixed_4c': [batch_size, 14, 14, 576],
                        'Mixed_4d': [batch_size, 14, 14, 576],
                        'Mixed_4e': [batch_size, 14, 14, 576],
                        'Mixed_5a': [batch_size, 7, 7, 1024],
                        'Mixed_5b': [batch_size, 7, 7, 1024],
                        'Mixed_5c': [batch_size, 7, 7, 1024],
                        'Conv2d_1a_7x7': [batch_size, 112, 112, 64],
                        'MaxPool_2a_3x3': [batch_size, 56, 56, 64],
                        'Conv2d_2b_1x1': [batch_size, 56, 56, 64],
                        'Conv2d_2c_3x3': [batch_size, 56, 56, 192],
                        'MaxPool_3a_3x3': [batch_size, 28, 28, 192]}
    self.assertItemsEqual(endpoints_shapes.keys(), end_points.keys())
    for endpoint_name in endpoints_shapes:
      expected_shape = endpoints_shapes[endpoint_name]
      self.assertTrue(endpoint_name in end_points)
      self.assertListEqual(end_points[endpoint_name].get_shape().as_list(),
                           expected_shape)

  def testModelHasExpectedNumberOfParameters(self):
    batch_size = 5
    height, width = 224, 224
    inputs = tf.random_uniform((batch_size, height, width, 3))
    with slim.arg_scope(inception.inception_v2_arg_scope()):
      inception.inception_v2_base(inputs)
    total_params, _ = slim.model_analyzer.analyze_vars(
        slim.get_model_variables())
    self.assertAlmostEqual(10173112, total_params)

  def testBuildEndPointsWithDepthMultiplierLessThanOne(self):
    batch_size = 5
    height, width = 224, 224
    num_classes = 1000

    inputs = tf.random_uniform((batch_size, height, width, 3))
    _, end_points = inception.inception_v2(inputs, num_classes)

    endpoint_keys = [key for key in end_points.keys()
                     if key.startswith('Mixed') or key.startswith('Conv')]

    _, end_points_with_multiplier = inception.inception_v2(
        inputs, num_classes, scope='depth_multiplied_net',
        depth_multiplier=0.5)

    for key in endpoint_keys:
      original_depth = end_points[key].get_shape().as_list()[3]
      new_depth = end_points_with_multiplier[key].get_shape().as_list()[3]
      self.assertEqual(0.5 * original_depth, new_depth)

  def testBuildEndPointsWithDepthMultiplierGreaterThanOne(self):
    batch_size = 5
    height, width = 224, 224
    num_classes = 1000

    inputs = tf.random_uniform((batch_size, height, width, 3))
    _, end_points = inception.inception_v2(inputs, num_classes)

    endpoint_keys = [key for key in end_points.keys()
                     if key.startswith('Mixed') or key.startswith('Conv')]

    _, end_points_with_multiplier = inception.inception_v2(
        inputs, num_classes, scope='depth_multiplied_net',
        depth_multiplier=2.0)

    for key in endpoint_keys:
      original_depth = end_points[key].get_shape().as_list()[3]
      new_depth = end_points_with_multiplier[key].get_shape().as_list()[3]
      self.assertEqual(2.0 * original_depth, new_depth)

  def testRaiseValueErrorWithInvalidDepthMultiplier(self):
    batch_size = 5
    height, width = 224, 224
    num_classes = 1000

    inputs = tf.random_uniform((batch_size, height, width, 3))
    with self.assertRaises(ValueError):
      _ = inception.inception_v2(inputs, num_classes, depth_multiplier=-0.1)
    with self.assertRaises(ValueError):
      _ = inception.inception_v2(inputs, num_classes, depth_multiplier=0.0)

  def testHalfSizeImages(self):
    batch_size = 5
    height, width = 112, 112
    num_classes = 1000

    inputs = tf.random_uniform((batch_size, height, width, 3))
    logits, end_points = inception.inception_v2(inputs, num_classes)
    self.assertTrue(logits.op.name.startswith('InceptionV2/Logits'))
    self.assertListEqual(logits.get_shape().as_list(),
                         [batch_size, num_classes])
    pre_pool = end_points['Mixed_5c']
    self.assertListEqual(pre_pool.get_shape().as_list(),
                         [batch_size, 4, 4, 1024])

  def testUnknownImageShape(self):
    tf.reset_default_graph()
    batch_size = 2
    height, width = 224, 224
    num_classes = 1000
    input_np = np.random.uniform(0, 1, (batch_size, height, width, 3))
    with self.test_session() as sess:
      inputs = tf.placeholder(tf.float32, shape=(batch_size, None, None, 3))
      logits, end_points = inception.inception_v2(inputs, num_classes)
      self.assertTrue(logits.op.name.startswith('InceptionV2/Logits'))
      self.assertListEqual(logits.get_shape().as_list(),
                           [batch_size, num_classes])
      pre_pool = end_points['Mixed_5c']
      feed_dict = {inputs: input_np}
      tf.global_variables_initializer().run()
      pre_pool_out = sess.run(pre_pool, feed_dict=feed_dict)
      self.assertListEqual(list(pre_pool_out.shape), [batch_size, 7, 7, 1024])

  def testUnknowBatchSize(self):
    batch_size = 1
    height, width = 224, 224
    num_classes = 1000

    inputs = tf.placeholder(tf.float32, (None, height, width, 3))
    logits, _ = inception.inception_v2(inputs, num_classes)
    self.assertTrue(logits.op.name.startswith('InceptionV2/Logits'))
    self.assertListEqual(logits.get_shape().as_list(),
                         [None, num_classes])
    images = tf.random_uniform((batch_size, height, width, 3))

    with self.test_session() as sess:
      sess.run(tf.global_variables_initializer())
      output = sess.run(logits, {inputs: images.eval()})
      self.assertEquals(output.shape, (batch_size, num_classes))

  def testEvaluation(self):
    batch_size = 2
    height, width = 224, 224
    num_classes = 1000

    eval_inputs = tf.random_uniform((batch_size, height, width, 3))
    logits, _ = inception.inception_v2(eval_inputs, num_classes,
                                       is_training=False)
    predictions = tf.argmax(logits, 1)

    with self.test_session() as sess:
      sess.run(tf.global_variables_initializer())
      output = sess.run(predictions)
      self.assertEquals(output.shape, (batch_size,))

  def testTrainEvalWithReuse(self):
    train_batch_size = 5
    eval_batch_size = 2
    height, width = 150, 150
    num_classes = 1000

    train_inputs = tf.random_uniform((train_batch_size, height, width, 3))
    inception.inception_v2(train_inputs, num_classes)
    eval_inputs = tf.random_uniform((eval_batch_size, height, width, 3))
    logits, _ = inception.inception_v2(eval_inputs, num_classes, reuse=True)
    predictions = tf.argmax(logits, 1)

    with self.test_session() as sess:
      sess.run(tf.global_variables_initializer())
      output = sess.run(predictions)
      self.assertEquals(output.shape, (eval_batch_size,))

  def testLogitsNotSqueezed(self):
    num_classes = 25
    images = tf.random_uniform([1, 224, 224, 3])
    logits, _ = inception.inception_v2(images,
                                       num_classes=num_classes,
                                       spatial_squeeze=False)

    with self.test_session() as sess:
      tf.global_variables_initializer().run()
      logits_out = sess.run(logits)
      self.assertListEqual(list(logits_out.shape), [1, 1, 1, num_classes])


if __name__ == '__main__':
  tf.test.main()


================================================
FILE: nets/inception_v3.py
================================================
# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Contains the definition for inception v3 classification network."""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tensorflow as tf

from nets import inception_utils

slim = tf.contrib.slim
trunc_normal = lambda stddev: tf.truncated_normal_initializer(0.0, stddev)


def inception_v3_base(inputs,
                      final_endpoint='Mixed_7c',
                      min_depth=16,
                      depth_multiplier=1.0,
                      scope=None):
  """Inception model from http://arxiv.org/abs/1512.00567.

  Constructs an Inception v3 network from inputs to the given final endpoint.
  This method can construct the network up to the final inception block
  Mixed_7c.

  Note that the names of the layers in the paper do not correspond to the names
  of the endpoints registered by this function although they build the same
  network.

  Here is a mapping from the old_names to the new names:
  Old name          | New name
  =======================================
  conv0             | Conv2d_1a_3x3
  conv1             | Conv2d_2a_3x3
  conv2             | Conv2d_2b_3x3
  pool1             | MaxPool_3a_3x3
  conv3             | Conv2d_3b_1x1
  conv4             | Conv2d_4a_3x3
  pool2             | MaxPool_5a_3x3
  mixed_35x35x256a  | Mixed_5b
  mixed_35x35x288a  | Mixed_5c
  mixed_35x35x288b  | Mixed_5d
  mixed_17x17x768a  | Mixed_6a
  mixed_17x17x768b  | Mixed_6b
  mixed_17x17x768c  | Mixed_6c
  mixed_17x17x768d  | Mixed_6d
  mixed_17x17x768e  | Mixed_6e
  mixed_8x8x1280a   | Mixed_7a
  mixed_8x8x2048a   | Mixed_7b
  mixed_8x8x2048b   | Mixed_7c

  Args:
    inputs: a tensor of size [batch_size, height, width, channels].
    final_endpoint: specifies the endpoint to construct the network up to. It
      can be one of ['Conv2d_1a_3x3', 'Conv2d_2a_3x3', 'Conv2d_2b_3x3',
      'MaxPool_3a_3x3', 'Conv2d_3b_1x1', 'Conv2d_4a_3x3', 'MaxPool_5a_3x3',
      'Mixed_5b', 'Mixed_5c', 'Mixed_5d', 'Mixed_6a', 'Mixed_6b', 'Mixed_6c',
      'Mixed_6d', 'Mixed_6e', 'Mixed_7a', 'Mixed_7b', 'Mixed_7c'].
    min_depth: Minimum depth value (number of channels) for all convolution ops.
      Enforced when depth_multiplier < 1, and not an active constraint when
      depth_multiplier >= 1.
    depth_multiplier: Float multiplier for the depth (number of channels)
      for all convolution ops. The value must be greater than zero. Typical
      usage will be to set this value in (0, 1) to reduce the number of
      parameters or computation cost of the model.
    scope: Optional variable_scope.

  Returns:
    tensor_out: output tensor corresponding to the final_endpoint.
    end_points: a set of activations for external use, for example summaries or
                losses.

  Raises:
    ValueError: if final_endpoint is not set to one of the predefined values,
                or depth_multiplier <= 0
  """
  # end_points will collect relevant activations for external use, for example
  # summaries or losses.
  end_points = {}

  if depth_multiplier <= 0:
    raise ValueError('depth_multiplier is not greater than zero.')
  depth = lambda d: max(int(d * depth_multiplier), min_depth)

  with tf.variable_scope(scope, 'InceptionV3', [inputs]):
    with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d],
                        stride=1, padding='VALID'):
      # 299 x 299 x 3
      end_point = 'Conv2d_1a_3x3'
      net = slim.conv2d(inputs, depth(32), [3, 3], stride=2, scope=end_point)
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      # 149 x 149 x 32
      end_point = 'Conv2d_2a_3x3'
      net = slim.conv2d(net, depth(32), [3, 3], scope=end_point)
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      # 147 x 147 x 32
      end_point = 'Conv2d_2b_3x3'
      net = slim.conv2d(net, depth(64), [3, 3], padding='SAME', scope=end_point)
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      # 147 x 147 x 64
      end_point = 'MaxPool_3a_3x3'
      net = slim.max_pool2d(net, [3, 3], stride=2, scope=end_point)
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      # 73 x 73 x 64
      end_point = 'Conv2d_3b_1x1'
      net = slim.conv2d(net, depth(80), [1, 1], scope=end_point)
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      # 73 x 73 x 80.
      end_point = 'Conv2d_4a_3x3'
      net = slim.conv2d(net, depth(192), [3, 3], scope=end_point)
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      # 71 x 71 x 192.
      end_point = 'MaxPool_5a_3x3'
      net = slim.max_pool2d(net, [3, 3], stride=2, scope=end_point)
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      # 35 x 35 x 192.

    # Inception blocks
    with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d],
                        stride=1, padding='SAME'):
      # mixed: 35 x 35 x 256.
      end_point = 'Mixed_5b'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(64), [1, 1], scope='Conv2d_0a_1x1')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(net, depth(48), [1, 1], scope='Conv2d_0a_1x1')
          branch_1 = slim.conv2d(branch_1, depth(64), [5, 5],
                                 scope='Conv2d_0b_5x5')
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.conv2d(net, depth(64), [1, 1], scope='Conv2d_0a_1x1')
          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3],
                                 scope='Conv2d_0b_3x3')
          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3],
                                 scope='Conv2d_0c_3x3')
        with tf.variable_scope('Branch_3'):
          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
          branch_3 = slim.conv2d(branch_3, depth(32), [1, 1],
                                 scope='Conv2d_0b_1x1')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points

      # mixed_1: 35 x 35 x 288.
      end_point = 'Mixed_5c'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(64), [1, 1], scope='Conv2d_0a_1x1')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(net, depth(48), [1, 1], scope='Conv2d_0b_1x1')
          branch_1 = slim.conv2d(branch_1, depth(64), [5, 5],
                                 scope='Conv_1_0c_5x5')
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.conv2d(net, depth(64), [1, 1],
                                 scope='Conv2d_0a_1x1')
          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3],
                                 scope='Conv2d_0b_3x3')
          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3],
                                 scope='Conv2d_0c_3x3')
        with tf.variable_scope('Branch_3'):
          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
          branch_3 = slim.conv2d(branch_3, depth(64), [1, 1],
                                 scope='Conv2d_0b_1x1')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points

      # mixed_2: 35 x 35 x 288.
      end_point = 'Mixed_5d'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(64), [1, 1], scope='Conv2d_0a_1x1')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(net, depth(48), [1, 1], scope='Conv2d_0a_1x1')
          branch_1 = slim.conv2d(branch_1, depth(64), [5, 5],
                                 scope='Conv2d_0b_5x5')
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.conv2d(net, depth(64), [1, 1], scope='Conv2d_0a_1x1')
          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3],
                                 scope='Conv2d_0b_3x3')
          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3],
                                 scope='Conv2d_0c_3x3')
        with tf.variable_scope('Branch_3'):
          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
          branch_3 = slim.conv2d(branch_3, depth(64), [1, 1],
                                 scope='Conv2d_0b_1x1')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points

      # mixed_3: 17 x 17 x 768.
      end_point = 'Mixed_6a'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(384), [3, 3], stride=2,
                                 padding='VALID', scope='Conv2d_1a_1x1')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(net, depth(64), [1, 1], scope='Conv2d_0a_1x1')
          branch_1 = slim.conv2d(branch_1, depth(96), [3, 3],
                                 scope='Conv2d_0b_3x3')
          branch_1 = slim.conv2d(branch_1, depth(96), [3, 3], stride=2,
                                 padding='VALID', scope='Conv2d_1a_1x1')
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.max_pool2d(net, [3, 3], stride=2, padding='VALID',
                                     scope='MaxPool_1a_3x3')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2])
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points

      # mixed4: 17 x 17 x 768.
      end_point = 'Mixed_6b'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(192), [1, 1], scope='Conv2d_0a_1x1')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(net, depth(128), [1, 1], scope='Conv2d_0a_1x1')
          branch_1 = slim.conv2d(branch_1, depth(128), [1, 7],
                                 scope='Conv2d_0b_1x7')
          branch_1 = slim.conv2d(branch_1, depth(192), [7, 1],
                                 scope='Conv2d_0c_7x1')
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.conv2d(net, depth(128), [1, 1], scope='Conv2d_0a_1x1')
          branch_2 = slim.conv2d(branch_2, depth(128), [7, 1],
                                 scope='Conv2d_0b_7x1')
          branch_2 = slim.conv2d(branch_2, depth(128), [1, 7],
                                 scope='Conv2d_0c_1x7')
          branch_2 = slim.conv2d(branch_2, depth(128), [7, 1],
                                 scope='Conv2d_0d_7x1')
          branch_2 = slim.conv2d(branch_2, depth(192), [1, 7],
                                 scope='Conv2d_0e_1x7')
        with tf.variable_scope('Branch_3'):
          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
          branch_3 = slim.conv2d(branch_3, depth(192), [1, 1],
                                 scope='Conv2d_0b_1x1')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points

      # mixed_5: 17 x 17 x 768.
      end_point = 'Mixed_6c'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(192), [1, 1], scope='Conv2d_0a_1x1')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(net, depth(160), [1, 1], scope='Conv2d_0a_1x1')
          branch_1 = slim.conv2d(branch_1, depth(160), [1, 7],
                                 scope='Conv2d_0b_1x7')
          branch_1 = slim.conv2d(branch_1, depth(192), [7, 1],
                                 scope='Conv2d_0c_7x1')
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.conv2d(net, depth(160), [1, 1], scope='Conv2d_0a_1x1')
          branch_2 = slim.conv2d(branch_2, depth(160), [7, 1],
                                 scope='Conv2d_0b_7x1')
          branch_2 = slim.conv2d(branch_2, depth(160), [1, 7],
                                 scope='Conv2d_0c_1x7')
          branch_2 = slim.conv2d(branch_2, depth(160), [7, 1],
                                 scope='Conv2d_0d_7x1')
          branch_2 = slim.conv2d(branch_2, depth(192), [1, 7],
                                 scope='Conv2d_0e_1x7')
        with tf.variable_scope('Branch_3'):
          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
          branch_3 = slim.conv2d(branch_3, depth(192), [1, 1],
                                 scope='Conv2d_0b_1x1')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      # mixed_6: 17 x 17 x 768.
      end_point = 'Mixed_6d'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(192), [1, 1], scope='Conv2d_0a_1x1')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(net, depth(160), [1, 1], scope='Conv2d_0a_1x1')
          branch_1 = slim.conv2d(branch_1, depth(160), [1, 7],
                                 scope='Conv2d_0b_1x7')
          branch_1 = slim.conv2d(branch_1, depth(192), [7, 1],
                                 scope='Conv2d_0c_7x1')
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.conv2d(net, depth(160), [1, 1], scope='Conv2d_0a_1x1')
          branch_2 = slim.conv2d(branch_2, depth(160), [7, 1],
                                 scope='Conv2d_0b_7x1')
          branch_2 = slim.conv2d(branch_2, depth(160), [1, 7],
                                 scope='Conv2d_0c_1x7')
          branch_2 = slim.conv2d(branch_2, depth(160), [7, 1],
                                 scope='Conv2d_0d_7x1')
          branch_2 = slim.conv2d(branch_2, depth(192), [1, 7],
                                 scope='Conv2d_0e_1x7')
        with tf.variable_scope('Branch_3'):
          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
          branch_3 = slim.conv2d(branch_3, depth(192), [1, 1],
                                 scope='Conv2d_0b_1x1')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points

      # mixed_7: 17 x 17 x 768.
      end_point = 'Mixed_6e'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(192), [1, 1], scope='Conv2d_0a_1x1')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(net, depth(192), [1, 1], scope='Conv2d_0a_1x1')
          branch_1 = slim.conv2d(branch_1, depth(192), [1, 7],
                                 scope='Conv2d_0b_1x7')
          branch_1 = slim.conv2d(branch_1, depth(192), [7, 1],
                                 scope='Conv2d_0c_7x1')
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.conv2d(net, depth(192), [1, 1], scope='Conv2d_0a_1x1')
          branch_2 = slim.conv2d(branch_2, depth(192), [7, 1],
                                 scope='Conv2d_0b_7x1')
          branch_2 = slim.conv2d(branch_2, depth(192), [1, 7],
                                 scope='Conv2d_0c_1x7')
          branch_2 = slim.conv2d(branch_2, depth(192), [7, 1],
                                 scope='Conv2d_0d_7x1')
          branch_2 = slim.conv2d(branch_2, depth(192), [1, 7],
                                 scope='Conv2d_0e_1x7')
        with tf.variable_scope('Branch_3'):
          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
          branch_3 = slim.conv2d(branch_3, depth(192), [1, 1],
                                 scope='Conv2d_0b_1x1')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points

      # mixed_8: 8 x 8 x 1280.
      end_point = 'Mixed_7a'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(192), [1, 1], scope='Conv2d_0a_1x1')
          branch_0 = slim.conv2d(branch_0, depth(320), [3, 3], stride=2,
                                 padding='VALID', scope='Conv2d_1a_3x3')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(net, depth(192), [1, 1], scope='Conv2d_0a_1x1')
          branch_1 = slim.conv2d(branch_1, depth(192), [1, 7],
                                 scope='Conv2d_0b_1x7')
          branch_1 = slim.conv2d(branch_1, depth(192), [7, 1],
                                 scope='Conv2d_0c_7x1')
          branch_1 = slim.conv2d(branch_1, depth(192), [3, 3], stride=2,
                                 padding='VALID', scope='Conv2d_1a_3x3')
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.max_pool2d(net, [3, 3], stride=2, padding='VALID',
                                     scope='MaxPool_1a_3x3')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2])
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      # mixed_9: 8 x 8 x 2048.
      end_point = 'Mixed_7b'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(320), [1, 1], scope='Conv2d_0a_1x1')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(net, depth(384), [1, 1], scope='Conv2d_0a_1x1')
          branch_1 = tf.concat(axis=3, values=[
              slim.conv2d(branch_1, depth(384), [1, 3], scope='Conv2d_0b_1x3'),
              slim.conv2d(branch_1, depth(384), [3, 1], scope='Conv2d_0b_3x1')])
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.conv2d(net, depth(448), [1, 1], scope='Conv2d_0a_1x1')
          branch_2 = slim.conv2d(
              branch_2, depth(384), [3, 3], scope='Conv2d_0b_3x3')
          branch_2 = tf.concat(axis=3, values=[
              slim.conv2d(branch_2, depth(384), [1, 3], scope='Conv2d_0c_1x3'),
              slim.conv2d(branch_2, depth(384), [3, 1], scope='Conv2d_0d_3x1')])
        with tf.variable_scope('Branch_3'):
          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
          branch_3 = slim.conv2d(
              branch_3, depth(192), [1, 1], scope='Conv2d_0b_1x1')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points

      # mixed_10: 8 x 8 x 2048.
      end_point = 'Mixed_7c'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(320), [1, 1], scope='Conv2d_0a_1x1')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(net, depth(384), [1, 1], scope='Conv2d_0a_1x1')
          branch_1 = tf.concat(axis=3, values=[
              slim.conv2d(branch_1, depth(384), [1, 3], scope='Conv2d_0b_1x3'),
              slim.conv2d(branch_1, depth(384), [3, 1], scope='Conv2d_0c_3x1')])
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.conv2d(net, depth(448), [1, 1], scope='Conv2d_0a_1x1')
          branch_2 = slim.conv2d(
              branch_2, depth(384), [3, 3], scope='Conv2d_0b_3x3')
          branch_2 = tf.concat(axis=3, values=[
              slim.conv2d(branch_2, depth(384), [1, 3], scope='Conv2d_0c_1x3'),
              slim.conv2d(branch_2, depth(384), [3, 1], scope='Conv2d_0d_3x1')])
        with tf.variable_scope('Branch_3'):
          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
          branch_3 = slim.conv2d(
              branch_3, depth(192), [1, 1], scope='Conv2d_0b_1x1')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
    raise ValueError('Unknown final endpoint %s' % final_endpoint)


def inception_v3(inputs,
                 num_classes=1000,
                 is_training=True,
                 dropout_keep_prob=0.8,
                 min_depth=16,
                 depth_multiplier=1.0,
                 prediction_fn=slim.softmax,
                 spatial_squeeze=True,
                 reuse=None,
                 scope='InceptionV3'):
  """Inception model from http://arxiv.org/abs/1512.00567.

  "Rethinking the Inception Architecture for Computer Vision"

  Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens,
  Zbigniew Wojna.

  With the default arguments this method constructs the exact model defined in
  the paper. However, one can experiment with variations of the inception_v3
  network by changing arguments dropout_keep_prob, min_depth and
  depth_multiplier.

  The default image size used to train this network is 299x299.

  Args:
    inputs: a tensor of size [batch_size, height, width, channels].
    num_classes: number of predicted classes.
    is_training: whether is training or not.
    dropout_keep_prob: the percentage of activation values that are retained.
    min_depth: Minimum depth value (number of channels) for all convolution ops.
      Enforced when depth_multiplier < 1, and not an active constraint when
      depth_multiplier >= 1.
    depth_multiplier: Float multiplier for the depth (number of channels)
      for all convolution ops. The value must be greater than zero. Typical
      usage will be to set this value in (0, 1) to reduce the number of
      parameters or computation cost of the model.
    prediction_fn: a function to get predictions out of logits.
    spatial_squeeze: if True, logits is of shape is [B, C], if false logits is
        of shape [B, 1, 1, C], where B is batch_size and C is number of classes.
    reuse: whether or not the network and its variables should be reused. To be
      able to reuse 'scope' must be given.
    scope: Optional variable_scope.

  Returns:
    logits: the pre-softmax activations, a tensor of size
      [batch_size, num_classes]
    end_points: a dictionary from components of the network to the corresponding
      activation.

  Raises:
    ValueError: if 'depth_multiplier' is less than or equal to zero.
  """
  if depth_multiplier <= 0:
    raise ValueError('depth_multiplier is not greater than zero.')
  depth = lambda d: max(int(d * depth_multiplier), min_depth)

  with tf.variable_scope(scope, 'InceptionV3', [inputs, num_classes],
                         reuse=reuse) as scope:
    with slim.arg_scope([slim.batch_norm, slim.dropout],
                        is_training=is_training):
      net, end_points = inception_v3_base(
          inputs, scope=scope, min_depth=min_depth,
          depth_multiplier=depth_multiplier)

      # Auxiliary Head logits
      with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d],
                          stride=1, padding='SAME'):
        aux_logits = end_points['Mixed_6e']
        with tf.variable_scope('AuxLogits'):
          # We want to pool the feature map to be 5x5xC
          # With padding = 0, and stride 3, this means our kernel is H - 12
          kernel_size = [aux_logits.get_shape().as_list()[1] - 12] * 2
          aux_logits = slim.avg_pool2d(
              aux_logits, kernel_size, stride=3, padding='VALID',
              scope='AvgPool_1a_5x5')
          aux_logits = slim.conv2d(aux_logits, depth(128), [1, 1],
                                   scope='Conv2d_1b_1x1')

          # Shape of feature map before the final layer.
          kernel_size = _reduced_kernel_size_for_small_input(aux_logits, [5, 5])
          aux_logits = slim.conv2d(
              aux_logits, depth(768), kernel_size,
              weights_initializer=trunc_normal(0.01),
              padding='VALID', scope='Conv2d_2a_{}x{}'.format(*kernel_size))
          aux_logits = slim.conv2d(
              aux_logits, num_classes, [1, 1], activation_fn=None,
              normalizer_fn=None, weights_initializer=trunc_normal(0.001),
              scope='Conv2d_2b_1x1')
          if spatial_squeeze:
            aux_logits = tf.squeeze(aux_logits, [1, 2], name='SpatialSqueeze')
          end_points['AuxLogits'] = aux_logits

      # Final pooling and prediction
      with tf.variable_scope('Logits'):
        #kernel_size = _reduced_kernel_size_for_small_input(net, [8, 8])
        kernel_size = _kernel_to_1x1_for_specific_input(net)
        net = slim.avg_pool2d(net, kernel_size, padding='VALID',
                              scope='AvgPool_1a_{}x{}'.format(*kernel_size))
        # 1 x 1 x 2048
        net = slim.dropout(net, keep_prob=dropout_keep_prob, scope='Dropout_1b')
        end_points['PreLogits'] = net
        # 2048
        logits = slim.conv2d(net, num_classes, [1, 1], activation_fn=None,
                             normalizer_fn=None, scope='Conv2d_1c_1x1')
        if spatial_squeeze:
          logits = tf.squeeze(logits, [1, 2], name='SpatialSqueeze')
        # 1000
      end_points['Logits'] = logits
      end_points['Predictions'] = prediction_fn(logits, scope='Predictions')
  return logits, end_points
inception_v3.default_image_size = 299


def _reduced_kernel_size_for_small_input(input_tensor, kernel_size):
  """Define kernel size which is automatically reduced for small input.

  If the shape of the input images is unknown at graph construction time this
  function assumes that the input images are is large enough.

  Args:
    input_tensor: input tensor of size [batch_size, height, width, channels].
    kernel_size: desired kernel size of length 2: [kernel_height, kernel_width]

  Returns:
    a tensor with the kernel size.

  TODO(jrru): Make this function work with unknown shapes. Theoretically, this
  can be done with the code below. Problems are two-fold: (1) If the shape was
  known, it will be lost. (2) inception.slim.ops._two_element_tuple cannot
  handle tensors that define the kernel size.
      shape = tf.shape(input_tensor)
      return = tf.pack([tf.minimum(shape[1], kernel_size[0]),
                        tf.minimum(shape[2], kernel_size[1])])

  """
  shape = input_tensor.get_shape().as_list()
  if shape[1] is None or shape[2] is None:
    kernel_size_out = kernel_size
  else:
    kernel_size_out = [min(shape[1], kernel_size[0]),
                       min(shape[2], kernel_size[1])]
  return kernel_size_out

def _kernel_to_1x1_for_specific_input(input_tensor):
  """Return a kernel that will transform the input_tensor into a vector.

  We want any input tensor of shape [B, H, W, C] to be transormed into [B, 1, 1, C].
  We assume a known input shape.
  """
  shape = input_tensor.get_shape().as_list()
  return [shape[1], shape[2]]


inception_v3_arg_scope = inception_utils.inception_arg_scope


================================================
FILE: nets/inception_v3_test.py
================================================
# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tests for nets.inception_v1."""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np
import tensorflow as tf

from nets import inception

slim = tf.contrib.slim


class InceptionV3Test(tf.test.TestCase):

  def testBuildClassificationNetwork(self):
    batch_size = 5
    height, width = 299, 299
    num_classes = 1000

    inputs = tf.random_uniform((batch_size, height, width, 3))
    logits, end_points = inception.inception_v3(inputs, num_classes)
    self.assertTrue(logits.op.name.startswith('InceptionV3/Logits'))
    self.assertListEqual(logits.get_shape().as_list(),
                         [batch_size, num_classes])
    self.assertTrue('Predictions' in end_points)
    self.assertListEqual(end_points['Predictions'].get_shape().as_list(),
                         [batch_size, num_classes])

  def testBuildBaseNetwork(self):
    batch_size = 5
    height, width = 299, 299

    inputs = tf.random_uniform((batch_size, height, width, 3))
    final_endpoint, end_points = inception.inception_v3_base(inputs)
    self.assertTrue(final_endpoint.op.name.startswith(
        'InceptionV3/Mixed_7c'))
    self.assertListEqual(final_endpoint.get_shape().as_list(),
                         [batch_size, 8, 8, 2048])
    expected_endpoints = ['Conv2d_1a_3x3', 'Conv2d_2a_3x3', 'Conv2d_2b_3x3',
                          'MaxPool_3a_3x3', 'Conv2d_3b_1x1', 'Conv2d_4a_3x3',
                          'MaxPool_5a_3x3', 'Mixed_5b', 'Mixed_5c', 'Mixed_5d',
                          'Mixed_6a', 'Mixed_6b', 'Mixed_6c', 'Mixed_6d',
                          'Mixed_6e', 'Mixed_7a', 'Mixed_7b', 'Mixed_7c']
    self.assertItemsEqual(end_points.keys(), expected_endpoints)

  def testBuildOnlyUptoFinalEndpoint(self):
    batch_size = 5
    height, width = 299, 299
    endpoints = ['Conv2d_1a_3x3', 'Conv2d_2a_3x3', 'Conv2d_2b_3x3',
                 'MaxPool_3a_3x3', 'Conv2d_3b_1x1', 'Conv2d_4a_3x3',
                 'MaxPool_5a_3x3', 'Mixed_5b', 'Mixed_5c', 'Mixed_5d',
                 'Mixed_6a', 'Mixed_6b', 'Mixed_6c', 'Mixed_6d',
                 'Mixed_6e', 'Mixed_7a', 'Mixed_7b', 'Mixed_7c']

    for index, endpoint in enumerate(endpoints):
      with tf.Graph().as_default():
        inputs = tf.random_uniform((batch_size, height, width, 3))
        out_tensor, end_points = inception.inception_v3_base(
            inputs, final_endpoint=endpoint)
        self.assertTrue(out_tensor.op.name.startswith(
            'InceptionV3/' + endpoint))
        self.assertItemsEqual(endpoints[:index+1], end_points)

  def testBuildAndCheckAllEndPointsUptoMixed7c(self):
    batch_size = 5
    height, width = 299, 299

    inputs = tf.random_uniform((batch_size, height, width, 3))
    _, end_points = inception.inception_v3_base(
        inputs, final_endpoint='Mixed_7c')
    endpoints_shapes = {'Conv2d_1a_3x3': [batch_size, 149, 149, 32],
                        'Conv2d_2a_3x3': [batch_size, 147, 147, 32],
                        'Conv2d_2b_3x3': [batch_size, 147, 147, 64],
                        'MaxPool_3a_3x3': [batch_size, 73, 73, 64],
                        'Conv2d_3b_1x1': [batch_size, 73, 73, 80],
                        'Conv2d_4a_3x3': [batch_size, 71, 71, 192],
                        'MaxPool_5a_3x3': [batch_size, 35, 35, 192],
                        'Mixed_5b': [batch_size, 35, 35, 256],
                        'Mixed_5c': [batch_size, 35, 35, 288],
                        'Mixed_5d': [batch_size, 35, 35, 288],
                        'Mixed_6a': [batch_size, 17, 17, 768],
                        'Mixed_6b': [batch_size, 17, 17, 768],
                        'Mixed_6c': [batch_size, 17, 17, 768],
                        'Mixed_6d': [batch_size, 17, 17, 768],
                        'Mixed_6e': [batch_size, 17, 17, 768],
                        'Mixed_7a': [batch_size, 8, 8, 1280],
                        'Mixed_7b': [batch_size, 8, 8, 2048],
                        'Mixed_7c': [batch_size, 8, 8, 2048]}
    self.assertItemsEqual(endpoints_shapes.keys(), end_points.keys())
    for endpoint_name in endpoints_shapes:
      expected_shape = endpoints_shapes[endpoint_name]
      self.assertTrue(endpoint_name in end_points)
      self.assertListEqual(end_points[endpoint_name].get_shape().as_list(),
                           expected_shape)

  def testModelHasExpectedNumberOfParameters(self):
    batch_size = 5
    height, width = 299, 299
    inputs = tf.random_uniform((batch_size, height, width, 3))
    with slim.arg_scope(inception.inception_v3_arg_scope()):
      inception.inception_v3_base(inputs)
    total_params, _ = slim.model_analyzer.analyze_vars(
        slim.get_model_variables())
    self.assertAlmostEqual(21802784, total_params)

  def testBuildEndPoints(self):
    batch_size = 5
    height, width = 299, 299
    num_classes = 1000

    inputs = tf.random_uniform((batch_size, height, width, 3))
    _, end_points = inception.inception_v3(inputs, num_classes)
    self.assertTrue('Logits' in end_points)
    logits = end_points['Logits']
    self.assertListEqual(logits.get_shape().as_list(),
                         [batch_size, num_classes])
    self.assertTrue('AuxLogits' in end_points)
    aux_logits = end_points['AuxLogits']
    self.assertListEqual(aux_logits.get_shape().as_list(),
                         [batch_size, num_classes])
    self.assertTrue('Mixed_7c' in end_points)
    pre_pool = end_points['Mixed_7c']
    self.assertListEqual(pre_pool.get_shape().as_list(),
                         [batch_size, 8, 8, 2048])
    self.assertTrue('PreLogits' in end_points)
    pre_logits = end_points['PreLogits']
    self.assertListEqual(pre_logits.get_shape().as_list(),
                         [batch_size, 1, 1, 2048])

  def testBuildEndPointsWithDepthMultiplierLessThanOne(self):
    batch_size = 5
    height, width = 299, 299
    num_classes = 1000

    inputs = tf.random_uniform((batch_size, height, width, 3))
    _, end_points = inception.inception_v3(inputs, num_classes)

    endpoint_keys = [key for key in end_points.keys()
                     if key.startswith('Mixed') or key.startswith('Conv')]

    _, end_points_with_multiplier = inception.inception_v3(
        inputs, num_classes, scope='depth_multiplied_net',
        depth_multiplier=0.5)

    for key in endpoint_keys:
      original_depth = end_points[key].get_shape().as_list()[3]
      new_depth = end_points_with_multiplier[key].get_shape().as_list()[3]
      self.assertEqual(0.5 * original_depth, new_depth)

  def testBuildEndPointsWithDepthMultiplierGreaterThanOne(self):
    batch_size = 5
    height, width = 299, 299
    num_classes = 1000

    inputs = tf.random_uniform((batch_size, height, width, 3))
    _, end_points = inception.inception_v3(inputs, num_classes)

    endpoint_keys = [key for key in end_points.keys()
                     if key.startswith('Mixed') or key.startswith('Conv')]

    _, end_points_with_multiplier = inception.inception_v3(
        inputs, num_classes, scope='depth_multiplied_net',
        depth_multiplier=2.0)

    for key in endpoint_keys:
      original_depth = end_points[key].get_shape().as_list()[3]
      new_depth = end_points_with_multiplier[key].get_shape().as_list()[3]
      self.assertEqual(2.0 * original_depth, new_depth)

  def testRaiseValueErrorWithInvalidDepthMultiplier(self):
    batch_size = 5
    height, width = 299, 299
    num_classes = 1000

    inputs = tf.random_uniform((batch_size, height, width, 3))
    with self.assertRaises(ValueError):
      _ = inception.inception_v3(inputs, num_classes, depth_multiplier=-0.1)
    with self.assertRaises(ValueError):
      _ = inception.inception_v3(inputs, num_classes, depth_multiplier=0.0)

  def testHalfSizeImages(self):
    batch_size = 5
    height, width = 150, 150
    num_classes = 1000

    inputs = tf.random_uniform((batch_size, height, width, 3))
    logits, end_points = inception.inception_v3(inputs, num_classes)
    self.assertTrue(logits.op.name.startswith('InceptionV3/Logits'))
    self.assertListEqual(logits.get_shape().as_list(),
                         [batch_size, num_classes])
    pre_pool = end_points['Mixed_7c']
    self.assertListEqual(pre_pool.get_shape().as_list(),
                         [batch_size, 3, 3, 2048])

  def testUnknownImageShape(self):
    tf.reset_default_graph()
    batch_size = 2
    height, width = 299, 299
    num_classes = 1000
    input_np = np.random.uniform(0, 1, (batch_size, height, width, 3))
    with self.test_session() as sess:
      inputs = tf.placeholder(tf.float32, shape=(batch_size, None, None, 3))
      logits, end_points = inception.inception_v3(inputs, num_classes)
      self.assertListEqual(logits.get_shape().as_list(),
                           [batch_size, num_classes])
      pre_pool = end_points['Mixed_7c']
      feed_dict = {inputs: input_np}
      tf.global_variables_initializer().run()
      pre_pool_out = sess.run(pre_pool, feed_dict=feed_dict)
      self.assertListEqual(list(pre_pool_out.shape), [batch_size, 8, 8, 2048])

  def testUnknowBatchSize(self):
    batch_size = 1
    height, width = 299, 299
    num_classes = 1000

    inputs = tf.placeholder(tf.float32, (None, height, width, 3))
    logits, _ = inception.inception_v3(inputs, num_classes)
    self.assertTrue(logits.op.name.startswith('InceptionV3/Logits'))
    self.assertListEqual(logits.get_shape().as_list(),
                         [None, num_classes])
    images = tf.random_uniform((batch_size, height, width, 3))

    with self.test_session() as sess:
      sess.run(tf.global_variables_initializer())
      output = sess.run(logits, {inputs: images.eval()})
      self.assertEquals(output.shape, (batch_size, num_classes))

  def testEvaluation(self):
    batch_size = 2
    height, width = 299, 299
    num_classes = 1000

    eval_inputs = tf.random_uniform((batch_size, height, width, 3))
    logits, _ = inception.inception_v3(eval_inputs, num_classes,
                                       is_training=False)
    predictions = tf.argmax(logits, 1)

    with self.test_session() as sess:
      sess.run(tf.global_variables_initializer())
      output = sess.run(predictions)
      self.assertEquals(output.shape, (batch_size,))

  def testTrainEvalWithReuse(self):
    train_batch_size = 5
    eval_batch_size = 2
    height, width = 150, 150
    num_classes = 1000

    train_inputs = tf.random_uniform((train_batch_size, height, width, 3))
    inception.inception_v3(train_inputs, num_classes)
    eval_inputs = tf.random_uniform((eval_batch_size, height, width, 3))
    logits, _ = inception.inception_v3(eval_inputs, num_classes,
                                       is_training=False, reuse=True)
    predictions = tf.argmax(logits, 1)

    with self.test_session() as sess:
      sess.run(tf.global_variables_initializer())
      output = sess.run(predictions)
      self.assertEquals(output.shape, (eval_batch_size,))

  def testLogitsNotSqueezed(self):
    num_classes = 25
    images = tf.random_uniform([1, 299, 299, 3])
    logits, _ = inception.inception_v3(images,
                                       num_classes=num_classes,
                                       spatial_squeeze=False)

    with self.test_session() as sess:
      tf.global_variables_initializer().run()
      logits_out = sess.run(logits)
      self.assertListEqual(list(logits_out.shape), [1, 1, 1, num_classes])


if __name__ == '__main__':
  tf.test.main()


================================================
FILE: nets/inception_v4.py
================================================
# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Contains the definition of the Inception V4 architecture.

As described in http://arxiv.org/abs/1602.07261.

  Inception-v4, Inception-ResNet and the Impact of Residual Connections
    on Learning
  Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, Alex Alemi
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tensorflow as tf

from nets import inception_utils

slim = tf.contrib.slim


def block_inception_a(inputs, scope=None, reuse=None):
  """Builds Inception-A block for Inception v4 network."""
  # By default use stride=1 and SAME padding
  with slim.arg_scope([slim.conv2d, slim.avg_pool2d, slim.max_pool2d],
                      stride=1, padding='SAME'):
    with tf.variable_scope(scope, 'BlockInceptionA', [inputs], reuse=reuse):
      with tf.variable_scope('Branch_0'):
        branch_0 = slim.conv2d(inputs, 96, [1, 1], scope='Conv2d_0a_1x1')
      with tf.variable_scope('Branch_1'):
        branch_1 = slim.conv2d(inputs, 64, [1, 1], scope='Conv2d_0a_1x1')
        branch_1 = slim.conv2d(branch_1, 96, [3, 3], scope='Conv2d_0b_3x3')
      with tf.variable_scope('Branch_2'):
        branch_2 = slim.conv2d(inputs, 64, [1, 1], scope='Conv2d_0a_1x1')
        branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0b_3x3')
        branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0c_3x3')
      with tf.variable_scope('Branch_3'):
        branch_3 = slim.avg_pool2d(inputs, [3, 3], scope='AvgPool_0a_3x3')
        branch_3 = slim.conv2d(branch_3, 96, [1, 1], scope='Conv2d_0b_1x1')
      return tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])


def block_reduction_a(inputs, scope=None, reuse=None):
  """Builds Reduction-A block for Inception v4 network."""
  # By default use stride=1 and SAME padding
  with slim.arg_scope([slim.conv2d, slim.avg_pool2d, slim.max_pool2d],
                      stride=1, padding='SAME'):
    with tf.variable_scope(scope, 'BlockReductionA', [inputs], reuse=reuse):
      with tf.variable_scope('Branch_0'):
        branch_0 = slim.conv2d(inputs, 384, [3, 3], stride=2, padding='VALID',
                               scope='Conv2d_1a_3x3')
      with tf.variable_scope('Branch_1'):
        branch_1 = slim.conv2d(inputs, 192, [1, 1], scope='Conv2d_0a_1x1')
        branch_1 = slim.conv2d(branch_1, 224, [3, 3], scope='Conv2d_0b_3x3')
        branch_1 = slim.conv2d(branch_1, 256, [3, 3], stride=2,
                               padding='VALID', scope='Conv2d_1a_3x3')
      with tf.variable_scope('Branch_2'):
        branch_2 = slim.max_pool2d(inputs, [3, 3], stride=2, padding='VALID',
                                   scope='MaxPool_1a_3x3')
      return tf.concat(axis=3, values=[branch_0, branch_1, branch_2])


def block_inception_b(inputs, scope=None, reuse=None):
  """Builds Inception-B block for Inception v4 network."""
  # By default use stride=1 and SAME padding
  with slim.arg_scope([slim.conv2d, slim.avg_pool2d, slim.max_pool2d],
                      stride=1, padding='SAME'):
    with tf.variable_scope(scope, 'BlockInceptionB', [inputs], reuse=reuse):
      with tf.variable_scope('Branch_0'):
        branch_0 = slim.conv2d(inputs, 384, [1, 1], scope='Conv2d_0a_1x1')
      with tf.variable_scope('Branch_1'):
        branch_1 = slim.conv2d(inputs, 192, [1, 1], scope='Conv2d_0a_1x1')
        branch_1 = slim.conv2d(branch_1, 224, [1, 7], scope='Conv2d_0b_1x7')
        branch_1 = slim.conv2d(branch_1, 256, [7, 1], scope='Conv2d_0c_7x1')
      with tf.variable_scope('Branch_2'):
        branch_2 = slim.conv2d(inputs, 192, [1, 1], scope='Conv2d_0a_1x1')
        branch_2 = slim.conv2d(branch_2, 192, [7, 1], scope='Conv2d_0b_7x1')
        branch_2 = slim.conv2d(branch_2, 224, [1, 7], scope='Conv2d_0c_1x7')
        branch_2 = slim.conv2d(branch_2, 224, [7, 1], scope='Conv2d_0d_7x1')
        branch_2 = slim.conv2d(branch_2, 256, [1, 7], scope='Conv2d_0e_1x7')
      with tf.variable_scope('Branch_3'):
        branch_3 = slim.avg_pool2d(inputs, [3, 3], scope='AvgPool_0a_3x3')
        branch_3 = slim.conv2d(branch_3, 128, [1, 1], scope='Conv2d_0b_1x1')
      return tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])


def block_reduction_b(inputs, scope=None, reuse=None):
  """Builds Reduction-B block for Inception v4 network."""
  # By default use stride=1 and SAME padding
  with slim.arg_scope([slim.conv2d, slim.avg_pool2d, slim.max_pool2d],
                      stride=1, padding='SAME'):
    with tf.variable_scope(scope, 'BlockReductionB', [inputs], reuse=reuse):
      with tf.variable_scope('Branch_0'):
        branch_0 = slim.conv2d(inputs, 192, [1, 1], scope='Conv2d_0a_1x1')
        branch_0 = slim.conv2d(branch_0, 192, [3, 3], stride=2,
                               padding='VALID', scope='Conv2d_1a_3x3')
      with tf.variable_scope('Branch_1'):
        branch_1 = slim.conv2d(inputs, 256, [1, 1], scope='Conv2d_0a_1x1')
        branch_1 = slim.conv2d(branch_1, 256, [1, 7], scope='Conv2d_0b_1x7')
        branch_1 = slim.conv2d(branch_1, 320, [7, 1], scope='Conv2d_0c_7x1')
        branch_1 = slim.conv2d(branch_1, 320, [3, 3], stride=2,
                               padding='VALID', scope='Conv2d_1a_3x3')
      with tf.variable_scope('Branch_2'):
        branch_2 = slim.max_pool2d(inputs, [3, 3], stride=2, padding='VALID',
                                   scope='MaxPool_1a_3x3')
      return tf.concat(axis=3, values=[branch_0, branch_1, branch_2])


def block_inception_c(inputs, scope=None, reuse=None):
  """Builds Inception-C block for Inception v4 network."""
  # By default use stride=1 and SAME padding
  with slim.arg_scope([slim.conv2d, slim.avg_pool2d, slim.max_pool2d],
                      stride=1, padding='SAME'):
    with tf.variable_scope(scope, 'BlockInceptionC', [inputs], reuse=reuse):
      with tf.variable_scope('Branch_0'):
        branch_0 = slim.conv2d(inputs, 256, [1, 1], scope='Conv2d_0a_1x1')
      with tf.variable_scope('Branch_1'):
        branch_1 = slim.conv2d(inputs, 384, [1, 1], scope='Conv2d_0a_1x1')
        branch_1 = tf.concat(axis=3, values=[
            slim.conv2d(branch_1, 256, [1, 3], scope='Conv2d_0b_1x3'),
            slim.conv2d(branch_1, 256, [3, 1], scope='Conv2d_0c_3x1')])
      with tf.variable_scope('Branch_2'):
        branch_2 = slim.conv2d(inputs, 384, [1, 1], scope='Conv2d_0a_1x1')
        branch_2 = slim.conv2d(branch_2, 448, [3, 1], scope='Conv2d_0b_3x1')
        branch_2 = slim.conv2d(branch_2, 512, [1, 3], scope='Conv2d_0c_1x3')
        branch_2 = tf.concat(axis=3, values=[
            slim.conv2d(branch_2, 256, [1, 3], scope='Conv2d_0d_1x3'),
            slim.conv2d(branch_2, 256, [3, 1], scope='Conv2d_0e_3x1')])
      with tf.variable_scope('Branch_3'):
        branch_3 = slim.avg_pool2d(inputs, [3, 3], scope='AvgPool_0a_3x3')
        branch_3 = slim.conv2d(branch_3, 256, [1, 1], scope='Conv2d_0b_1x1')
      return tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])


def inception_v4_base(inputs, final_endpoint='Mixed_7d', scope=None):
  """Creates the Inception V4 network up to the given final endpoint.

  Args:
    inputs: a 4-D tensor of size [batch_size, height, width, 3].
    final_endpoint: specifies the endpoint to construct the network up to.
      It can be one of [ 'Conv2d_1a_3x3', 'Conv2d_2a_3x3', 'Conv2d_2b_3x3',
      'Mixed_3a', 'Mixed_4a', 'Mixed_5a', 'Mixed_5b', 'Mixed_5c', 'Mixed_5d',
      'Mixed_5e', 'Mixed_6a', 'Mixed_6b', 'Mixed_6c', 'Mixed_6d', 'Mixed_6e',
      'Mixed_6f', 'Mixed_6g', 'Mixed_6h', 'Mixed_7a', 'Mixed_7b', 'Mixed_7c',
      'Mixed_7d']
    scope: Optional variable_scope.

  Returns:
    logits: the logits outputs of the model.
    end_points: the set of end_points from the inception model.

  Raises:
    ValueError: if final_endpoint is not set to one of the predefined values,
  """
  end_points = {}

  def add_and_check_final(name, net):
    end_points[name] = net
    return name == final_endpoint

  with tf.variable_scope(scope, 'InceptionV4', [inputs]):
    with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d],
                        stride=1, padding='SAME'):
      # 299 x 299 x 3
      net = slim.conv2d(inputs, 32, [3, 3], stride=2,
                        padding='VALID', scope='Conv2d_1a_3x3')
      if add_and_check_final('Conv2d_1a_3x3', net): return net, end_points
      # 149 x 149 x 32
      net = slim.conv2d(net, 32, [3, 3], padding='VALID',
                        scope='Conv2d_2a_3x3')
      if add_and_check_final('Conv2d_2a_3x3', net): return net, end_points
      # 147 x 147 x 32
      net = slim.conv2d(net, 64, [3, 3], scope='Conv2d_2b_3x3')
      if add_and_check_final('Conv2d_2b_3x3', net): return net, end_points
      # 147 x 147 x 64
      with tf.variable_scope('Mixed_3a'):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.max_pool2d(net, [3, 3], stride=2, padding='VALID',
                                     scope='MaxPool_0a_3x3')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(net, 96, [3, 3], stride=2, padding='VALID',
                                 scope='Conv2d_0a_3x3')
        net = tf.concat(axis=3, values=[branch_0, branch_1])
        if add_and_check_final('Mixed_3a', net): return net, end_points

      # 73 x 73 x 160
      with tf.variable_scope('Mixed_4a'):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
          branch_0 = slim.conv2d(branch_0, 96, [3, 3], padding='VALID',
                                 scope='Conv2d_1a_3x3')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
          branch_1 = slim.conv2d(branch_1, 64, [1, 7], scope='Conv2d_0b_1x7')
          branch_1 = slim.conv2d(branch_1, 64, [7, 1], scope='Conv2d_0c_7x1')
          branch_1 = slim.conv2d(branch_1, 96, [3, 3], padding='VALID',
                                 scope='Conv2d_1a_3x3')
        net = tf.concat(axis=3, values=[branch_0, branch_1])
        if add_and_check_final('Mixed_4a', net): return net, end_points

      # 71 x 71 x 192
      with tf.variable_scope('Mixed_5a'):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, 192, [3, 3], stride=2, padding='VALID',
                                 scope='Conv2d_1a_3x3')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.max_pool2d(net, [3, 3], stride=2, padding='VALID',
                                     scope='MaxPool_1a_3x3')
        net = tf.concat(axis=3, values=[branch_0, branch_1])
        if add_and_check_final('Mixed_5a', net): return net, end_points

      # 35 x 35 x 384
      # 4 x Inception-A blocks
      for idx in xrange(4):
        block_scope = 'Mixed_5' + chr(ord('b') + idx)
        net = block_inception_a(net, block_scope)
        if add_and_check_final(block_scope, net): return net, end_points

      # 35 x 35 x 384
      # Reduction-A block
      net = block_reduction_a(net, 'Mixed_6a')
      if add_and_check_final('Mixed_6a', net): return net, end_points

      # 17 x 17 x 1024
      # 7 x Inception-B blocks
      for idx in xrange(7):
        block_scope = 'Mixed_6' + chr(ord('b') + idx)
        net = block_inception_b(net, block_scope)
        if add_and_check_final(block_scope, net): return net, end_points

      # 17 x 17 x 1024
      # Reduction-B block
      net = block_reduction_b(net, 'Mixed_7a')
      if add_and_check_final('Mixed_7a', net): return net, end_points

      # 8 x 8 x 1536
      # 3 x Inception-C blocks
      for idx in xrange(3):
        block_scope = 'Mixed_7' + chr(ord('b') + idx)
        net = block_inception_c(net, block_scope)
        if add_and_check_final(block_scope, net): return net, end_points
  raise ValueError('Unknown final endpoint %s' % final_endpoint)


def inception_v4(inputs, num_classes=1001, is_training=True,
                 dropout_keep_prob=0.8,
                 reuse=None,
                 scope='InceptionV4',
                 create_aux_logits=True):
  """Creates the Inception V4 model.

  Args:
    inputs: a 4-D tensor of size [batch_size, height, width, 3].
    num_classes: number of predicted classes.
    is_training: whether is training or not.
    dropout_keep_prob: float, the fraction to keep before final layer.
    reuse: whether or not the network and its variables should be reused. To be
      able to reuse 'scope' must be given.
    scope: Optional variable_scope.
    create_aux_logits: Whether to include the auxilliary logits.

  Returns:
    logits: the logits outputs of the model.
    end_points: the set of end_points from the inception model.
  """
  end_points = {}
  with tf.variable_scope(scope, 'InceptionV4', [inputs], reuse=reuse) as scope:
    with slim.arg_scope([slim.batch_norm, slim.dropout],
                        is_training=is_training):
      net, end_points = inception_v4_base(inputs, scope=scope)

      with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d],
                          stride=1, padding='SAME'):
        # Auxiliary Head logits
        if create_aux_logits:
          with tf.variable_scope('AuxLogits'):
            # 17 x 17 x 1024
            aux_logits = end_points['Mixed_6h']
            # Originally, kernel_size = 5
            # However, if we change the input size then we need to change the kernel size
            # We want to pool the feature map to be 5x5xC
            # With padding = 0, and stride 3, this means our kernel is H - 12
            kernel_size = [aux_logits.get_shape().as_list()[1] - 12] * 2
            aux_logits = slim.avg_pool2d(aux_logits, kernel_size, stride=3,
                                         padding='VALID',
                                         scope='AvgPool_1a_5x5')
            aux_logits = slim.conv2d(aux_logits, 128, [1, 1],
                                     scope='Conv2d_1b_1x1')
            aux_logits = slim.conv2d(aux_logits, 768,
                                     aux_logits.get_shape()[1:3],
                                     padding='VALID', scope='Conv2d_2a')
            aux_logits = slim.flatten(aux_logits)
            aux_logits = slim.fully_connected(aux_logits, num_classes,
                                              activation_fn=None,
                                              scope='Aux_logits')
            end_points['AuxLogits'] = aux_logits

        # Final pooling and prediction
        with tf.variable_scope('Logits'):
          # 8 x 8 x 1536
          net = slim.avg_pool2d(net, net.get_shape()[1:3], padding='VALID',
                                scope='AvgPool_1a')
          # 1 x 1 x 1536
          net = slim.dropout(net, dropout_keep_prob, scope='Dropout_1b')
          net = slim.flatten(net, scope='PreLogitsFlatten')
          end_points['PreLogitsFlatten'] = net
          # 1536
          logits = slim.fully_connected(net, num_classes, activation_fn=None,
                                        scope='Logits')
          end_points['Logits'] = logits
          end_points['Predictions'] = tf.nn.softmax(logits, name='Predictions')
    return logits, end_points
inception_v4.default_image_size = 299


inception_v4_arg_scope = inception_utils.inception_arg_scope


================================================
FILE: nets/inception_v4_test.py
================================================
# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tests for slim.inception_v4."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tensorflow as tf

from nets import inception


class InceptionTest(tf.test.TestCase):

  def testBuildLogits(self):
    batch_size = 5
    height, width = 299, 299
    num_classes = 1000
    inputs = tf.random_uniform((batch_size, height, width, 3))
    logits, end_points = inception.inception_v4(inputs, num_classes)
    auxlogits = end_points['AuxLogits']
    predictions = end_points['Predictions']
    self.assertTrue(auxlogits.op.name.startswith('InceptionV4/AuxLogits'))
    self.assertListEqual(auxlogits.get_shape().as_list(),
                         [batch_size, num_classes])
    self.assertTrue(logits.op.name.startswith('InceptionV4/Logits'))
    self.assertListEqual(logits.get_shape().as_list(),
                         [batch_size, num_classes])
    self.assertTrue(predictions.op.name.startswith(
        'InceptionV4/Logits/Predictions'))
    self.assertListEqual(predictions.get_shape().as_list(),
                         [batch_size, num_classes])

  def testBuildWithoutAuxLogits(self):
    batch_size = 5
    height, width = 299, 299
    num_classes = 1000
    inputs = tf.random_uniform((batch_size, height, width, 3))
    logits, endpoints = inception.inception_v4(inputs, num_classes,
                                               create_aux_logits=False)
    self.assertFalse('AuxLogits' in endpoints)
    self.assertTrue(logits.op.name.startswith('InceptionV4/Logits'))
    self.assertListEqual(logits.get_shape().as_list(),
                         [batch_size, num_classes])

  def testAllEndPointsShapes(self):
    batch_size = 5
    height, width = 299, 299
    num_classes = 1000
    inputs = tf.random_uniform((batch_size, height, width, 3))
    _, end_points = inception.inception_v4(inputs, num_classes)
    endpoints_shapes = {'Conv2d_1a_3x3': [batch_size, 149, 149, 32],
                        'Conv2d_2a_3x3': [batch_size, 147, 147, 32],
                        'Conv2d_2b_3x3': [batch_size, 147, 147, 64],
                        'Mixed_3a': [batch_size, 73, 73, 160],
                        'Mixed_4a': [batch_size, 71, 71, 192],
                        'Mixed_5a': [batch_size, 35, 35, 384],
                        # 4 x Inception-A blocks
                        'Mixed_5b': [batch_size, 35, 35, 384],
                        'Mixed_5c': [batch_size, 35, 35, 384],
                        'Mixed_5d': [batch_size, 35, 35, 384],
                        'Mixed_5e': [batch_size, 35, 35, 384],
                        # Reduction-A block
                        'Mixed_6a': [batch_size, 17, 17, 1024],
                        # 7 x Inception-B blocks
                        'Mixed_6b': [batch_size, 17, 17, 1024],
                        'Mixed_6c': [batch_size, 17, 17, 1024],
                        'Mixed_6d': [batch_size, 17, 17, 1024],
                        'Mixed_6e': [batch_size, 17, 17, 1024],
                        'Mixed_6f': [batch_size, 17, 17, 1024],
                        'Mixed_6g': [batch_size, 17, 17, 1024],
                        'Mixed_6h': [batch_size, 17, 17, 1024],
                        # Reduction-A block
                        'Mixed_7a': [batch_size, 8, 8, 1536],
                        # 3 x Inception-C blocks
                        'Mixed_7b': [batch_size, 8, 8, 1536],
                        'Mixed_7c': [batch_size, 8, 8, 1536],
                        'Mixed_7d': [batch_size, 8, 8, 1536],
                        # Logits and predictions
                        'AuxLogits': [batch_size, num_classes],
                        'PreLogitsFlatten': [batch_size, 1536],
                        'Logits': [batch_size, num_classes],
                        'Predictions': [batch_size, num_classes]}
    self.assertItemsEqual(endpoints_shapes.keys(), end_points.keys())
    for endpoint_name in endpoints_shapes:
      expected_shape = endpoints_shapes[endpoint_name]
      self.assertTrue(endpoint_name in end_points)
      self.assertListEqual(end_points[endpoint_name].get_shape().as_list(),
                           expected_shape)

  def testBuildBaseNetwork(self):
    batch_size = 5
    height, width = 299, 299
    inputs = tf.random_uniform((batch_size, height, width, 3))
    net, end_points = inception.inception_v4_base(inputs)
    self.assertTrue(net.op.name.startswith(
        'InceptionV4/Mixed_7d'))
    self.assertListEqual(net.get_shape().as_list(), [batch_size, 8, 8, 1536])
    expected_endpoints = [
        'Conv2d_1a_3x3', 'Conv2d_2a_3x3', 'Conv2d_2b_3x3', 'Mixed_3a',
        'Mixed_4a', 'Mixed_5a', 'Mixed_5b', 'Mixed_5c', 'Mixed_5d',
        'Mixed_5e', 'Mixed_6a', 'Mixed_6b', 'Mixed_6c', 'Mixed_6d',
        'Mixed_6e', 'Mixed_6f', 'Mixed_6g', 'Mixed_6h', 'Mixed_7a',
        'Mixed_7b', 'Mixed_7c', 'Mixed_7d']
    self.assertItemsEqual(end_points.keys(), expected_endpoints)
    for name, op in end_points.iteritems():
      self.assertTrue(op.name.startswith('InceptionV4/' + name))

  def testBuildOnlyUpToFinalEndpoint(self):
    batch_size = 5
    height, width = 299, 299
    all_endpoints = [
        'Conv2d_1a_3x3', 'Conv2d_2a_3x3', 'Conv2d_2b_3x3', 'Mixed_3a',
        'Mixed_4a', 'Mixed_5a', 'Mixed_5b', 'Mixed_5c', 'Mixed_5d',
        'Mixed_5e', 'Mixed_6a', 'Mixed_6b', 'Mixed_6c', 'Mixed_6d',
        'Mixed_6e', 'Mixed_6f', 'Mixed_6g', 'Mixed_6h', 'Mixed_7a',
        'Mixed_7b', 'Mixed_7c', 'Mixed_7d']
    for index, endpoint in enumerate(all_endpoints):
      with tf.Graph().as_default():
        inputs = tf.random_uniform((batch_size, height, width, 3))
        out_tensor, end_points = inception.inception_v4_base(
            inputs, final_endpoint=endpoint)
        self.assertTrue(out_tensor.op.name.startswith(
            'InceptionV4/' + endpoint))
        self.assertItemsEqual(all_endpoints[:index+1], end_points)

  def testVariablesSetDevice(self):
    batch_size = 5
    height, width = 299, 299
    num_classes = 1000
    inputs = tf.random_uniform((batch_size, height, width, 3))
    # Force all Variables to reside on the device.
    with tf.variable_scope('on_cpu'), tf.device('/cpu:0'):
      inception.inception_v4(inputs, num_classes)
    with tf.variable_scope('on_gpu'), tf.device('/gpu:0'):
      inception.inception_v4(inputs, num_classes)
    for v in tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, scope='on_cpu'):
      self.assertDeviceEqual(v.device, '/cpu:0')
    for v in tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, scope='on_gpu'):
      self.assertDeviceEqual(v.device, '/gpu:0')

  def testHalfSizeImages(self):
    batch_size = 5
    height, width = 150, 150
    num_classes = 1000
    inputs = tf.random_uniform((batch_size, height, width, 3))
    logits, end_points = inception.inception_v4(inputs, num_classes)
    self.assertTrue(logits.op.name.startswith('InceptionV4/Logits'))
    self.assertListEqual(logits.get_shape().as_list(),
                         [batch_size, num_classes])
    pre_pool = end_points['Mixed_7d']
    self.assertListEqual(pre_pool.get_shape().as_list(),
                         [batch_size, 3, 3, 1536])

  def testUnknownBatchSize(self):
    batch_size = 1
    height, width = 299, 299
    num_classes = 1000
    with self.test_session() as sess:
      inputs = tf.placeholder(tf.float32, (None, height, width, 3))
      logits, _ = inception.inception_v4(inputs, num_classes)
      self.assertTrue(logits.op.name.startswith('InceptionV4/Logits'))
      self.assertListEqual(logits.get_shape().as_list(),
                           [None, num_classes])
      images = tf.random_uniform((batch_size, height, width, 3))
      sess.run(tf.global_variables_initializer())
      output = sess.run(logits, {inputs: images.eval()})
      self.assertEquals(output.shape, (batch_size, num_classes))

  def testEvaluation(self):
    batch_size = 2
    height, width = 299, 299
    num_classes = 1000
    with self.test_session() as sess:
      eval_inputs = tf.random_uniform((batch_size, height, width, 3))
      logits, _ = inception.inception_v4(eval_inputs,
                                         num_classes,
                                         is_training=False)
      predictions = tf.argmax(logits, 1)
      sess.run(tf.global_variables_initializer())
      output = sess.run(predictions)
      self.assertEquals(output.shape, (batch_size,))

  def testTrainEvalWithReuse(self):
    train_batch_size = 5
    eval_batch_size = 2
    height, width = 150, 150
    num_classes = 1000
    with self.test_session() as sess:
      train_inputs = tf.random_uniform((train_batch_size, height, width, 3))
      inception.inception_v4(train_inputs, num_classes)
      eval_inputs = tf.random_uniform((eval_batch_size, height, width, 3))
      logits, _ = inception.inception_v4(eval_inputs,
                                         num_classes,
                                         is_training=False,
                                         reuse=True)
      predictions = tf.argmax(logits, 1)
      sess.run(tf.global_variables_initializer())
      output = sess.run(predictions)
      self.assertEquals(output.shape, (eval_batch_size,))


if __name__ == '__main__':
  tf.test.main()


================================================
FILE: nets/mobilenet_v1.py
================================================
# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# =============================================================================
"""MobileNet v1.

MobileNet is a general architecture and can be used for multiple use cases.
Depending on the use case, it can use different input layer size and different
head (for example: embeddings, localization and classification).

As described in https://arxiv.org/abs/1704.04861.

  MobileNets: Efficient Convolutional Neural Networks for
    Mobile Vision Applications
  Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang,
    Tobias Weyand, Marco Andreetto, Hartwig Adam

100% Mobilenet V1 (base) with input size 224x224:

See mobilenet_v1()

Layer                                                     params           macs
--------------------------------------------------------------------------------
MobilenetV1/Conv2d_0/Conv2D:                                 864      10,838,016
MobilenetV1/Conv2d_1_depthwise/depthwise:                    288       3,612,672
MobilenetV1/Conv2d_1_pointwise/Conv2D:                     2,048      25,690,112
MobilenetV1/Conv2d_2_depthwise/depthwise:                    576       1,806,336
MobilenetV1/Conv2d_2_pointwise/Conv2D:                     8,192      25,690,112
MobilenetV1/Conv2d_3_depthwise/depthwise:                  1,152       3,612,672
MobilenetV1/Conv2d_3_pointwise/Conv2D:                    16,384      51,380,224
MobilenetV1/Conv2d_4_depthwise/depthwise:                  1,152         903,168
MobilenetV1/Conv2d_4_pointwise/Conv2D:                    32,768      25,690,112
MobilenetV1/Conv2d_5_depthwise/depthwise:                  2,304       1,806,336
MobilenetV1/Conv2d_5_pointwise/Conv2D:                    65,536      51,380,224
MobilenetV1/Conv2d_6_depthwise/depthwise:                  2,304         451,584
MobilenetV1/Conv2d_6_pointwise/Conv2D:                   131,072      25,690,112
MobilenetV1/Conv2d_7_depthwise/depthwise:                  4,608         903,168
MobilenetV1/Conv2d_7_pointwise/Conv2D:                   262,144      51,380,224
MobilenetV1/Conv2d_8_depthwise/depthwise:                  4,608         903,168
MobilenetV1/Conv2d_8_pointwise/Conv2D:                   262,144      51,380,224
MobilenetV1/Conv2d_9_depthwise/depthwise:                  4,608         903,168
MobilenetV1/Conv2d_9_pointwise/Conv2D:                   262,144      51,380,224
MobilenetV1/Conv2d_10_depthwise/depthwise:                 4,608         903,168
MobilenetV1/Conv2d_10_pointwise/Conv2D:                  262,144      51,380,224
MobilenetV1/Conv2d_11_depthwise/depthwise:                 4,608         903,168
MobilenetV1/Conv2d_11_pointwise/Conv2D:                  262,144      51,380,224
MobilenetV1/Conv2d_12_depthwise/depthwise:                 4,608         225,792
MobilenetV1/Conv2d_12_pointwise/Conv2D:                  524,288      25,690,112
MobilenetV1/Conv2d_13_depthwise/depthwise:                 9,216         451,584
MobilenetV1/Conv2d_13_pointwise/Conv2D:                1,048,576      51,380,224
--------------------------------------------------------------------------------
Total:                                                 3,185,088     567,716,352


75% Mobilenet V1 (base) with input size 128x128:

See mobilenet_v1_075()

Layer                                                     params           macs
--------------------------------------------------------------------------------
MobilenetV1/Conv2d_0/Conv2D:                                 648       2,654,208
MobilenetV1/Conv2d_1_depthwise/depthwise:                    216         884,736
MobilenetV1/Conv2d_1_pointwise/Conv2D:                     1,152       4,718,592
MobilenetV1/Conv2d_2_depthwise/depthwise:                    432         442,368
MobilenetV1/Conv2d_2_pointwise/Conv2D:                     4,608       4,718,592
MobilenetV1/Conv2d_3_depthwise/depthwise:                    864         884,736
MobilenetV1/Conv2d_3_pointwise/Conv2D:                     9,216       9,437,184
MobilenetV1/Conv2d_4_depthwise/depthwise:                    864         221,184
MobilenetV1/Conv2d_4_pointwise/Conv2D:                    18,432       4,718,592
MobilenetV1/Conv2d_5_depthwise/depthwise:                  1,728         442,368
MobilenetV1/Conv2d_5_pointwise/Conv2D:                    36,864       9,437,184
MobilenetV1/Conv2d_6_depthwise/depthwise:                  1,728         110,592
MobilenetV1/Conv2d_6_pointwise/Conv2D:                    73,728       4,718,592
MobilenetV1/Conv2d_7_depthwise/depthwise:                  3,456         221,184
MobilenetV1/Conv2d_7_pointwise/Conv2D:                   147,456       9,437,184
MobilenetV1/Conv2d_8_depthwise/depthwise:                  3,456         221,184
MobilenetV1/Conv2d_8_pointwise/Conv2D:                   147,456       9,437,184
MobilenetV1/Conv2d_9_depthwise/depthwise:                  3,456         221,184
MobilenetV1/Conv2d_9_pointwise/Conv2D:                   147,456       9,437,184
MobilenetV1/Conv2d_10_depthwise/depthwise:                 3,456         221,184
MobilenetV1/Conv2d_10_pointwise/Conv2D:                  147,456       9,437,184
MobilenetV1/Conv2d_11_depthwise/depthwise:                 3,456         221,184
MobilenetV1/Conv2d_11_pointwise/Conv2D:                  147,456       9,437,184
MobilenetV1/Conv2d_12_depthwise/depthwise:                 3,456          55,296
MobilenetV1/Conv2d_12_pointwise/Conv2D:                  294,912       4,718,592
MobilenetV1/Conv2d_13_depthwise/depthwise:                 6,912         110,592
MobilenetV1/Conv2d_13_pointwise/Conv2D:                  589,824       9,437,184
--------------------------------------------------------------------------------
Total:                                                 1,800,144     106,002,432

"""

# Tensorflow mandates these.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from collections import namedtuple
import functools

import tensorflow as tf

slim = tf.contrib.slim

# Conv and DepthSepConv namedtuple define layers of the MobileNet architecture
# Conv defines 3x3 convolution layers
# DepthSepConv defines 3x3 depthwise convolution followed by 1x1 convolution.
# stride is the stride of the convolution
# depth is the number of channels or filters in a layer
Conv = namedtuple('Conv', ['kernel', 'stride', 'depth'])
DepthSepConv = namedtuple('DepthSepConv', ['kernel', 'stride', 'depth'])

# _CONV_DEFS specifies the MobileNet body
_CONV_DEFS = [
    Conv(kernel=[3, 3], stride=2, depth=32),
    DepthSepConv(kernel=[3, 3], stride=1, depth=64),
    DepthSepConv(kernel=[3, 3], stride=2, depth=128),
    DepthSepConv(kernel=[3, 3], stride=1, depth=128),
    DepthSepConv(kernel=[3, 3], stride=2, depth=256),
    DepthSepConv(kernel=[3, 3], stride=1, depth=256),
    DepthSepConv(kernel=[3, 3], stride=2, depth=512),
    DepthSepConv(kernel=[3, 3], stride=1, depth=512),
    DepthSepConv(kernel=[3, 3], stride=1, depth=512),
    DepthSepConv(kernel=[3, 3], stride=1, depth=512),
    DepthSepConv(kernel=[3, 3], stride=1, depth=512),
    DepthSepConv(kernel=[3, 3], stride=1, depth=512),
    DepthSepConv(kernel=[3, 3], stride=2, depth=1024),
    DepthSepConv(kernel=[3, 3], stride=1, depth=1024)
]


def mobilenet_v1_base(inputs,
                      final_endpoint='Conv2d_13_pointwise',
                      min_depth=8,
                      depth_multiplier=1.0,
                      conv_defs=None,
                      output_stride=None,
                      scope=None):
  """Mobilenet v1.

  Constructs a Mobilenet v1 network from inputs to the given final endpoint.

  Args:
    inputs: a tensor of shape [batch_size, height, width, channels].
    final_endpoint: specifies the endpoint to construct the network up to. It
      can be one of ['Conv2d_0', 'Conv2d_1_pointwise', 'Conv2d_2_pointwise',
      'Conv2d_3_pointwise', 'Conv2d_4_pointwise', 'Conv2d_5'_pointwise,
      'Conv2d_6_pointwise', 'Conv2d_7_pointwise', 'Conv2d_8_pointwise',
      'Conv2d_9_pointwise', 'Conv2d_10_pointwise', 'Conv2d_11_pointwise',
      'Conv2d_12_pointwise', 'Conv2d_13_pointwise'].
    min_depth: Minimum depth value (number of channels) for all convolution ops.
      Enforced when depth_multiplier < 1, and not an active constraint when
      depth_multiplier >= 1.
    depth_multiplier: Float multiplier for the depth (number of channels)
      for all convolution ops. The value must be greater than zero. Typical
      usage will be to set this value in (0, 1) to reduce the number of
      parameters or computation cost of the model.
    conv_defs: A list of ConvDef namedtuples specifying the net architecture.
    output_stride: An integer that specifies the requested ratio of input to
      output spatial resolution. If not None, then we invoke atrous convolution
      if necessary to prevent the network from reducing the spatial resolution
      of the activation maps. Allowed values are 8 (accurate fully convolutional
      mode), 16 (fast fully convolutional mode), 32 (classification mode).
    scope: Optional variable_scope.

  Returns:
    tensor_out: output tensor corresponding to the final_endpoint.
    end_points: a set of activations for external use, for example summaries or
                losses.

  Raises:
    ValueError: if final_endpoint is not set to one of the predefined values,
                or depth_multiplier <= 0, or the target output_stride is not
                allowed.
  """
  depth = lambda d: max(int(d * depth_multiplier), min_depth)
  end_points = {}

  # Used to find thinned depths for each layer.
  if depth_multiplier <= 0:
    raise ValueError('depth_multiplier is not greater than zero.')

  if conv_defs is None:
    conv_defs = _CONV_DEFS

  if output_stride is not None and output_stride not in [8, 16, 32]:
    raise ValueError('Only allowed output_stride values are 8, 16, 32.')

  with tf.variable_scope(scope, 'MobilenetV1', [inputs]):
    with slim.arg_scope([slim.conv2d, slim.separable_conv2d], padding='SAME'):
      # The current_stride variable keeps track of the output stride of the
      # activations, i.e., the running product of convolution strides up to the
      # current network layer. This allows us to invoke atrous convolution
      # whenever applying the next convolution would result in the activations
      # having output stride larger than the target output_stride.
      current_stride = 1

      # The atrous convolution rate parameter.
      rate = 1

      net = inputs
      for i, conv_def in enumerate(conv_defs):
        end_point_base = 'Conv2d_%d' % i

        if output_stride is not None and current_stride == output_stride:
          # If we have reached the target output_stride, then we need to employ
          # atrous convolution with stride=1 and multiply the atrous rate by the
          # current unit's stride for use in subsequent layers.
          layer_stride = 1
          layer_rate = rate
          rate *= conv_def.stride
        else:
          layer_stride = conv_def.stride
          layer_rate = 1
          current_stride *= conv_def.stride

        if isinstance(conv_def, Conv):
          end_point = end_point_base
          net = slim.conv2d(net, depth(conv_def.depth), conv_def.kernel,
                            stride=conv_def.stride,
                            normalizer_fn=slim.batch_norm,
                            scope=end_point)
          end_points[end_point] = net
          if end_point == final_endpoint:
            return net, end_points

        elif isinstance(conv_def, DepthSepConv):
          end_point = end_point_base + '_depthwise'

          # By passing filters=None
          # separable_conv2d produces only a depthwise convolution layer
          net = slim.separable_conv2d(net, None, conv_def.kernel,
                                      depth_multiplier=1,
                                      stride=layer_stride,
                                      rate=layer_rate,
                                      normalizer_fn=slim.batch_norm,
                                      scope=end_point)

          end_points[end_point] = net
          if end_point == final_endpoint:
            return net, end_points

          end_point = end_point_base + '_pointwise'

          net = slim.conv2d(net, depth(conv_def.depth), [1, 1],
                            stride=1,
                            normalizer_fn=slim.batch_norm,
                            scope=end_point)

          end_points[end_point] = net
          if end_point == final_endpoint:
            return net, end_points
        else:
          raise ValueError('Unknown convolution type %s for layer %d'
                           % (conv_def.ltype, i))
  raise ValueError('Unknown final endpoint %s' % final_endpoint)


def mobilenet_v1(inputs,
                 num_classes=1000,
                 dropout_keep_prob=0.999,
                 is_training=True,
                 min_depth=8,
                 depth_multiplier=1.0,
                 conv_defs=None,
                 prediction_fn=tf.contrib.layers.softmax,
                 spatial_squeeze=True,
                 reuse=None,
                 scope='MobilenetV1'):
  """Mobilenet v1 model for classification.

  Args:
    inputs: a tensor of shape [batch_size, height, width, channels].
    num_classes: number of predicted classes.
    dropout_keep_prob: the percentage of activation values that are retained.
    is_training: whether is training or not.
    min_depth: Minimum depth value (number of channels) for all convolution ops.
      Enforced when depth_multiplier < 1, and not an active constraint when
      depth_multiplier >= 1.
    depth_multiplier: Float multiplier for the depth (number of channels)
      for all convolution ops. The value must be greater than zero. Typical
      usage will be to set this value in (0, 1) to reduce the number of
      parameters or computation cost of the model.
    conv_defs: A list of ConvDef namedtuples specifying the net architecture.
    prediction_fn: a function to get predictions out of logits.
    spatial_squeeze: if True, logits is of shape is [B, C], if false logits is
        of shape [B, 1, 1, C], where B is batch_size and C is number of classes.
    reuse: whether or not the network and its variables should be reused. To be
      able to reuse 'scope' must be given.
    scope: Optional variable_scope.

  Returns:
    logits: the pre-softmax activations, a tensor of size
      [batch_size, num_classes]
    end_points: a dictionary from components of the network to the corresponding
      activation.

  Raises:
    ValueError: Input rank is invalid.
  """
  input_shape = inputs.get_shape().as_list()
  if len(input_shape) != 4:
    raise ValueError('Invalid input tensor rank, expected 4, was: %d' %
                     len(input_shape))

  with tf.variable_scope(scope, 'MobilenetV1', [inputs, num_classes],
                         reuse=reuse) as scope:
    with slim.arg_scope([slim.batch_norm, slim.dropout],
                        is_training=is_training):
      net, end_points = mobilenet_v1_base(inputs, scope=scope,
                                          min_depth=min_depth,
                                          depth_multiplier=depth_multiplier,
                                          conv_defs=conv_defs)
      with tf.variable_scope('Logits'):
        #kernel_size = _reduced_kernel_size_for_small_input(net, [7, 7])
        kernel_size = net.get_shape()[1:3]
        net = slim.avg_pool2d(net, kernel_size, padding='VALID',
                              scope='AvgPool_1a')
        end_points['AvgPool_1a'] = net
        # 1 x 1 x 1024
        net = slim.dropout(net, keep_prob=dropout_keep_prob, scope='Dropout_1b')
        logits = slim.conv2d(net, num_classes, [1, 1], activation_fn=None,
                             normalizer_fn=None, scope='Conv2d_1c_1x1')
        if spatial_squeeze:
          logits = tf.squeeze(logits, [1, 2], name='SpatialSqueeze')
      end_points['Logits'] = logits
      if prediction_fn:
        end_points['Predictions'] = prediction_fn(logits, scope='Predictions')
  return logits, end_points

mobilenet_v1.default_image_size = 224


def wrapped_partial(func, *args, **kwargs):
  partial_func = functools.partial(func, *args, **kwargs)
  functools.update_wrapper(partial_func, func)
  return partial_func


mobilenet_v1_075 = wrapped_partial(mobilenet_v1, depth_multiplier=0.75)
mobilenet_v1_050 = wrapped_partial(mobilenet_v1, depth_multiplier=0.50)
mobilenet_v1_025 = wrapped_partial(mobilenet_v1, depth_multiplier=0.25)


def _reduced_kernel_size_for_small_input(input_tensor, kernel_size):
  """Define kernel size which is automatically reduced for small input.

  If the shape of the input images is unknown at graph construction time this
  function assumes that the input images are large enough.

  Args:
    input_tensor: input tensor of size [batch_size, height, width, channels].
    kernel_size: desired kernel size of length 2: [kernel_height, kernel_width]

  Returns:
    a tensor with the kernel size.
  """
  shape = input_tensor.get_shape().as_list()
  if shape[1] is None or shape[2] is None:
    kernel_size_out = kernel_size
  else:
    kernel_size_out = [min(shape[1], kernel_size[0]),
                       min(shape[2], kernel_size[1])]
  return kernel_size_out


def mobilenet_v1_arg_scope(is_training=True,
                           weight_decay=0.00004,
                           stddev=0.09,
                           batch_norm_decay=0.9997,
                           batch_norm_epsilon=0.001,
                           regularize_depthwise=False):
  """Defines the default MobilenetV1 arg scope.

  Args:
    is_training: Whether or not we're training the model.
    weight_decay: The weight decay to use for regularizing the model.
    stddev: The standard deviation of the trunctated normal weight initializer.
    regularize_depthwise: Whether or not apply regularization on depthwise.

  Returns:
    An `arg_scope` to use for the mobilenet v1 model.
  """
  batch_norm_params = {
      'is_training': is_training,
      'center': True,
      'scale': True,
      'decay': batch_norm_decay,
      'epsilon': batch_norm_epsilon,
  }

  # Set weight_decay for weights in Conv and DepthSepConv layers.
  weights_init = tf.truncated_normal_initializer(stddev=stddev)
  regularizer = tf.contrib.layers.l2_regularizer(weight_decay)
  if regularize_depthwise:
    depthwise_regularizer = regularizer
  else:
    depthwise_regularizer = None
  with slim.arg_scope([slim.conv2d, slim.separable_conv2d],
                      weights_initializer=weights_init,
                      activation_fn=tf.nn.relu6, normalizer_fn=slim.batch_norm):
    with slim.arg_scope([slim.batch_norm], **batch_norm_params):
      with slim.arg_scope([slim.conv2d], weights_regularizer=regularizer):
        with slim.arg_scope([slim.separable_conv2d],
                            weights_regularizer=depthwise_regularizer) as sc:
          return sc

================================================
FILE: nets/mobilenet_v1_test.py
================================================
# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# =============================================================================
"""Tests for MobileNet v1."""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np
import tensorflow as tf

from nets import mobilenet_v1

slim = tf.contrib.slim


class MobilenetV1Test(tf.test.TestCase):

  def testBuildClassificationNetwork(self):
    batch_size = 5
    height, width = 224, 224
    num_classes = 1000

    inputs = tf.random_uniform((batch_size, height, width, 3))
    logits, end_points = mobilenet_v1.mobilenet_v1(inputs, num_classes)
    self.assertTrue(logits.op.name.startswith('MobilenetV1/Logits'))
    self.assertListEqual(logits.get_shape().as_list(),
                         [batch_size, num_classes])
    self.assertTrue('Predictions' in end_points)
    self.assertListEqual(end_points['Predictions'].get_shape().as_list(),
                         [batch_size, num_classes])

  def testBuildBaseNetwork(self):
    batch_size = 5
    height, width = 224, 224

    inputs = tf.random_uniform((batch_size, height, width, 3))
    net, end_points = mobilenet_v1.mobilenet_v1_base(inputs)
    self.assertTrue(net.op.name.startswith('MobilenetV1/Conv2d_13'))
    self.assertListEqual(net.get_shape().as_list(),
                         [batch_size, 7, 7, 1024])
    expected_endpoints = ['Conv2d_0',
                          'Conv2d_1_depthwise', 'Conv2d_1_pointwise',
                          'Conv2d_2_depthwise', 'Conv2d_2_pointwise',
                          'Conv2d_3_depthwise', 'Conv2d_3_pointwise',
                          'Conv2d_4_depthwise', 'Conv2d_4_pointwise',
                          'Conv2d_5_depthwise', 'Conv2d_5_pointwise',
                          'Conv2d_6_depthwise', 'Conv2d_6_pointwise',
                          'Conv2d_7_depthwise', 'Conv2d_7_pointwise',
                          'Conv2d_8_depthwise', 'Conv2d_8_pointwise',
                          'Conv2d_9_depthwise', 'Conv2d_9_pointwise',
                          'Conv2d_10_depthwise', 'Conv2d_10_pointwise',
                          'Conv2d_11_depthwise', 'Conv2d_11_pointwise',
                          'Conv2d_12_depthwise', 'Conv2d_12_pointwise',
                          'Conv2d_13_depthwise', 'Conv2d_13_pointwise']
    self.assertItemsEqual(end_points.keys(), expected_endpoints)

  def testBuildOnlyUptoFinalEndpoint(self):
    batch_size = 5
    height, width = 224, 224
    endpoints = ['Conv2d_0',
                 'Conv2d_1_depthwise', 'Conv2d_1_pointwise',
                 'Conv2d_2_depthwise', 'Conv2d_2_pointwise',
                 'Conv2d_3_depthwise', 'Conv2d_3_pointwise',
                 'Conv2d_4_depthwise', 'Conv2d_4_pointwise',
                 'Conv2d_5_depthwise', 'Conv2d_5_pointwise',
                 'Conv2d_6_depthwise', 'Conv2d_6_pointwise',
                 'Conv2d_7_depthwise', 'Conv2d_7_pointwise',
                 'Conv2d_8_depthwise', 'Conv2d_8_pointwise',
                 'Conv2d_9_depthwise', 'Conv2d_9_pointwise',
                 'Conv2d_10_depthwise', 'Conv2d_10_pointwise',
                 'Conv2d_11_depthwise', 'Conv2d_11_pointwise',
                 'Conv2d_12_depthwise', 'Conv2d_12_pointwise',
                 'Conv2d_13_depthwise', 'Conv2d_13_pointwise']
    for index, endpoint in enumerate(endpoints):
      with tf.Graph().as_default():
        inputs = tf.random_uniform((batch_size, height, width, 3))
        out_tensor, end_points = mobilenet_v1.mobilenet_v1_base(
            inputs, final_endpoint=endpoint)
        self.assertTrue(out_tensor.op.name.startswith(
            'MobilenetV1/' + endpoint))
        self.assertItemsEqual(endpoints[:index+1], end_points)

  def testBuildCustomNetworkUsingConvDefs(self):
    batch_size = 5
    height, width = 224, 224
    conv_defs = [
        mobilenet_v1.Conv(kernel=[3, 3], stride=2, depth=32),
        mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=1, depth=64),
        mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=2, depth=128),
        mobilenet_v1.DepthSepConv(kernel=[3, 3], stride=1, depth=512)
    ]

    inputs = tf.random_uniform((batch_size, height, width, 3))
    net, end_points = mobilenet_v1.mobilenet_v1_base(
        inputs, final_endpoint='Conv2d_3_pointwise', conv_defs=conv_defs)
    self.assertTrue(net.op.name.startswith('MobilenetV1/Conv2d_3'))
    self.assertListEqual(net.get_shape().as_list(),
                         [batch_size, 56, 56, 512])
    expected_endpoints = ['Conv2d_0',
                          'Conv2d_1_depthwise', 'Conv2d_1_pointwise',
                          'Conv2d_2_depthwise', 'Conv2d_2_pointwise',
                          'Conv2d_3_depthwise', 'Conv2d_3_pointwise']
    self.assertItemsEqual(end_points.keys(), expected_endpoints)

  def testBuildAndCheckAllEndPointsUptoConv2d_13(self):
    batch_size = 5
    height, width = 224, 224

    inputs = tf.random_uniform((batch_size, height, width, 3))
    with slim.arg_scope([slim.conv2d, slim.separable_conv2d],
                        normalizer_fn=slim.batch_norm):
      _, end_points = mobilenet_v1.mobilenet_v1_base(
          inputs, final_endpoint='Conv2d_13_pointwise')
    endpoints_shapes = {'Conv2d_0': [batch_size, 112, 112, 32],
                        'Conv2d_1_depthwise': [batch_size, 112, 112, 32],
                        'Conv2d_1_pointwise': [batch_size, 112, 112, 64],
                        'Conv2d_2_depthwise': [batch_size, 56, 56, 64],
                        'Conv2d_2_pointwise': [batch_size, 56, 56, 128],
                        'Conv2d_3_depthwise': [batch_size, 56, 56, 128],
                        'Conv2d_3_pointwise': [batch_size, 56, 56, 128],
                        'Conv2d_4_depthwise': [batch_size, 28, 28, 128],
                        'Conv2d_4_pointwise': [batch_size, 28, 28, 256],
                        'Conv2d_5_depthwise': [batch_size, 28, 28, 256],
                        'Conv2d_5_pointwise': [batch_size, 28, 28, 256],
                        'Conv2d_6_depthwise': [batch_size, 14, 14, 256],
                        'Conv2d_6_pointwise': [batch_size, 14, 14, 512],
                        'Conv2d_7_depthwise': [batch_size, 14, 14, 512],
                        'Conv2d_7_pointwise': [batch_size, 14, 14, 512],
                        'Conv2d_8_depthwise': [batch_size, 14, 14, 512],
                        'Conv2d_8_pointwise': [batch_size, 14, 14, 512],
                        'Conv2d_9_depthwise': [batch_size, 14, 14, 512],
                        'Conv2d_9_pointwise': [batch_size, 14, 14, 512],
                        'Conv2d_10_depthwise': [batch_size, 14, 14, 512],
                        'Conv2d_10_pointwise': [batch_size, 14, 14, 512],
                        'Conv2d_11_depthwise': [batch_size, 14, 14, 512],
                        'Conv2d_11_pointwise': [batch_size, 14, 14, 512],
                        'Conv2d_12_depthwise': [batch_size, 7, 7, 512],
                        'Conv2d_12_pointwise': [batch_size, 7, 7, 1024],
                        'Conv2d_13_depthwise': [batch_size, 7, 7, 1024],
                        'Conv2d_13_pointwise': [batch_size, 7, 7, 1024]}
    self.assertItemsEqual(endpoints_shapes.keys(), end_points.keys())
    for endpoint_name, expected_shape in endpoints_shapes.iteritems():
      self.assertTrue(endpoint_name in end_points)
      self.assertListEqual(end_points[endpoint_name].get_shape().as_list(),
                           expected_shape)

  def testOutputStride16BuildAndCheckAllEndPointsUptoConv2d_13(self):
    batch_size = 5
    height, width = 224, 224
    output_stride = 16

    inputs = tf.random_uniform((batch_size, height, width, 3))
    with slim.arg_scope([slim.conv2d, slim.separable_conv2d],
                        normalizer_fn=slim.batch_norm):
      _, end_points = mobilenet_v1.mobilenet_v1_base(
          inputs, output_stride=output_stride,
          final_endpoint='Conv2d_13_pointwise')
    endpoints_shapes = {'Conv2d_0': [batch_size, 112, 112, 32],
                        'Conv2d_1_depthwise': [batch_size, 112, 112, 32],
                        'Conv2d_1_pointwise': [batch_size, 112, 112, 64],
                        'Conv2d_2_depthwise': [batch_size, 56, 56, 64],
                        'Conv2d_2_pointwise': [batch_size, 56, 56, 128],
                        'Conv2d_3_depthwise': [batch_size, 56, 56, 128],
                        'Conv2d_3_pointwise': [batch_size, 56, 56, 128],
                        'Conv2d_4_depthwise': [batch_size, 28, 28, 128],
                        'Conv2d_4_pointwise': [batch_size, 28, 28, 256],
                        'Conv2d_5_depthwise': [batch_size, 28, 28, 256],
                        'Conv2d_5_pointwise': [batch_size, 28, 28, 256],
                        'Conv2d_6_depthwise': [batch_size, 14, 14, 256],
                        'Conv2d_6_pointwise': [batch_size, 14, 14, 512],
                        'Conv2d_7_depthwise': [batch_size, 14, 14, 512],
                        'Conv2d_7_pointwise': [batch_size, 14, 14, 512],
                        'Conv2d_8_depthwise': [batch_size, 14, 14, 512],
                        'Conv2d_8_pointwise': [batch_size, 14, 14, 512],
                        'Conv2d_9_depthwise': [batch_size, 14, 14, 512],
                        'Conv2d_9_pointwise': [batch_size, 14, 14, 512],
                        'Conv2d_10_depthwise': [batch_size, 14, 14, 512],
                        'Conv2d_10_pointwise': [batch_size, 14, 14, 512],
                        'Conv2d_11_depthwise': [batch_size, 14, 14, 512],
                        'Conv2d_11_pointwise': [batch_size, 14, 14, 512],
                        'Conv2d_12_depthwise': [batch_size, 14, 14, 512],
                        'Conv2d_12_pointwise': [batch_size, 14, 14, 1024],
                        'Conv2d_13_depthwise': [batch_size, 14, 14, 1024],
                        'Conv2d_13_pointwise': [batch_size, 14, 14, 1024]}
    self.assertItemsEqual(endpoints_shapes.keys(), end_points.keys())
    for endpoint_name, expected_shape in endpoints_shapes.iteritems():
      self.assertTrue(endpoint_name in end_points)
      self.assertListEqual(end_points[endpoint_name].get_shape().as_list(),
                           expected_shape)

  def testOutputStride8BuildAndCheckAllEndPointsUptoConv2d_13(self):
    batch_size = 5
    height, width = 224, 224
    output_stride = 8

    inputs = tf.random_uniform((batch_size, height, width, 3))
    with slim.arg_scope([slim.conv2d, slim.separable_conv2d],
                        normalizer_fn=slim.batch_norm):
      _, end_points = mobilenet_v1.mobilenet_v1_base(
          inputs, output_stride=output_stride,
          final_endpoint='Conv2d_13_pointwise')
    endpoints_shapes = {'Conv2d_0': [batch_size, 112, 112, 32],
                        'Conv2d_1_depthwise': [batch_size, 112, 112, 32],
                        'Conv2d_1_pointwise': [batch_size, 112, 112, 64],
                        'Conv2d_2_depthwise': [batch_size, 56, 56, 64],
                        'Conv2d_2_pointwise': [batch_size, 56, 56, 128],
                        'Conv2d_3_depthwise': [batch_size, 56, 56, 128],
                        'Conv2d_3_pointwise': [batch_size, 56, 56, 128],
                        'Conv2d_4_depthwise': [batch_size, 28, 28, 128],
                        'Conv2d_4_pointwise': [batch_size, 28, 28, 256],
                        'Conv2d_5_depthwise': [batch_size, 28, 28, 256],
                        'Conv2d_5_pointwise': [batch_size, 28, 28, 256],
                        'Conv2d_6_depthwise': [batch_size, 28, 28, 256],
                        'Conv2d_6_pointwise': [batch_size, 28, 28, 512],
                        'Conv2d_7_depthwise': [batch_size, 28, 28, 512],
                        'Conv2d_7_pointwise': [batch_size, 28, 28, 512],
                        'Conv2d_8_depthwise': [batch_size, 28, 28, 512],
                        'Conv2d_8_pointwise': [batch_size, 28, 28, 512],
                        'Conv2d_9_depthwise': [batch_size, 28, 28, 512],
                        'Conv2d_9_pointwise': [batch_size, 28, 28, 512],
                        'Conv2d_10_depthwise': [batch_size, 28, 28, 512],
                        'Conv2d_10_pointwise': [batch_size, 28, 28, 512],
                        'Conv2d_11_depthwise': [batch_size, 28, 28, 512],
                        'Conv2d_11_pointwise': [batch_size, 28, 28, 512],
                        'Conv2d_12_depthwise': [batch_size, 28, 28, 512],
                        'Conv2d_12_pointwise': [batch_size, 28, 28, 1024],
                        'Conv2d_13_depthwise': [batch_size, 28, 28, 1024],
                        'Conv2d_13_pointwise': [batch_size, 28, 28, 1024]}
    self.assertItemsEqual(endpoints_shapes.keys(), end_points.keys())
    for endpoint_name, expected_shape in endpoints_shapes.iteritems():
      self.assertTrue(endpoint_name in end_points)
      self.assertListEqual(end_points[endpoint_name].get_shape().as_list(),
                           expected_shape)

  def testBuildAndCheckAllEndPointsApproximateFaceNet(self):
    batch_size = 5
    height, width = 128, 128

    inputs = tf.random_uniform((batch_size, height, width, 3))
    with slim.arg_scope([slim.conv2d, slim.separable_conv2d],
                        normalizer_fn=slim.batch_norm):
      _, end_points = mobilenet_v1.mobilenet_v1_base(
          inputs, final_endpoint='Conv2d_13_pointwise', depth_multiplier=0.75)
    # For the Conv2d_0 layer FaceNet has depth=16
    endpoints_shapes = {'Conv2d_0': [batch_size, 64, 64, 24],
                        'Conv2d_1_depthwise': [batch_size, 64, 64, 24],
                        'Conv2d_1_pointwise': [batch_size, 64, 64, 48],
                        'Conv2d_2_depthwise': [batch_size, 32, 32, 48],
                        'Conv2d_2_pointwise': [batch_size, 32, 32, 96],
                        'Conv2d_3_depthwise': [batch_size, 32, 32, 96],
                        'Conv2d_3_pointwise': [batch_size, 32, 32, 96],
                        'Conv2d_4_depthwise': [batch_size, 16, 16, 96],
                        'Conv2d_4_pointwise': [batch_size, 16, 16, 192],
                        'Conv2d_5_depthwise': [batch_size, 16, 16, 192],
                        'Conv2d_5_pointwise': [batch_size, 16, 16, 192],
                        'Conv2d_6_depthwise': [batch_size, 8, 8, 192],
                        'Conv2d_6_pointwise': [batch_size, 8, 8, 384],
                        'Conv2d_7_depthwise': [batch_size, 8, 8, 384],
                        'Conv2d_7_pointwise': [batch_size, 8, 8, 384],
                        'Conv2d_8_depthwise': [batch_size, 8, 8, 384],
                        'Conv2d_8_pointwise': [batch_size, 8, 8, 384],
                        'Conv2d_9_depthwise': [batch_size, 8, 8, 384],
                        'Conv2d_9_pointwise': [batch_size, 8, 8, 384],
                        'Conv2d_10_depthwise': [batch_size, 8, 8, 384],
                        'Conv2d_10_pointwise': [batch_size, 8, 8, 384],
                        'Conv2d_11_depthwise': [batch_size, 8, 8, 384],
                        'Conv2d_11_pointwise': [batch_size, 8, 8, 384],
                        'Conv2d_12_depthwise': [batch_size, 4, 4, 384],
                        'Conv2d_12_pointwise': [batch_size, 4, 4, 768],
                        'Conv2d_13_depthwise': [batch_size, 4, 4, 768],
                        'Conv2d_13_pointwise': [batch_size, 4, 4, 768]}
    self.assertItemsEqual(endpoints_shapes.keys(), end_points.keys())
    for endpoint_name, expected_shape in endpoints_shapes.iteritems():
      self.assertTrue(endpoint_name in end_points)
      self.assertListEqual(end_points[endpoint_name].get_shape().as_list(),
                           expected_shape)

  def testModelHasExpectedNumberOfParameters(self):
    batch_size = 5
    height, width = 224, 224
    inputs = tf.random_uniform((batch_size, height, width, 3))
    with slim.arg_scope([slim.conv2d, slim.separable_conv2d],
                        normalizer_fn=slim.batch_norm):
      mobilenet_v1.mobilenet_v1_base(inputs)
      total_params, _ = slim.model_analyzer.analyze_vars(
          slim.get_model_variables())
      self.assertAlmostEqual(3217920L, total_params)

  def testBuildEndPointsWithDepthMultiplierLessThanOne(self):
    batch_size = 5
    height, width = 224, 224
    num_classes = 1000

    inputs = tf.random_uniform((batch_size, height, width, 3))
    _, end_points = mobilenet_v1.mobilenet_v1(inputs, num_classes)

    endpoint_keys = [key for key in end_points.keys() if key.startswith('Conv')]

    _, end_points_with_multiplier = mobilenet_v1.mobilenet_v1(
        inputs, num_classes, scope='depth_multiplied_net',
        depth_multiplier=0.5)

    for key in endpoint_keys:
      original_depth = end_points[key].get_shape().as_list()[3]
      new_depth = end_points_with_multiplier[key].get_shape().as_list()[3]
      self.assertEqual(0.5 * original_depth, new_depth)

  def testBuildEndPointsWithDepthMultiplierGreaterThanOne(self):
    batch_size = 5
    height, width = 224, 224
    num_classes = 1000

    inputs = tf.random_uniform((batch_size, height, width, 3))
    _, end_points = mobilenet_v1.mobilenet_v1(inputs, num_classes)

    endpoint_keys = [key for key in end_points.keys()
                     if key.startswith('Mixed') or key.startswith('Conv')]

    _, end_points_with_multiplier = mobilenet_v1.mobilenet_v1(
        inputs, num_classes, scope='depth_multiplied_net',
        depth_multiplier=2.0)

    for key in endpoint_keys:
      original_depth = end_points[key].get_shape().as_list()[3]
      new_depth = end_points_with_multiplier[key].get_shape().as_list()[3]
      self.assertEqual(2.0 * original_depth, new_depth)

  def testRaiseValueErrorWithInvalidDepthMultiplier(self):
    batch_size = 5
    height, width = 224, 224
    num_classes = 1000

    inputs = tf.random_uniform((batch_size, height, width, 3))
    with self.assertRaises(ValueError):
      _ = mobilenet_v1.mobilenet_v1(
          inputs, num_classes, depth_multiplier=-0.1)
    with self.assertRaises(ValueError):
      _ = mobilenet_v1.mobilenet_v1(
          inputs, num_classes, depth_multiplier=0.0)

  def testHalfSizeImages(self):
    batch_size = 5
    height, width = 112, 112
    num_classes = 1000

    inputs = tf.random_uniform((batch_size, height, width, 3))
    logits, end_points = mobilenet_v1.mobilenet_v1(inputs, num_classes)
    self.assertTrue(logits.op.name.startswith('MobilenetV1/Logits'))
    self.assertListEqual(logits.get_shape().as_list(),
                         [batch_size, num_classes])
    pre_pool = end_points['Conv2d_13_pointwise']
    self.assertListEqual(pre_pool.get_shape().as_list(),
                         [batch_size, 4, 4, 1024])

  def testUnknownImageShape(self):
    tf.reset_default_graph()
    batch_size = 2
    height, width = 224, 224
    num_classes = 1000
    input_np = np.random.uniform(0, 1, (batch_size, height, width, 3))
    with self.test_session() as sess:
      inputs = tf.placeholder(tf.float32, shape=(batch_size, None, None, 3))
      logits, end_points = mobilenet_v1.mobilenet_v1(inputs, num_classes)
      self.assertTrue(logits.op.name.startswith('MobilenetV1/Logits'))
      self.assertListEqual(logits.get_shape().as_list(),
                           [batch_size, num_classes])
      pre_pool = end_points['Conv2d_13_pointwise']
      feed_dict = {inputs: input_np}
      tf.global_variables_initializer().run()
      pre_pool_out = sess.run(pre_pool, feed_dict=feed_dict)
      self.assertListEqual(list(pre_pool_out.shape), [batch_size, 7, 7, 1024])

  def testUnknowBatchSize(self):
    batch_size = 1
    height, width = 224, 224
    num_classes = 1000

    inputs = tf.placeholder(tf.float32, (None, height, width, 3))
    logits, _ = mobilenet_v1.mobilenet_v1(inputs, num_classes)
    self.assertTrue(logits.op.name.startswith('MobilenetV1/Logits'))
    self.assertListEqual(logits.get_shape().as_list(),
                         [None, num_classes])
    images = tf.random_uniform((batch_size, height, width, 3))

    with self.test_session() as sess:
      sess.run(tf.global_variables_initializer())
      output = sess.run(logits, {inputs: images.eval()})
      self.assertEquals(output.shape, (batch_size, num_classes))

  def testEvaluation(self):
    batch_size = 2
    height, width = 224, 224
    num_classes = 1000

    eval_inputs = tf.random_uniform((batch_size, height, width, 3))
    logits, _ = mobilenet_v1.mobilenet_v1(eval_inputs, num_classes,
                                          is_training=False)
    predictions = tf.argmax(logits, 1)

    with self.test_session() as sess:
      sess.run(tf.global_variables_initializer())
      output = sess.run(predictions)
      self.assertEquals(output.shape, (batch_size,))

  def testTrainEvalWithReuse(self):
    train_batch_size = 5
    eval_batch_size = 2
    height, width = 150, 150
    num_classes = 1000

    train_inputs = tf.random_uniform((train_batch_size, height, width, 3))
    mobilenet_v1.mobilenet_v1(train_inputs, num_classes)
    eval_inputs = tf.random_uniform((eval_batch_size, height, width, 3))
    logits, _ = mobilenet_v1.mobilenet_v1(eval_inputs, num_classes,
                                          reuse=True)
    predictions = tf.argmax(logits, 1)

    with self.test_session() as sess:
      sess.run(tf.global_variables_initializer())
      output = sess.run(predictions)
      self.assertEquals(output.shape, (eval_batch_size,))

  def testLogitsNotSqueezed(self):
    num_classes = 25
    images = tf.random_uniform([1, 224, 224, 3])
    logits, _ = mobilenet_v1.mobilenet_v1(images,
                                          num_classes=num_classes,
                                          spatial_squeeze=False)

    with self.test_session() as sess:
      tf.global_variables_initializer().run()
      logits_out = sess.run(logits)
      self.assertListEqual(list(logits_out.shape), [1, 1, 1, num_classes])


if __name__ == '__main__':
  tf.test.main()

================================================
FILE: nets/net_profile.py
================================================
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import argparse

import tensorflow as tf

from nets import nets_factory

def profile(model_name, num_classes, image_size, batch_size):

    graph = tf.Graph()
    sess = tf.Session(graph=graph)

    with graph.as_default(), sess.as_default():

        network_fn = nets_factory.get_network_fn(model_name, num_classes=num_classes)
        inputs = tf.random_uniform((batch_size, image_size, image_size, 3))
        logits, _ = network_fn(inputs)

        print("Profiling model %s" % model_name)

        # Print trainable variable parameter statistics to stdout.
        param_stats = tf.contrib.tfprof.model_analyzer.print_model_analysis(
            tf.get_default_graph(),
            tfprof_options=tf.contrib.tfprof.model_analyzer.
                TRAINABLE_VARS_PARAMS_STAT_OPTIONS)

        # param_stats is tensorflow.tfprof.TFProfNode proto. It organize the statistics
        # of each graph node in tree scructure. Let's print the root below.
        print('total_params: %d\n' % param_stats.total_parameters)

        print()

        # Print to stdout an analysis of the number of floating point operations in the
        # model broken down by individual operations.
        tf.contrib.tfprof.model_analyzer.print_model_analysis(
            tf.get_default_graph(),
            tfprof_options=tf.contrib.tfprof.model_analyzer.FLOAT_OPS_OPTIONS)


def parse_args():

    parser = argparse.ArgumentParser(description='')

    parser.add_argument('--model_name', dest='model_name',
                        help='The name of the architecture to profile.', type=str,
                        required=False, default='inception_v3')

    parser.add_argument('--num_classes', dest='num_classes',
                        help='The number of classes.', type=int,
                        required=False, default=1000)

    parser.add_argument('--image_size', dest='image_size',
                          help='The size of the input image.', type=int,
                          required=False, default=299)

    parser.add_argument('--batch_size', dest='batch_size',
                        help='The number of images in a batch.', type=int,
                        required=False, default=1)

    args = parser.parse_args()
    return args

def main():
    args = parse_args()

    profile(args.model_name, args.num_classes, args.image_size, args.batch_size)

if __name__ == '__main__':
    main()

================================================
FILE: nets/nets_factory.py
================================================
# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Contains a factory for building various models."""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import functools

import tensorflow as tf

from nets import inception
from nets import mobilenet_v1
from nets import resnet_v2

slim = tf.contrib.slim

networks_map = {
                'inception_v1': inception.inception_v1,
                'inception_v2': inception.inception_v2,
                'inception_v3': inception.inception_v3,
                'inception_v4': inception.inception_v4,
                'inception_resnet_v2': inception.inception_resnet_v2,
                'resnet_v2_50': resnet_v2.resnet_v2_50,
                'resnet_v2_101': resnet_v2.resnet_v2_101,
                'resnet_v2_152': resnet_v2.resnet_v2_152,
                'resnet_v2_200': resnet_v2.resnet_v2_200,
                'mobilenet_v1': mobilenet_v1.mobilenet_v1,
                'mobilenet_v1_075': mobilenet_v1.mobilenet_v1_075,
                'mobilenet_v1_050': mobilenet_v1.mobilenet_v1_050,
                'mobilenet_v1_025': mobilenet_v1.mobilenet_v1_025,
               }

arg_scopes_map = {'inception_v1': inception.inception_v3_arg_scope,
                  'inception_v2': inception.inception_v3_arg_scope,
                  'inception_v3': inception.inception_v3_arg_scope,
                  'inception_v4': inception.inception_v4_arg_scope,
                  'inception_resnet_v2': inception.inception_resnet_v2_arg_scope,
                  'resnet_v2_50': resnet_v2.resnet_arg_scope,
                  'resnet_v2_101': resnet_v2.resnet_arg_scope,
                  'resnet_v2_152': resnet_v2.resnet_arg_scope,
                  'resnet_v2_200': resnet_v2.resnet_arg_scope,
                  'mobilenet_v1': mobilenet_v1.mobilenet_v1_arg_scope,
                  'mobilenet_v1_075': mobilenet_v1.mobilenet_v1_arg_scope,
                  'mobilenet_v1_050': mobilenet_v1.mobilenet_v1_arg_scope,
                  'mobilenet_v1_025': mobilenet_v1.mobilenet_v1_arg_scope,
                 }


def get_network_fn(name, num_classes, weight_decay=0.0, is_training=False):
  """Returns a network_fn such as `logits, end_points = network_fn(images)`.

  Args:
    name: The name of the network.
    num_classes: The number of classes to use for classification.
    weight_decay: The l2 coefficient for the model weights.
    is_training: `True` if the model is being used for training and `False`
      otherwise.

  Returns:
    network_fn: A function that applies the model to a batch of images. It has
      the following signature:
        logits, end_points = network_fn(images)
  Raises:
    ValueError: If network `name` is not recognized.
  """
  if name not in networks_map:
    raise ValueError('Name of network unknown %s' % name)
  arg_scope = arg_scopes_map[name](weight_decay=weight_decay)
  func = networks_map[name]
  @functools.wraps(func)
  def network_fn(images):
    with slim.arg_scope(arg_scope):
      return func(images, num_classes, is_training=is_training)
  if hasattr(func, 'default_image_size'):
    network_fn.default_image_size = func.default_image_size

  return network_fn


================================================
FILE: nets/nets_factory_test.py
================================================
# Copyright 2016 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

"""Tests for slim.inception."""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function


import tensorflow as tf

from nets import nets_factory


class NetworksTest(tf.test.TestCase):

  def testGetNetworkFn(self):
    batch_size = 5
    num_classes = 1000
    for net in nets_factory.networks_map:
      with self.test_session():
        net_fn = nets_factory.get_network_fn(net, num_classes)
        # Most networks use 224 as their default_image_size
        image_size = getattr(net_fn, 'default_image_size', 224)
        inputs = tf.random_uniform((batch_size, image_size, image_size, 3))
        logits, end_points = net_fn(inputs)
        self.assertTrue(isinstance(logits, tf.Tensor))
        self.assertTrue(isinstance(end_points, dict))
        self.assertEqual(logits.get_shape().as_list()[0], batch_size)
        self.assertEqual(logits.get_shape().as_list()[-1], num_classes)

if __name__ == '__main__':
  tf.test.main()


================================================
FILE: nets/resnet_utils.py
================================================
# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Contains building blocks for various versions of Residual Networks.

Residual networks (ResNets) were proposed in:
  Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
  Deep Residual Learning for Image Recognition. arXiv:1512.03385, 2015

More variants were introduced in:
  Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
  Identity Mappings in Deep Residual Networks. arXiv: 1603.05027, 2016

We can obtain different ResNet variants by changing the network depth, width,
and form of residual unit. This module implements the infrastructure for
building them. Concrete ResNet units and full ResNet networks are implemented in
the accompanying resnet_v1.py and resnet_v2.py modules.

Compared to https://github.com/KaimingHe/deep-residual-networks, in the current
implementation we subsample the output activations in the last residual unit of
each block, instead of subsampling the input activations in the first residual
unit of each block. The two implementations give identical results but our
implementation is more memory efficient.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import collections
import tensorflow as tf

slim = tf.contrib.slim


class Block(collections.namedtuple('Block', ['scope', 'unit_fn', 'args'])):
  """A named tuple describing a ResNet block.

  Its parts are:
    scope: The scope of the `Block`.
    unit_fn: The ResNet unit function which takes as input a `Tensor` and
      returns another `Tensor` with the output of the ResNet unit.
    args: A list of length equal to the number of units in the `Block`. The list
      contains one (depth, depth_bottleneck, stride) tuple for each unit in the
      block to serve as argument to unit_fn.
  """


def subsample(inputs, factor, scope=None):
  """Subsamples the input along the spatial dimensions.

  Args:
    inputs: A `Tensor` of size [batch, height_in, width_in, channels].
    factor: The subsampling factor.
    scope: Optional variable_scope.

  Returns:
    output: A `Tensor` of size [batch, height_out, width_out, channels] with the
      input, either intact (if factor == 1) or subsampled (if factor > 1).
  """
  if factor == 1:
    return inputs
  else:
    return slim.max_pool2d(inputs, [1, 1], stride=factor, scope=scope)


def conv2d_same(inputs, num_outputs, kernel_size, stride, rate=1, scope=None):
  """Strided 2-D convolution with 'SAME' padding.

  When stride > 1, then we do explicit zero-padding, followed by conv2d with
  'VALID' padding.

  Note that

     net = conv2d_same(inputs, num_outputs, 3, stride=stride)

  is equivalent to

     net = slim.conv2d(inputs, num_outputs, 3, stride=1, padding='SAME')
     net = subsample(net, factor=stride)

  whereas

     net = slim.conv2d(inputs, num_outputs, 3, stride=stride, padding='SAME')

  is different when the input's height or width is even, which is why we add the
  current function. For more details, see ResnetUtilsTest.testConv2DSameEven().

  Args:
    inputs: A 4-D tensor of size [batch, height_in, width_in, channels].
    num_outputs: An integer, the number of output filters.
    kernel_size: An int with the kernel_size of the filters.
    stride: An integer, the output stride.
    rate: An integer, rate for atrous convolution.
    scope: Scope.

  Returns:
    output: A 4-D tensor of size [batch, height_out, width_out, channels] with
      the convolution output.
  """
  if stride == 1:
    return slim.conv2d(inputs, num_outputs, kernel_size, stride=1, rate=rate,
                       padding='SAME', scope=scope)
  else:
    kernel_size_effective = kernel_size + (kernel_size - 1) * (rate - 1)
    pad_total = kernel_size_effective - 1
    pad_beg = pad_total // 2
    pad_end = pad_total - pad_beg
    inputs = tf.pad(inputs,
                    [[0, 0], [pad_beg, pad_end], [pad_beg, pad_end], [0, 0]])
    return slim.conv2d(inputs, num_outputs, kernel_size, stride=stride,
                       rate=rate, padding='VALID', scope=scope)


@slim.add_arg_scope
def stack_blocks_dense(net, blocks, output_stride=None,
                       outputs_collections=None):
  """Stacks ResNet `Blocks` and controls output feature density.

  First, this function creates scopes for the ResNet in the form of
  'block_name/unit_1', 'block_name/unit_2', etc.

  Second, this function allows the user to explicitly control the ResNet
  output_stride, which is the ratio of the input to output spatial resolution.
  This is useful for dense prediction tasks such as semantic segmentation or
  object detection.

  Most ResNets consist of 4 ResNet blocks and subsample the activations by a
  factor of 2 when transitioning between consecutive ResNet blocks. This results
  to a nominal ResNet output_stride equal to 8. If we set the output_stride to
  half the nominal network stride (e.g., output_stride=4), then we compute
  responses twice.

  Control of the output feature density is implemented by atrous convolution.

  Args:
    net: A `Tensor` of size [batch, height, width, channels].
    blocks: A list of length equal to the number of ResNet `Blocks`. Each
      element is a ResNet `Block` object describing the units in the `Block`.
    output_stride: If `None`, then the output will be computed at the nominal
      network stride. If output_stride is not `None`, it specifies the requested
      ratio of input to output spatial resolution, which needs to be equal to
      the product of unit strides from the start up to some level of the ResNet.
      For example, if the ResNet employs units with strides 1, 2, 1, 3, 4, 1,
      then valid values for the output_stride are 1, 2, 6, 24 or None (which
      is equivalent to output_stride=24).
    outputs_collections: Collection to add the ResNet block outputs.

  Returns:
    net: Output tensor with stride equal to the specified output_stride.

  Raises:
    ValueError: If the target output_stride is not valid.
  """
  # The current_stride variable keeps track of the effective stride of the
  # activations. This allows us to invoke atrous convolution whenever applying
  # the next residual unit would result in the activations having stride larger
  # than the target output_stride.
  current_stride = 1

  # The atrous convolution rate parameter.
  rate = 1

  for block in blocks:
    with tf.variable_scope(block.scope, 'block', [net]) as sc:
      for i, unit in enumerate(block.args):
        if output_stride is not None and current_stride > output_stride:
          raise ValueError('The target output_stride cannot be reached.')

        with tf.variable_scope('unit_%d' % (i + 1), values=[net]):
          # If we have reached the target output_stride, then we need to employ
          # atrous convolution with stride=1 and multiply the atrous rate by the
          # current unit's stride for use in subsequent layers.
          if output_stride is not None and current_stride == output_stride:
            net = block.unit_fn(net, rate=rate, **dict(unit, stride=1))
            rate *= unit.get('stride', 1)

          else:
            net = block.unit_fn(net, rate=1, **unit)
            current_stride *= unit.get('stride', 1)
      net = slim.utils.collect_named_outputs(outputs_collections, sc.name, net)

  if output_stride is not None and current_stride != output_stride:
    raise ValueError('The target output_stride cannot be reached.')

  return net


def resnet_arg_scope(weight_decay=0.0001,
                     batch_norm_decay=0.997,
                     batch_norm_epsilon=1e-5,
                     batch_norm_scale=True,
                     activation_fn=tf.nn.relu,
                     use_batch_norm=True):
  """Defines the default ResNet arg scope.

  TODO(gpapan): The batch-normalization related default values above are
    appropriate for use in conjunction with the reference ResNet models
    released at https://github.com/KaimingHe/deep-residual-networks. When
    training ResNets from scratch, they might need to be tuned.

  Args:
    weight_decay: The weight decay to use for regularizing the model.
    batch_norm_decay: The moving average decay when estimating layer activation
      statistics in batch normalization.
    batch_norm_epsilon: Small constant to prevent division by zero when
      normalizing activations by their variance in batch normalization.
    batch_norm_scale: If True, uses an explicit `gamma` multiplier to scale the
      activations in the batch normalization layer.
    activation_fn: The activation function which is used in ResNet.
    use_batch_norm: Whether or not to use batch normalization.

  Returns:
    An `arg_scope` to use for the resnet models.
  """
  batch_norm_params = {
      'decay': batch_norm_decay,
      'epsilon': batch_norm_epsilon,
      'scale': batch_norm_scale,
      'updates_collections': tf.GraphKeys.UPDATE_OPS,
  }

  with slim.arg_scope(
      [slim.conv2d],
      weights_regularizer=slim.l2_regularizer(weight_decay),
      weights_initializer=slim.variance_scaling_initializer(),
      activation_fn=activation_fn,
      normalizer_fn=slim.batch_norm if use_batch_norm else None,
      normalizer_params=batch_norm_params):
    with slim.arg_scope([slim.batch_norm], **batch_norm_params):
      # The following implies padding='SAME' for pool1, which makes feature
      # alignment easier for dense prediction tasks. This is also used in
      # https://github.com/facebook/fb.resnet.torch. However the accompanying
      # code of 'Deep Residual Learning for Image Recognition' uses
      # padding='VALID' for pool1. You can switch to that choice by setting
      # slim.arg_scope([slim.max_pool2d], padding='VALID').
      with slim.arg_scope([slim.max_pool2d], padding='SAME') as arg_sc:
        return arg_sc

================================================
FILE: nets/resnet_v2.py
================================================
# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Contains definitions for the preactivation form of Residual Networks.

Residual networks (ResNets) were originally proposed in:
[1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
    Deep Residual Learning for Image Recognition. arXiv:1512.03385

The full preactivation 'v2' ResNet variant implemented in this module was
introduced by:
[2] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
    Identity Mappings in Deep Residual Networks. arXiv: 1603.05027

The key difference of the full preactivation 'v2' variant compared to the
'v1' variant in [1] is the use of batch normalization before every weight layer.

Typical use:

   from tensorflow.contrib.slim.nets import resnet_v2

ResNet-101 for image classification into 1000 classes:

   # inputs has shape [batch, 224, 224, 3]
   with slim.arg_scope(resnet_v2.resnet_arg_scope()):
      net, end_points = resnet_v2.resnet_v2_101(inputs, 1000, is_training=False)

ResNet-101 for semantic segmentation into 21 classes:

   # inputs has shape [batch, 513, 513, 3]
   with slim.arg_scope(resnet_v2.resnet_arg_scope(is_training)):
      net, end_points = resnet_v2.resnet_v2_101(inputs,
                                                21,
                                                is_training=False,
                                                global_pool=False,
                                                output_stride=16)
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tensorflow as tf

from nets import resnet_utils

slim = tf.contrib.slim
resnet_arg_scope = resnet_utils.resnet_arg_scope


@slim.add_arg_scope
def bottleneck(inputs, depth, depth_bottleneck, stride, rate=1,
               outputs_collections=None, scope=None):
  """Bottleneck residual unit variant with BN before convolutions.

  This is the full preactivation residual unit variant proposed in [2]. See
  Fig. 1(b) of [2] for its definition. Note that we use here the bottleneck
  variant which has an extra bottleneck layer.

  When putting together two consecutive ResNet blocks that use this unit, one
  should use stride = 2 in the last unit of the first block.

  Args:
    inputs: A tensor of size [batch, height, width, channels].
    depth: The depth of the ResNet unit output.
    depth_bottleneck: The depth of the bottleneck layers.
    stride: The ResNet unit's stride. Determines the amount of downsampling of
      the units output compared to its input.
    rate: An integer, rate for atrous convolution.
    outputs_collections: Collection to add the ResNet unit output.
    scope: Optional variable_scope.

  Returns:
    The ResNet unit's output.
  """
  with tf.variable_scope(scope, 'bottleneck_v2', [inputs]) as sc:
    depth_in = slim.utils.last_dimension(inputs.get_shape(), min_rank=4)
    preact = slim.batch_norm(inputs, activation_fn=tf.nn.relu, scope='preact')
    if depth == depth_in:
      shortcut = resnet_utils.subsample(inputs, stride, 'shortcut')
    else:
      shortcut = slim.conv2d(preact, depth, [1, 1], stride=stride,
                             normalizer_fn=None, activation_fn=None,
                             scope='shortcut')

    residual = slim.conv2d(preact, depth_bottleneck, [1, 1], stride=1,
                           scope='conv1')
    residual = resnet_utils.conv2d_same(residual, depth_bottleneck, 3, stride,
                                        rate=rate, scope='conv2')
    residual = slim.conv2d(residual, depth, [1, 1], stride=1,
                           normalizer_fn=None, activation_fn=None,
                           scope='conv3')

    output = shortcut + residual

    return slim.utils.collect_named_outputs(outputs_collections,
                                            sc.original_name_scope,
                                            output)


def resnet_v2(inputs,
              blocks,
              num_classes=None,
              is_training=True,
              global_pool=True,
              output_stride=None,
              include_root_block=True,
              spatial_squeeze=True,
              dropout_keep_prob=1.,
              reuse=None,
              scope=None):
  """Generator for v2 (preactivation) ResNet models.

  This function generates a family of ResNet v2 models. See the resnet_v2_*()
  methods for specific model instantiations, obtained by selecting different
  block instantiations that produce ResNets of various depths.

  Training for image classification on Imagenet is usually done with [224, 224]
  inputs, resulting in [7, 7] feature maps at the output of the last ResNet
  block for the ResNets defined in [1] that have nominal stride equal to 32.
  However, for dense prediction tasks we advise that one uses inputs with
  spatial dimensions that are multiples of 32 plus 1, e.g., [321, 321]. In
  this case the feature maps at the ResNet output will have spatial shape
  [(height - 1) / output_stride + 1, (width - 1) / output_stride + 1]
  and corners exactly aligned with the input image corners, which greatly
  facilitates alignment of the features to the image. Using as input [225, 225]
  images results in [8, 8] feature maps at the output of the last ResNet block.

  For dense prediction tasks, the ResNet needs to run in fully-convolutional
  (FCN) mode and global_pool needs to be set to False. The ResNets in [1, 2] all
  have nominal stride equal to 32 and a good choice in FCN mode is to use
  output_stride=16 in order to increase the density of the computed features at
  small computational and memory overhead, cf. http://arxiv.org/abs/1606.00915.

  Args:
    inputs: A tensor of size [batch, height_in, width_in, channels].
    blocks: A list of length equal to the number of ResNet blocks. Each element
      is a resnet_utils.Block object describing the units in the block.
    num_classes: Number of predicted classes for classification tasks. If None
      we return the features before the logit layer.
    is_training: whether is training or not.
    global_pool: If True, we perform global average pooling before computing the
      logits. Set to True for image classification, False for dense prediction.
    output_stride: If None, then the output will be computed at the nominal
      network stride. If output_stride is not None, it specifies the requested
      ratio of input to output spatial resolution.
    include_root_block: If True, include the initial convolution followed by
      max-pooling, if False excludes it. If excluded, `inputs` should be the
      results of an activation-less convolution.
    spatial_squeeze: if True, logits is of shape [B, C], if false logits is
        of shape [B, 1, 1, C], where B is batch_size and C is number of classes.
        To use this parameter, the input images must be smaller than 300x300
        pixels, in which case the output logit layer does not contain spatial
        information and can be removed.
    reuse: whether or not the network and its variables should be reused. To be
      able to reuse 'scope' must be given.
    scope: Optional variable_scope.


  Returns:
    net: A rank-4 tensor of size [batch, height_out, width_out, channels_out].
      If global_pool is False, then height_out and width_out are reduced by a
      factor of output_stride compared to the respective height_in and width_in,
      else both height_out and width_out equal one. If num_classes is None, then
      net is the output of the last ResNet block, potentially after global
      average pooling. If num_classes is not None, net contains the pre-softmax
      activations.
    end_points: A dictionary from components of the network to the corresponding
      activation.

  Raises:
    ValueError: If the target output_stride is not valid.
  """
  with tf.variable_scope(scope, 'resnet_v2', [inputs], reuse=reuse) as sc:
    end_points_collection = sc.name + '_end_points'
    with slim.arg_scope([slim.conv2d, bottleneck,
                         resnet_utils.stack_blocks_dense],
                        outputs_collections=end_points_collection):
      with slim.arg_scope([slim.batch_norm], is_training=is_training):
        net = inputs
        if include_root_block:
          if output_stride is not None:
            if output_stride % 4 != 0:
              raise ValueError('The output_stride needs to be a multiple of 4.')
            output_stride /= 4
          # We do not include batch normalization or activation functions in
          # conv1 because the first ResNet unit will perform these. Cf.
          # Appendix of [2].
          with slim.arg_scope([slim.conv2d],
                              activation_fn=None, normalizer_fn=None):
            net = resnet_utils.conv2d_same(net, 64, 7, stride=2, scope='conv1')
          net = slim.max_pool2d(net, [3, 3], stride=2, scope='pool1')
        net = resnet_utils.stack_blocks_dense(net, blocks, output_stride)
        # This is needed because the pre-activation variant does not have batch
        # normalization or activation functions in the residual unit output. See
        # Appendix of [2].
        net = slim.batch_norm(net, activation_fn=tf.nn.relu, scope='postnorm')
        if global_pool:
          # Global average pooling.
          net = tf.reduce_mean(net, [1, 2], name='pool5', keep_dims=True)
        if num_classes is not None:
          net = slim.dropout(net, keep_prob=dropout_keep_prob, scope='Dropout_1b')
          net = slim.conv2d(net, num_classes, [1, 1], activation_fn=None,
                            normalizer_fn=None, scope='logits')
          if spatial_squeeze:
            net = tf.squeeze(net, [1, 2], name='SpatialSqueeze')
        # Convert end_points_collection into a dictionary of end_points.
        end_points = slim.utils.convert_collection_to_dict(
            end_points_collection)
        if num_classes is not None:
          end_points['predictions'] = slim.softmax(net, scope='predictions')
        return net, end_points
resnet_v2.default_image_size = 224


def resnet_v2_block(scope, base_depth, num_units, stride):
  """Helper function for creating a resnet_v2 bottleneck block.

  Args:
    scope: The scope of the block.
    base_depth: The depth of the bottleneck layer for each unit.
    num_units: The number of units in the block.
    stride: The stride of the block, implemented as a stride in the last unit.
      All other units have stride=1.

  Returns:
    A resnet_v2 bottleneck block.
  """
  return resnet_utils.Block(scope, bottleneck, [{
      'depth': base_depth * 4,
      'depth_bottleneck': base_depth,
      'stride': 1
  }] * (num_units - 1) + [{
      'depth': base_depth * 4,
      'depth_bottleneck': base_depth,
      'stride': stride
  }])
resnet_v2.default_image_size = 224


def resnet_v2_50(inputs,
                 num_classes=None,
                 is_training=True,
                 global_pool=True,
                 output_stride=None,
                 spatial_squeeze=True,
                 dropout_keep_prob=1.,
                 reuse=None,
                 scope='resnet_v2_50'):
  """ResNet-50 model of [1]. See resnet_v2() for arg and return description."""
  blocks = [
      resnet_v2_block('block1', base_depth=64, num_units=3, stride=2),
      resnet_v2_block('block2', base_depth=128, num_units=4, stride=2),
      resnet_v2_block('block3', base_depth=256, num_units=6, stride=2),
      resnet_v2_block('block4', base_depth=512, num_units=3, stride=1),
  ]
  return resnet_v2(inputs, blocks, num_classes, is_training=is_training,
                   global_pool=global_pool, output_stride=output_stride,
                   include_root_block=True, spatial_squeeze=spatial_squeeze,
                   dropout_keep_prob=dropout_keep_prob, reuse=reuse, scope=scope)
resnet_v2_50.default_image_size = resnet_v2.default_image_size


def resnet_v2_101(inputs,
                  num_classes=None,
                  is_training=True,
                  global_pool=True,
                  output_stride=None,
                  spatial_squeeze=True,
                  reuse=None,
                  scope='resnet_v2_101'):
  """ResNet-101 model of [1]. See resnet_v2() for arg and return description."""
  blocks = [
      resnet_v2_block('block1', base_depth=64, num_units=3, stride=2),
      resnet_v2_block('block2', base_depth=128, num_units=4, stride=2),
      resnet_v2_block('block3', base_depth=256, num_units=23, stride=2),
      resnet_v2_block('block4', base_depth=512, num_units=3, stride=1),
  ]
  return resnet_v2(inputs, blocks, num_classes, is_training=is_training,
                   global_pool=global_pool, output_stride=output_stride,
                   include_root_block=True, spatial_squeeze=spatial_squeeze,
                   reuse=reuse, scope=scope)
resnet_v2_101.default_image_size = resnet_v2.default_image_size


def resnet_v2_152(inputs,
                  num_classes=None,
                  is_training=True,
                  global_pool=True,
                  output_stride=None,
                  spatial_squeeze=True,
                  dropout_keep_prob=1.,
                  reuse=None,
                  scope='resnet_v2_152'):
  """ResNet-152 model of [1]. See resnet_v2() for arg and return description."""
  blocks = [
      resnet_v2_block('block1', base_depth=64, num_units=3, stride=2),
      resnet_v2_block('block2', base_depth=128, num_units=8, stride=2),
      resnet_v2_block('block3', base_depth=256, num_units=36, stride=2),
      resnet_v2_block('block4', base_depth=512, num_units=3, stride=1),
  ]
  return resnet_v2(inputs, blocks, num_classes, is_training=is_training,
                   global_pool=global_pool, output_stride=output_stride,
                   include_root_block=True, spatial_squeeze=spatial_squeeze,
                   dropout_keep_prob=dropout_keep_prob, reuse=reuse, scope=scope)
resnet_v2_152.default_image_size = resnet_v2.default_image_size


def resnet_v2_200(inputs,
                  num_classes=None,
                  is_training=True,
                  global_pool=True,
                  output_stride=None,
                  spatial_squeeze=True,
                  dropout_keep_prob=1.,
                  reuse=None,
                  scope='resnet_v2_200'):
  """ResNet-200 model of [2]. See resnet_v2() for arg and return description."""
  blocks = [
      resnet_v2_block('block1', base_depth=64, num_units=3, stride=2),
      resnet_v2_block('block2', base_depth=128, num_units=24, stride=2),
      resnet_v2_block('block3', base_depth=256, num_units=36, stride=2),
      resnet_v2_block('block4', base_depth=512, num_units=3, stride=1),
  ]
  return resnet_v2(inputs, blocks, num_classes, is_training=is_training,
                   global_pool=global_pool, output_stride=output_stride,
                   include_root_block=True, spatial_squeeze=spatial_squeeze,
                   dropout_keep_prob=dropout_keep_prob, reuse=reuse, scope=scope)
resnet_v2_200.default_image_size = resnet_v2.default_image_size

================================================
FILE: nets/resnet_v2_test.py
================================================
# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tests for slim.nets.resnet_v2."""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np
import tensorflow as tf

from nets import resnet_utils
from nets import resnet_v2

slim = tf.contrib.slim


def create_test_input(batch_size, height, width, channels):
  """Create test input tensor.

  Args:
    batch_size: The number of images per batch or `None` if unknown.
    height: The height of each image or `None` if unknown.
    width: The width of each image or `None` if unknown.
    channels: The number of channels per image or `None` if unknown.

  Returns:
    Either a placeholder `Tensor` of dimension
      [batch_size, height, width, channels] if any of the inputs are `None` or a
    constant `Tensor` with the mesh grid values along the spatial dimensions.
  """
  if None in [batch_size, height, width, channels]:
    return tf.placeholder(tf.float32, (batch_size, height, width, channels))
  else:
    return tf.to_float(
        np.tile(
            np.reshape(
                np.reshape(np.arange(height), [height, 1]) +
                np.reshape(np.arange(width), [1, width]),
                [1, height, width, 1]),
            [batch_size, 1, 1, channels]))


class ResnetUtilsTest(tf.test.TestCase):

  def testSubsampleThreeByThree(self):
    x = tf.reshape(tf.to_float(tf.range(9)), [1, 3, 3, 1])
    x = resnet_utils.subsample(x, 2)
    expected = tf.reshape(tf.constant([0, 2, 6, 8]), [1, 2, 2, 1])
    with self.test_session():
      self.assertAllClose(x.eval(), expected.eval())

  def testSubsampleFourByFour(self):
    x = tf.reshape(tf.to_float(tf.range(16)), [1, 4, 4, 1])
    x = resnet_utils.subsample(x, 2)
    expected = tf.reshape(tf.constant([0, 2, 8, 10]), [1, 2, 2, 1])
    with self.test_session():
      self.assertAllClose(x.eval(), expected.eval())

  def testConv2DSameEven(self):
    n, n2 = 4, 2

    # Input image.
    x = create_test_input(1, n, n, 1)

    # Convolution kernel.
    w = create_test_input(1, 3, 3, 1)
    w = tf.reshape(w, [3, 3, 1, 1])

    tf.get_variable('Conv/weights', initializer=w)
    tf.get_variable('Conv/biases', initializer=tf.zeros([1]))
    tf.get_variable_scope().reuse_variables()

    y1 = slim.conv2d(x, 1, [3, 3], stride=1, scope='Conv')
    y1_expected = tf.to_float([[14, 28, 43, 26],
                               [28, 48, 66, 37],
                               [43, 66, 84, 46],
                               [26, 37, 46, 22]])
    y1_expected = tf.reshape(y1_expected, [1, n, n, 1])

    y2 = resnet_utils.subsample(y1, 2)
    y2_expected = tf.to_float([[14, 43],
                               [43, 84]])
    y2_expected = tf.reshape(y2_expected, [1, n2, n2, 1])

    y3 = resnet_utils.conv2d_same(x, 1, 3, stride=2, scope='Conv')
    y3_expected = y2_expected

    y4 = slim.conv2d(x, 1, [3, 3], stride=2, scope='Conv')
    y4_expected = tf.to_float([[48, 37],
                               [37, 22]])
    y4_expected = tf.reshape(y4_expected, [1, n2, n2, 1])

    with self.test_session() as sess:
      sess.run(tf.global_variables_initializer())
      self.assertAllClose(y1.eval(), y1_expected.eval())
      self.assertAllClose(y2.eval(), y2_expected.eval())
      self.assertAllClose(y3.eval(), y3_expected.eval())
      self.assertAllClose(y4.eval(), y4_expected.eval())

  def testConv2DSameOdd(self):
    n, n2 = 5, 3

    # Input image.
    x = create_test_input(1, n, n, 1)

    # Convolution kernel.
    w = create_test_input(1, 3, 3, 1)
    w = tf.reshape(w, [3, 3, 1, 1])

    tf.get_variable('Conv/weights', initializer=w)
    tf.get_variable('Conv/biases', initializer=tf.zeros([1]))
    tf.get_variable_scope().reuse_variables()

    y1 = slim.conv2d(x, 1, [3, 3], stride=1, scope='Conv')
    y1_expected = tf.to_float([[14, 28, 43, 58, 34],
                               [28, 48, 66, 84, 46],
                               [43, 66, 84, 102, 55],
                               [58, 84, 102, 120, 64],
                               [34, 46, 55, 64, 30]])
    y1_expected = tf.reshape(y1_expected, [1, n, n, 1])

    y2 = resnet_utils.subsample(y1, 2)
    y2_expected = tf.to_float([[14, 43, 34],
                               [43, 84, 55],
                               [34, 55, 30]])
    y2_expected = tf.reshape(y2_expected, [1, n2, n2, 1])

    y3 = resnet_utils.conv2d_same(x, 1, 3, stride=2, scope='Conv')
    y3_expected = y2_expected

    y4 = slim.conv2d(x, 1, [3, 3], stride=2, scope='Conv')
    y4_expected = y2_expected

    with self.test_session() as sess:
      sess.run(tf.global_variables_initializer())
      self.assertAllClose(y1.eval(), y1_expected.eval())
      self.assertAllClose(y2.eval(), y2_expected.eval())
      self.assertAllClose(y3.eval(), y3_expected.eval())
      self.assertAllClose(y4.eval(), y4_expected.eval())

  def _resnet_plain(self, inputs, blocks, output_stride=None, scope=None):
    """A plain ResNet without extra layers before or after the ResNet blocks."""
    with tf.variable_scope(scope, values=[inputs]):
      with slim.arg_scope([slim.conv2d], outputs_collections='end_points'):
        net = resnet_utils.stack_blocks_dense(inputs, blocks, output_stride)
        end_points = slim.utils.convert_collection_to_dict('end_points')
        return net, end_points

  def testEndPointsV2(self):
    """Test the end points of a tiny v2 bottleneck network."""
    blocks = [
        resnet_v2.resnet_v2_block(
            'block1', base_depth=1, num_units=2, stride=2),
        resnet_v2.resnet_v2_block(
            'block2', base_depth=2, num_units=2, stride=1),
    ]
    inputs = create_test_input(2, 32, 16, 3)
    with slim.arg_scope(resnet_utils.resnet_arg_scope()):
      _, end_points = self._resnet_plain(inputs, blocks, scope='tiny')
    expected = [
        'tiny/block1/unit_1/bottleneck_v2/shortcut',
        'tiny/block1/unit_1/bottleneck_v2/conv1',
        'tiny/block1/unit_1/bottleneck_v2/conv2',
        'tiny/block1/unit_1/bottleneck_v2/conv3',
        'tiny/block1/unit_2/bottleneck_v2/conv1',
        'tiny/block1/unit_2/bottleneck_v2/conv2',
        'tiny/block1/unit_2/bottleneck_v2/conv3',
        'tiny/block2/unit_1/bottleneck_v2/shortcut',
        'tiny/block2/unit_1/bottleneck_v2/conv1',
        'tiny/block2/unit_1/bottleneck_v2/conv2',
        'tiny/block2/unit_1/bottleneck_v2/conv3',
        'tiny/block2/unit_2/bottleneck_v2/conv1',
        'tiny/block2/unit_2/bottleneck_v2/conv2',
        'tiny/block2/unit_2/bottleneck_v2/conv3']
    self.assertItemsEqual(expected, end_points)

  def _stack_blocks_nondense(self, net, blocks):
    """A simplified ResNet Block stacker without output stride control."""
    for block in blocks:
      with tf.variable_scope(block.scope, 'block', [net]):
        for i, unit in enumerate(block.args):
          with tf.variable_scope('unit_%d' % (i + 1), values=[net]):
            net = block.unit_fn(net, rate=1, **unit)
    return net

  def testAtrousValuesBottleneck(self):
    """Verify the values of dense feature extraction by atrous convolution.

    Make sure that dense feature extraction by stack_blocks_dense() followed by
    subsampling gives identical results to feature extraction at the nominal
    network output stride using the simple self._stack_blocks_nondense() above.
    """
    block = resnet_v2.resnet_v2_block
    blocks = [
        block('block1', base_depth=1, num_units=2, stride=2),
        block('block2', base_depth=2, num_units=2, stride=2),
        block('block3', base_depth=4, num_units=2, stride=2),
        block('block4', base_depth=8, num_units=2, stride=1),
    ]
    nominal_stride = 8

    # Test both odd and even input dimensions.
    height = 30
    width = 31
    with slim.arg_scope(resnet_utils.resnet_arg_scope()):
      with slim.arg_scope([slim.batch_norm], is_training=False):
        for output_stride in [1, 2, 4, 8, None]:
          with tf.Graph().as_default():
            with self.test_session() as sess:
              tf.set_random_seed(0)
              inputs = create_test_input(1, height, width, 3)
              # Dense feature extraction followed by subsampling.
              output = resnet_utils.stack_blocks_dense(inputs,
                                                       blocks,
                                                       output_stride)
              if output_stride is None:
                factor = 1
              else:
                factor = nominal_stride // output_stride

              output = resnet_utils.subsample(output, factor)
              # Make the two networks use the same weights.
              tf.get_variable_scope().reuse_variables()
              # Feature extraction at the nominal network rate.
              expected = self._stack_blocks_nondense(inputs, blocks)
              sess.run(tf.global_variables_initializer())
              output, expected = sess.run([output, expected])
              self.assertAllClose(output, expected, atol=1e-4, rtol=1e-4)


class ResnetCompleteNetworkTest(tf.test.TestCase):
  """Tests with complete small ResNet v2 networks."""

  def _resnet_small(self,
                    inputs,
                    num_classes=None,
                    is_training=True,
                    global_pool=True,
                    output_stride=None,
                    include_root_block=True,
                    spatial_squeeze=True,
                    reuse=None,
                    scope='resnet_v2_small'):
    """A shallow and thin ResNet v2 for faster tests."""
    block = resnet_v2.resnet_v2_block
    blocks = [
        block('block1', base_depth=1, num_units=3, stride=2),
        block('block2', base_depth=2, num_units=3, stride=2),
        block('block3', base_depth=4, num_units=3, stride=2),
        block('block4', base_depth=8, num_units=2, stride=1),
    ]
    return resnet_v2.resnet_v2(inputs, blocks, num_classes,
                               is_training=is_training,
                               global_pool=global_pool,
                               output_stride=output_stride,
                               include_root_block=include_root_block,
                               spatial_squeeze=spatial_squeeze,
                               reuse=reuse,
                               scope=scope)

  def testClassificationEndPoints(self):
    global_pool = True
    num_classes = 10
    inputs = create_test_input(2, 224, 224, 3)
    with slim.arg_scope(resnet_utils.resnet_arg_scope()):
      logits, end_points = self._resnet_small(inputs, num_classes,
                                              global_pool=global_pool,
                                              spatial_squeeze=False,
                                              scope='resnet')
    self.assertTrue(logits.op.name.startswith('resnet/logits'))
    self.assertListEqual(logits.get_shape().as_list(), [2, 1, 1, num_classes])
    self.assertTrue('predictions' in end_points)
    self.assertListEqual(end_points['predictions'].get_shape().as_list(),
                         [2, 1, 1, num_classes])

  def testClassificationShapes(self):
    global_pool = True
    num_classes = 10
    inputs = create_test_input(2, 224, 224, 3)
    with slim.arg_scope(resnet_utils.resnet_arg_scope()):
      _, end_points = self._resnet_small(inputs, num_classes,
                                         global_pool=global_pool,
                                         scope='resnet')
      endpoint_to_shape = {
          'resnet/block1': [2, 28, 28, 4],
          'resnet/block2': [2, 14, 14, 8],
          'resnet/block3': [2, 7, 7, 16],
          'resnet/block4': [2, 7, 7, 32]}
      for endpoint in endpoint_to_shape:
        shape = endpoint_to_shape[endpoint]
        self.assertListEqual(end_points[endpoint].get_shape().as_list(), shape)

  def testFullyConvolutionalEndpointShapes(self):
    global_pool = False
    num_classes = 10
    inputs = create_test_input(2, 321, 321, 3)
    with slim.arg_scope(resnet_utils.resnet_arg_scope()):
      _, end_points = self._resnet_small(inputs, num_classes,
                                         global_pool=global_pool,
                                         spatial_squeeze=False,
                                         scope='resnet')
      endpoint_to_shape = {
          'resnet/block1': [2, 41, 41, 4],
          'resnet/block2': [2, 21, 21, 8],
          'resnet/block3': [2, 11, 11, 16],
          'resnet/block4': [2, 11, 11, 32]}
      for endpoint in endpoint_to_shape:
        shape = endpoint_to_shape[endpoint]
        self.assertListEqual(end_points[endpoint].get_shape().as_list(), shape)

  def testRootlessFullyConvolutionalEndpointShapes(self):
    global_pool = False
    num_classes = 10
    inputs = create_test_input(2, 128, 128, 3)
    with slim.arg_scope(resnet_utils.resnet_arg_scope()):
      _, end_points = self._resnet_small(inputs, num_classes,
                                         global_pool=global_pool,
                                         include_root_block=False,
                                         spatial_squeeze=False,
                                         scope='resnet')
      endpoint_to_shape = {
          'resnet/block1': [2, 64, 64, 4],
          'resnet/block2': [2, 32, 32, 8],
          'resnet/block3': [2, 16, 16, 16],
          'resnet/block4': [2, 16, 16, 32]}
      for endpoint in endpoint_to_shape:
        shape = endpoint_to_shape[endpoint]
        self.assertListEqual(end_points[endpoint].get_shape().as_list(), shape)

  def testAtrousFullyConvolutionalEndpointShapes(self):
    global_pool = False
    num_classes = 10
    output_stride = 8
    inputs = create_test_input(2, 321, 321, 3)
    with slim.arg_scope(resnet_utils.resnet_arg_scope()):
      _, end_points = self._resnet_small(inputs,
                                         num_classes,
                                         global_pool=global_pool,
                                         output_stride=output_stride,
                                         spatial_squeeze=False,
                                         scope='resnet')
      endpoint_to_shape = {
          'resnet/block1': [2, 41, 41, 4],
          'resnet/block2': [2, 41, 41, 8],
          'resnet/block3': [2, 41, 41, 16],
          'resnet/block4': [2, 41, 41, 32]}
      for endpoint in endpoint_to_shape:
        shape = endpoint_to_shape[endpoint]
        self.assertListEqual(end_points[endpoint].get_shape().as_list(), shape)

  def testAtrousFullyConvolutionalValues(self):
    """Verify dense feature extraction with atrous convolution."""
    nominal_stride = 32
    for output_stride in [4, 8, 16, 32, None]:
      with slim.arg_scope(resnet_utils.resnet_arg_scope()):
        with tf.Graph().as_default():
          with self.test_session() as sess:
            tf.set_random_seed(0)
            inputs = create_test_input(2, 81, 81, 3)
            # Dense feature extraction followed by subsampling.
            output, _ = self._resnet_small(inputs, None,
                                           is_training=False,
                                           global_pool=False,
                                           output_stride=output_stride)
            if output_stride is None:
              factor = 1
            else:
              factor = nominal_stride // output_stride
            output = resnet_utils.subsample(output, factor)
            # Make the two networks use the same weights.
            tf.get_variable_scope().reuse_variables()
            # Feature extraction at the nominal network rate.
            expected, _ = self._resnet_small(inputs, None,
                                             is_training=False,
                                             global_pool=False)
            sess.run(tf.global_variables_initializer())
            self.assertAllClose(output.eval(), expected.eval(),
                                atol=1e-4, rtol=1e-4)

  def testUnknownBatchSize(self):
    batch = 2
    height, width = 65, 65
    global_pool = True
    num_classes = 10
    inputs = create_test_input(None, height, width, 3)
    with slim.arg_scope(resnet_utils.resnet_arg_scope()):
      logits, _ = self._resnet_small(inputs, num_classes,
                                     global_pool=global_pool,
                                     spatial_squeeze=False,
                                     scope='resnet')
    self.assertTrue(logits.op.name.startswith('resnet/logits'))
    self.assertListEqual(logits.get_shape().as_list(),
                         [None, 1, 1, num_classes])
    images = create_test_input(batch, height, width, 3)
    with self.test_session() as sess:
      sess.run(tf.global_variables_initializer())
      output = sess.run(logits, {inputs: images.eval()})
      self.assertEqual(output.shape, (batch, 1, 1, num_classes))

  def testFullyConvolutionalUnknownHeightWidth(self):
    batch = 2
    height, width = 65, 65
    global_pool = False
    inputs = create_test_input(batch, None, None, 3)
    with slim.arg_scope(resnet_utils.resnet_arg_scope()):
      output, _ = self._resnet_small(inputs, None,
                                     global_pool=global_pool)
    self.assertListEqual(output.get_shape().as_list(),
                         [batch, None, None, 32])
    images = create_test_input(batch, height, width, 3)
    with self.test_session() as sess:
      sess.run(tf.global_variables_initializer())
      output = sess.run(output, {inputs: images.eval()})
      self.assertEqual(output.shape, (batch, 3, 3, 32))

  def testAtrousFullyConvolutionalUnknownHeightWidth(self):
    batch = 2
    height, width = 65, 65
    global_pool = False
    output_stride = 8
    inputs = create_test_input(batch, None, None, 3)
    with slim.arg_scope(resnet_utils.resnet_arg_scope()):
      output, _ = self._resnet_small(inputs,
                                     None,
                                     global_pool=global_pool,
                                     output_stride=output_stride)
    self.assertListEqual(output.get_shape().as_list(),
                         [batch, None, None, 32])
    images = create_test_input(batch, height, width, 3)
    with self.test_session() as sess:
      sess.run(tf.global_variables_initializer())
      output = sess.run(output, {inputs: images.eval()})
      self.assertEqual(output.shape, (batch, 9, 9, 32))


if __name__ == '__main__':
  tf.test.main()

================================================
FILE: preprocessing/__init__.py
================================================


================================================
FILE: preprocessing/decode_example.py
================================================
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tensorflow as tf

def decode_serialized_example(serialized_example, features_to_fetch, decode_image=True):
    """
    Args:
        serialized_example : A tfrecord example
        features_to_fetch : a list of tuples (feature key, name for feature)
    Returns:
        dictionary : maps name to parsed example
    """

    feature_map = {}
    for feature_key, feature_name in features_to_fetch:
        feature_map[feature_key] = {
            'image/height': tf.FixedLenFeature([], tf.int64),
            'image/width': tf.FixedLenFeature([], tf.int64),
            'image/colorspace': tf.FixedLenFeature([], tf.string),
            'image/channels': tf.FixedLenFeature([], tf.int64),
            'image/format': tf.FixedLenFeature([], tf.string),
            'image/filename': tf.FixedLenFeature([], tf.string),
            'image/id': tf.FixedLenFeature([], tf.string),
            'image/encoded': tf.FixedLenFeature([], tf.string),
            'image/extra': tf.FixedLenFeature([], tf.string),
            'image/class/label': tf.FixedLenFeature([], tf.int64),
            'image/class/text': tf.FixedLenFeature([], tf.string),
            'image/class/conf':  tf.FixedLenFeature([], tf.float32),
            'image/object/bbox/xmin': tf.VarLenFeature(dtype=tf.float32),
            'image/object/bbox/xmax': tf.VarLenFeature(dtype=tf.float32),
            'image/object/bbox/ymin': tf.VarLenFeature(dtype=tf.float32),
            'image/object/bbox/ymax': tf.VarLenFeature(dtype=tf.float32),
            'image/object/bbox/label': tf.VarLenFeature(dtype=tf.int64),
            'image/object/bbox/text': tf.VarLenFeature(dtype=tf.string),
            'image/object/bbox/conf': tf.VarLenFeature(dtype=tf.float32),
            'image/object/bbox/score' : tf.VarLenFeature(dtype=tf.float32),
            'image/object/parts/x' : tf.VarLenFeature(dtype=tf.float32),
            'image/object/parts/y' : tf.VarLenFeature(dtype=tf.float32),
            'image/object/parts/v' : tf.VarLenFeature(dtype=tf.int64),
            'image/object/parts/score' : tf.VarLenFeature(dtype=tf.float32),
            'image/object/count' : tf.FixedLenFeature([], tf.int64),
            'image/object/area' : tf.VarLenFeature(dtype=tf.float32),
            'image/object/id' : tf.VarLenFeature(dtype=tf.string)
        }[feature_key]

    features = tf.parse_single_example(
      serialized_example,
      features = feature_map
    )

    # return a dictionary of the features
    parsed_features = {}

    for feature_key, feature_name in features_to_fetch:
        if feature_key == 'image/height':
            parsed_features[feature_name] = features[feature_key]
        elif feature_key == 'image/width':
            parsed_features[feature_name] = features[feature_key]
        elif feature_key == 'image/colorspace':
            parsed_features[feature_name] = features[feature_key]
        elif feature_key == 'image/channels':
            parsed_features[feature_name] = features[feature_key]
        elif feature_key == 'image/format':
            parsed_features[feature_name] = features[feature_key]
        elif feature_key == 'image/filename':
            parsed_features[feature_name] = features[feature_key]
        elif feature_key == 'image/id':
            parsed_features[feature_name] = features[feature_key]
        elif feature_key == 'image/encoded':
            if decode_image:
                parsed_features[feature_name] = tf.image.decode_jpeg(features[feature_key], channels=3)
            else:
                parsed_features[feature_name] = features[feature_key]
        elif feature_key == 'image/extra':
            parsed_features[feature_name] = features[feature_key]
        elif feature_key == 'image/class/label':
            parsed_features[feature_name] = features[feature_key]
        elif feature_key == 'image/class/text':
            parsed_features[feature_name] = features[feature_key]
        elif feature_key == 'image/class/conf':
            parsed_features[feature_name] = features[feature_key]
        elif feature_key == 'image/object/bbox/xmin':
            parsed_features[feature_name] = features[feature_key].values
        elif feature_key == 'image/object/bbox/xmax':
            parsed_features[feature_name] = features[feature_key].values
        elif feature_key == 'image/object/bbox/ymin':
            parsed_features[feature_name] = features[feature_key].values
        elif feature_key == 'image/object/bbox/ymax':
            parsed_features[feature_name] = features[feature_key].values
        elif feature_key == 'image/object/bbox/label':
            parsed_features[feature_name] = features[feature_key].values
        elif feature_key == 'image/object/bbox/text':
            parsed_features[feature_name] = features[feature_key].values
        elif feature_key == 'image/object/bbox/conf':
            parsed_features[feature_name] = features[feature_key].values
        elif feature_key == 'image/object/bbox/score' :
            parsed_features[feature_name] = features[feature_key].values
        elif feature_key == 'image/object/parts/x' :
            parsed_features[feature_name] = features[feature_key].values
        elif feature_key == 'image/object/parts/y' :
            parsed_features[feature_name] = features[feature_key].values
        elif feature_key == 'image/object/parts/v' :
            parsed_features[feature_name] = features[feature_key].values
        elif feature_key == 'image/object/parts/score' :
            parsed_features[feature_name] = features[feature_key].values
        elif feature_key == 'image/object/count' :
            parsed_features[feature_name] = features[feature_key]
        elif feature_key == 'image/object/area' :
            parsed_features[feature_name] = features[feature_key].values
        elif feature_key == 'image/object/id' :
            parsed_features[feature_name] = features[feature_key].values

    return parsed_features

================================================
FILE: preprocessing/inputs.py
================================================
# Some of this code came from the https://github.com/tensorflow/models/tree/master/slim
# directory, so lets keep the Google license around for now.
#
# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

"""Provides utilities to preprocess images for the Inception networks."""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from easydict import EasyDict
import tensorflow as tf
from tensorflow.python.ops import control_flow_ops

from preprocessing.decode_example import decode_serialized_example


def apply_with_random_selector(x, func, num_cases):
  """Computes func(x, sel), with sel sampled from [0...num_cases-1].
  Args:
    x: input Tensor.
    func: Python function to apply.
    num_cases: Python int32, number of cases to sample sel from.
  Returns:
    The result of func(x, sel), where func receives the value of the
    selector as a python integer, but sel is sampled dynamically.
  """
  sel = tf.random_uniform([], maxval=num_cases, dtype=tf.int32)
  # Pass the real x only to one of the func calls.
  return control_flow_ops.merge([
      func(control_flow_ops.switch(x, tf.equal(sel, case))[1], case)
      for case in range(num_cases)])[0]


def distort_color(image, color_ordering=0, fast_mode=True, scope=None):
  """Distort the color of a Tensor image.
  Each color distortion is non-commutative and thus ordering of the color ops
  matters. Ideally we would randomly permute the ordering of the color ops.
  Rather then adding that level of complication, we select a distinct ordering
  of color ops for each preprocessing thread.
  Args:
    image: 3-D Tensor containing single image in [0, 1].
    color_ordering: Python int, a type of distortion (valid values: 0-3).
    fast_mode: Avoids slower ops (random_hue and random_contrast)
    scope: Optional scope for name_scope.
  Returns:
    3-D Tensor color-distorted image on range [0, 1]
  Raises:
    ValueError: if color_ordering not in [0, 3]
  """
  with tf.name_scope(scope, 'distort_color', [image]):
    if fast_mode:
      if color_ordering == 0:
        image = tf.image.random_brightness(image, max_delta=32. / 255.)
        image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
      else:
        image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
        image = tf.image.random_brightness(image, max_delta=32. / 255.)
    else:
      if color_ordering == 0:
        image = tf.image.random_brightness(image, max_delta=32. / 255.)
        image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
        image = tf.image.random_hue(image, max_delta=0.2)
        image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
      elif color_ordering == 1:
        image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
        image = tf.image.random_brightness(image, max_delta=32. / 255.)
        image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
        image = tf.image.random_hue(image, max_delta=0.2)
      elif color_ordering == 2:
        image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
        image = tf.image.random_hue(image, max_delta=0.2)
        image = tf.image.random_brightness(image, max_delta=32. / 255.)
        image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
      elif color_ordering == 3:
        image = tf.image.random_hue(image, max_delta=0.2)
        image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
        image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
        image = tf.image.random_brightness(image, max_delta=32. / 255.)
      else:
        raise ValueError('color_ordering must be in [0, 3]')

    # The random_* ops do not necessarily clamp.
    return tf.clip_by_value(image, 0.0, 1.0)

def distorted_bounding_box_crop(image,
                                bbox,
                                min_object_covered=0.1,
                                aspect_ratio_range=(0.75, 1.33),
                                area_range=(0.05, 1.0),
                                max_attempts=100,
                                scope=None):
  """Generates cropped_image using a one of the bboxes randomly distorted.
  See `tf.image.sample_distorted_bounding_box` for more documentation.
  Args:
    image: 3-D Tensor of image (it will be converted to floats in [0, 1]).
    bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]
      where each coordinate is [0, 1) and the coordinates are arranged
      as [ymin, xmin, ymax, xmax]. If num_boxes is 0 then it would use the whole
      image.
    min_object_covered: An optional `float`. Defaults to `0.1`. The cropped
      area of the image must contain at least this fraction of any bounding box
      supplied.
    aspect_ratio_range: An optional list of `floats`. The cropped area of the
      image must have an aspect ratio = width / height within this range.
    area_range: An optional list of `floats`. The cropped area of the image
      must contain a fraction of the supplied image within in this range.
    max_attempts: An optional `int`. Number of attempts at generating a cropped
      region of the image of the specified constraints. After `max_attempts`
      failures, return the entire image.
    scope: Optional scope for name_scope.
  Returns:
    A tuple, a 3-D Tensor cropped_image and the distorted bbox
  """
  with tf.name_scope(scope, 'distorted_bounding_box_crop', [image, bbox]):
    # Each bounding box has shape [1, num_boxes, box coords] and
    # the coordinates are ordered [ymin, xmin, ymax, xmax].

    # A large fraction of image datasets contain a human-annotated bounding
    # box delineating the region of the image containing the object of interest.
    # We choose to create a new bounding box for the object which is a randomly
    # distorted version of the human-annotated bounding box that obeys an
    # allowed range of aspect ratios, sizes and overlap with the human-annotated
    # bounding box. If no box is supplied, then we assume the bounding box is
    # the entire image.
    sample_distorted_bounding_box = tf.image.sample_distorted_bounding_box(
        tf.shape(image),
        bounding_boxes=bbox,
        min_object_covered=min_object_covered,
        aspect_ratio_range=aspect_ratio_range,
        area_range=area_range,
        max_attempts=max_attempts,
        use_image_if_no_bounding_boxes=True)
    bbox_begin, bbox_size, distort_bbox = sample_distorted_bounding_box

    # Crop the image to the specified bounding box.
    cropped_image = tf.slice(image, bbox_begin, bbox_size)
    return tf.tuple([cropped_image, distort_bbox])

def _largest_size_at_most(height, width, largest_side):
  """Computes new shape with the largest side equal to `largest_side`.
  Computes new shape with the largest side equal to `largest_side` while
  preserving the original aspect ratio.
  Args:
    height: an int32 scalar tensor indicating the current height.
    width: an int32 scalar tensor indicating the current width.
    largest_side: A python integer or scalar `Tensor` indicating the size of
      the largest side after resize.
  Returns:
    new_height: an int32 scalar tensor indicating the new height.
    new_width: and int32 scalar tensor indicating the new width.
  """
  largest_side = tf.convert_to_tensor(largest_side, dtype=tf.int32)

  height = tf.to_float(height)
  width = tf.to_float(width)
  largest_side = tf.to_float(largest_side)

  scale = tf.cond(tf.greater(height, width),
                  lambda: largest_side / height,
                  lambda: largest_side / width)
  new_height = tf.to_int32(height * scale)
  new_width = tf.to_int32(width * scale)
  return new_height, new_width

class DistortedInputs():

    def __init__(self, cfg, add_summaries):
        self.cfg = cfg
        self.add_summaries = add_summaries

    def apply(self, original_image, bboxes, distorted_inputs, image_summaries, current_index):

        cfg = self.cfg
        add_summaries = self.add_summaries

        image_shape = tf.shape(original_image)
        image_height = tf.cast(image_shape[0], dtype=tf.float32) # cast so that we can multiply them by the bbox coords
        image_width = tf.cast(image_shape[1], dtype=tf.float32)

        # First thing we need to do is crop out the bbox region from the image
        bbox = bboxes[current_index]
        xmin = tf.cast(bbox[0] * image_width, tf.int32)
        ymin = tf.cast(bbox[1] * image_height, tf.int32)
        xmax = tf.cast(bbox[2] * image_width, tf.int32)
        ymax = tf.cast(bbox[3] * image_height, tf.int32)
        bbox_width = xmax - xmin
        bbox_height = ymax - ymin

        image = tf.image.crop_to_bounding_box(
            image=original_image,
            offset_height=ymin,
            offset_width=xmin,
            target_height=bbox_height,
            target_width=bbox_width
        )
        image_height = bbox_height
        image_width = bbox_width

        # Convert the pixel values to be in the range [0,1]
        if image.dtype != tf.float32:
          image = tf.image.convert_image_dtype(image, dtype=tf.float32)

        # Add a summary of the original data
        if add_summaries:
            new_height, new_width = _largest_size_at_most(image_height, image_width, cfg.INPUT_SIZE)
            resized_original_image = tf.image.resize_bilinear(tf.expand_dims(image, 0), [new_height, new_width])
            resized_original_image = tf.squeeze(resized_original_image)
            resized_original_image = tf.image.pad_to_bounding_box(resized_original_image, 0, 0, cfg.INPUT_SIZE, cfg.INPUT_SIZE)

            # If there are multiple boxes for an image, we only want to write to the TensorArray once.
            #image_summaries = image_summaries.write(0, tf.expand_dims(resized_original_image, 0))
            image_summaries = tf.cond(tf.equal(current_index, 0),
                lambda: image_summaries.write(0, tf.expand_dims(resized_original_image, 0)),
                lambda: image_summaries.identity()
            )

        # Extract a distorted bbox
        if cfg.DO_RANDOM_CROP > 0:
            r = tf.random_uniform([], minval=0, maxval=1, dtype=tf.float32)
            do_crop = tf.less(r, cfg.DO_RANDOM_CROP)
            rc_cfg = cfg.RANDOM_CROP_CFG
            bbox = tf.constant([0.0, 0.0, 1.0, 1.0], dtype=tf.float32, shape=[1, 1, 4])
            distorted_image, distorted_bbox = tf.cond(do_crop,
                    lambda: distorted_bounding_box_crop(image, bbox,
                                                           aspect_ratio_range=(rc_cfg.MIN_ASPECT_RATIO, rc_cfg.MAX_ASPECT_RATIO),
                                                           area_range=(rc_cfg.MIN_AREA, rc_cfg.MAX_AREA),
                                                           max_attempts=rc_cfg.MAX_ATTEMPTS),
                    lambda: tf.tuple([image, bbox])
                )
        else:
            distorted_image = tf.identity(image)
            distorted_bbox = tf.constant([[[0.0, 0.0, 1.0, 1.0]]]) # ymin, xmin, ymax, xmax

        if cfg.DO_CENTRAL_CROP > 0:
            r = tf.random_uniform([], minval=0, maxval=1, dtype=tf.float32)
            do_crop = tf.less(r, cfg.DO_CENTRAL_CROP)
            distorted_image = tf.cond(do_crop,
                lambda: tf.image.central_crop(distorted_image, cfg.CENTRAL_CROP_FRACTION),
                lambda: tf.identity(distorted_image)
            )

        distorted_image.set_shape([None, None, 3])

        # Add a summary
        if add_summaries:
            image_with_bbox = tf.image.draw_bounding_boxes(tf.expand_dims(image, 0), distorted_bbox)
            new_height, new_width = _largest_size_at_most(image_height, image_width, cfg.INPUT_SIZE)
            resized_image_with_bbox = tf.image.resize_bilinear(image_with_bbox, [new_height, new_width])
            resized_image_with_bbox = tf.squeeze(resized_image_with_bbox)
            resized_image_with_bbox = tf.image.pad_to_bounding_box(resized_image_with_bbox, 0, 0, cfg.INPUT_SIZE, cfg.INPUT_SIZE)
            #image_summaries = image_summaries.write(1, tf.expand_dims(resized_image_with_bbox, 0))
            image_summaries = tf.cond(tf.equal(current_index, 0),
                lambda: image_summaries.write(1, tf.expand_dims(resized_image_with_bbox, 0)),
                lambda: image_summaries.identity()
            )

        # Resize the distorted image to the correct dimensions for the network
        if cfg.MAINTAIN_ASPECT_RATIO:
            shape = tf.shape(distorted_image)
            height = shape[0]
            width = shape[1]
            new_height, new_width = _largest_size_at_most(height, width, cfg.INPUT_SIZE)
        else:
            new_height = cfg.INPUT_SIZE
            new_width = cfg.INPUT_SIZE

        num_resize_cases = 1 if cfg.RESIZE_FAST else 4
        distorted_image = apply_with_random_selector(
            distorted_image,
            lambda x, method: tf.image.resize_images(x, [new_height, new_width], method=method),
            num_cases=num_resize_cases)

        distorted_image = tf.image.pad_to_bounding_box(distorted_image, 0, 0, cfg.INPUT_SIZE, cfg.INPUT_SIZE)

        if add_summaries:
            #image_summaries = image_summaries.write(2, tf.expand_dims(distorted_image, 0))
            image_summaries = tf.cond(tf.equal(current_index, 0),
                lambda: image_summaries.write(2, tf.expand_dims(distorted_image, 0)),
                lambda: image_summaries.identity()
            )

        # Randomly flip the image:
        if cfg.DO_RANDOM_FLIP_LEFT_RIGHT > 0:
          r = tf.random_uniform([], minval=0, maxval=1, dtype=tf.float32)
          do_flip = tf.less(r, 0.5)
          distorted_image = tf.cond(do_flip, lambda: tf.image.flip_left_right(distorted_image), lambda: tf.identity(distorted_image))

        # TODO: Can this be changed so that we don't always distort the colors?
        # Distort the colors
        if cfg.DO_COLOR_DISTORTION > 0:
            r = tf.random_uniform([], minval=0, maxval=1, dtype=tf.float32)
            do_color_distortion = tf.less(r, cfg.DO_COLOR_DISTORTION)
            num_color_cases = 1 if cfg.COLOR_DISTORT_FAST else 4
            distorted_color_image = apply_with_random_selector(
              distorted_image,
              lambda x, ordering: distort_color(x, ordering, fast_mode=cfg.COLOR_DISTORT_FAST),
              num_cases=num_color_cases)
            distorted_image = tf.cond(do_color_distortion, lambda: tf.identity(distorted_color_image), lambda: tf.identity(distorted_image))

        distorted_image.set_shape([cfg.INPUT_SIZE, cfg.INPUT_SIZE, 3])

        # Add a summary
        if add_summaries:
            #image_summaries = image_summaries.write(3, tf.expand_dims(distorted_image, 0))
            image_summaries = tf.cond(tf.equal(current_index, 0),
                lambda: image_summaries.write(3, tf.expand_dims(distorted_image, 0)),
                lambda: image_summaries.identity()
            )

        # Add the distorted image to the TensorArray
        distorted_inputs = distorted_inputs.write(current_index, tf.expand_dims(distorted_image, 0))

        return [original_image, bboxes, distorted_inputs, image_summaries, current_index + 1]

def check_normalized_box_values(xmin, ymin, xmax, ymax, maximum_normalized_coordinate=1.01, prefix=""):
    """ Make sure the normalized coordinates are less than 1
    """

    xmin_maximum = tf.reduce_max(xmin)
    xmin_assert = tf.Assert(
        tf.greater_equal(1.01, xmin_maximum),
        ['%s, maximum xmin coordinate value is larger '
         'than %f: ' % (prefix, maximum_normalized_coordinate), xmin_maximum])
    with tf.control_dependencies([xmin_assert]):
        xmin = tf.identity(xmin)

    ymin_maximum = tf.reduce_max(ymin)
    ymin_assert = tf.Assert(
        tf.greater_equal(1.01, ymin_maximum),
        ['%s, maximum ymin coordinate value is larger '
        'than %f: ' % (prefix, maximum_normalized_coordinate), ymin_maximum])
    with tf.control_dependencies([ymin_assert]):
        ymin = tf.identity(ymin)

    xmax_maximum = tf.reduce_max(xmax)
    xmax_assert = tf.Assert(
        tf.greater_equal(1.01, xmax_maximum),
        ['%s, maximum xmax coordinate value is larger '
        'than %f: ' % (prefix, maximum_normalized_coordinate), xmax_maximum])
    with tf.control_dependencies([xmax_assert]):
        xmax = tf.identity(xmax)

    ymax_maximum = tf.reduce_max(ymax)
    ymax_assert = tf.Assert(
        tf.greater_equal(1.01, ymax_maximum),
        ['%s, maximum ymax coordinate value is larger '
        'than %f: ' % (prefix, maximum_normalized_coordinate), ymax_maximum])
    with tf.control_dependencies([ymax_assert]):
        ymax = tf.identity(ymax)

    return xmin, ymin, xmax, ymax

def expand_bboxes(xmin, xmax, ymin, ymax, cfg):
    """
    Expand the bboxes.
    """

    w = xmax - xmin
    h = ymax - ymin

    w = w * cfg.WIDTH_EXPANSION_FACTOR
    h = h * cfg.HEIGHT_EXPANSION_FACTOR

    half_w = w / 2.
    half_h = h / 2.

    xmin = tf.clip_by_value(xmin - half_w, 0, 1)
    xmax = tf.clip_by_value(xmax + half_w, 0, 1)
    ymin = tf.clip_by_value(ymin - half_h, 0, 1)
    ymax = tf.clip_by_value(ymax + half_h, 0, 1)

    return tf.tuple([xmin, xmax, ymin, ymax])

def get_region_data(serialized_example, cfg, fetch_ids=True, fetch_labels=True, fetch_text_labels=True, read_filename=False):
    """
    Return the image, an array of bounding boxes, and an array of ids.
    """

    feature_dict = {}

    if cfg.REGION_TYPE == 'bbox':

        bbox_cfg = cfg.BBOX_CFG

        features_to_extract = [('image/object/bbox/xmin', 'xmin'),
                               ('image/object/bbox/xmax', 'xmax'),
                               ('image/object/bbox/ymin', 'ymin'),
                               ('image/object/bbox/ymax', 'ymax'),
                               ('image/object/bbox/ymax', 'ymax')]

        if read_filename:
            features_to_extract.append(('image/filename', 'filename'))
        else:
            features_to_extract.append(('image/encoded', 'image'))

        if fetch_ids:
            features_to_extract.append(('image/object/id', 'id'))

        if fetch_labels:
            features_to_extract.append(('image/object/bbox/label', 'label'))

        if fetch_text_labels:
            features_to_extract.append(('image/object/bbox/text', 'text'))

        features = decode_serialized_example(serialized_example, features_to_extract)

        if read_filename:
            image_buffer = tf.read_file(features['filename'])
            image = tf.image.decode_jpeg(image_buffer, channels=3)
        else:
            image = features['image']

        feature_dict['image'] = image

        xmin = tf.expand_dims(features['xmin'], 0)
        ymin = tf.expand_dims(features['ymin'], 0)
        xmax = tf.expand_dims(features['xmax'], 0)
        ymax = tf.expand_dims(features['ymax'], 0)

        xmin, ymin, xmax, ymax = check_normalized_box_values(xmin, ymin, xmax, ymax, prefix="From tfrecords ")

        if 'DO_EXPANSION' in bbox_cfg and bbox_cfg.DO_EXPANSION > 0:
            r = tf.random_uniform([], minval=0, maxval=1, dtype=tf.float32)
            do_expansion = tf.less(r, bbox_cfg.DO_EXPANSION)
            xmin, xmax, ymin, ymax = tf.cond(do_expansion,
                lambda: expand_bboxes(xmin, xmax, ymin, ymax, bbox_cfg.EXPANSION_CFG),
                lambda: tf.tuple([xmin, xmax, ymin, ymax])
            )

            xmin, ymin, xmax, ymax = check_normalized_box_values(xmin, ymin, xmax, ymax, prefix="After expansion ")

        # combine the bounding boxes
        bboxes = tf.concat(values=[xmin, ymin, xmax, ymax], axis=0)
        # order the bboxes so that they have the shape: [num_bboxes, bbox_coords]
        bboxes = tf.transpose(bboxes, [1, 0])

        feature_dict['bboxes'] = bboxes

        if fetch_ids:
            ids = features['id']
            feature_dict['ids'] = ids

        if fetch_labels:
            labels = features['label']
            feature_dict['labels'] = labels

        if fetch_text_labels:
            text = features['text']
            feature_dict['text'] = text

    elif cfg.REGION_TYPE == 'image':

        features_to_extract = []

        if read_filename:
            features_to_extract.append(('image/filename', 'filename'))
        else:
            features_to_extract.append(('image/encoded', 'image'))

        if fetch_ids:
            features_to_extract.append(('image/id', 'id'))

        if fetch_labels:
            features_to_extract.append(('image/class/label', 'label'))

        if fetch_text_labels:
            features_to_extract.append(('image/class/text', 'text'))

        features = decode_serialized_example(serialized_example, features_to_extract)

        if read_filename:
            image_buffer = tf.read_file(features['filename'])
            image = tf.image.decode_jpeg(image_buffer, channels=3)
        else:
            image = features['image']

        feature_dict['image'] = image

        bboxes = tf.constant([[0.0, 0.0, 1.0, 1.0]])
        feature_dict['bboxes'] = bboxes

        if fetch_ids:
            ids = [features['id']]
            feature_dict['ids'] = ids

        if fetch_labels:
            labels = [features['label']]
            feature_dict['labels'] = labels

        if fetch_text_labels:
            text = [features['text']]
            feature_dict['text'] = text

    else:
        raise ValueError("Unknown REGION_TYPE: %s" % (cfg.REGION_TYPE,))

    return feature_dict

def bbox_crop_loop_cond(original_image, bboxes, distorted_inputs, image_summaries, current_index):
    num_bboxes = tf.shape(bboxes)[0]
    return current_index < num_bboxes

def get_distorted_inputs(original_image, bboxes, cfg, add_summaries):

    distorter = DistortedInputs(cfg, add_summaries)
    num_bboxes = tf.shape(bboxes)[0]
    distorted_inputs = tf.TensorArray(
        dtype=tf.float32,
        size=num_bboxes,
        element_shape=tf.TensorShape([1, cfg.INPUT_SIZE, cfg.INPUT_SIZE, 3])
    )

    if add_summaries:
        image_summaries = tf.TensorArray(
            dtype=tf.float32,
            size=4,
            element_shape=tf.TensorShape([1, cfg.INPUT_SIZE, cfg.INPUT_SIZE, 3])
        )
    else:
        image_summaries = tf.constant([])

    current_index = tf.constant(0, dtype=tf.int32)

    loop_vars = [original_image, bboxes, distorted_inputs, image_summaries, current_index]
    original_image, bboxes, distorted_inputs, image_summaries, current_index = tf.while_loop(
        cond=bbox_crop_loop_cond,
        body=distorter.apply,
        loop_vars=loop_vars,
        parallel_iterations=10, back_prop=False, swap_memory=False
    )

    distorted_inputs = distorted_inputs.concat()

    if add_summaries:
        tf.summary.image('0.original_image', image_summaries.read(0))
        tf.summary.image('1.image_with_random_crop', image_summaries.read(1))
        tf.summary.image('2.cropped_resized_image', image_summaries.read(2))
        tf.summary.image('3.final_distorted_image', image_summaries.read(3))


    return distorted_inputs

def create_training_batch(serialized_example, cfg, add_summaries, read_filenames=False):

    features = get_region_data(serialized_example, cfg, fetch_ids=False,
                               fetch_labels=True, fetch_text_labels=False, read_filename=read_filenames)

    original_image = features['image']
    bboxes = features['bboxes']
    labels = features['labels']

    distorted_inputs = get_distorted_inputs(original_image, bboxes, cfg, add_summaries)

    distorted_inputs = tf.subtract(distorted_inputs, 0.5)
    distorted_inputs = tf.multiply(distorted_inputs, 2.0)

    names = ('inputs', 'labels')
    tensors = [distorted_inputs, labels]
    return [names, tensors]

def create_visualization_batch(serialized_example, cfg, add_summaries, fetch_text_labels=False, read_filenames=False):

    features = get_region_data(serialized_example, cfg, fetch_ids=True,
                               fetch_labels=True, fetch_text_labels=fetch_text_labels, read_filename=read_filenames)

    original_image = features['image']
    ids = features['ids']
    bboxes = features['bboxes']
    labels = features['labels']
    if fetch_text_labels:
        text_labels = features['text']

    cpy_original_image = tf.identity(original_image)

    distorted_inputs = get_distorted_inputs(original_image, bboxes, cfg, add_summaries)

    original_image = cpy_original_image

    # Resize the original image
    if original_image.dtype != tf.float32:
      original_image = tf.image.convert_image_dtype(original_image, dtype=tf.float32)
    shape = tf.shape(original_image)
    height = shape[0]
    width = shape[1]
    new_height, new_width = _largest_size_at_most(height, width, cfg.INPUT_SIZE)
    original_image = tf.image.resize_images(original_image, [new_height, new_width], method=0)
    original_image = tf.image.pad_to_bounding_box(original_image, 0, 0, cfg.INPUT_SIZE, cfg.INPUT_SIZE)
    original_image = tf.image.convert_image_dtype(original_image, dtype=tf.uint8)

    # make a copy of the original image for each bounding box
    num_bboxes = tf.shape(bboxes)[0]
    expanded_original_image = tf.expand_dims(original_image, 0)
    concatenated_original_images = tf.tile(expanded_original_image, [num_bboxes, 1, 1, 1])

    names = ['original_inputs', 'inputs', 'ids', 'labels']
    tensors = [concatenated_original_images, distorted_inputs, ids, labels]

    if fetch_text_labels:
        names.append('text_labels')
        tensors.append(text_labels)

    return [names, tensors]

def create_classification_batch(serialized_example, cfg, add_summaries, read_filenames=False):

    features = get_region_data(serialized_example, cfg, fetch_ids=True,
                               fetch_labels=False, fetch_text_labels=False, read_filename=read_filenames)

    original_image = features['image']
    bboxes = features['bboxes']
    ids = features['ids']

    distorted_inputs = get_distorted_inputs(original_image, bboxes, cfg, add_summaries)

    distorted_inputs = tf.subtract(distorted_inputs, 0.5)
    distorted_inputs = tf.multiply(distorted_inputs, 2.0)

    names = ('inputs', 'ids')
    tensors = [distorted_inputs, ids]
    return [names, tensors]

def input_nodes(tfrecords, cfg, num_epochs=None, batch_size=32, num_threads=2,
                shuffle_batch = True, random_seed=1, capacity = 1000, min_after_dequeue = 96,
                add_summaries=True, input_type='train', fetch_text_labels=False,
                read_filenames=False):
    """
    Args:
        tfrecords:
        cfg:
        num_epochs: number of times to read the tfrecords
        batch_size:
        num_threads:
        shuffle_batch:
        capacity:
        min_after_dequeue:
        add_summaries: Add tensorboard summaries of the images
        input_type: 'train', 'visualize', 'test', 'classification'
    """
    with tf.name_scope('inputs'):

        # A producer to generate tfrecord file paths
        filename_queue = tf.train.string_input_producer(
          tfrecords,
          num_epochs=num_epochs
        )

        # Construct a Reader to read examples from the tfrecords file
        reader = tf.TFRecordReader()
        _, serialized_example = reader.read(filename_queue)

        if input_type=='train' or input_type=='test':
            batch_keys, data_to_batch = create_training_batch(serialized_example, cfg, add_summaries, read_filenames)
        elif input_type=='visualize':
            batch_keys, data_to_batch = create_visualization_batch(serialized_example, cfg, add_summaries, fetch_text_labels, read_filenames)
        elif input_type=='classification':
            batch_keys, data_to_batch = create_classification_batch(serialized_example, cfg, add_summaries, read_filenames)
        else:
            raise ValueError("Unknown input type: %s. Options are `train`, `test`, " \
                             "`visualize`, and `classification`." % (input_type,))

        if shuffle_batch:
            batch = tf.train.shuffle_batch(
                data_to_batch,
                batch_size=batch_size,
                num_threads=num_threads,
                capacity= capacity,
                min_after_dequeue= min_after_dequeue,
                seed = random_seed,
                enqueue_many=True
            )

        else:
            batch = tf.train.batch(
                data_to_batch,
                batch_size=batch_size,
                num_threads=num_threads,
                capacity= capacity,
                enqueue_many=True
            )

        batch_dict = {k : v for k, v in zip(batch_keys, batch)}

        return batch_dict

================================================
FILE: requirements.txt
================================================
easydict>=1.6
matplotlib>=2.0.0
numpy>=1.12.0
PyYAML>=3.11
tensorflow>=1.0.0

================================================
FILE: test.py
================================================
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import argparse
import os

import numpy as np
import tensorflow as tf
import tensorflow.contrib.slim as slim

from config.parse_config import parse_config_file
from nets import nets_factory
from preprocessing import inputs

def test(tfrecords, checkpoint_path, save_dir, max_iterations, eval_interval_secs, cfg, read_images=False):
    """
    Args:
        tfrecords (list)
        checkpoint_path (str)
        savedir (str)
        max_iterations (int)
        cfg (EasyDict)
    """
    tf.logging.set_verbosity(tf.logging.DEBUG)

    graph = tf.Graph()

    with graph.as_default():

        global_step = slim.get_or_create_global_step()

        with tf.device('/cpu:0'):
            batch_dict = inputs.input_nodes(
                tfrecords=tfrecords,
                cfg=cfg.IMAGE_PROCESSING,
                num_epochs=1,
                batch_size=cfg.BATCH_SIZE,
                num_threads=cfg.NUM_INPUT_THREADS,
                shuffle_batch =cfg.SHUFFLE_QUEUE,
                random_seed=cfg.RANDOM_SEED,
                capacity=cfg.QUEUE_CAPACITY,
                min_after_dequeue=cfg.QUEUE_MIN,
                add_summaries=False,
                input_type='test',
                read_filenames=read_images
            )

            batched_one_hot_labels = slim.one_hot_encoding(batch_dict['labels'],
                                                        num_classes=cfg.NUM_CLASSES)

        arg_scope = nets_factory.arg_scopes_map[cfg.MODEL_NAME]()

        with slim.arg_scope(arg_scope):
            logits, end_points = nets_factory.networks_map[cfg.MODEL_NAME](
                inputs=batch_dict['inputs'],
                num_classes=cfg.NUM_CLASSES,
                is_training=False
            )

            predictions = end_points['Predictions']
            #labels = tf.squeeze(batch_dict['labels'])
            labels = batch_dict['labels']

            # Add the loss summary
            loss = tf.losses.softmax_cross_entropy(
                logits=logits, onehot_labels=batched_one_hot_labels, label_smoothing=0., weights=1.0)

        if 'MOVING_AVERAGE_DECAY' in cfg and cfg.MOVING_AVERAGE_DECAY > 0:
            variable_averages = tf.train.ExponentialMovingAverage(
                cfg.MOVING_AVERAGE_DECAY, global_step)
            variables_to_restore = variable_averages.variables_to_restore(
                slim.get_model_variables())
            variables_to_restore[global_step.op.name] = global_step
        else:
            variables_to_restore = slim.get_variables_to_restore()
            variables_to_restore.append(global_step)


        # Define the metrics:
        metric_map = {
            'Accuracy': tf.metrics.accuracy(labels=labels, predictions=tf.argmax(predictions, 1)),#slim.metrics.streaming_accuracy(labels=labels, predictions=tf.argmax(predictions, 1)),
            loss.op.name : slim.metrics.streaming_mean(loss)
        }
        if len(cfg.ACCURACY_AT_K_METRIC) > 0:
            bool_labels = tf.ones([cfg.BATCH_SIZE], dtype=tf.bool)
            for k in cfg.ACCURACY_AT_K_METRIC:
                if k <= 1 or k > cfg.NUM_CLASSES:
                    continue
                in_top_k = tf.nn.in_top_k(predictions=predictions, targets=labels, k=k)
                metric_map['Accuracy_at_%s' % k] = tf.metrics.accuracy(labels=bool_labels, predictions=in_top_k)#slim.metrics.streaming_accuracy(labels=bool_labels, predictions=in_top_k)

        names_to_values, names_to_updates = slim.metrics.aggregate_metric_map(metric_map)

        # Print the summaries to screen.
        print_global_step = True
        for name, value in names_to_values.iteritems():
            summary_name = 'eval/%s' % name
            op = tf.summary.scalar(summary_name, value, collections=[])
            if print_global_step:
                op=tf.Print(op, [global_step], "Model Step ")
                print_global_step = False
            op = tf.Print(op, [value], summary_name)
            tf.add_to_collection(tf.GraphKeys.SUMMARIES, op)

        if max_iterations > 0:
            num_batches = max_iterations
        else:
            # This ensures that we make a single pass over all of the data.
            # We could use ceil if the batch queue is allowed to pad the last batch
            num_batches = np.floor(cfg.NUM_TEST_EXAMPLES / float(cfg.BATCH_SIZE))


        sess_config = tf.ConfigProto(
            log_device_placement=cfg.SESSION_CONFIG.LOG_DEVICE_PLACEMENT,
            allow_soft_placement = True,
            gpu_options = tf.GPUOptions(
                per_process_gpu_memory_fraction=cfg.SESSION_CONFIG.PER_PROCESS_GPU_MEMORY_FRACTION
            ),
            intra_op_parallelism_threads=cfg.SESSION_CONFIG.INTRA_OP_PARALLELISM_THREADS if 'INTRA_OP_PARALLELISM_THREADS' in cfg.SESSION_CONFIG else None,
            inter_op_parallelism_threads=cfg.SESSION_CONFIG.INTER_OP_PARALLELISM_THREADS if 'INTER_OP_PARALLELISM_THREADS' in cfg.SESSION_CONFIG else None
        )

        if eval_interval_secs > 0:

            if not os.path.isdir(checkpoint_path):
                raise ValueError("checkpoint_path should be a path to a directory when " \
                                 "evaluating in a loop.")

            slim.evaluation.evaluation_loop(
                master='',
                checkpoint_dir=checkpoint_path,
                logdir=save_dir,
                num_evals=num_batches,
                initial_op=None,
                initial_op_feed_dict=None,
                eval_op=names_to_updates.values(),
                eval_op_feed_dict=None,
                final_op=None,
                final_op_feed_dict=None,
                summary_op=tf.summary.merge_all(),
                summary_op_feed_dict=None,
                variables_to_restore=variables_to_restore,
                eval_interval_secs=eval_interval_secs,
                max_number_of_evaluations=None,
                session_config=sess_config,
                timeout=None
            )

        else:
            if os.path.isdir(checkpoint_path):
                checkpoint_dir = checkpoint_path
                checkpoint_path = tf.train.latest_checkpoint(checkpoint_dir)

                if checkpoint_path is None:
                    raise ValueError("Unable to find a model checkpoint in the " \
                                     "directory %s" % (checkpoint_dir,))

            tf.logging.info('Evaluating %s' % checkpoint_path)

            slim.evaluation.evaluate_once(
                master='',
                checkpoint_path=checkpoint_path,
                logdir=save_dir,
                num_evals=num_batches,
                eval_op=names_to_updates.values(),
                variables_to_restore=variables_to_restore,
                session_config=sess_config
            )

def parse_args():

    parser = argparse.ArgumentParser(description='Test the person classifier')

    parser.add_argument('--tfrecords', dest='tfrecords',
                        help='Paths to tfrecords.', type=str,
                        nargs='+', required=True)

    parser.add_argument('--checkpoint_path', dest='checkpoint_path',
                          help='Path to a specific model to test against. If a directory, then the newest checkpoint file will be used.', type=str,
                          required=True, default=None)

    parser.add_argument('--save_dir', dest='savedir',
                          help='Path to directory to store summary files.', type=str,
                          required=True)

    parser.add_argument('--config', dest='config_file',
                        help='Path to the configuration file.',
                        required=True, type=str)

    parser.add_argument('--eval_interval_secs', dest='eval_interval_secs',
                        help='Go into an evaluation loop, waiting this many seconds between evaluations. Default is to evaluate once.',
                        required=False, type=int, default=0)

    parser.add_argument('--batch_size', dest='batch_size',
                        help='The number of images in a batch.',
                        required=False, type=int, default=None)

    parser.add_argument('--batches', dest='batches',
                        help='Maximum number of iterations to run. Default is all records (modulo the batch size).',
                        required=False, type=int, default=0)

    parser.add_argument('--model_name', dest='model_name',
                        help='The name of the architecture to use.',
                        required=False, type=str, default=None)

    parser.add_argument('--read_images', dest='read_images',
                        help='Read the images from the file system using the `filename` field rather than using the `encoded` field of the tfrecord.',
                        action='store_true', default=False)

    args = parser.parse_args()
    return args

def main():

    args = parse_args()

    cfg = parse_config_file(args.config_file)

    if args.batch_size != None:
        cfg.BATCH_SIZE = args.batch_size

    if args.model_name != None:
        cfg.MODEL_NAME = args.model_name

    test(
        tfrecords=args.tfrecords,
        checkpoint_path=args.checkpoint_path,
        save_dir=args.savedir,
        max_iterations=args.batches,
        eval_interval_secs=args.eval_interval_secs,
        cfg=cfg,
        read_images=args.read_images
    )

if __name__ == '__main__':
    main()


================================================
FILE: tfserving/README.md
================================================
# TensorFlow Serving Utilities

This directory contains utility code for interacting with a [TensorFlow Serving](https://www.tensorflow.org/serving/) instance. I'll walk through the basic steps of using TensorFlow Serving below.

## Export a Trained Model
When your training process has finished you will be left with a training checkpoint file created by the [tf.train.Saver](https://www.tensorflow.org/api_docs/python/tf/train/Saver) class. We need to convert this checkpoint file for use with TensorFlow Serving. You'll need to create a yaml configuration file for the export (essentially specifying the number of classes, input size, and a few other things). An example:

```yaml
# Export specific configuration

RANDOM_SEED : 1.0

SESSION_CONFIG : {
  # If true, then the device location of each variable will be printed
  LOG_DEVICE_PLACEMENT : false,

  # How much GPU memory we are allowed to pre-allocate
  PER_PROCESS_GPU_MEMORY_FRACTION : 0.9
}

#################################################
# Dataset Info
# The number of classes we are classifying
NUM_CLASSES : 200

# The model architecture to use.
MODEL_NAME : 'inception_v3'

# END: Dataset Info
#################################################
# Image Processing and Augmentation 

IMAGE_PROCESSING : {
    # Images are assumed to be raveled, and have length  INPUT_SIZE * INPUT_SIZE * 3
    INPUT_SIZE : 299
}

# END: Image Processing and Augmentation
#################################################
# Regularization 
#
# The decay to use for the moving average. If 0, then moving average is not computed
# When restoring models, this value is needed to determine whether to restore moving
# average variables or not.
MOVING_AVERAGE_DECAY : 0.9999

# End: Regularization
#################################################
```

To export the model, we'll use the [export.py](export.py) script:
```
python export.py \
--checkpoint_path model.ckpt-399739 \
--export_dir export \
--export_version 1 \
--config config_export.yaml \
--serving \
--add_preprocess \
--class_names class-codes.txt
```
This will create a directory called `1` in the `export_dir` directory and will contain the files that TensorFlow Serving requires. We've passed in semantic identifiers for the classes using the `--class_names` argument. This will allow clients to receive semantically meaningful identifiers along with the prediction results. This removes the requirement of clients having to map from score indices to identifiers themselves. The class-codes.txt file contains one identifier per line, with each line corresponding to one index in the scores array. For example:
```txt
car
pedestrian
light post
trash can
bench
```

## Server Machine
Spin up an Ubuntu 16.04 instance on your favorite cloud provider, or use your personal machine. You'll need to add the TensorFlow Serving distribution URI as a package source prior to installing (notes [here](https://github.com/tensorflow/serving/blob/master/tensorflow_serving/g3doc/setup.md#installing-using-apt-get)):
```
echo "deb [arch=amd64] http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal" | sudo tee /etc/apt/sources.list.d/tensorflow-serving.list

curl https://storage.googleapis.com/tensorflow-serving-apt/tensorflow-serving.release.pub.gpg | sudo apt-key add -

sudo apt-get update && sudo apt-get install tensorflow-model-server
```
You can also install from [source](https://github.com/tensorflow/serving/blob/master/tensorflow_serving/g3doc/setup.md#installation).

Create a models directory, such as `/home/ubuntu/serving/models`, and copy your `1` directory (that was created with the export.py script) to this directory. Alternatively, you can just specify `/home/ubuntu/serving/models` as your `--export_dir` when calling the export.py script.

Now you can start the server:
```
tensorflow_model_server --port=9000 --model_name=inception --model_base_path=/home/ubuntu/serving/models
```
Note the `--model_name` field, the client will need to know this when querying the server. 

## Client Machine
To query the server from a client machine you'll need to install the `tensorflow-serving-api` PIP package along with the `tensorflow` package. I use `numpy` for some operations so I'll install that too:
```
pip install numpy tensorflow tensorflow-serving-api
```

We can now query the server using the [client.py](client.py) file:
```
python client.py \
--images IMG_0932_sm.jpg \
--num_results 10 \
--model_name inception \
--host localhost \
--port 9000 \
--timeout 10
```
This command will send the `IMG_0932_sm.jpg` file to the TensorFlow Serving instance at `localhost:9000` and print the top 10 class predictions. 

Rather than sending the raw image bytes to the TensorFlow Serving instance, we can send the prepared image array. This image array will be fed directly into the network, so it must be the proper size and have had any transformations already applied. The [inputs.py](inputs.py) file has a convenience function to prepare an image for inception style networks. For example:
```python
from scipy.misc import imread

import inputs
import tfserver

image = imread('IMG_0898.jpg')

preped_image = inputs.prepare_image(image)
image_data = [preped_image]

predictions = tfserver.predict(image_data)
results = tfserver.process_classification_prediction(predictions, max_classes=10)

print(results)
```


================================================
FILE: tfserving/__init__.py
================================================


================================================
FILE: tfserving/client.py
================================================
"""
A simple client to query a TensorFlow Serving instance.

Example:
$ python client.py \
--images IMG_0932_sm.jpg \
--num_results 10 \
--model_name inception \
--host localhost \
--port 9000 \
--timeout 10

Author: Grant Van Horn
"""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import argparse
import time

import tfserver

def parse_args():

  parser = argparse.ArgumentParser(description='Command line classification client. Sorts and prints the classification results.')

  parser.add_argument('--images', dest='image_paths',
                        help='Path to one or more images to classify (jpeg or png).',
                        type=str, nargs='+', required=True)

  parser.add_argument('--num_results', dest='num_results',
                      help='The number of results to print. Set to 0 to print all classes.',
                      required=False, type=int, default=0)

  parser.add_argument('--model_name', dest='model_name',
                        help='The name of the model to query.',
                        required=False, type=str, default='inception')

  parser.add_argument('--host', dest='host',
                        help='Machine host where the TensorFlow Serving model is.',
                        required=False, type=str, default='localhost')

  parser.add_argument('--port', dest='port',
                      help='Port that the TensorFlow Server is listening on.',
                      required=False, type=int, default=9000)

  parser.add_argument('--timeout', dest='timeout',
                      help='Amount of time to wait before failing.',
                      required=False, type=int, default=10)

  args = parser.parse_args()

  return args

def main():

  args = parse_args()

  # Read in the image bytes
  image_data = []
  for fp in args.image_paths:
    with open(fp) as f:
      data = f.read()
    image_data.append(data)

  # Get the predictions
  t = time.time()
  predictions = tfserver.predict(image_data, model_name=args.model_name,
    host=args.host, port=args.port, timeout=args.timeout
  )
  dt = time.time() - t
  print("Prediction call took %0.4f seconds" % (dt,))

  # Process the results
  results = tfserver.process_classification_prediction(predictions, max_classes=args.num_results)

  # Print the results
  for i, fp in enumerate(args.image_paths):
    print("Results for image: %s" % (fp,))
    for name, score in results[i]:
      print("%s: %0.3f" % (name, score))
    print()

if __name__ == '__main__':
  main()

================================================
FILE: tfserving/inputs.py
================================================
"""
Numpy and scipy image preparation.

Author: Grant Van Horn
"""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np
from scipy.misc import imresize

def prepare_image(image, input_height=299, input_width=299):
  """ Prepare an image to be passed through a network.
  Arguments:
    image (numpy.ndarray): An uint8 RGB image
  Returns:
    list: the image resized, centered and raveled
  """

  # We assume an uint8 RGB image
  assert image.dtype == np.uint8
  assert image.ndim == 3
  assert image.shape[2] == 3

  resized_image = imresize(image, (input_height, input_width, 3))
  float_image = resized_image.astype(np.float32)
  centered_image = ((float_image / 255.) - 0.5) * 2.0

  return centered_image.ravel().tolist()


================================================
FILE: tfserving/tfserver.py
================================================
"""
TensorFlow Serving caller code.

Requirements:
pip install numpy tensorflow tensorflow-serving-api

Author: Grant Van Horn
"""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from grpc.beta import implementations
import numpy as np
import tensorflow as tf
from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_service_pb2

def predict(image_data,
            model_name='inception',
            host='localhost',
            port=9000,
            timeout=10):
  """
  Arguments:
    image_data (list): A list of image data. The image data should either be the image bytes or
      float arrays.
    model_name (str): The name of the model to query (specified when you started the Server)
    model_signature_name (str): The name of the signature to query (specified when you created the exported model)
    host (str): The machine host identifier that the classifier is running on.
    port (int): The port that the classifier is listening on.
    timeout (int): Time in seconds before timing out.

  Returns:
    PredictResponse protocol buffer. See here: https://github.com/tensorflow/serving/blob/master/tensorflow_serving/apis/predict.proto
  """

  if len(image_data) <= 0:
    return None

  channel = implementations.insecure_channel(host, int(port))
  stub = prediction_service_pb2.beta_create_PredictionService_stub(channel)
  request = predict_pb2.PredictRequest()
  request.model_spec.name = model_name

  if type(image_data[0]) == str:
    request.model_spec.signature_name = 'predict_image_bytes'
    request.inputs['images'].CopyFrom(
        tf.contrib.util.make_tensor_proto(image_data, shape=[len(image_data)]))
  else:
    request.model_spec.signature_name = 'predict_image_array'
    request.inputs['images'].CopyFrom(
        tf.contrib.util.make_tensor_proto(image_data, shape=[len(image_data), len(image_data[1])]))

  result = stub.Predict(request, timeout)
  return result

def process_classification_prediction(predictions, max_classes=10):
  """
  Arguments:
    prediction (PredictResponse protocol buffer): TensorFlow Serving prediction response.
    num_classes (int): Maximum number of results to return. Set to 0 for all results.
  Returns:
    list of lists: A list of (name, score) tuples, one for each prediction.
  """

  # Determine how many outputs there are
  dims = predictions.outputs['classes'].tensor_shape.dim
  num_inputs = dims[0].size
  num_classes = dims[1].size

  all_class_names = np.array(predictions.outputs['classes'].string_val).reshape(num_inputs, num_classes)
  all_scores = np.array(predictions.outputs['scores'].float_val).reshape(num_inputs, num_classes)

  results = []
  for i in range(num_inputs):

    scores = all_scores[i]
    class_names = all_class_names[i]

    idxs = np.argsort(scores)[::-1]
    scores = scores[idxs]
    class_names = class_names[idxs]

    num_to_return = min(num_classes, max_classes)
    if num_to_return <= 0:
      num_to_return = scores.shape[-1]

    names_scores = [(class_names[i], scores[i]) for i in range(num_to_return)]
    results.append(names_scores)

  return results

================================================
FILE: train.py
================================================
# Some of this code came from the https://github.com/tensorflow/models/tree/master/slim
# directory, so lets keep the Google license around for now.
#
# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import argparse
import copy
import os

import numpy as np
import tensorflow as tf
import tensorflow.contrib.slim as slim

from config.parse_config import parse_config_file
from nets import nets_factory
from preprocessing.inputs import input_nodes


def _configure_learning_rate(global_step, cfg):
    """Configures the learning rate.
    Args:
        num_samples_per_epoch: The number of samples in each epoch of training.
        global_step: The global_step tensor.
    Returns:
        A `Tensor` representing the learning rate.
    Raises:
        ValueError: if cfg.LEARNING_RATE_DECAY_TYPE is not recognized.
    """


    decay_steps = int(cfg.NUM_TRAIN_EXAMPLES / cfg.BATCH_SIZE * cfg.NUM_EPOCHS_PER_DELAY)

    if cfg.LEARNING_RATE_DECAY_TYPE == 'exponential':
        return tf.train.exponential_decay(cfg.INITIAL_LEARNING_RATE,
                                          global_step,
                                          decay_steps,
                                          cfg.LEARNING_RATE_DECAY_FACTOR,
                                          staircase=cfg.LEARNING_RATE_STAIRCASE,
                                          name='exponential_decay_learning_rate')

    elif cfg.LEARNING_RATE_DECAY_TYPE == 'fixed':
        return tf.constant(cfg.INITIAL_LEARNING_RATE, name='fixed_learning_rate')

    elif cfg.LEARNING_RATE_DECAY_TYPE == 'polynomial':
        return tf.train.polynomial_decay(cfg.INITIAL_LEARNING_RATE,
                                         global_step,
                                         decay_steps,
                                         cfg.END_LEARNING_RATE,
                                         power=1.0,
                                         cycle=False,
                                         name='polynomial_decay_learning_rate')
    else:
        raise ValueError('learning_rate_decay_type [%s] was not recognized',
                         cfg.LEARNING_RATE_DECAY_TYPE)


def _configure_optimizer(learning_rate, cfg):
    """Configures the optimizer used for training.
    Args:
        learning_rate: A scalar or `Tensor` learning rate.
    Returns:
        An instance of an optimizer.
    Raises:
        ValueError: if FLAGS.optimizer is not recognized.
    """
    if cfg.OPTIMIZER == 'adadelta':
        optimizer = tf.train.AdadeltaOptimizer(
            learning_rate,
            rho=cfg.ADADELTA_RHO,
            epsilon=cfg.OPTIMIZER_EPSILON)
    elif cfg.OPTIMIZER == 'adagrad':
        optimizer = tf.train.AdagradOptimizer(
            learning_rate,
            initial_accumulator_value=cfg.ADAGRAD_INITIAL_ACCUMULATOR_VALUE)
    elif cfg.OPTIMIZER == 'adam':
        optimizer = tf.train.AdamOptimizer(
            learning_rate,
            beta1=cfg.ADAM_BETA1,
            beta2=cfg.ADAM_BETA2,
            epsilon=cfg.OPTIMIZER_EPSILON)
    elif cfg.OPTIMIZER == 'ftrl':
        optimizer = tf.train.FtrlOptimizer(
            learning_rate,
            learning_rate_power=cfg.FTRL_LEARNING_RATE_POWER,
            initial_accumulator_value=cfg.FTRL_INITIAL_ACCUMULATOR_VALUE,
            l1_regularization_strength=cfg.FTRL_L1,
            l2_regularization_strength=cfg.FTRL_L2)
    elif cfg.OPTIMIZER == 'momentum':
        optimizer = tf.train.MomentumOptimizer(
            learning_rate,
            momentum=cfg.MOMENTUM,
            name='Momentum')
    elif cfg.OPTIMIZER == 'rmsprop':
        optimizer = tf.train.RMSPropOptimizer(
            learning_rate,
            decay=cfg.RMSPROP_DECAY,
            momentum=cfg.MOMENTUM,
            epsilon=cfg.OPTIMIZER_EPSILON)
    elif cfg.OPTIMIZER == 'sgd':
        optimizer = tf.train.GradientDescentOptimizer(learning_rate)
    else:
        raise ValueError('Optimizer [%s] was not recognized', cfg.OPTIMIZER)
    return optimizer

def get_trainable_variables(trainable_scopes):
    """Returns a list of variables to train.
    Returns:
        A list of variables to train by the optimizer.
    """

    if trainable_scopes is None:
        return tf.trainable_variables()

    trainable_scopes = [scope.strip() for scope in trainable_scopes]

    variables_to_train = []
    for scope in trainable_scopes:
        variables = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope)
        variables_to_train.extend(variables)
    return variables_to_train


def get_init_function(logdir, pretrained_model_path, checkpoint_exclude_scopes, restore_variables_with_moving_averages=False, restore_moving_averages=False, ema=None):
    """
    Args:
        logdir : location of where we will be storing checkpoint files.
        pretrained_model_path : a path to a specific model, or a directory with a checkpoint file. The latest model will be used.
        fine_tune : If True, then the detection heads will not be restored.
        original_inception_vars : A list of variables that do not include the detection heads.
        use_moving_averages : If True, then the moving average values of the variables will be restored.
        restore_moving_averages : If True, then the moving average values will also be restored.
        ema : The exponential moving average object
    """


    if pretrained_model_path is None:
        return None

    # Warn the user if a checkpoint exists in the train_dir. Then we'll be
    # ignoring the checkpoint anyway.
    if tf.train.latest_checkpoint(logdir):
        tf.logging.info(
            'Ignoring --pretrained_model_path because a checkpoint already exists in %s'
            % logdir)
        return None

    exclusions = []
    if checkpoint_exclude_scopes:
        exclusions = [scope.strip() for scope in checkpoint_exclude_scopes]

    variables_to_restore = []
    for var in slim.get_model_variables():
        excluded = False
        for exclusion in exclusions:
          if var.op.name.startswith(exclusion):
            excluded = True
            break
        if not excluded:
          variables_to_restore.append(var)

    #for variable in variables_to_restore:
    #    print(variable.name)

    if os.path.isdir(pretrained_model_path):
        checkpoint_path = tf.train.latest_checkpoint(pretrained_model_path)
        if checkpoint_path is None:
            raise ValueError(
                "No model checkpoint file found in directory %s" % (pretrained_model_path))

    else:
        checkpoint_path = pretrained_model_path

    tf.logging.info('Restoring variables from %s' % checkpoint_path)

    if ema != None:
        # # Restore each variable with its moving average value
        # if restore_variables_with_moving_averages:

        #     # Also restore the moving average variables
        #     if restore_moving_averages:
        #         variables_to_restore_with_ma = variables_to_restore + [ema.average(var) for var in variables_to_restore]
        #         normal_saver = tf.train.Saver(variables_to_restore_with_ma, reshape=False)
        #     else:
        #         normal_saver = tf.train.Saver(variables_to_restore, reshape=False)
        #     ema_saver = tf.train.Saver({
        #         ema.average_name(var) : ema.average(var)
        #         for var in variables_to_restore
        #     }, reshape=False)

        #     def callback(session):
        #         normal_saver.restore(session, checkpoint_path)
        #         ema_saver.restore(session, checkpoint_path)
        #     return callback

        # elif restore_moving_averages:
        #     variables_to_restore += [ema.average(var) for var in variables_to_restore]

        # Load in the moving average value for a variable, rather than the variable itself
        if restore_variables_with_moving_averages:

            variables_to_restore = {
                ema.average_name(var) : var
                for var in variables_to_restore
            }

        # Do we want to restore the moving average variables? Otherwise they will be reinitialized
        if restore_moving_averages:

            # If we are already using the moving averages to restore the variables, then we will need
            # two Saver() objects (since the names in the dictionaries will clash)
            if restore_variables_with_moving_averages:

                normal_saver = tf.train.Saver(variables_to_restore, reshape=False)
                ema_saver = tf.train.Saver({
                    ema.average_name(var) : ema.average(var)
                    for var in variables_to_restore.values()
                }, reshape=False)

                def callback(session):
                    normal_saver.restore(session, checkpoint_path)
                    ema_saver.restore(session, checkpoint_path)
                return callback

            else:
                # GVH: Need to check for dict
                variables_to_restore += [ema.average(var) for var in variables_to_restore]

    return slim.assign_from_checkpoint_fn(
        checkpoint_path,
        variables_to_restore,
        ignore_missing_vars=False)


def train(tfrecords, logdir, cfg, pretrained_model_path=None, trainable_scopes=None, checkpoint_exclude_scopes=None, restore_variables_with_moving_averages=False, restore_moving_averages=False, read_images=False):
    """
    Args:
        tfrecords (list)
        bbox_priors (np.array)
        logdir (str)
        cfg (EasyDict)
        pretrained_model_path (str) : path to a pretrained Inception Network
    """
    tf.logging.set_verbosity(tf.logging.INFO)

    graph = tf.Graph()

    # Force all Variables to reside on the CPU.
    with graph.as_default():

        # Create a variable to count the number of train() calls.
        global_step = slim.get_or_create_global_step()

        with tf.device('/cpu:0'):
            batch_dict = input_nodes(
                tfrecords=tfrecords,
                cfg=cfg.IMAGE_PROCESSING,
                num_epochs=None,
                batch_size=cfg.BATCH_SIZE,
                num_threads=cfg.NUM_INPUT_THREADS,
                shuffle_batch =cfg.SHUFFLE_QUEUE,
                random_seed=cfg.RANDOM_SEED,
                capacity=cfg.QUEUE_CAPACITY,
                min_after_dequeue=cfg.QUEUE_MIN,
                add_summaries=True,
                input_type='train',
                read_filenames=read_images
            )

            batched_one_hot_labels = slim.one_hot_encoding(batch_dict['labels'],
                                                        num_classes=cfg.NUM_CLASSES)

        # GVH: Doesn't seem to help to the poor queueing performance...
        # batch_queue = slim.prefetch_queue.prefetch_queue(
        #                   [batch_dict['inputs'], batched_one_hot_labels], capacity=2)
        # inputs, labels = batch_queue.dequeue()

        arg_scope = nets_factory.arg_scopes_map[cfg.MODEL_NAME](
            weight_decay=cfg.WEIGHT_DECAY,
            batch_norm_decay=cfg.BATCHNORM_MOVING_AVERAGE_DECAY,
            batch_norm_epsilon=cfg.BATCHNORM_EPSILON
        )

        with slim.arg_scope(arg_scope):
            logits, end_points = nets_factory.networks_map[cfg.MODEL_NAME](
                inputs=batch_dict['inputs'],
                num_classes=cfg.NUM_CLASSES,
                dropout_keep_prob=cfg.DROPOUT_KEEP_PROB,
                is_training=True
            )

            # Add the losses
            if 'AuxLogits' in end_points:
                tf.losses.softmax_cross_entropy(
                    logits=end_points['AuxLogits'], onehot_labels=batched_one_hot_labels,
                    label_smoothing=cfg.LABEL_SMOOTHING, weights=0.4, scope='aux_loss')

            tf.losses.softmax_cross_entropy(
                logits=logits, onehot_labels=batched_one_hot_labels, label_smoothing=cfg.LABEL_SMOOTHING, weights=1.0)


        summaries = set(tf.get_collection(tf.GraphKeys.SUMMARIES))

        # Summarize the losses
        for loss in tf.get_collection(tf.GraphKeys.LOSSES):
            summaries.add(tf.summary.scalar(name='losses/%s' % loss.op.name, tensor=loss))

        regularization_losses = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)
        if regularization_losses:
            regularization_loss = tf.add_n(regularization_losses, name='regularization_loss')
            summaries.add(tf.summary.scalar(name='losses/regularization_loss', tensor=regularization_loss))

        total_loss = tf.losses.get_total_loss()
        summaries.add(tf.summary.scalar(name='losses/total_loss', tensor=total_loss))


        if 'MOVING_AVERAGE_DECAY' in cfg and cfg.MOVING_AVERAGE_DECAY > 0:
            moving_average_variables = slim.get_model_variables()
            ema = tf.train.ExponentialMovingAverage(
                decay=cfg.MOVING_AVERAGE_DECAY,
                num_updates=global_step
            )
        elif restore_variables_with_moving_averages or restore_moving_averages:
            # Perhaps we are finetuning the last layer of a pretrained model?
            # So we just need something to load in the moving averages, for use in get_init_function()
            moving_average_variables = None
            ema = tf.train.ExponentialMovingAverage(
                decay=1,
                num_updates=global_step
            )
        else:
            moving_average_variables = None
            ema = None


        # Calculate the learning rate schedule.
        lr = _configure_learning_rate(global_step, cfg)

        # Create an optimizer that performs gradient descent.
        optimizer = _configure_optimizer(lr, cfg)

        summaries.add(tf.summary.scalar(tensor=lr,
                                        name='learning_rate'))

        # Add the moving average update ops to the graph
        if ema != None and moving_average_variables != None:
            tf.add_to_collection(tf.GraphKeys.UPDATE_OPS, ema.apply(moving_average_variables))

        trainable_vars = get_trainable_variables(trainable_scopes)
        train_op = slim.learning.create_train_op(total_loss=total_loss,
                                                 optimizer=optimizer,
                                                 global_step=global_step,
                                                 variables_to_train=trainable_vars,
                                                 clip_gradient_norm=cfg.CLIP_GRADIENT_NORM)

        # Merge all of the summaries
        summaries |= set(tf.get_collection(tf.GraphKeys.SUMMARIES))
        summary_op = tf.summary.merge(inputs=list(summaries), name='summary_op')

        sess_config = tf.ConfigProto(
          log_device_placement=cfg.SESSION_CONFIG.LOG_DEVICE_PLACEMENT,
          allow_soft_placement = True,
          gpu_options = tf.GPUOptions(
              per_process_gpu_memory_fraction=cfg.SESSION_CONFIG.PER_PROCESS_GPU_MEMORY_FRACTION
          ),
          intra_op_parallelism_threads=cfg.SESSION_CONFIG.INTRA_OP_PARALLELISM_THREADS if 'INTRA_OP_PARALLELISM_THREADS' in cfg.SESSION_CONFIG else None,
          inter_op_parallelism_threads=cfg.SESSION_CONFIG.INTER_OP_PARALLELISM_THREADS if 'INTER_OP_PARALLELISM_THREADS' in cfg.SESSION_CONFIG else None
        )

        saver = tf.train.Saver(
          # Save all variables
          max_to_keep = cfg.MAX_TO_KEEP,
          keep_checkpoint_every_n_hours = cfg.KEEP_CHECKPOINT_EVERY_N_HOURS
        )

        # Run training.
        slim.learning.train(
            train_op=train_op,
            logdir=logdir,
            init_fn=get_init_function(logdir, pretrained_model_path, checkpoint_exclude_scopes, restore_variables_with_moving_averages=restore_variables_with_moving_averages, restore_moving_averages=restore_moving_averages, ema=ema),
            number_of_steps=cfg.NUM_TRAIN_ITERATIONS,
            save_summaries_secs=cfg.SAVE_SUMMARY_SECS,
            save_interval_secs=cfg.SAVE_INTERVAL_SECS,
            saver=saver,
            session_config=sess_config,
            summary_op = summary_op,
            log_every_n_steps = cfg.LOG_EVERY_N_STEPS
        )

def parse_args():

    parser = argparse.ArgumentParser(description='Train the classification system')

    parser.add_argument('--tfrecords', dest='tfrecords',
                        help='Paths to tfrecord files.', type=str,
                        nargs='+', required=True)

    parser.add_argument('--logdir', dest='logdir',
                          help='path to directory to store summary files and checkpoint files', type=str,
                          required=True)

    parser.add_argument('--config', dest='config_file',
                        help='Path to the configuration file',
                        required=True, type=str)

    parser.add_argument('--pretrained_model', dest='pretrained_model',
                        help='Path to a model to restore. This is ignored if there is model in the logdir.',
                        required=False, type=str, default=None)

    parser.add_argument('--trainable_scopes', dest='trainable_scopes',
                        help='Only variables within these scopes will be trained.',
                        type=str, nargs='+', default=None, required=False)

    parser.add_argument('--checkpoint_exclude_scopes', dest='checkpoint_exclude_scopes',
                        help='Variables within these scopes will not be restored from the checkpoint files.',
                        type=str, nargs='+', default=None, required=False)

    parser.add_argument('--max_number_of_steps', dest='max_number_of_steps',
                        help='The maximum number of iterations to run.',
                        required=False, type=int, default=None)

    parser.add_argument('--learning_rate_decay_type', dest='learning_rate_decay_type',
                          help='Type of the decay', type=str,
                          required=False, default=None)

    parser.add_argument('--lr', dest='learning_rate',
                          help='Initial learning rate', type=float,
                          required=False, default=None)

    parser.add_argument('--batch_size', dest='batch_size',
                        help='The number of images in a batch.',
                        required=False, type=int, default=None)

    parser.add_argument('--model_name', dest='model_name',
                        help='The name of the architecture to use.',
                        required=False, type=str, default=None)

    parser.add_argument('--restore_variables_with_moving_averages', dest='restore_variables_with_moving_averages',
                        help='If True, then we restore variables with their moving average values.',
                        required=False, action='store_true', default=False)

    parser.add_argument('--restore_moving_averages', dest='restore_moving_averages',
                        help='If True, then we restore the variable that tracks the moving average of each trainable varibale.',
                        required=False, action='store_true', default=False)

    parser.add_argument('--read_images', dest='read_images',
                        help='Read the images from the file system using the `filename` field rather than using the `encoded` field of the tfrecord.',
                        action='store_true', default=False)

    args = parser.parse_args()
    return args

def main():
    args = parse_args()

    cfg = parse_config_file(args.config_file)

    # Replace cfg parameters with the command line values
    if args.max_number_of_steps != None:
        cfg.NUM_TRAIN_ITERATIONS = args.max_number_of_steps

    if args.learning_rate_decay_type != None:
        cfg.LEARNING_RATE_DECAY_TYPE = args.learning_rate_decay_type

    if args.learning_rate != None:
        cfg.INITIAL_LEARNING_RATE = args.learning_rate

    if args.batch_size != None:
        cfg.BATCH_SIZE = args.batch_size

    if args.model_name != None:
        cfg.MODEL_NAME = args.model_name

    train(
        tfrecords=args.tfrecords,
        logdir=args.logdir,
        cfg=cfg,
        pretrained_model_path=args.pretrained_model,
        trainable_scopes = args.trainable_scopes,
        checkpoint_exclude_scopes = args.checkpoint_exclude_scopes,
        restore_variables_with_moving_averages=args.restore_variables_with_moving_averages,
        restore_moving_averages=args.restore_moving_averages,
        read_images=args.read_images
    )

if __name__ == '__main__':
  main()


================================================
FILE: visualize_train_inputs.py
================================================
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import argparse

from matplotlib import pyplot as plt
import numpy as np
import tensorflow as tf

from config.parse_config import parse_config_file
from preprocessing.inputs import input_nodes

def visualize_train_inputs(tfrecords, cfg, show_text_labels=False, read_images=False):

    graph = tf.Graph()
    sess = tf.Session(graph = graph)

    # run a session to look at the images...
    with sess.as_default(), graph.as_default():

        # Input Nodes
        with tf.device('/cpu:0'):
            batch_dict = input_nodes(
                tfrecords=tfrecords,
                cfg=cfg.IMAGE_PROCESSING,
                num_epochs=1,
                batch_size=cfg.BATCH_SIZE,
                num_threads=cfg.NUM_INPUT_THREADS,
                shuffle_batch =cfg.SHUFFLE_QUEUE,
                random_seed=cfg.RANDOM_SEED,
                capacity=cfg.QUEUE_CAPACITY,
                min_after_dequeue=cfg.QUEUE_MIN,
                add_summaries=False,
                input_type='visualize',
                fetch_text_labels=show_text_labels,
                read_filenames=read_images
            )

        # Convert float images to uint8 images
        image_to_convert = tf.placeholder(dtype=tf.float32,
                                          shape=[cfg.IMAGE_PROCESSING.INPUT_SIZE,
                                                 cfg.IMAGE_PROCESSING.INPUT_SIZE, 3])
        uint8_image = tf.image.convert_image_dtype(image_to_convert, dtype=tf.uint8)


        coord = tf.train.Coordinator()
        tf.global_variables_initializer().run()
        tf.local_variables_initializer().run()
        threads = tf.train.start_queue_runners(sess=sess, coord=coord)

        plt.ion()
        done = False
        while not done:

            output = sess.run(batch_dict)

            original_images = output['original_inputs']
            distorted_images = output['inputs']
            image_ids = output['ids']
            labels = output['labels']
            if show_text_labels:
                text_labels = output['text_labels']

            for b in range(cfg.BATCH_SIZE):

                original_image = original_images[b]
                distorted_image = distorted_images[b]

                if original_image.dtype != np.uint8:
                    original_image = sess.run(uint8_image, {image_to_convert : original_image})

                if distorted_image.dtype != np.uint8:
                    distorted_image = sess.run(uint8_image, {image_to_convert : distorted_image})

                image_id = image_ids[b]
                label = labels[b]

                fig = plt.figure('Train Inputs')

                if show_text_labels:
                    text_label = text_labels[b]
                    st = fig.suptitle("Image: %s\nLabel: %d\nText: %s" %
                                      (image_id, label, text_label), fontsize=12)
                else:
                    st = fig.suptitle("Image: %s\nLabel: %d" % (image_id, label), fontsize=12)

                plt.subplot(2, 1, 1)
                plt.imshow(original_image)
                plt.title("Original")
                plt.axis('off')

                plt.subplot(2, 1, 2)
                plt.imshow(distorted_image)
                plt.title("Modified")
                plt.axis('off')

                # Shift the subplots down a bit to make room for the super title
                st.set_y(0.95)
                fig.subplots_adjust(top=0.75)

                plt.show(block=False)

                t = raw_input("Press Enter to view next image. Press any key followed " \
                              "by Enter to quite: ")
                if t != '':
                    done = True
                    break
                plt.clf()


def parse_args():

    parser = argparse.ArgumentParser(description='Visualize the inputs to train the classification system.')

    parser.add_argument('--tfrecords', dest='tfrecords',
                        help='Paths to tfrecord files.', type=str,
                        nargs='+', required=True)

    parser.add_argument('--config', dest='config_file',
                        help='Path to the configuration file',
                        required=True, type=str)

    parser.add_argument('--text_labels', dest='show_text_labels',
                        help='If text labels have been stored in the tfrecords, then you can use this flag to show them.',
                        action='store_true', default=False)

    parser.add_argument('--read_images', dest='read_images',
                        help='Read the images from the file system using the `filename` field rather than using the `encoded` field of the tfrecord.',
                        action='store_true', default=False)

    args = parser.parse_args()
    return args

def main():
  args = parse_args()
  cfg = parse_config_file(args.config_file)
  visualize_train_inputs(
    tfrecords=args.tfrecords,
    cfg=cfg,
    show_text_labels=args.show_text_labels,
    read_images=args.read_images
  )


if __name__ == '__main__':
  main()