[
  {
    "path": "LICENSE",
    "content": "MIT License\n\nCopyright (c) 2016 François Chollet\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "README.md",
    "content": "# Trained image classification models for Keras\n\n**THIS REPOSITORY IS DEPRECATED. USE THE MODULE `keras.applications` INSTEAD.**\n\nPull requests will not be reviewed nor merged. Direct any PRs to `keras.applications`. Issues are not monitored either.\n\n----\n\nThis repository contains code for the following Keras models:\n\n- VGG16\n- VGG19\n- ResNet50\n- Inception v3\n- CRNN for music tagging\n\nAll architectures are compatible with both TensorFlow and Theano, and upon instantiation the models will be built according to the image dimension ordering set in your Keras configuration file at `~/.keras/keras.json`. For instance, if you have set `image_dim_ordering=tf`, then any model loaded from this repository will get built according to the TensorFlow dimension ordering convention, \"Width-Height-Depth\".\n\nPre-trained weights can be automatically loaded upon instantiation (`weights='imagenet'` argument in model constructor for all image models, `weights='msd'` for the music tagging model). Weights are automatically downloaded if necessary, and cached locally in `~/.keras/models/`.\n\n## Examples\n\n### Classify images\n\n```python\nfrom resnet50 import ResNet50\nfrom keras.preprocessing import image\nfrom imagenet_utils import preprocess_input, decode_predictions\n\nmodel = ResNet50(weights='imagenet')\n\nimg_path = 'elephant.jpg'\nimg = image.load_img(img_path, target_size=(224, 224))\nx = image.img_to_array(img)\nx = np.expand_dims(x, axis=0)\nx = preprocess_input(x)\n\npreds = model.predict(x)\nprint('Predicted:', decode_predictions(preds))\n# print: [[u'n02504458', u'African_elephant']]\n```\n\n### Extract features from images\n\n```python\nfrom vgg16 import VGG16\nfrom keras.preprocessing import image\nfrom imagenet_utils import preprocess_input\n\nmodel = VGG16(weights='imagenet', include_top=False)\n\nimg_path = 'elephant.jpg'\nimg = image.load_img(img_path, target_size=(224, 224))\nx = image.img_to_array(img)\nx = np.expand_dims(x, axis=0)\nx = preprocess_input(x)\n\nfeatures = model.predict(x)\n```\n\n### Extract features from an arbitrary intermediate layer\n\n```python\nfrom vgg19 import VGG19\nfrom keras.preprocessing import image\nfrom imagenet_utils import preprocess_input\nfrom keras.models import Model\n\nbase_model = VGG19(weights='imagenet')\nmodel = Model(input=base_model.input, output=base_model.get_layer('block4_pool').output)\n\nimg_path = 'elephant.jpg'\nimg = image.load_img(img_path, target_size=(224, 224))\nx = image.img_to_array(img)\nx = np.expand_dims(x, axis=0)\nx = preprocess_input(x)\n\nblock4_pool_features = model.predict(x)\n```\n\n## References\n\n- [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556) - please cite this paper if you use the VGG models in your work.\n- [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385) - please cite this paper if you use the ResNet model in your work.\n- [Rethinking the Inception Architecture for Computer Vision](http://arxiv.org/abs/1512.00567) - please cite this paper if you use the Inception v3 model in your work.\n- [Music-auto_tagging-keras](https://github.com/keunwoochoi/music-auto_tagging-keras)\n\nAdditionally, don't forget to [cite Keras](https://keras.io/getting-started/faq/#how-should-i-cite-keras) if you use these models.\n\n\n## License\n\n- All code in this repository is under the MIT license as specified by the LICENSE file.\n- The ResNet50 weights are ported from the ones [released by Kaiming He](https://github.com/KaimingHe/deep-residual-networks) under the [MIT license](https://github.com/KaimingHe/deep-residual-networks/blob/master/LICENSE).\n- The VGG16 and VGG19 weights are ported from the ones [released by VGG at Oxford](http://www.robots.ox.ac.uk/~vgg/research/very_deep/) under the [Creative Commons Attribution License](https://creativecommons.org/licenses/by/4.0/).\n- The Inception v3 weights are trained by ourselves and are released under the MIT license.\n"
  },
  {
    "path": "audio_conv_utils.py",
    "content": "import numpy as np\nfrom keras import backend as K\n\n\nTAGS = ['rock', 'pop', 'alternative', 'indie', 'electronic',\n        'female vocalists', 'dance', '00s', 'alternative rock', 'jazz',\n        'beautiful', 'metal', 'chillout', 'male vocalists',\n        'classic rock', 'soul', 'indie rock', 'Mellow', 'electronica',\n        '80s', 'folk', '90s', 'chill', 'instrumental', 'punk',\n        'oldies', 'blues', 'hard rock', 'ambient', 'acoustic',\n        'experimental', 'female vocalist', 'guitar', 'Hip-Hop',\n        '70s', 'party', 'country', 'easy listening',\n        'sexy', 'catchy', 'funk', 'electro', 'heavy metal',\n        'Progressive rock', '60s', 'rnb', 'indie pop',\n        'sad', 'House', 'happy']\n\n\ndef librosa_exists():\n    try:\n        __import__('librosa')\n    except ImportError:\n        return False\n    else:\n        return True\n\n\ndef preprocess_input(audio_path, dim_ordering='default'):\n    '''Reads an audio file and outputs a Mel-spectrogram.\n    '''\n    if dim_ordering == 'default':\n        dim_ordering = K.image_dim_ordering()\n    assert dim_ordering in {'tf', 'th'}\n\n    if librosa_exists():\n        import librosa\n    else:\n        raise RuntimeError('Librosa is required to process audio files.\\n' +\n                           'Install it via `pip install librosa` \\nor visit ' +\n                           'http://librosa.github.io/librosa/ for details.')\n\n    # mel-spectrogram parameters\n    SR = 12000\n    N_FFT = 512\n    N_MELS = 96\n    HOP_LEN = 256\n    DURA = 29.12\n\n    src, sr = librosa.load(audio_path, sr=SR)\n    n_sample = src.shape[0]\n    n_sample_wanted = int(DURA * SR)\n\n    # trim the signal at the center\n    if n_sample < n_sample_wanted:  # if too short\n        src = np.hstack((src, np.zeros((int(DURA * SR) - n_sample,))))\n    elif n_sample > n_sample_wanted:  # if too long\n        src = src[(n_sample - n_sample_wanted) / 2:\n                  (n_sample + n_sample_wanted) / 2]\n\n    logam = librosa.logamplitude\n    melgram = librosa.feature.melspectrogram\n    x = logam(melgram(y=src, sr=SR, hop_length=HOP_LEN,\n                      n_fft=N_FFT, n_mels=N_MELS) ** 2,\n              ref_power=1.0)\n\n    if dim_ordering == 'th':\n        x = np.expand_dims(x, axis=0)\n    elif dim_ordering == 'tf':\n        x = np.expand_dims(x, axis=3)\n    return x\n\n\ndef decode_predictions(preds, top_n=5):\n    '''Decode the output of a music tagger model.\n\n    # Arguments\n        preds: 2-dimensional numpy array\n        top_n: integer in [0, 50], number of items to show\n\n    '''\n    assert len(preds.shape) == 2 and preds.shape[1] == 50\n    results = []\n    for pred in preds:\n        result = zip(TAGS, pred)\n        result = sorted(result, key=lambda x: x[1], reverse=True)\n        results.append(result[:top_n])\n    return results\n"
  },
  {
    "path": "imagenet_utils.py",
    "content": "import numpy as np\nimport json\n\nfrom keras.utils.data_utils import get_file\nfrom keras import backend as K\n\nCLASS_INDEX = None\nCLASS_INDEX_PATH = 'https://s3.amazonaws.com/deep-learning-models/image-models/imagenet_class_index.json'\n\n\ndef preprocess_input(x, dim_ordering='default'):\n    if dim_ordering == 'default':\n        dim_ordering = K.image_dim_ordering()\n    assert dim_ordering in {'tf', 'th'}\n\n    if dim_ordering == 'th':\n        x[:, 0, :, :] -= 103.939\n        x[:, 1, :, :] -= 116.779\n        x[:, 2, :, :] -= 123.68\n        # 'RGB'->'BGR'\n        x = x[:, ::-1, :, :]\n    else:\n        x[:, :, :, 0] -= 103.939\n        x[:, :, :, 1] -= 116.779\n        x[:, :, :, 2] -= 123.68\n        # 'RGB'->'BGR'\n        x = x[:, :, :, ::-1]\n    return x\n\n\ndef decode_predictions(preds, top=5):\n    global CLASS_INDEX\n    if len(preds.shape) != 2 or preds.shape[1] != 1000:\n        raise ValueError('`decode_predictions` expects '\n                         'a batch of predictions '\n                         '(i.e. a 2D array of shape (samples, 1000)). '\n                         'Found array with shape: ' + str(preds.shape))\n    if CLASS_INDEX is None:\n        fpath = get_file('imagenet_class_index.json',\n                         CLASS_INDEX_PATH,\n                         cache_subdir='models')\n        CLASS_INDEX = json.load(open(fpath))\n    results = []\n    for pred in preds:\n        top_indices = pred.argsort()[-top:][::-1]\n        result = [tuple(CLASS_INDEX[str(i)]) + (pred[i],) for i in top_indices]\n        results.append(result)\n    return results\n"
  },
  {
    "path": "inception_resnet_v2.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"Inception-ResNet V2 model for Keras.\n\nModel naming and structure follows TF-slim implementation (which has some additional\nlayers and different number of filters from the original arXiv paper):\nhttps://github.com/tensorflow/models/blob/master/slim/nets/inception_resnet_v2.py\n\nPre-trained ImageNet weights are also converted from TF-slim, which can be found in:\nhttps://github.com/tensorflow/models/tree/master/slim#pre-trained-models\n\n# Reference\n- [Inception-v4, Inception-ResNet and the Impact of\n   Residual Connections on Learning](https://arxiv.org/abs/1602.07261)\n\n\"\"\"\nfrom __future__ import print_function\nfrom __future__ import absolute_import\n\nimport warnings\nimport numpy as np\n\nfrom keras.preprocessing import image\nfrom keras.models import Model\nfrom keras.layers import Activation\nfrom keras.layers import AveragePooling2D\nfrom keras.layers import BatchNormalization\nfrom keras.layers import Concatenate\nfrom keras.layers import Conv2D\nfrom keras.layers import Dense\nfrom keras.layers import GlobalAveragePooling2D\nfrom keras.layers import GlobalMaxPooling2D\nfrom keras.layers import Input\nfrom keras.layers import Lambda\nfrom keras.layers import MaxPooling2D\nfrom keras.utils.data_utils import get_file\nfrom keras.engine.topology import get_source_inputs\nfrom keras.applications.imagenet_utils import _obtain_input_shape\nfrom keras.applications.imagenet_utils import decode_predictions\nfrom keras import backend as K\n\n\nBASE_WEIGHT_URL = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.7/'\n\n\ndef preprocess_input(x):\n    \"\"\"Preprocesses a numpy array encoding a batch of images.\n\n    This function applies the \"Inception\" preprocessing which converts\n    the RGB values from [0, 255] to [-1, 1]. Note that this preprocessing\n    function is different from `imagenet_utils.preprocess_input()`.\n\n    # Arguments\n        x: a 4D numpy array consists of RGB values within [0, 255].\n\n    # Returns\n        Preprocessed array.\n    \"\"\"\n    x /= 255.\n    x -= 0.5\n    x *= 2.\n    return x\n\n\ndef conv2d_bn(x,\n              filters,\n              kernel_size,\n              strides=1,\n              padding='same',\n              activation='relu',\n              use_bias=False,\n              name=None):\n    \"\"\"Utility function to apply conv + BN.\n\n    # Arguments\n        x: input tensor.\n        filters: filters in `Conv2D`.\n        kernel_size: kernel size as in `Conv2D`.\n        padding: padding mode in `Conv2D`.\n        activation: activation in `Conv2D`.\n        strides: strides in `Conv2D`.\n        name: name of the ops; will become `name + '_ac'` for the activation\n            and `name + '_bn'` for the batch norm layer.\n\n    # Returns\n        Output tensor after applying `Conv2D` and `BatchNormalization`.\n    \"\"\"\n    x = Conv2D(filters,\n               kernel_size,\n               strides=strides,\n               padding=padding,\n               use_bias=use_bias,\n               name=name)(x)\n    if not use_bias:\n        bn_axis = 1 if K.image_data_format() == 'channels_first' else 3\n        bn_name = None if name is None else name + '_bn'\n        x = BatchNormalization(axis=bn_axis, scale=False, name=bn_name)(x)\n    if activation is not None:\n        ac_name = None if name is None else name + '_ac'\n        x = Activation(activation, name=ac_name)(x)\n    return x\n\n\ndef inception_resnet_block(x, scale, block_type, block_idx, activation='relu'):\n    \"\"\"Adds a Inception-ResNet block.\n\n    This function builds 3 types of Inception-ResNet blocks mentioned\n    in the paper, controlled by the `block_type` argument (which is the\n    block name used in the official TF-slim implementation):\n        - Inception-ResNet-A: `block_type='block35'`\n        - Inception-ResNet-B: `block_type='block17'`\n        - Inception-ResNet-C: `block_type='block8'`\n\n    # Arguments\n        x: input tensor.\n        scale: scaling factor to scale the residuals (i.e., the output of\n            passing `x` through an inception module) before adding them\n            to the shortcut branch. Let `r` be the output from the residual branch,\n            the output of this block will be `x + scale * r`.\n        block_type: `'block35'`, `'block17'` or `'block8'`, determines\n            the network structure in the residual branch.\n        block_idx: an `int` used for generating layer names. The Inception-ResNet blocks\n            are repeated many times in this network. We use `block_idx` to identify\n            each of the repetitions. For example, the first Inception-ResNet-A block\n            will have `block_type='block35', block_idx=0`, ane the layer names will have\n            a common prefix `'block35_0'`.\n        activation: activation function to use at the end of the block\n            (see [activations](keras./activations.md)).\n            When `activation=None`, no activation is applied\n            (i.e., \"linear\" activation: `a(x) = x`).\n\n    # Returns\n        Output tensor for the block.\n\n    # Raises\n        ValueError: if `block_type` is not one of `'block35'`,\n            `'block17'` or `'block8'`.\n    \"\"\"\n    if block_type == 'block35':\n        branch_0 = conv2d_bn(x, 32, 1)\n        branch_1 = conv2d_bn(x, 32, 1)\n        branch_1 = conv2d_bn(branch_1, 32, 3)\n        branch_2 = conv2d_bn(x, 32, 1)\n        branch_2 = conv2d_bn(branch_2, 48, 3)\n        branch_2 = conv2d_bn(branch_2, 64, 3)\n        branches = [branch_0, branch_1, branch_2]\n    elif block_type == 'block17':\n        branch_0 = conv2d_bn(x, 192, 1)\n        branch_1 = conv2d_bn(x, 128, 1)\n        branch_1 = conv2d_bn(branch_1, 160, [1, 7])\n        branch_1 = conv2d_bn(branch_1, 192, [7, 1])\n        branches = [branch_0, branch_1]\n    elif block_type == 'block8':\n        branch_0 = conv2d_bn(x, 192, 1)\n        branch_1 = conv2d_bn(x, 192, 1)\n        branch_1 = conv2d_bn(branch_1, 224, [1, 3])\n        branch_1 = conv2d_bn(branch_1, 256, [3, 1])\n        branches = [branch_0, branch_1]\n    else:\n        raise ValueError('Unknown Inception-ResNet block type. '\n                         'Expects \"block35\", \"block17\" or \"block8\", '\n                         'but got: ' + str(block_type))\n\n    block_name = block_type + '_' + str(block_idx)\n    channel_axis = 1 if K.image_data_format() == 'channels_first' else 3\n    mixed = Concatenate(axis=channel_axis, name=block_name + '_mixed')(branches)\n    up = conv2d_bn(mixed,\n                   K.int_shape(x)[channel_axis],\n                   1,\n                   activation=None,\n                   use_bias=True,\n                   name=block_name + '_conv')\n\n    x = Lambda(lambda inputs, scale: inputs[0] + inputs[1] * scale,\n               output_shape=K.int_shape(x)[1:],\n               arguments={'scale': scale},\n               name=block_name)([x, up])\n    if activation is not None:\n        x = Activation(activation, name=block_name + '_ac')(x)\n    return x\n\n\ndef InceptionResNetV2(include_top=True,\n                      weights='imagenet',\n                      input_tensor=None,\n                      input_shape=None,\n                      pooling=None,\n                      classes=1000):\n    \"\"\"Instantiates the Inception-ResNet v2 architecture.\n\n    Optionally loads weights pre-trained on ImageNet.\n    Note that when using TensorFlow, for best performance you should\n    set `\"image_data_format\": \"channels_last\"` in your Keras config\n    at `~/.keras/keras.json`.\n\n    The model and the weights are compatible with both TensorFlow and Theano\n    backends (but not CNTK). The data format convention used by the model is\n    the one specified in your Keras config file.\n\n    Note that the default input image size for this model is 299x299, instead\n    of 224x224 as in the VGG16 and ResNet models. Also, the input preprocessing\n    function is different (i.e., do not use `imagenet_utils.preprocess_input()`\n    with this model. Use `preprocess_input()` defined in this module instead).\n\n    # Arguments\n        include_top: whether to include the fully-connected\n            layer at the top of the network.\n        weights: one of `None` (random initialization)\n            or `'imagenet'` (pre-training on ImageNet).\n        input_tensor: optional Keras tensor (i.e. output of `layers.Input()`)\n            to use as image input for the model.\n        input_shape: optional shape tuple, only to be specified\n            if `include_top` is `False` (otherwise the input shape\n            has to be `(299, 299, 3)` (with `'channels_last'` data format)\n            or `(3, 299, 299)` (with `'channels_first'` data format).\n            It should have exactly 3 inputs channels,\n            and width and height should be no smaller than 139.\n            E.g. `(150, 150, 3)` would be one valid value.\n        pooling: Optional pooling mode for feature extraction\n            when `include_top` is `False`.\n            - `None` means that the output of the model will be\n                the 4D tensor output of the last convolutional layer.\n            - `'avg'` means that global average pooling\n                will be applied to the output of the\n                last convolutional layer, and thus\n                the output of the model will be a 2D tensor.\n            - `'max'` means that global max pooling will be applied.\n        classes: optional number of classes to classify images\n            into, only to be specified if `include_top` is `True`, and\n            if no `weights` argument is specified.\n\n    # Returns\n        A Keras `Model` instance.\n\n    # Raises\n        ValueError: in case of invalid argument for `weights`,\n            or invalid input shape.\n        RuntimeError: If attempting to run this model with an unsupported backend.\n    \"\"\"\n    if K.backend() in {'cntk'}:\n        raise RuntimeError(K.backend() + ' backend is currently unsupported for this model.')\n\n    if weights not in {'imagenet', None}:\n        raise ValueError('The `weights` argument should be either '\n                         '`None` (random initialization) or `imagenet` '\n                         '(pre-training on ImageNet).')\n\n    if weights == 'imagenet' and include_top and classes != 1000:\n        raise ValueError('If using `weights` as imagenet with `include_top`'\n                         ' as true, `classes` should be 1000')\n\n    # Determine proper input shape\n    input_shape = _obtain_input_shape(\n        input_shape,\n        default_size=299,\n        min_size=139,\n        data_format=K.image_data_format(),\n        require_flatten=False,\n        weights=weights)\n\n    if input_tensor is None:\n        img_input = Input(shape=input_shape)\n    else:\n        if not K.is_keras_tensor(input_tensor):\n            img_input = Input(tensor=input_tensor, shape=input_shape)\n        else:\n            img_input = input_tensor\n\n    # Stem block: 35 x 35 x 192\n    x = conv2d_bn(img_input, 32, 3, strides=2, padding='valid')\n    x = conv2d_bn(x, 32, 3, padding='valid')\n    x = conv2d_bn(x, 64, 3)\n    x = MaxPooling2D(3, strides=2)(x)\n    x = conv2d_bn(x, 80, 1, padding='valid')\n    x = conv2d_bn(x, 192, 3, padding='valid')\n    x = MaxPooling2D(3, strides=2)(x)\n\n    # Mixed 5b (Inception-A block): 35 x 35 x 320\n    branch_0 = conv2d_bn(x, 96, 1)\n    branch_1 = conv2d_bn(x, 48, 1)\n    branch_1 = conv2d_bn(branch_1, 64, 5)\n    branch_2 = conv2d_bn(x, 64, 1)\n    branch_2 = conv2d_bn(branch_2, 96, 3)\n    branch_2 = conv2d_bn(branch_2, 96, 3)\n    branch_pool = AveragePooling2D(3, strides=1, padding='same')(x)\n    branch_pool = conv2d_bn(branch_pool, 64, 1)\n    branches = [branch_0, branch_1, branch_2, branch_pool]\n    channel_axis = 1 if K.image_data_format() == 'channels_first' else 3\n    x = Concatenate(axis=channel_axis, name='mixed_5b')(branches)\n\n    # 10x block35 (Inception-ResNet-A block): 35 x 35 x 320\n    for block_idx in range(1, 11):\n        x = inception_resnet_block(x,\n                                   scale=0.17,\n                                   block_type='block35',\n                                   block_idx=block_idx)\n\n    # Mixed 6a (Reduction-A block): 17 x 17 x 1088\n    branch_0 = conv2d_bn(x, 384, 3, strides=2, padding='valid')\n    branch_1 = conv2d_bn(x, 256, 1)\n    branch_1 = conv2d_bn(branch_1, 256, 3)\n    branch_1 = conv2d_bn(branch_1, 384, 3, strides=2, padding='valid')\n    branch_pool = MaxPooling2D(3, strides=2, padding='valid')(x)\n    branches = [branch_0, branch_1, branch_pool]\n    x = Concatenate(axis=channel_axis, name='mixed_6a')(branches)\n\n    # 20x block17 (Inception-ResNet-B block): 17 x 17 x 1088\n    for block_idx in range(1, 21):\n        x = inception_resnet_block(x,\n                                   scale=0.1,\n                                   block_type='block17',\n                                   block_idx=block_idx)\n\n    # Mixed 7a (Reduction-B block): 8 x 8 x 2080\n    branch_0 = conv2d_bn(x, 256, 1)\n    branch_0 = conv2d_bn(branch_0, 384, 3, strides=2, padding='valid')\n    branch_1 = conv2d_bn(x, 256, 1)\n    branch_1 = conv2d_bn(branch_1, 288, 3, strides=2, padding='valid')\n    branch_2 = conv2d_bn(x, 256, 1)\n    branch_2 = conv2d_bn(branch_2, 288, 3)\n    branch_2 = conv2d_bn(branch_2, 320, 3, strides=2, padding='valid')\n    branch_pool = MaxPooling2D(3, strides=2, padding='valid')(x)\n    branches = [branch_0, branch_1, branch_2, branch_pool]\n    x = Concatenate(axis=channel_axis, name='mixed_7a')(branches)\n\n    # 10x block8 (Inception-ResNet-C block): 8 x 8 x 2080\n    for block_idx in range(1, 10):\n        x = inception_resnet_block(x,\n                                   scale=0.2,\n                                   block_type='block8',\n                                   block_idx=block_idx)\n    x = inception_resnet_block(x,\n                               scale=1.,\n                               activation=None,\n                               block_type='block8',\n                               block_idx=10)\n\n    # Final convolution block: 8 x 8 x 1536\n    x = conv2d_bn(x, 1536, 1, name='conv_7b')\n\n    if include_top:\n        # Classification block\n        x = GlobalAveragePooling2D(name='avg_pool')(x)\n        x = Dense(classes, activation='softmax', name='predictions')(x)\n    else:\n        if pooling == 'avg':\n            x = GlobalAveragePooling2D()(x)\n        elif pooling == 'max':\n            x = GlobalMaxPooling2D()(x)\n\n    # Ensure that the model takes into account\n    # any potential predecessors of `input_tensor`\n    if input_tensor is not None:\n        inputs = get_source_inputs(input_tensor)\n    else:\n        inputs = img_input\n\n    # Create model\n    model = Model(inputs, x, name='inception_resnet_v2')\n\n    # Load weights\n    if weights == 'imagenet':\n        if K.image_data_format() == 'channels_first':\n            if K.backend() == 'tensorflow':\n                warnings.warn('You are using the TensorFlow backend, yet you '\n                              'are using the Theano '\n                              'image data format convention '\n                              '(`image_data_format=\"channels_first\"`). '\n                              'For best performance, set '\n                              '`image_data_format=\"channels_last\"` in '\n                              'your Keras config '\n                              'at ~/.keras/keras.json.')\n        if include_top:\n            weights_filename = 'inception_resnet_v2_weights_tf_dim_ordering_tf_kernels.h5'\n            weights_path = get_file(weights_filename,\n                                    BASE_WEIGHT_URL + weights_filename,\n                                    cache_subdir='models',\n                                    md5_hash='e693bd0210a403b3192acc6073ad2e96')\n        else:\n            weights_filename = 'inception_resnet_v2_weights_tf_dim_ordering_tf_kernels_notop.h5'\n            weights_path = get_file(weights_filename,\n                                    BASE_WEIGHT_URL + weights_filename,\n                                    cache_subdir='models',\n                                    md5_hash='d19885ff4a710c122648d3b5c3b684e4')\n        model.load_weights(weights_path)\n\n    return model\n\n\nif __name__ == '__main__':\n    model = InceptionResNetV2(include_top=True, weights='imagenet')\n\n    img_path = 'elephant.jpg'\n    img = image.load_img(img_path, target_size=(299, 299))\n    x = image.img_to_array(img)\n    x = np.expand_dims(x, axis=0)\n\n    x = preprocess_input(x)\n\n    preds = model.predict(x)\n    print('Predicted:', decode_predictions(preds))\n"
  },
  {
    "path": "inception_v3.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"Inception V3 model for Keras.\n\nNote that the input image format for this model is different than for\nthe VGG16 and ResNet models (299x299 instead of 224x224),\nand that the input preprocessing function is also different (same as Xception).\n\n# Reference\n\n- [Rethinking the Inception Architecture for Computer Vision](http://arxiv.org/abs/1512.00567)\n\n\"\"\"\nfrom __future__ import print_function\nfrom __future__ import absolute_import\n\nimport warnings\nimport numpy as np\n\nfrom keras.models import Model\nfrom keras import layers\nfrom keras.layers import Activation\nfrom keras.layers import Dense\nfrom keras.layers import Input\nfrom keras.layers import BatchNormalization\nfrom keras.layers import Conv2D\nfrom keras.layers import MaxPooling2D\nfrom keras.layers import AveragePooling2D\nfrom keras.layers import GlobalAveragePooling2D\nfrom keras.layers import GlobalMaxPooling2D\nfrom keras.engine.topology import get_source_inputs\nfrom keras.utils.layer_utils import convert_all_kernels_in_model\nfrom keras.utils.data_utils import get_file\nfrom keras import backend as K\nfrom keras.applications.imagenet_utils import decode_predictions\nfrom keras.applications.imagenet_utils import _obtain_input_shape\nfrom keras.preprocessing import image\n\n\nWEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.5/inception_v3_weights_tf_dim_ordering_tf_kernels.h5'\nWEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.5/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5'\n\n\ndef conv2d_bn(x,\n              filters,\n              num_row,\n              num_col,\n              padding='same',\n              strides=(1, 1),\n              name=None):\n    \"\"\"Utility function to apply conv + BN.\n\n    Arguments:\n        x: input tensor.\n        filters: filters in `Conv2D`.\n        num_row: height of the convolution kernel.\n        num_col: width of the convolution kernel.\n        padding: padding mode in `Conv2D`.\n        strides: strides in `Conv2D`.\n        name: name of the ops; will become `name + '_conv'`\n            for the convolution and `name + '_bn'` for the\n            batch norm layer.\n\n    Returns:\n        Output tensor after applying `Conv2D` and `BatchNormalization`.\n    \"\"\"\n    if name is not None:\n        bn_name = name + '_bn'\n        conv_name = name + '_conv'\n    else:\n        bn_name = None\n        conv_name = None\n    if K.image_data_format() == 'channels_first':\n        bn_axis = 1\n    else:\n        bn_axis = 3\n    x = Conv2D(\n        filters, (num_row, num_col),\n        strides=strides,\n        padding=padding,\n        use_bias=False,\n        name=conv_name)(x)\n    x = BatchNormalization(axis=bn_axis, scale=False, name=bn_name)(x)\n    x = Activation('relu', name=name)(x)\n    return x\n\n\ndef InceptionV3(include_top=True,\n                weights='imagenet',\n                input_tensor=None,\n                input_shape=None,\n                pooling=None,\n                classes=1000):\n    \"\"\"Instantiates the Inception v3 architecture.\n\n    Optionally loads weights pre-trained\n    on ImageNet. Note that when using TensorFlow,\n    for best performance you should set\n    `image_data_format=\"channels_last\"` in your Keras config\n    at ~/.keras/keras.json.\n    The model and the weights are compatible with both\n    TensorFlow and Theano. The data format\n    convention used by the model is the one\n    specified in your Keras config file.\n    Note that the default input image size for this model is 299x299.\n\n    Arguments:\n        include_top: whether to include the fully-connected\n            layer at the top of the network.\n        weights: one of `None` (random initialization)\n            or \"imagenet\" (pre-training on ImageNet).\n        input_tensor: optional Keras tensor (i.e. output of `layers.Input()`)\n            to use as image input for the model.\n        input_shape: optional shape tuple, only to be specified\n            if `include_top` is False (otherwise the input shape\n            has to be `(299, 299, 3)` (with `channels_last` data format)\n            or `(3, 299, 299)` (with `channels_first` data format).\n            It should have exactly 3 inputs channels,\n            and width and height should be no smaller than 139.\n            E.g. `(150, 150, 3)` would be one valid value.\n        pooling: Optional pooling mode for feature extraction\n            when `include_top` is `False`.\n            - `None` means that the output of the model will be\n                the 4D tensor output of the\n                last convolutional layer.\n            - `avg` means that global average pooling\n                will be applied to the output of the\n                last convolutional layer, and thus\n                the output of the model will be a 2D tensor.\n            - `max` means that global max pooling will\n                be applied.\n        classes: optional number of classes to classify images\n            into, only to be specified if `include_top` is True, and\n            if no `weights` argument is specified.\n\n    Returns:\n        A Keras model instance.\n\n    Raises:\n        ValueError: in case of invalid argument for `weights`,\n            or invalid input shape.\n    \"\"\"\n    if weights not in {'imagenet', None}:\n        raise ValueError('The `weights` argument should be either '\n                         '`None` (random initialization) or `imagenet` '\n                         '(pre-training on ImageNet).')\n\n    if weights == 'imagenet' and include_top and classes != 1000:\n        raise ValueError('If using `weights` as imagenet with `include_top`'\n                         ' as true, `classes` should be 1000')\n\n    # Determine proper input shape\n    input_shape = _obtain_input_shape(\n        input_shape,\n        default_size=299,\n        min_size=139,\n        data_format=K.image_data_format(),\n        include_top=include_top)\n\n    if input_tensor is None:\n        img_input = Input(shape=input_shape)\n    else:\n        img_input = Input(tensor=input_tensor, shape=input_shape)\n\n    if K.image_data_format() == 'channels_first':\n        channel_axis = 1\n    else:\n        channel_axis = 3\n\n    x = conv2d_bn(img_input, 32, 3, 3, strides=(2, 2), padding='valid')\n    x = conv2d_bn(x, 32, 3, 3, padding='valid')\n    x = conv2d_bn(x, 64, 3, 3)\n    x = MaxPooling2D((3, 3), strides=(2, 2))(x)\n\n    x = conv2d_bn(x, 80, 1, 1, padding='valid')\n    x = conv2d_bn(x, 192, 3, 3, padding='valid')\n    x = MaxPooling2D((3, 3), strides=(2, 2))(x)\n\n    # mixed 0, 1, 2: 35 x 35 x 256\n    branch1x1 = conv2d_bn(x, 64, 1, 1)\n\n    branch5x5 = conv2d_bn(x, 48, 1, 1)\n    branch5x5 = conv2d_bn(branch5x5, 64, 5, 5)\n\n    branch3x3dbl = conv2d_bn(x, 64, 1, 1)\n    branch3x3dbl = conv2d_bn(branch3x3dbl, 96, 3, 3)\n    branch3x3dbl = conv2d_bn(branch3x3dbl, 96, 3, 3)\n\n    branch_pool = AveragePooling2D((3, 3), strides=(1, 1), padding='same')(x)\n    branch_pool = conv2d_bn(branch_pool, 32, 1, 1)\n    x = layers.concatenate(\n        [branch1x1, branch5x5, branch3x3dbl, branch_pool],\n        axis=channel_axis,\n        name='mixed0')\n\n    # mixed 1: 35 x 35 x 256\n    branch1x1 = conv2d_bn(x, 64, 1, 1)\n\n    branch5x5 = conv2d_bn(x, 48, 1, 1)\n    branch5x5 = conv2d_bn(branch5x5, 64, 5, 5)\n\n    branch3x3dbl = conv2d_bn(x, 64, 1, 1)\n    branch3x3dbl = conv2d_bn(branch3x3dbl, 96, 3, 3)\n    branch3x3dbl = conv2d_bn(branch3x3dbl, 96, 3, 3)\n\n    branch_pool = AveragePooling2D((3, 3), strides=(1, 1), padding='same')(x)\n    branch_pool = conv2d_bn(branch_pool, 64, 1, 1)\n    x = layers.concatenate(\n        [branch1x1, branch5x5, branch3x3dbl, branch_pool],\n        axis=channel_axis,\n        name='mixed1')\n\n    # mixed 2: 35 x 35 x 256\n    branch1x1 = conv2d_bn(x, 64, 1, 1)\n\n    branch5x5 = conv2d_bn(x, 48, 1, 1)\n    branch5x5 = conv2d_bn(branch5x5, 64, 5, 5)\n\n    branch3x3dbl = conv2d_bn(x, 64, 1, 1)\n    branch3x3dbl = conv2d_bn(branch3x3dbl, 96, 3, 3)\n    branch3x3dbl = conv2d_bn(branch3x3dbl, 96, 3, 3)\n\n    branch_pool = AveragePooling2D((3, 3), strides=(1, 1), padding='same')(x)\n    branch_pool = conv2d_bn(branch_pool, 64, 1, 1)\n    x = layers.concatenate(\n        [branch1x1, branch5x5, branch3x3dbl, branch_pool],\n        axis=channel_axis,\n        name='mixed2')\n\n    # mixed 3: 17 x 17 x 768\n    branch3x3 = conv2d_bn(x, 384, 3, 3, strides=(2, 2), padding='valid')\n\n    branch3x3dbl = conv2d_bn(x, 64, 1, 1)\n    branch3x3dbl = conv2d_bn(branch3x3dbl, 96, 3, 3)\n    branch3x3dbl = conv2d_bn(\n        branch3x3dbl, 96, 3, 3, strides=(2, 2), padding='valid')\n\n    branch_pool = MaxPooling2D((3, 3), strides=(2, 2))(x)\n    x = layers.concatenate(\n        [branch3x3, branch3x3dbl, branch_pool], axis=channel_axis, name='mixed3')\n\n    # mixed 4: 17 x 17 x 768\n    branch1x1 = conv2d_bn(x, 192, 1, 1)\n\n    branch7x7 = conv2d_bn(x, 128, 1, 1)\n    branch7x7 = conv2d_bn(branch7x7, 128, 1, 7)\n    branch7x7 = conv2d_bn(branch7x7, 192, 7, 1)\n\n    branch7x7dbl = conv2d_bn(x, 128, 1, 1)\n    branch7x7dbl = conv2d_bn(branch7x7dbl, 128, 7, 1)\n    branch7x7dbl = conv2d_bn(branch7x7dbl, 128, 1, 7)\n    branch7x7dbl = conv2d_bn(branch7x7dbl, 128, 7, 1)\n    branch7x7dbl = conv2d_bn(branch7x7dbl, 192, 1, 7)\n\n    branch_pool = AveragePooling2D((3, 3), strides=(1, 1), padding='same')(x)\n    branch_pool = conv2d_bn(branch_pool, 192, 1, 1)\n    x = layers.concatenate(\n        [branch1x1, branch7x7, branch7x7dbl, branch_pool],\n        axis=channel_axis,\n        name='mixed4')\n\n    # mixed 5, 6: 17 x 17 x 768\n    for i in range(2):\n        branch1x1 = conv2d_bn(x, 192, 1, 1)\n\n        branch7x7 = conv2d_bn(x, 160, 1, 1)\n        branch7x7 = conv2d_bn(branch7x7, 160, 1, 7)\n        branch7x7 = conv2d_bn(branch7x7, 192, 7, 1)\n\n        branch7x7dbl = conv2d_bn(x, 160, 1, 1)\n        branch7x7dbl = conv2d_bn(branch7x7dbl, 160, 7, 1)\n        branch7x7dbl = conv2d_bn(branch7x7dbl, 160, 1, 7)\n        branch7x7dbl = conv2d_bn(branch7x7dbl, 160, 7, 1)\n        branch7x7dbl = conv2d_bn(branch7x7dbl, 192, 1, 7)\n\n        branch_pool = AveragePooling2D(\n            (3, 3), strides=(1, 1), padding='same')(x)\n        branch_pool = conv2d_bn(branch_pool, 192, 1, 1)\n        x = layers.concatenate(\n            [branch1x1, branch7x7, branch7x7dbl, branch_pool],\n            axis=channel_axis,\n            name='mixed' + str(5 + i))\n\n    # mixed 7: 17 x 17 x 768\n    branch1x1 = conv2d_bn(x, 192, 1, 1)\n\n    branch7x7 = conv2d_bn(x, 192, 1, 1)\n    branch7x7 = conv2d_bn(branch7x7, 192, 1, 7)\n    branch7x7 = conv2d_bn(branch7x7, 192, 7, 1)\n\n    branch7x7dbl = conv2d_bn(x, 192, 1, 1)\n    branch7x7dbl = conv2d_bn(branch7x7dbl, 192, 7, 1)\n    branch7x7dbl = conv2d_bn(branch7x7dbl, 192, 1, 7)\n    branch7x7dbl = conv2d_bn(branch7x7dbl, 192, 7, 1)\n    branch7x7dbl = conv2d_bn(branch7x7dbl, 192, 1, 7)\n\n    branch_pool = AveragePooling2D((3, 3), strides=(1, 1), padding='same')(x)\n    branch_pool = conv2d_bn(branch_pool, 192, 1, 1)\n    x = layers.concatenate(\n        [branch1x1, branch7x7, branch7x7dbl, branch_pool],\n        axis=channel_axis,\n        name='mixed7')\n\n    # mixed 8: 8 x 8 x 1280\n    branch3x3 = conv2d_bn(x, 192, 1, 1)\n    branch3x3 = conv2d_bn(branch3x3, 320, 3, 3,\n                          strides=(2, 2), padding='valid')\n\n    branch7x7x3 = conv2d_bn(x, 192, 1, 1)\n    branch7x7x3 = conv2d_bn(branch7x7x3, 192, 1, 7)\n    branch7x7x3 = conv2d_bn(branch7x7x3, 192, 7, 1)\n    branch7x7x3 = conv2d_bn(\n        branch7x7x3, 192, 3, 3, strides=(2, 2), padding='valid')\n\n    branch_pool = MaxPooling2D((3, 3), strides=(2, 2))(x)\n    x = layers.concatenate(\n        [branch3x3, branch7x7x3, branch_pool], axis=channel_axis, name='mixed8')\n\n    # mixed 9: 8 x 8 x 2048\n    for i in range(2):\n        branch1x1 = conv2d_bn(x, 320, 1, 1)\n\n        branch3x3 = conv2d_bn(x, 384, 1, 1)\n        branch3x3_1 = conv2d_bn(branch3x3, 384, 1, 3)\n        branch3x3_2 = conv2d_bn(branch3x3, 384, 3, 1)\n        branch3x3 = layers.concatenate(\n            [branch3x3_1, branch3x3_2], axis=channel_axis, name='mixed9_' + str(i))\n\n        branch3x3dbl = conv2d_bn(x, 448, 1, 1)\n        branch3x3dbl = conv2d_bn(branch3x3dbl, 384, 3, 3)\n        branch3x3dbl_1 = conv2d_bn(branch3x3dbl, 384, 1, 3)\n        branch3x3dbl_2 = conv2d_bn(branch3x3dbl, 384, 3, 1)\n        branch3x3dbl = layers.concatenate(\n            [branch3x3dbl_1, branch3x3dbl_2], axis=channel_axis)\n\n        branch_pool = AveragePooling2D(\n            (3, 3), strides=(1, 1), padding='same')(x)\n        branch_pool = conv2d_bn(branch_pool, 192, 1, 1)\n        x = layers.concatenate(\n            [branch1x1, branch3x3, branch3x3dbl, branch_pool],\n            axis=channel_axis,\n            name='mixed' + str(9 + i))\n    if include_top:\n        # Classification block\n        x = GlobalAveragePooling2D(name='avg_pool')(x)\n        x = Dense(classes, activation='softmax', name='predictions')(x)\n    else:\n        if pooling == 'avg':\n            x = GlobalAveragePooling2D()(x)\n        elif pooling == 'max':\n            x = GlobalMaxPooling2D()(x)\n\n    # Ensure that the model takes into account\n    # any potential predecessors of `input_tensor`.\n    if input_tensor is not None:\n        inputs = get_source_inputs(input_tensor)\n    else:\n        inputs = img_input\n    # Create model.\n    model = Model(inputs, x, name='inception_v3')\n\n    # load weights\n    if weights == 'imagenet':\n        if K.image_data_format() == 'channels_first':\n            if K.backend() == 'tensorflow':\n                warnings.warn('You are using the TensorFlow backend, yet you '\n                              'are using the Theano '\n                              'image data format convention '\n                              '(`image_data_format=\"channels_first\"`). '\n                              'For best performance, set '\n                              '`image_data_format=\"channels_last\"` in '\n                              'your Keras config '\n                              'at ~/.keras/keras.json.')\n        if include_top:\n            weights_path = get_file(\n                'inception_v3_weights_tf_dim_ordering_tf_kernels.h5',\n                WEIGHTS_PATH,\n                cache_subdir='models',\n                md5_hash='9a0d58056eeedaa3f26cb7ebd46da564')\n        else:\n            weights_path = get_file(\n                'inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5',\n                WEIGHTS_PATH_NO_TOP,\n                cache_subdir='models',\n                md5_hash='bcbd6486424b2319ff4ef7d526e38f63')\n        model.load_weights(weights_path)\n        if K.backend() == 'theano':\n            convert_all_kernels_in_model(model)\n    return model\n\n\ndef preprocess_input(x):\n    x /= 255.\n    x -= 0.5\n    x *= 2.\n    return x\n\n\nif __name__ == '__main__':\n    model = InceptionV3(include_top=True, weights='imagenet')\n\n    img_path = 'elephant.jpg'\n    img = image.load_img(img_path, target_size=(299, 299))\n    x = image.img_to_array(img)\n    x = np.expand_dims(x, axis=0)\n\n    x = preprocess_input(x)\n\n    preds = model.predict(x)\n    print('Predicted:', decode_predictions(preds))\n"
  },
  {
    "path": "mobilenet.py",
    "content": "\"\"\"MobileNet v1 models for Keras.\n\nCode contributed by Somshubra Majumdar (@titu1994).\n\nMobileNet is a general architecture and can be used for multiple use cases.\nDepending on the use case, it can use different input layer size and\ndifferent width factors. This allows different width models to reduce\nthe number of multiply-adds and thereby\nreduce inference cost on mobile devices.\n\nMobileNets support any input size greater than 32 x 32, with larger image sizes\noffering better performance.\nThe number of parameters and number of multiply-adds\ncan be modified by using the `alpha` parameter,\nwhich increases/decreases the number of filters in each layer.\nBy altering the image size and `alpha` parameter,\nall 16 models from the paper can be built, with ImageNet weights provided.\n\nThe paper demonstrates the performance of MobileNets using `alpha` values of\n1.0 (also called 100 % MobileNet), 0.75, 0.5 and 0.25.\nFor each of these `alpha` values, weights for 4 different input image sizes\nare provided (224, 192, 160, 128).\n\nThe following table describes the size and accuracy of the 100% MobileNet\non size 224 x 224:\n----------------------------------------------------------------------------\nWidth Multiplier (alpha) | ImageNet Acc |  Multiply-Adds (M) |  Params (M)\n----------------------------------------------------------------------------\n|   1.0 MobileNet-224    |    70.6 %     |        529        |     4.2     |\n|   0.75 MobileNet-224   |    68.4 %     |        325        |     2.6     |\n|   0.50 MobileNet-224   |    63.7 %     |        149        |     1.3     |\n|   0.25 MobileNet-224   |    50.6 %     |        41         |     0.5     |\n----------------------------------------------------------------------------\n\nThe following table describes the performance of\nthe 100 % MobileNet on various input sizes:\n------------------------------------------------------------------------\n      Resolution      | ImageNet Acc | Multiply-Adds (M) | Params (M)\n------------------------------------------------------------------------\n|  1.0 MobileNet-224  |    70.6 %    |        529        |     4.2     |\n|  1.0 MobileNet-192  |    69.1 %    |        529        |     4.2     |\n|  1.0 MobileNet-160  |    67.2 %    |        529        |     4.2     |\n|  1.0 MobileNet-128  |    64.4 %    |        529        |     4.2     |\n------------------------------------------------------------------------\n\nThe weights for all 16 models are obtained and translated\nfrom Tensorflow checkpoints found at\nhttps://github.com/tensorflow/models/blob/master/slim/nets/mobilenet_v1.md\n\n# Reference\n- [MobileNets: Efficient Convolutional Neural Networks for\n   Mobile Vision Applications](https://arxiv.org/pdf/1704.04861.pdf))\n\"\"\"\nfrom __future__ import print_function\nfrom __future__ import absolute_import\nfrom __future__ import division\n\nimport warnings\nimport numpy as np\n\nfrom keras.preprocessing import image\n\nfrom keras.models import Model\nfrom keras.layers import Input\nfrom keras.layers import Activation\nfrom keras.layers import Dropout\nfrom keras.layers import Reshape\nfrom keras.layers import BatchNormalization\nfrom keras.layers import GlobalAveragePooling2D\nfrom keras.layers import GlobalMaxPooling2D\nfrom keras.layers import Conv2D\nfrom keras import initializers\nfrom keras import regularizers\nfrom keras import constraints\nfrom keras.utils import conv_utils\nfrom keras.utils.data_utils import get_file\nfrom keras.engine.topology import get_source_inputs\nfrom keras.engine import InputSpec\nfrom keras.applications.imagenet_utils import _obtain_input_shape\nfrom keras.applications.imagenet_utils import decode_predictions\nfrom keras import backend as K\n\n\nBASE_WEIGHT_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.6/'\n\n\ndef relu6(x):\n    return K.relu(x, max_value=6)\n\n\ndef preprocess_input(x):\n    x /= 255.\n    x -= 0.5\n    x *= 2.\n    return x\n\n\nclass DepthwiseConv2D(Conv2D):\n    \"\"\"Depthwise separable 2D convolution.\n\n    Depthwise Separable convolutions consists in performing\n    just the first step in a depthwise spatial convolution\n    (which acts on each input channel separately).\n    The `depth_multiplier` argument controls how many\n    output channels are generated per input channel in the depthwise step.\n\n    # Arguments\n        kernel_size: An integer or tuple/list of 2 integers, specifying the\n            width and height of the 2D convolution window.\n            Can be a single integer to specify the same value for\n            all spatial dimensions.\n        strides: An integer or tuple/list of 2 integers,\n            specifying the strides of the convolution along the width and height.\n            Can be a single integer to specify the same value for\n            all spatial dimensions.\n            Specifying any stride value != 1 is incompatible with specifying\n            any `dilation_rate` value != 1.\n        padding: one of `\"valid\"` or `\"same\"` (case-insensitive).\n        depth_multiplier: The number of depthwise convolution output channels\n            for each input channel.\n            The total number of depthwise convolution output\n            channels will be equal to `filters_in * depth_multiplier`.\n        data_format: A string,\n            one of `channels_last` (default) or `channels_first`.\n            The ordering of the dimensions in the inputs.\n            `channels_last` corresponds to inputs with shape\n            `(batch, height, width, channels)` while `channels_first`\n            corresponds to inputs with shape\n            `(batch, channels, height, width)`.\n            It defaults to the `image_data_format` value found in your\n            Keras config file at `~/.keras/keras.json`.\n            If you never set it, then it will be \"channels_last\".\n        activation: Activation function to use\n            (see [activations](keras./activations.md)).\n            If you don't specify anything, no activation is applied\n            (ie. \"linear\" activation: `a(x) = x`).\n        use_bias: Boolean, whether the layer uses a bias vector.\n        depthwise_initializer: Initializer for the depthwise kernel matrix\n            (see [initializers](keras./initializers.md)).\n        bias_initializer: Initializer for the bias vector\n            (see [initializers](keras./initializers.md)).\n        depthwise_regularizer: Regularizer function applied to\n            the depthwise kernel matrix\n            (see [regularizer](keras./regularizers.md)).\n        bias_regularizer: Regularizer function applied to the bias vector\n            (see [regularizer](keras./regularizers.md)).\n        activity_regularizer: Regularizer function applied to\n            the output of the layer (its \"activation\").\n            (see [regularizer](keras./regularizers.md)).\n        depthwise_constraint: Constraint function applied to\n            the depthwise kernel matrix\n            (see [constraints](keras./constraints.md)).\n        bias_constraint: Constraint function applied to the bias vector\n            (see [constraints](keras./constraints.md)).\n\n    # Input shape\n        4D tensor with shape:\n        `[batch, channels, rows, cols]` if data_format='channels_first'\n        or 4D tensor with shape:\n        `[batch, rows, cols, channels]` if data_format='channels_last'.\n\n    # Output shape\n        4D tensor with shape:\n        `[batch, filters, new_rows, new_cols]` if data_format='channels_first'\n        or 4D tensor with shape:\n        `[batch, new_rows, new_cols, filters]` if data_format='channels_last'.\n        `rows` and `cols` values might have changed due to padding.\n    \"\"\"\n\n    def __init__(self,\n                 kernel_size,\n                 strides=(1, 1),\n                 padding='valid',\n                 depth_multiplier=1,\n                 data_format=None,\n                 activation=None,\n                 use_bias=True,\n                 depthwise_initializer='glorot_uniform',\n                 bias_initializer='zeros',\n                 depthwise_regularizer=None,\n                 bias_regularizer=None,\n                 activity_regularizer=None,\n                 depthwise_constraint=None,\n                 bias_constraint=None,\n                 **kwargs):\n        super(DepthwiseConv2D, self).__init__(\n            filters=None,\n            kernel_size=kernel_size,\n            strides=strides,\n            padding=padding,\n            data_format=data_format,\n            activation=activation,\n            use_bias=use_bias,\n            bias_regularizer=bias_regularizer,\n            activity_regularizer=activity_regularizer,\n            bias_constraint=bias_constraint,\n            **kwargs)\n        self.depth_multiplier = depth_multiplier\n        self.depthwise_initializer = initializers.get(depthwise_initializer)\n        self.depthwise_regularizer = regularizers.get(depthwise_regularizer)\n        self.depthwise_constraint = constraints.get(depthwise_constraint)\n        self.bias_initializer = initializers.get(bias_initializer)\n\n    def build(self, input_shape):\n        if len(input_shape) < 4:\n            raise ValueError('Inputs to `DepthwiseConv2D` should have rank 4. '\n                             'Received input shape:', str(input_shape))\n        if self.data_format == 'channels_first':\n            channel_axis = 1\n        else:\n            channel_axis = 3\n        if input_shape[channel_axis] is None:\n            raise ValueError('The channel dimension of the inputs to '\n                             '`DepthwiseConv2D` '\n                             'should be defined. Found `None`.')\n        input_dim = int(input_shape[channel_axis])\n        depthwise_kernel_shape = (self.kernel_size[0],\n                                  self.kernel_size[1],\n                                  input_dim,\n                                  self.depth_multiplier)\n\n        self.depthwise_kernel = self.add_weight(\n            shape=depthwise_kernel_shape,\n            initializer=self.depthwise_initializer,\n            name='depthwise_kernel',\n            regularizer=self.depthwise_regularizer,\n            constraint=self.depthwise_constraint)\n\n        if self.use_bias:\n            self.bias = self.add_weight(shape=(input_dim * self.depth_multiplier,),\n                                        initializer=self.bias_initializer,\n                                        name='bias',\n                                        regularizer=self.bias_regularizer,\n                                        constraint=self.bias_constraint)\n        else:\n            self.bias = None\n        # Set input spec.\n        self.input_spec = InputSpec(ndim=4, axes={channel_axis: input_dim})\n        self.built = True\n\n    def call(self, inputs, training=None):\n        outputs = K.depthwise_conv2d(\n            inputs,\n            self.depthwise_kernel,\n            strides=self.strides,\n            padding=self.padding,\n            dilation_rate=self.dilation_rate,\n            data_format=self.data_format)\n\n        if self.bias:\n            outputs = K.bias_add(\n                outputs,\n                self.bias,\n                data_format=self.data_format)\n\n        if self.activation is not None:\n            return self.activation(outputs)\n\n        return outputs\n\n    def compute_output_shape(self, input_shape):\n        if self.data_format == 'channels_first':\n            rows = input_shape[2]\n            cols = input_shape[3]\n            out_filters = input_shape[1] * self.depth_multiplier\n        elif self.data_format == 'channels_last':\n            rows = input_shape[1]\n            cols = input_shape[2]\n            out_filters = input_shape[3] * self.depth_multiplier\n\n        rows = conv_utils.conv_output_length(rows, self.kernel_size[0],\n                                             self.padding,\n                                             self.strides[0])\n        cols = conv_utils.conv_output_length(cols, self.kernel_size[1],\n                                             self.padding,\n                                             self.strides[1])\n\n        if self.data_format == 'channels_first':\n            return (input_shape[0], out_filters, rows, cols)\n        elif self.data_format == 'channels_last':\n            return (input_shape[0], rows, cols, out_filters)\n\n    def get_config(self):\n        config = super(DepthwiseConv2D, self).get_config()\n        config.pop('filters')\n        config.pop('kernel_initializer')\n        config.pop('kernel_regularizer')\n        config.pop('kernel_constraint')\n        config['depth_multiplier'] = self.depth_multiplier\n        config['depthwise_initializer'] = initializers.serialize(self.depthwise_initializer)\n        config['depthwise_regularizer'] = regularizers.serialize(self.depthwise_regularizer)\n        config['depthwise_constraint'] = constraints.serialize(self.depthwise_constraint)\n        return config\n\n\ndef MobileNet(input_shape=None,\n              alpha=1.0,\n              depth_multiplier=1,\n              dropout=1e-3,\n              include_top=True,\n              weights='imagenet',\n              input_tensor=None,\n              pooling=None,\n              classes=1000):\n    \"\"\"Instantiates the MobileNet architecture.\n\n    Note that only TensorFlow is supported for now,\n    therefore it only works with the data format\n    `image_data_format='channels_last'` in your Keras config\n    at `~/.keras/keras.json`.\n\n    To load a MobileNet model via `load_model`, import the custom\n    objects `relu6` and `DepthwiseConv2D` and pass them to the\n    `custom_objects` parameter.\n    E.g.\n    model = load_model('mobilenet.h5', custom_objects={\n                       'relu6': mobilenet.relu6,\n                       'DepthwiseConv2D': mobilenet.DepthwiseConv2D})\n\n    # Arguments\n        input_shape: optional shape tuple, only to be specified\n            if `include_top` is False (otherwise the input shape\n            has to be `(224, 224, 3)` (with `channels_last` data format)\n            or (3, 224, 224) (with `channels_first` data format).\n            It should have exactly 3 inputs channels,\n            and width and height should be no smaller than 32.\n            E.g. `(200, 200, 3)` would be one valid value.\n        alpha: controls the width of the network.\n            - If `alpha` < 1.0, proportionally decreases the number\n                of filters in each layer.\n            - If `alpha` > 1.0, proportionally increases the number\n                of filters in each layer.\n            - If `alpha` = 1, default number of filters from the paper\n                 are used at each layer.\n        depth_multiplier: depth multiplier for depthwise convolution\n            (also called the resolution multiplier)\n        dropout: dropout rate\n        include_top: whether to include the fully-connected\n            layer at the top of the network.\n        weights: `None` (random initialization) or\n            `imagenet` (ImageNet weights)\n        input_tensor: optional Keras tensor (i.e. output of\n            `layers.Input()`)\n            to use as image input for the model.\n        pooling: Optional pooling mode for feature extraction\n            when `include_top` is `False`.\n            - `None` means that the output of the model\n                will be the 4D tensor output of the\n                last convolutional layer.\n            - `avg` means that global average pooling\n                will be applied to the output of the\n                last convolutional layer, and thus\n                the output of the model will be a\n                2D tensor.\n            - `max` means that global max pooling will\n                be applied.\n        classes: optional number of classes to classify images\n            into, only to be specified if `include_top` is True, and\n            if no `weights` argument is specified.\n\n    # Returns\n        A Keras model instance.\n\n    # Raises\n        ValueError: in case of invalid argument for `weights`,\n            or invalid input shape.\n        RuntimeError: If attempting to run this model with a\n            backend that does not support separable convolutions.\n    \"\"\"\n\n    if K.backend() != 'tensorflow':\n        raise RuntimeError('Only Tensorflow backend is currently supported, '\n                           'as other backends do not support '\n                           'depthwise convolution.')\n\n    if weights not in {'imagenet', None}:\n        raise ValueError('The `weights` argument should be either '\n                         '`None` (random initialization) or `imagenet` '\n                         '(pre-training on ImageNet).')\n\n    if weights == 'imagenet' and include_top and classes != 1000:\n        raise ValueError('If using `weights` as ImageNet with `include_top` '\n                         'as true, `classes` should be 1000')\n\n    # Determine proper input shape.\n    input_shape = _obtain_input_shape(input_shape,\n                                      default_size=224,\n                                      min_size=32,\n                                      data_format=K.image_data_format(),\n                                      include_top=include_top or weights)\n    if K.image_data_format() == 'channels_last':\n        row_axis, col_axis = (0, 1)\n    else:\n        row_axis, col_axis = (1, 2)\n    rows = input_shape[row_axis]\n    cols = input_shape[col_axis]\n\n    if weights == 'imagenet':\n        if depth_multiplier != 1:\n            raise ValueError('If imagenet weights are being loaded, '\n                             'depth multiplier must be 1')\n\n        if alpha not in [0.25, 0.50, 0.75, 1.0]:\n            raise ValueError('If imagenet weights are being loaded, '\n                             'alpha can be one of'\n                             '`0.25`, `0.50`, `0.75` or `1.0` only.')\n\n        if rows != cols or rows not in [128, 160, 192, 224]:\n            raise ValueError('If imagenet weights are being loaded, '\n                             'input must have a static square shape (one of '\n                             '(128,128), (160,160), (192,192), or (224, 224)).'\n                             ' Input shape provided = %s' % (input_shape,))\n\n    if K.image_data_format() != 'channels_last':\n        warnings.warn('The MobileNet family of models is only available '\n                      'for the input data format \"channels_last\" '\n                      '(width, height, channels). '\n                      'However your settings specify the default '\n                      'data format \"channels_first\" (channels, width, height).'\n                      ' You should set `image_data_format=\"channels_last\"` '\n                      'in your Keras config located at ~/.keras/keras.json. '\n                      'The model being returned right now will expect inputs '\n                      'to follow the \"channels_last\" data format.')\n        K.set_image_data_format('channels_last')\n        old_data_format = 'channels_first'\n    else:\n        old_data_format = None\n\n    if input_tensor is None:\n        img_input = Input(shape=input_shape)\n    else:\n        if not K.is_keras_tensor(input_tensor):\n            img_input = Input(tensor=input_tensor, shape=input_shape)\n        else:\n            img_input = input_tensor\n\n    x = _conv_block(img_input, 32, alpha, strides=(2, 2))\n    x = _depthwise_conv_block(x, 64, alpha, depth_multiplier, block_id=1)\n\n    x = _depthwise_conv_block(x, 128, alpha, depth_multiplier,\n                              strides=(2, 2), block_id=2)\n    x = _depthwise_conv_block(x, 128, alpha, depth_multiplier, block_id=3)\n\n    x = _depthwise_conv_block(x, 256, alpha, depth_multiplier,\n                              strides=(2, 2), block_id=4)\n    x = _depthwise_conv_block(x, 256, alpha, depth_multiplier, block_id=5)\n\n    x = _depthwise_conv_block(x, 512, alpha, depth_multiplier,\n                              strides=(2, 2), block_id=6)\n    x = _depthwise_conv_block(x, 512, alpha, depth_multiplier, block_id=7)\n    x = _depthwise_conv_block(x, 512, alpha, depth_multiplier, block_id=8)\n    x = _depthwise_conv_block(x, 512, alpha, depth_multiplier, block_id=9)\n    x = _depthwise_conv_block(x, 512, alpha, depth_multiplier, block_id=10)\n    x = _depthwise_conv_block(x, 512, alpha, depth_multiplier, block_id=11)\n\n    x = _depthwise_conv_block(x, 1024, alpha, depth_multiplier,\n                              strides=(2, 2), block_id=12)\n    x = _depthwise_conv_block(x, 1024, alpha, depth_multiplier, block_id=13)\n\n    if include_top:\n        if K.image_data_format() == 'channels_first':\n            shape = (int(1024 * alpha), 1, 1)\n        else:\n            shape = (1, 1, int(1024 * alpha))\n\n        x = GlobalAveragePooling2D()(x)\n        x = Reshape(shape, name='reshape_1')(x)\n        x = Dropout(dropout, name='dropout')(x)\n        x = Conv2D(classes, (1, 1),\n                   padding='same', name='conv_preds')(x)\n        x = Activation('softmax', name='act_softmax')(x)\n        x = Reshape((classes,), name='reshape_2')(x)\n    else:\n        if pooling == 'avg':\n            x = GlobalAveragePooling2D()(x)\n        elif pooling == 'max':\n            x = GlobalMaxPooling2D()(x)\n\n    # Ensure that the model takes into account\n    # any potential predecessors of `input_tensor`.\n    if input_tensor is not None:\n        inputs = get_source_inputs(input_tensor)\n    else:\n        inputs = img_input\n\n    # Create model.\n    model = Model(inputs, x, name='mobilenet_%0.2f_%s' % (alpha, rows))\n\n    # load weights\n    if weights == 'imagenet':\n        if K.image_data_format() == 'channels_first':\n            raise ValueError('Weights for \"channels_last\" format '\n                             'are not available.')\n        if alpha == 1.0:\n            alpha_text = '1_0'\n        elif alpha == 0.75:\n            alpha_text = '7_5'\n        elif alpha == 0.50:\n            alpha_text = '5_0'\n        else:\n            alpha_text = '2_5'\n\n        if include_top:\n            model_name = 'mobilenet_%s_%d_tf.h5' % (alpha_text, rows)\n            weigh_path = BASE_WEIGHT_PATH + model_name\n            weights_path = get_file(model_name,\n                                    weigh_path,\n                                    cache_subdir='models')\n        else:\n            model_name = 'mobilenet_%s_%d_tf_no_top.h5' % (alpha_text, rows)\n            weigh_path = BASE_WEIGHT_PATH + model_name\n            weights_path = get_file(model_name,\n                                    weigh_path,\n                                    cache_subdir='models')\n        model.load_weights(weights_path)\n\n    if old_data_format:\n        K.set_image_data_format(old_data_format)\n    return model\n\n\ndef _conv_block(inputs, filters, alpha, kernel=(3, 3), strides=(1, 1)):\n    \"\"\"Adds an initial convolution layer (with batch normalization and relu6).\n\n    # Arguments\n        inputs: Input tensor of shape `(rows, cols, 3)`\n            (with `channels_last` data format) or\n            (3, rows, cols) (with `channels_first` data format).\n            It should have exactly 3 inputs channels,\n            and width and height should be no smaller than 32.\n            E.g. `(224, 224, 3)` would be one valid value.\n        filters: Integer, the dimensionality of the output space\n            (i.e. the number output of filters in the convolution).\n        alpha: controls the width of the network.\n            - If `alpha` < 1.0, proportionally decreases the number\n                of filters in each layer.\n            - If `alpha` > 1.0, proportionally increases the number\n                of filters in each layer.\n            - If `alpha` = 1, default number of filters from the paper\n                 are used at each layer.\n        kernel: An integer or tuple/list of 2 integers, specifying the\n            width and height of the 2D convolution window.\n            Can be a single integer to specify the same value for\n            all spatial dimensions.\n        strides: An integer or tuple/list of 2 integers,\n            specifying the strides of the convolution along the width and height.\n            Can be a single integer to specify the same value for\n            all spatial dimensions.\n            Specifying any stride value != 1 is incompatible with specifying\n            any `dilation_rate` value != 1.\n\n    # Input shape\n        4D tensor with shape:\n        `(samples, channels, rows, cols)` if data_format='channels_first'\n        or 4D tensor with shape:\n        `(samples, rows, cols, channels)` if data_format='channels_last'.\n\n    # Output shape\n        4D tensor with shape:\n        `(samples, filters, new_rows, new_cols)` if data_format='channels_first'\n        or 4D tensor with shape:\n        `(samples, new_rows, new_cols, filters)` if data_format='channels_last'.\n        `rows` and `cols` values might have changed due to stride.\n\n    # Returns\n        Output tensor of block.\n    \"\"\"\n    channel_axis = 1 if K.image_data_format() == 'channels_first' else -1\n    filters = int(filters * alpha)\n    x = Conv2D(filters, kernel,\n               padding='same',\n               use_bias=False,\n               strides=strides,\n               name='conv1')(inputs)\n    x = BatchNormalization(axis=channel_axis, name='conv1_bn')(x)\n    return Activation(relu6, name='conv1_relu')(x)\n\n\ndef _depthwise_conv_block(inputs, pointwise_conv_filters, alpha,\n                          depth_multiplier=1, strides=(1, 1), block_id=1):\n    \"\"\"Adds a depthwise convolution block.\n\n    A depthwise convolution block consists of a depthwise conv,\n    batch normalization, relu6, pointwise convolution,\n    batch normalization and relu6 activation.\n\n    # Arguments\n        inputs: Input tensor of shape `(rows, cols, channels)`\n            (with `channels_last` data format) or\n            (channels, rows, cols) (with `channels_first` data format).\n        pointwise_conv_filters: Integer, the dimensionality of the output space\n            (i.e. the number output of filters in the pointwise convolution).\n        alpha: controls the width of the network.\n            - If `alpha` < 1.0, proportionally decreases the number\n                of filters in each layer.\n            - If `alpha` > 1.0, proportionally increases the number\n                of filters in each layer.\n            - If `alpha` = 1, default number of filters from the paper\n                 are used at each layer.\n        depth_multiplier: The number of depthwise convolution output channels\n            for each input channel.\n            The total number of depthwise convolution output\n            channels will be equal to `filters_in * depth_multiplier`.\n        strides: An integer or tuple/list of 2 integers,\n            specifying the strides of the convolution along the width and height.\n            Can be a single integer to specify the same value for\n            all spatial dimensions.\n            Specifying any stride value != 1 is incompatible with specifying\n            any `dilation_rate` value != 1.\n        block_id: Integer, a unique identification designating the block number.\n\n    # Input shape\n        4D tensor with shape:\n        `(batch, channels, rows, cols)` if data_format='channels_first'\n        or 4D tensor with shape:\n        `(batch, rows, cols, channels)` if data_format='channels_last'.\n\n    # Output shape\n        4D tensor with shape:\n        `(batch, filters, new_rows, new_cols)` if data_format='channels_first'\n        or 4D tensor with shape:\n        `(batch, new_rows, new_cols, filters)` if data_format='channels_last'.\n        `rows` and `cols` values might have changed due to stride.\n\n    # Returns\n        Output tensor of block.\n    \"\"\"\n    channel_axis = 1 if K.image_data_format() == 'channels_first' else -1\n    pointwise_conv_filters = int(pointwise_conv_filters * alpha)\n\n    x = DepthwiseConv2D((3, 3),\n                        padding='same',\n                        depth_multiplier=depth_multiplier,\n                        strides=strides,\n                        use_bias=False,\n                        name='conv_dw_%d' % block_id)(inputs)\n    x = BatchNormalization(axis=channel_axis, name='conv_dw_%d_bn' % block_id)(x)\n    x = Activation(relu6, name='conv_dw_%d_relu' % block_id)(x)\n\n    x = Conv2D(pointwise_conv_filters, (1, 1),\n               padding='same',\n               use_bias=False,\n               strides=(1, 1),\n               name='conv_pw_%d' % block_id)(x)\n    x = BatchNormalization(axis=channel_axis, name='conv_pw_%d_bn' % block_id)(x)\n    return Activation(relu6, name='conv_pw_%d_relu' % block_id)(x)\n\n\nif __name__ == '__main__':\n    for r in [128, 160, 192, 224]:\n        for a in [0.25, 0.50, 0.75, 1.0]:\n            if r == 224:\n                model = MobileNet(include_top=True, weights='imagenet',\n                                  input_shape=(r, r, 3), alpha=a)\n\n                img_path = 'elephant.jpg'\n                img = image.load_img(img_path, target_size=(r, r))\n                x = image.img_to_array(img)\n                x = np.expand_dims(x, axis=0)\n                x = preprocess_input(x)\n                print('Input image shape:', x.shape)\n\n                preds = model.predict(x)\n                print(np.argmax(preds))\n                print('Predicted:', decode_predictions(preds, 1))\n\n            model = MobileNet(include_top=False, weights='imagenet')\n"
  },
  {
    "path": "music_tagger_crnn.py",
    "content": "# -*- coding: utf-8 -*-\n'''MusicTaggerCRNN model for Keras.\n\nCode by github.com/keunwoochoi.\n\n# Reference:\n\n- [Music-auto_tagging-keras](https://github.com/keunwoochoi/music-auto_tagging-keras)\n\n'''\nfrom __future__ import print_function\nfrom __future__ import absolute_import\n\nimport numpy as np\nfrom keras import backend as K\nfrom keras.layers import Input, Dense\nfrom keras.models import Model\nfrom keras.layers import Dense, Dropout, Reshape, Permute\nfrom keras.layers.convolutional import Convolution2D\nfrom keras.layers.convolutional import MaxPooling2D, ZeroPadding2D\nfrom keras.layers.normalization import BatchNormalization\nfrom keras.layers.advanced_activations import ELU\nfrom keras.layers.recurrent import GRU\nfrom keras.utils.data_utils import get_file\nfrom keras.utils.layer_utils import convert_all_kernels_in_model\nfrom audio_conv_utils import decode_predictions, preprocess_input\n\nTH_WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.3/music_tagger_crnn_weights_tf_kernels_th_dim_ordering.h5'\nTF_WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.3/music_tagger_crnn_weights_tf_kernels_tf_dim_ordering.h5'\n\n\ndef MusicTaggerCRNN(weights='msd', input_tensor=None,\n                    include_top=True):\n    '''Instantiate the MusicTaggerCRNN architecture,\n    optionally loading weights pre-trained\n    on Million Song Dataset. Note that when using TensorFlow,\n    for best performance you should set\n    `image_dim_ordering=\"tf\"` in your Keras config\n    at ~/.keras/keras.json.\n\n    The model and the weights are compatible with both\n    TensorFlow and Theano. The dimension ordering\n    convention used by the model is the one\n    specified in your Keras config file.\n\n    For preparing mel-spectrogram input, see\n    `audio_conv_utils.py` in [applications](https://github.com/fchollet/keras/tree/master/keras/applications).\n    You will need to install [Librosa](http://librosa.github.io/librosa/)\n    to use it.\n\n    # Arguments\n        weights: one of `None` (random initialization)\n            or \"msd\" (pre-training on ImageNet).\n        input_tensor: optional Keras tensor (i.e. output of `layers.Input()`)\n            to use as image input for the model.\n        include_top: whether to include the 1 fully-connected\n            layer (output layer) at the top of the network.\n            If False, the network outputs 32-dim features.\n\n\n    # Returns\n        A Keras model instance.\n    '''\n    if weights not in {'msd', None}:\n        raise ValueError('The `weights` argument should be either '\n                         '`None` (random initialization) or `msd` '\n                         '(pre-training on Million Song Dataset).')\n\n    # Determine proper input shape\n    if K.image_dim_ordering() == 'th':\n        input_shape = (1, 96, 1366)\n    else:\n        input_shape = (96, 1366, 1)\n\n    if input_tensor is None:\n        melgram_input = Input(shape=input_shape)\n    else:\n        if not K.is_keras_tensor(input_tensor):\n            melgram_input = Input(tensor=input_tensor, shape=input_shape)\n        else:\n            melgram_input = input_tensor\n\n    # Determine input axis\n    if K.image_dim_ordering() == 'th':\n        channel_axis = 1\n        freq_axis = 2\n        time_axis = 3\n    else:\n        channel_axis = 3\n        freq_axis = 1\n        time_axis = 2\n\n    # Input block\n    x = ZeroPadding2D(padding=(0, 37))(melgram_input)\n    x = BatchNormalization(axis=time_axis, name='bn_0_freq')(x)\n\n    # Conv block 1\n    x = Convolution2D(64, 3, 3, border_mode='same', name='conv1')(x)\n    x = BatchNormalization(axis=channel_axis, mode=0, name='bn1')(x)\n    x = ELU()(x)\n    x = MaxPooling2D(pool_size=(2, 2), strides=(2, 2), name='pool1')(x)\n\n    # Conv block 2\n    x = Convolution2D(128, 3, 3, border_mode='same', name='conv2')(x)\n    x = BatchNormalization(axis=channel_axis, mode=0, name='bn2')(x)\n    x = ELU()(x)\n    x = MaxPooling2D(pool_size=(3, 3), strides=(3, 3), name='pool2')(x)\n\n    # Conv block 3\n    x = Convolution2D(128, 3, 3, border_mode='same', name='conv3')(x)\n    x = BatchNormalization(axis=channel_axis, mode=0, name='bn3')(x)\n    x = ELU()(x)\n    x = MaxPooling2D(pool_size=(4, 4), strides=(4, 4), name='pool3')(x)\n\n    # Conv block 4\n    x = Convolution2D(128, 3, 3, border_mode='same', name='conv4')(x)\n    x = BatchNormalization(axis=channel_axis, mode=0, name='bn4')(x)\n    x = ELU()(x)\n    x = MaxPooling2D(pool_size=(4, 4), strides=(4, 4), name='pool4')(x)\n\n    # reshaping\n    if K.image_dim_ordering() == 'th':\n        x = Permute((3, 1, 2))(x)\n    x = Reshape((15, 128))(x)\n\n    # GRU block 1, 2, output\n    x = GRU(32, return_sequences=True, name='gru1')(x)\n    x = GRU(32, return_sequences=False, name='gru2')(x)\n\n    if include_top:\n        x = Dense(50, activation='sigmoid', name='output')(x)\n\n    # Create model\n    model = Model(melgram_input, x)\n    if weights is None:\n        return model\n    else:\n        # Load weights\n        if K.image_dim_ordering() == 'tf':\n            weights_path = get_file('music_tagger_crnn_weights_tf_kernels_tf_dim_ordering.h5',\n                                    TF_WEIGHTS_PATH,\n                                    cache_subdir='models')\n        else:\n            weights_path = get_file('music_tagger_crnn_weights_tf_kernels_th_dim_ordering.h5',\n                                    TH_WEIGHTS_PATH,\n                                    cache_subdir='models')\n        model.load_weights(weights_path, by_name=True)\n        if K.backend() == 'theano':\n            convert_all_kernels_in_model(model)\n        return model\n\n\nif __name__ == '__main__':\n    model = MusicTaggerCRNN(weights='msd')\n\n    audio_path = 'audio_file.mp3'\n    melgram = preprocess_input(audio_path)\n    melgrams = np.expand_dims(melgram, axis=0)\n\n    preds = model.predict(melgrams)\n    print('Predicted:')\n    print(decode_predictions(preds))\n"
  },
  {
    "path": "resnet50.py",
    "content": "# -*- coding: utf-8 -*-\n'''ResNet50 model for Keras.\n\n# Reference:\n\n- [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385)\n\nAdapted from code contributed by BigMoyan.\n'''\nfrom __future__ import print_function\n\nimport numpy as np\nimport warnings\n\nfrom keras.layers import Input\nfrom keras import layers\nfrom keras.layers import Dense\nfrom keras.layers import Activation\nfrom keras.layers import Flatten\nfrom keras.layers import Conv2D\nfrom keras.layers import MaxPooling2D\nfrom keras.layers import GlobalMaxPooling2D\nfrom keras.layers import ZeroPadding2D\nfrom keras.layers import AveragePooling2D\nfrom keras.layers import GlobalAveragePooling2D\nfrom keras.layers import BatchNormalization\nfrom keras.models import Model\nfrom keras.preprocessing import image\nimport keras.backend as K\nfrom keras.utils import layer_utils\nfrom keras.utils.data_utils import get_file\nfrom keras.applications.imagenet_utils import decode_predictions\nfrom keras.applications.imagenet_utils import preprocess_input\nfrom keras.applications.imagenet_utils import _obtain_input_shape\nfrom keras.engine.topology import get_source_inputs\n\n\nWEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_tf_dim_ordering_tf_kernels.h5'\nWEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5'\n\n\ndef identity_block(input_tensor, kernel_size, filters, stage, block):\n    \"\"\"The identity block is the block that has no conv layer at shortcut.\n\n    # Arguments\n        input_tensor: input tensor\n        kernel_size: defualt 3, the kernel size of middle conv layer at main path\n        filters: list of integers, the filterss of 3 conv layer at main path\n        stage: integer, current stage label, used for generating layer names\n        block: 'a','b'..., current block label, used for generating layer names\n\n    # Returns\n        Output tensor for the block.\n    \"\"\"\n    filters1, filters2, filters3 = filters\n    if K.image_data_format() == 'channels_last':\n        bn_axis = 3\n    else:\n        bn_axis = 1\n    conv_name_base = 'res' + str(stage) + block + '_branch'\n    bn_name_base = 'bn' + str(stage) + block + '_branch'\n\n    x = Conv2D(filters1, (1, 1), name=conv_name_base + '2a')(input_tensor)\n    x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2a')(x)\n    x = Activation('relu')(x)\n\n    x = Conv2D(filters2, kernel_size,\n               padding='same', name=conv_name_base + '2b')(x)\n    x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2b')(x)\n    x = Activation('relu')(x)\n\n    x = Conv2D(filters3, (1, 1), name=conv_name_base + '2c')(x)\n    x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2c')(x)\n\n    x = layers.add([x, input_tensor])\n    x = Activation('relu')(x)\n    return x\n\n\ndef conv_block(input_tensor, kernel_size, filters, stage, block, strides=(2, 2)):\n    \"\"\"conv_block is the block that has a conv layer at shortcut\n\n    # Arguments\n        input_tensor: input tensor\n        kernel_size: defualt 3, the kernel size of middle conv layer at main path\n        filters: list of integers, the filterss of 3 conv layer at main path\n        stage: integer, current stage label, used for generating layer names\n        block: 'a','b'..., current block label, used for generating layer names\n\n    # Returns\n        Output tensor for the block.\n\n    Note that from stage 3, the first conv layer at main path is with strides=(2,2)\n    And the shortcut should have strides=(2,2) as well\n    \"\"\"\n    filters1, filters2, filters3 = filters\n    if K.image_data_format() == 'channels_last':\n        bn_axis = 3\n    else:\n        bn_axis = 1\n    conv_name_base = 'res' + str(stage) + block + '_branch'\n    bn_name_base = 'bn' + str(stage) + block + '_branch'\n\n    x = Conv2D(filters1, (1, 1), strides=strides,\n               name=conv_name_base + '2a')(input_tensor)\n    x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2a')(x)\n    x = Activation('relu')(x)\n\n    x = Conv2D(filters2, kernel_size, padding='same',\n               name=conv_name_base + '2b')(x)\n    x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2b')(x)\n    x = Activation('relu')(x)\n\n    x = Conv2D(filters3, (1, 1), name=conv_name_base + '2c')(x)\n    x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2c')(x)\n\n    shortcut = Conv2D(filters3, (1, 1), strides=strides,\n                      name=conv_name_base + '1')(input_tensor)\n    shortcut = BatchNormalization(axis=bn_axis, name=bn_name_base + '1')(shortcut)\n\n    x = layers.add([x, shortcut])\n    x = Activation('relu')(x)\n    return x\n\n\ndef ResNet50(include_top=True, weights='imagenet',\n             input_tensor=None, input_shape=None,\n             pooling=None,\n             classes=1000):\n    \"\"\"Instantiates the ResNet50 architecture.\n\n    Optionally loads weights pre-trained\n    on ImageNet. Note that when using TensorFlow,\n    for best performance you should set\n    `image_data_format=\"channels_last\"` in your Keras config\n    at ~/.keras/keras.json.\n\n    The model and the weights are compatible with both\n    TensorFlow and Theano. The data format\n    convention used by the model is the one\n    specified in your Keras config file.\n\n    # Arguments\n        include_top: whether to include the fully-connected\n            layer at the top of the network.\n        weights: one of `None` (random initialization)\n            or \"imagenet\" (pre-training on ImageNet).\n        input_tensor: optional Keras tensor (i.e. output of `layers.Input()`)\n            to use as image input for the model.\n        input_shape: optional shape tuple, only to be specified\n            if `include_top` is False (otherwise the input shape\n            has to be `(224, 224, 3)` (with `channels_last` data format)\n            or `(3, 224, 244)` (with `channels_first` data format).\n            It should have exactly 3 inputs channels,\n            and width and height should be no smaller than 197.\n            E.g. `(200, 200, 3)` would be one valid value.\n        pooling: Optional pooling mode for feature extraction\n            when `include_top` is `False`.\n            - `None` means that the output of the model will be\n                the 4D tensor output of the\n                last convolutional layer.\n            - `avg` means that global average pooling\n                will be applied to the output of the\n                last convolutional layer, and thus\n                the output of the model will be a 2D tensor.\n            - `max` means that global max pooling will\n                be applied.\n        classes: optional number of classes to classify images\n            into, only to be specified if `include_top` is True, and\n            if no `weights` argument is specified.\n\n    # Returns\n        A Keras model instance.\n\n    # Raises\n        ValueError: in case of invalid argument for `weights`,\n            or invalid input shape.\n    \"\"\"\n    if weights not in {'imagenet', None}:\n        raise ValueError('The `weights` argument should be either '\n                         '`None` (random initialization) or `imagenet` '\n                         '(pre-training on ImageNet).')\n\n    if weights == 'imagenet' and include_top and classes != 1000:\n        raise ValueError('If using `weights` as imagenet with `include_top`'\n                         ' as true, `classes` should be 1000')\n\n    # Determine proper input shape\n    input_shape = _obtain_input_shape(input_shape,\n                                      default_size=224,\n                                      min_size=197,\n                                      data_format=K.image_data_format(),\n                                      include_top=include_top)\n\n    if input_tensor is None:\n        img_input = Input(shape=input_shape)\n    else:\n        if not K.is_keras_tensor(input_tensor):\n            img_input = Input(tensor=input_tensor, shape=input_shape)\n        else:\n            img_input = input_tensor\n    if K.image_data_format() == 'channels_last':\n        bn_axis = 3\n    else:\n        bn_axis = 1\n\n    x = ZeroPadding2D((3, 3))(img_input)\n    x = Conv2D(64, (7, 7), strides=(2, 2), name='conv1')(x)\n    x = BatchNormalization(axis=bn_axis, name='bn_conv1')(x)\n    x = Activation('relu')(x)\n    x = MaxPooling2D((3, 3), strides=(2, 2))(x)\n\n    x = conv_block(x, 3, [64, 64, 256], stage=2, block='a', strides=(1, 1))\n    x = identity_block(x, 3, [64, 64, 256], stage=2, block='b')\n    x = identity_block(x, 3, [64, 64, 256], stage=2, block='c')\n\n    x = conv_block(x, 3, [128, 128, 512], stage=3, block='a')\n    x = identity_block(x, 3, [128, 128, 512], stage=3, block='b')\n    x = identity_block(x, 3, [128, 128, 512], stage=3, block='c')\n    x = identity_block(x, 3, [128, 128, 512], stage=3, block='d')\n\n    x = conv_block(x, 3, [256, 256, 1024], stage=4, block='a')\n    x = identity_block(x, 3, [256, 256, 1024], stage=4, block='b')\n    x = identity_block(x, 3, [256, 256, 1024], stage=4, block='c')\n    x = identity_block(x, 3, [256, 256, 1024], stage=4, block='d')\n    x = identity_block(x, 3, [256, 256, 1024], stage=4, block='e')\n    x = identity_block(x, 3, [256, 256, 1024], stage=4, block='f')\n\n    x = conv_block(x, 3, [512, 512, 2048], stage=5, block='a')\n    x = identity_block(x, 3, [512, 512, 2048], stage=5, block='b')\n    x = identity_block(x, 3, [512, 512, 2048], stage=5, block='c')\n\n    x = AveragePooling2D((7, 7), name='avg_pool')(x)\n\n    if include_top:\n        x = Flatten()(x)\n        x = Dense(classes, activation='softmax', name='fc1000')(x)\n    else:\n        if pooling == 'avg':\n            x = GlobalAveragePooling2D()(x)\n        elif pooling == 'max':\n            x = GlobalMaxPooling2D()(x)\n\n    # Ensure that the model takes into account\n    # any potential predecessors of `input_tensor`.\n    if input_tensor is not None:\n        inputs = get_source_inputs(input_tensor)\n    else:\n        inputs = img_input\n    # Create model.\n    model = Model(inputs, x, name='resnet50')\n\n    # load weights\n    if weights == 'imagenet':\n        if include_top:\n            weights_path = get_file('resnet50_weights_tf_dim_ordering_tf_kernels.h5',\n                                    WEIGHTS_PATH,\n                                    cache_subdir='models',\n                                    md5_hash='a7b3fe01876f51b976af0dea6bc144eb')\n        else:\n            weights_path = get_file('resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5',\n                                    WEIGHTS_PATH_NO_TOP,\n                                    cache_subdir='models',\n                                    md5_hash='a268eb855778b3df3c7506639542a6af')\n        model.load_weights(weights_path)\n        if K.backend() == 'theano':\n            layer_utils.convert_all_kernels_in_model(model)\n\n        if K.image_data_format() == 'channels_first':\n            if include_top:\n                maxpool = model.get_layer(name='avg_pool')\n                shape = maxpool.output_shape[1:]\n                dense = model.get_layer(name='fc1000')\n                layer_utils.convert_dense_weights_data_format(dense, shape, 'channels_first')\n\n            if K.backend() == 'tensorflow':\n                warnings.warn('You are using the TensorFlow backend, yet you '\n                              'are using the Theano '\n                              'image data format convention '\n                              '(`image_data_format=\"channels_first\"`). '\n                              'For best performance, set '\n                              '`image_data_format=\"channels_last\"` in '\n                              'your Keras config '\n                              'at ~/.keras/keras.json.')\n    return model\n\n\nif __name__ == '__main__':\n    model = ResNet50(include_top=True, weights='imagenet')\n\n    img_path = 'elephant.jpg'\n    img = image.load_img(img_path, target_size=(224, 224))\n    x = image.img_to_array(img)\n    x = np.expand_dims(x, axis=0)\n    x = preprocess_input(x)\n    print('Input image shape:', x.shape)\n\n    preds = model.predict(x)\n    print('Predicted:', decode_predictions(preds))\n"
  },
  {
    "path": "vgg16.py",
    "content": "# -*- coding: utf-8 -*-\n'''VGG16 model for Keras.\n\n# Reference:\n\n- [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556)\n\n'''\nfrom __future__ import print_function\n\nimport numpy as np\nimport warnings\n\nfrom keras.models import Model\nfrom keras.layers import Flatten\nfrom keras.layers import Dense\nfrom keras.layers import Input\nfrom keras.layers import Conv2D\nfrom keras.layers import MaxPooling2D\nfrom keras.layers import GlobalMaxPooling2D\nfrom keras.layers import GlobalAveragePooling2D\nfrom keras.preprocessing import image\nfrom keras.utils import layer_utils\nfrom keras.utils.data_utils import get_file\nfrom keras import backend as K\nfrom keras.applications.imagenet_utils import decode_predictions\nfrom keras.applications.imagenet_utils import preprocess_input\nfrom keras.applications.imagenet_utils import _obtain_input_shape\nfrom keras.engine.topology import get_source_inputs\n\n\nWEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels.h5'\nWEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5'\n\n\ndef VGG16(include_top=True, weights='imagenet',\n          input_tensor=None, input_shape=None,\n          pooling=None,\n          classes=1000):\n    \"\"\"Instantiates the VGG16 architecture.\n\n    Optionally loads weights pre-trained\n    on ImageNet. Note that when using TensorFlow,\n    for best performance you should set\n    `image_data_format=\"channels_last\"` in your Keras config\n    at ~/.keras/keras.json.\n\n    The model and the weights are compatible with both\n    TensorFlow and Theano. The data format\n    convention used by the model is the one\n    specified in your Keras config file.\n\n    # Arguments\n        include_top: whether to include the 3 fully-connected\n            layers at the top of the network.\n        weights: one of `None` (random initialization)\n            or \"imagenet\" (pre-training on ImageNet).\n        input_tensor: optional Keras tensor (i.e. output of `layers.Input()`)\n            to use as image input for the model.\n        input_shape: optional shape tuple, only to be specified\n            if `include_top` is False (otherwise the input shape\n            has to be `(224, 224, 3)` (with `channels_last` data format)\n            or `(3, 224, 244)` (with `channels_first` data format).\n            It should have exactly 3 inputs channels,\n            and width and height should be no smaller than 48.\n            E.g. `(200, 200, 3)` would be one valid value.\n        pooling: Optional pooling mode for feature extraction\n            when `include_top` is `False`.\n            - `None` means that the output of the model will be\n                the 4D tensor output of the\n                last convolutional layer.\n            - `avg` means that global average pooling\n                will be applied to the output of the\n                last convolutional layer, and thus\n                the output of the model will be a 2D tensor.\n            - `max` means that global max pooling will\n                be applied.\n        classes: optional number of classes to classify images\n            into, only to be specified if `include_top` is True, and\n            if no `weights` argument is specified.\n\n    # Returns\n        A Keras model instance.\n\n    # Raises\n        ValueError: in case of invalid argument for `weights`,\n            or invalid input shape.\n    \"\"\"\n    if weights not in {'imagenet', None}:\n        raise ValueError('The `weights` argument should be either '\n                         '`None` (random initialization) or `imagenet` '\n                         '(pre-training on ImageNet).')\n\n    if weights == 'imagenet' and include_top and classes != 1000:\n        raise ValueError('If using `weights` as imagenet with `include_top`'\n                         ' as true, `classes` should be 1000')\n    # Determine proper input shape\n    input_shape = _obtain_input_shape(input_shape,\n                                      default_size=224,\n                                      min_size=48,\n                                      data_format=K.image_data_format(),\n                                      include_top=include_top)\n\n    if input_tensor is None:\n        img_input = Input(shape=input_shape)\n    else:\n        if not K.is_keras_tensor(input_tensor):\n            img_input = Input(tensor=input_tensor, shape=input_shape)\n        else:\n            img_input = input_tensor\n    # Block 1\n    x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1')(img_input)\n    x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2')(x)\n    x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)\n\n    # Block 2\n    x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1')(x)\n    x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2')(x)\n    x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)\n\n    # Block 3\n    x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv1')(x)\n    x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv2')(x)\n    x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv3')(x)\n    x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)\n\n    # Block 4\n    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv1')(x)\n    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv2')(x)\n    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv3')(x)\n    x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x)\n\n    # Block 5\n    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1')(x)\n    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2')(x)\n    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv3')(x)\n    x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x)\n\n    if include_top:\n        # Classification block\n        x = Flatten(name='flatten')(x)\n        x = Dense(4096, activation='relu', name='fc1')(x)\n        x = Dense(4096, activation='relu', name='fc2')(x)\n        x = Dense(classes, activation='softmax', name='predictions')(x)\n    else:\n        if pooling == 'avg':\n            x = GlobalAveragePooling2D()(x)\n        elif pooling == 'max':\n            x = GlobalMaxPooling2D()(x)\n\n    # Ensure that the model takes into account\n    # any potential predecessors of `input_tensor`.\n    if input_tensor is not None:\n        inputs = get_source_inputs(input_tensor)\n    else:\n        inputs = img_input\n    # Create model.\n    model = Model(inputs, x, name='vgg16')\n\n    # load weights\n    if weights == 'imagenet':\n        if include_top:\n            weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels.h5',\n                                    WEIGHTS_PATH,\n                                    cache_subdir='models')\n        else:\n            weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5',\n                                    WEIGHTS_PATH_NO_TOP,\n                                    cache_subdir='models')\n        model.load_weights(weights_path)\n        if K.backend() == 'theano':\n            layer_utils.convert_all_kernels_in_model(model)\n\n        if K.image_data_format() == 'channels_first':\n            if include_top:\n                maxpool = model.get_layer(name='block5_pool')\n                shape = maxpool.output_shape[1:]\n                dense = model.get_layer(name='fc1')\n                layer_utils.convert_dense_weights_data_format(dense, shape, 'channels_first')\n\n            if K.backend() == 'tensorflow':\n                warnings.warn('You are using the TensorFlow backend, yet you '\n                              'are using the Theano '\n                              'image data format convention '\n                              '(`image_data_format=\"channels_first\"`). '\n                              'For best performance, set '\n                              '`image_data_format=\"channels_last\"` in '\n                              'your Keras config '\n                              'at ~/.keras/keras.json.')\n    return model\n\n\nif __name__ == '__main__':\n    model = VGG16(include_top=True, weights='imagenet')\n\n    img_path = 'elephant.jpg'\n    img = image.load_img(img_path, target_size=(224, 224))\n    x = image.img_to_array(img)\n    x = np.expand_dims(x, axis=0)\n    x = preprocess_input(x)\n    print('Input image shape:', x.shape)\n\n    preds = model.predict(x)\n    print('Predicted:', decode_predictions(preds))\n"
  },
  {
    "path": "vgg19.py",
    "content": "# -*- coding: utf-8 -*-\n'''VGG19 model for Keras.\n\n# Reference:\n\n- [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556)\n\n'''\nfrom __future__ import print_function\n\nimport numpy as np\nimport warnings\n\nfrom keras.models import Model\nfrom keras.layers import Flatten, Dense, Input\nfrom keras.layers import Conv2D\nfrom keras.layers import MaxPooling2D\nfrom keras.layers import GlobalMaxPooling2D\nfrom keras.layers import GlobalAveragePooling2D\nfrom keras.preprocessing import image\nfrom keras.utils import layer_utils\nfrom keras.utils.data_utils import get_file\nfrom keras import backend as K\nfrom keras.applications.imagenet_utils import decode_predictions\nfrom keras.applications.imagenet_utils import preprocess_input\nfrom keras.applications.imagenet_utils import _obtain_input_shape\nfrom keras.engine.topology import get_source_inputs\n\n\nWEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg19_weights_tf_dim_ordering_tf_kernels.h5'\nWEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg19_weights_tf_dim_ordering_tf_kernels_notop.h5'\n\n\ndef VGG19(include_top=True, weights='imagenet',\n          input_tensor=None, input_shape=None,\n          pooling=None,\n          classes=1000):\n    \"\"\"Instantiates the VGG19 architecture.\n\n    Optionally loads weights pre-trained\n    on ImageNet. Note that when using TensorFlow,\n    for best performance you should set\n    `image_data_format=\"channels_last\"` in your Keras config\n    at ~/.keras/keras.json.\n\n    The model and the weights are compatible with both\n    TensorFlow and Theano. The data format\n    convention used by the model is the one\n    specified in your Keras config file.\n\n    # Arguments\n        include_top: whether to include the 3 fully-connected\n            layers at the top of the network.\n        weights: one of `None` (random initialization)\n            or \"imagenet\" (pre-training on ImageNet).\n        input_tensor: optional Keras tensor (i.e. output of `layers.Input()`)\n            to use as image input for the model.\n        input_shape: optional shape tuple, only to be specified\n            if `include_top` is False (otherwise the input shape\n            has to be `(224, 224, 3)` (with `channels_last` data format)\n            or `(3, 224, 244)` (with `channels_first` data format).\n            It should have exactly 3 inputs channels,\n            and width and height should be no smaller than 48.\n            E.g. `(200, 200, 3)` would be one valid value.\n        pooling: Optional pooling mode for feature extraction\n            when `include_top` is `False`.\n            - `None` means that the output of the model will be\n                the 4D tensor output of the\n                last convolutional layer.\n            - `avg` means that global average pooling\n                will be applied to the output of the\n                last convolutional layer, and thus\n                the output of the model will be a 2D tensor.\n            - `max` means that global max pooling will\n                be applied.\n        classes: optional number of classes to classify images\n            into, only to be specified if `include_top` is True, and\n            if no `weights` argument is specified.\n\n    # Returns\n        A Keras model instance.\n\n    # Raises\n        ValueError: in case of invalid argument for `weights`,\n            or invalid input shape.\n    \"\"\"\n    if weights not in {'imagenet', None}:\n        raise ValueError('The `weights` argument should be either '\n                         '`None` (random initialization) or `imagenet` '\n                         '(pre-training on ImageNet).')\n\n    if weights == 'imagenet' and include_top and classes != 1000:\n        raise ValueError('If using `weights` as imagenet with `include_top`'\n                         ' as true, `classes` should be 1000')\n    # Determine proper input shape\n    input_shape = _obtain_input_shape(input_shape,\n                                      default_size=224,\n                                      min_size=48,\n                                      data_format=K.image_data_format(),\n                                      include_top=include_top)\n\n    if input_tensor is None:\n        img_input = Input(shape=input_shape)\n    else:\n        if not K.is_keras_tensor(input_tensor):\n            img_input = Input(tensor=input_tensor, shape=input_shape)\n        else:\n            img_input = input_tensor\n    # Block 1\n    x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1')(img_input)\n    x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2')(x)\n    x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)\n\n    # Block 2\n    x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1')(x)\n    x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2')(x)\n    x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)\n\n    # Block 3\n    x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv1')(x)\n    x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv2')(x)\n    x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv3')(x)\n    x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv4')(x)\n    x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)\n\n    # Block 4\n    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv1')(x)\n    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv2')(x)\n    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv3')(x)\n    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv4')(x)\n    x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x)\n\n    # Block 5\n    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1')(x)\n    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2')(x)\n    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv3')(x)\n    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv4')(x)\n    x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x)\n\n    if include_top:\n        # Classification block\n        x = Flatten(name='flatten')(x)\n        x = Dense(4096, activation='relu', name='fc1')(x)\n        x = Dense(4096, activation='relu', name='fc2')(x)\n        x = Dense(classes, activation='softmax', name='predictions')(x)\n    else:\n        if pooling == 'avg':\n            x = GlobalAveragePooling2D()(x)\n        elif pooling == 'max':\n            x = GlobalMaxPooling2D()(x)\n\n    # Ensure that the model takes into account\n    # any potential predecessors of `input_tensor`.\n    if input_tensor is not None:\n        inputs = get_source_inputs(input_tensor)\n    else:\n        inputs = img_input\n    # Create model.\n    model = Model(inputs, x, name='vgg19')\n\n    # load weights\n    if weights == 'imagenet':\n        if include_top:\n            weights_path = get_file('vgg19_weights_tf_dim_ordering_tf_kernels.h5',\n                                    WEIGHTS_PATH,\n                                    cache_subdir='models')\n        else:\n            weights_path = get_file('vgg19_weights_tf_dim_ordering_tf_kernels_notop.h5',\n                                    WEIGHTS_PATH_NO_TOP,\n                                    cache_subdir='models')\n        model.load_weights(weights_path)\n        if K.backend() == 'theano':\n            layer_utils.convert_all_kernels_in_model(model)\n\n        if K.image_data_format() == 'channels_first':\n            if include_top:\n                maxpool = model.get_layer(name='block5_pool')\n                shape = maxpool.output_shape[1:]\n                dense = model.get_layer(name='fc1')\n                layer_utils.convert_dense_weights_data_format(dense, shape, 'channels_first')\n\n            if K.backend() == 'tensorflow':\n                warnings.warn('You are using the TensorFlow backend, yet you '\n                              'are using the Theano '\n                              'image data format convention '\n                              '(`image_data_format=\"channels_first\"`). '\n                              'For best performance, set '\n                              '`image_data_format=\"channels_last\"` in '\n                              'your Keras config '\n                              'at ~/.keras/keras.json.')\n    return model\n\n\nif __name__ == '__main__':\n    model = VGG19(include_top=True, weights='imagenet')\n\n    img_path = 'cat.jpg'\n    img = image.load_img(img_path, target_size=(224, 224))\n    x = image.img_to_array(img)\n    x = np.expand_dims(x, axis=0)\n    x = preprocess_input(x)\n    print('Input image shape:', x.shape)\n\n    preds = model.predict(x)\n    print('Predicted:', decode_predictions(preds))\n"
  },
  {
    "path": "xception.py",
    "content": "# -*- coding: utf-8 -*-\n'''Xception V1 model for Keras.\n\nOn ImageNet, this model gets to a top-1 validation accuracy of 0.790.\nand a top-5 validation accuracy of 0.945.\n\nDo note that the input image format for this model is different than for\nthe VGG16 and ResNet models (299x299 instead of 224x224),\nand that the input preprocessing function\nis also different (same as Inception V3).\n\nAlso do note that this model is only available for the TensorFlow backend,\ndue to its reliance on `SeparableConvolution` layers.\n\n# Reference:\n\n- [Xception: Deep Learning with Depthwise Separable Convolutions](https://arxiv.org/abs/1610.02357)\n\n'''\nfrom __future__ import print_function\nfrom __future__ import absolute_import\n\nimport warnings\nimport numpy as np\n\nfrom keras.preprocessing import image\n\nfrom keras.models import Model\nfrom keras import layers\nfrom keras.layers import Dense\nfrom keras.layers import Input\nfrom keras.layers import BatchNormalization\nfrom keras.layers import Activation\nfrom keras.layers import Conv2D\nfrom keras.layers import SeparableConv2D\nfrom keras.layers import MaxPooling2D\nfrom keras.layers import GlobalAveragePooling2D\nfrom keras.layers import GlobalMaxPooling2D\nfrom keras.engine.topology import get_source_inputs\nfrom keras.utils.data_utils import get_file\nfrom keras import backend as K\nfrom keras.applications.imagenet_utils import decode_predictions\nfrom keras.applications.imagenet_utils import _obtain_input_shape\n\n\nTF_WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.4/xception_weights_tf_dim_ordering_tf_kernels.h5'\nTF_WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.4/xception_weights_tf_dim_ordering_tf_kernels_notop.h5'\n\n\ndef Xception(include_top=True, weights='imagenet',\n             input_tensor=None, input_shape=None,\n             pooling=None,\n             classes=1000):\n    \"\"\"Instantiates the Xception architecture.\n\n    Optionally loads weights pre-trained\n    on ImageNet. This model is available for TensorFlow only,\n    and can only be used with inputs following the TensorFlow\n    data format `(width, height, channels)`.\n    You should set `image_data_format=\"channels_last\"` in your Keras config\n    located at ~/.keras/keras.json.\n\n    Note that the default input image size for this model is 299x299.\n\n    # Arguments\n        include_top: whether to include the fully-connected\n            layer at the top of the network.\n        weights: one of `None` (random initialization)\n            or \"imagenet\" (pre-training on ImageNet).\n        input_tensor: optional Keras tensor (i.e. output of `layers.Input()`)\n            to use as image input for the model.\n        input_shape: optional shape tuple, only to be specified\n            if `include_top` is False (otherwise the input shape\n            has to be `(299, 299, 3)`.\n            It should have exactly 3 inputs channels,\n            and width and height should be no smaller than 71.\n            E.g. `(150, 150, 3)` would be one valid value.\n        pooling: Optional pooling mode for feature extraction\n            when `include_top` is `False`.\n            - `None` means that the output of the model will be\n                the 4D tensor output of the\n                last convolutional layer.\n            - `avg` means that global average pooling\n                will be applied to the output of the\n                last convolutional layer, and thus\n                the output of the model will be a 2D tensor.\n            - `max` means that global max pooling will\n                be applied.\n        classes: optional number of classes to classify images\n            into, only to be specified if `include_top` is True, and\n            if no `weights` argument is specified.\n\n    # Returns\n        A Keras model instance.\n\n    # Raises\n        ValueError: in case of invalid argument for `weights`,\n            or invalid input shape.\n        RuntimeError: If attempting to run this model with a\n            backend that does not support separable convolutions.\n    \"\"\"\n    if weights not in {'imagenet', None}:\n        raise ValueError('The `weights` argument should be either '\n                         '`None` (random initialization) or `imagenet` '\n                         '(pre-training on ImageNet).')\n\n    if weights == 'imagenet' and include_top and classes != 1000:\n        raise ValueError('If using `weights` as imagenet with `include_top`'\n                         ' as true, `classes` should be 1000')\n\n    if K.backend() != 'tensorflow':\n        raise RuntimeError('The Xception model is only available with '\n                           'the TensorFlow backend.')\n    if K.image_data_format() != 'channels_last':\n        warnings.warn('The Xception model is only available for the '\n                      'input data format \"channels_last\" '\n                      '(width, height, channels). '\n                      'However your settings specify the default '\n                      'data format \"channels_first\" (channels, width, height). '\n                      'You should set `image_data_format=\"channels_last\"` in your Keras '\n                      'config located at ~/.keras/keras.json. '\n                      'The model being returned right now will expect inputs '\n                      'to follow the \"channels_last\" data format.')\n        K.set_image_data_format('channels_last')\n        old_data_format = 'channels_first'\n    else:\n        old_data_format = None\n\n    # Determine proper input shape\n    input_shape = _obtain_input_shape(input_shape,\n                                      default_size=299,\n                                      min_size=71,\n                                      data_format=K.image_data_format(),\n                                      include_top=include_top)\n\n    if input_tensor is None:\n        img_input = Input(shape=input_shape)\n    else:\n        if not K.is_keras_tensor(input_tensor):\n            img_input = Input(tensor=input_tensor, shape=input_shape)\n        else:\n            img_input = input_tensor\n\n    x = Conv2D(32, (3, 3), strides=(2, 2), use_bias=False, name='block1_conv1')(img_input)\n    x = BatchNormalization(name='block1_conv1_bn')(x)\n    x = Activation('relu', name='block1_conv1_act')(x)\n    x = Conv2D(64, (3, 3), use_bias=False, name='block1_conv2')(x)\n    x = BatchNormalization(name='block1_conv2_bn')(x)\n    x = Activation('relu', name='block1_conv2_act')(x)\n\n    residual = Conv2D(128, (1, 1), strides=(2, 2),\n                      padding='same', use_bias=False)(x)\n    residual = BatchNormalization()(residual)\n\n    x = SeparableConv2D(128, (3, 3), padding='same', use_bias=False, name='block2_sepconv1')(x)\n    x = BatchNormalization(name='block2_sepconv1_bn')(x)\n    x = Activation('relu', name='block2_sepconv2_act')(x)\n    x = SeparableConv2D(128, (3, 3), padding='same', use_bias=False, name='block2_sepconv2')(x)\n    x = BatchNormalization(name='block2_sepconv2_bn')(x)\n\n    x = MaxPooling2D((3, 3), strides=(2, 2), padding='same', name='block2_pool')(x)\n    x = layers.add([x, residual])\n\n    residual = Conv2D(256, (1, 1), strides=(2, 2),\n                      padding='same', use_bias=False)(x)\n    residual = BatchNormalization()(residual)\n\n    x = Activation('relu', name='block3_sepconv1_act')(x)\n    x = SeparableConv2D(256, (3, 3), padding='same', use_bias=False, name='block3_sepconv1')(x)\n    x = BatchNormalization(name='block3_sepconv1_bn')(x)\n    x = Activation('relu', name='block3_sepconv2_act')(x)\n    x = SeparableConv2D(256, (3, 3), padding='same', use_bias=False, name='block3_sepconv2')(x)\n    x = BatchNormalization(name='block3_sepconv2_bn')(x)\n\n    x = MaxPooling2D((3, 3), strides=(2, 2), padding='same', name='block3_pool')(x)\n    x = layers.add([x, residual])\n\n    residual = Conv2D(728, (1, 1), strides=(2, 2),\n                      padding='same', use_bias=False)(x)\n    residual = BatchNormalization()(residual)\n\n    x = Activation('relu', name='block4_sepconv1_act')(x)\n    x = SeparableConv2D(728, (3, 3), padding='same', use_bias=False, name='block4_sepconv1')(x)\n    x = BatchNormalization(name='block4_sepconv1_bn')(x)\n    x = Activation('relu', name='block4_sepconv2_act')(x)\n    x = SeparableConv2D(728, (3, 3), padding='same', use_bias=False, name='block4_sepconv2')(x)\n    x = BatchNormalization(name='block4_sepconv2_bn')(x)\n\n    x = MaxPooling2D((3, 3), strides=(2, 2), padding='same', name='block4_pool')(x)\n    x = layers.add([x, residual])\n\n    for i in range(8):\n        residual = x\n        prefix = 'block' + str(i + 5)\n\n        x = Activation('relu', name=prefix + '_sepconv1_act')(x)\n        x = SeparableConv2D(728, (3, 3), padding='same', use_bias=False, name=prefix + '_sepconv1')(x)\n        x = BatchNormalization(name=prefix + '_sepconv1_bn')(x)\n        x = Activation('relu', name=prefix + '_sepconv2_act')(x)\n        x = SeparableConv2D(728, (3, 3), padding='same', use_bias=False, name=prefix + '_sepconv2')(x)\n        x = BatchNormalization(name=prefix + '_sepconv2_bn')(x)\n        x = Activation('relu', name=prefix + '_sepconv3_act')(x)\n        x = SeparableConv2D(728, (3, 3), padding='same', use_bias=False, name=prefix + '_sepconv3')(x)\n        x = BatchNormalization(name=prefix + '_sepconv3_bn')(x)\n\n        x = layers.add([x, residual])\n\n    residual = Conv2D(1024, (1, 1), strides=(2, 2),\n                      padding='same', use_bias=False)(x)\n    residual = BatchNormalization()(residual)\n\n    x = Activation('relu', name='block13_sepconv1_act')(x)\n    x = SeparableConv2D(728, (3, 3), padding='same', use_bias=False, name='block13_sepconv1')(x)\n    x = BatchNormalization(name='block13_sepconv1_bn')(x)\n    x = Activation('relu', name='block13_sepconv2_act')(x)\n    x = SeparableConv2D(1024, (3, 3), padding='same', use_bias=False, name='block13_sepconv2')(x)\n    x = BatchNormalization(name='block13_sepconv2_bn')(x)\n\n    x = MaxPooling2D((3, 3), strides=(2, 2), padding='same', name='block13_pool')(x)\n    x = layers.add([x, residual])\n\n    x = SeparableConv2D(1536, (3, 3), padding='same', use_bias=False, name='block14_sepconv1')(x)\n    x = BatchNormalization(name='block14_sepconv1_bn')(x)\n    x = Activation('relu', name='block14_sepconv1_act')(x)\n\n    x = SeparableConv2D(2048, (3, 3), padding='same', use_bias=False, name='block14_sepconv2')(x)\n    x = BatchNormalization(name='block14_sepconv2_bn')(x)\n    x = Activation('relu', name='block14_sepconv2_act')(x)\n\n    if include_top:\n        x = GlobalAveragePooling2D(name='avg_pool')(x)\n        x = Dense(classes, activation='softmax', name='predictions')(x)\n    else:\n        if pooling == 'avg':\n            x = GlobalAveragePooling2D()(x)\n        elif pooling == 'max':\n            x = GlobalMaxPooling2D()(x)\n\n    # Ensure that the model takes into account\n    # any potential predecessors of `input_tensor`.\n    if input_tensor is not None:\n        inputs = get_source_inputs(input_tensor)\n    else:\n        inputs = img_input\n    # Create model.\n    model = Model(inputs, x, name='xception')\n\n    # load weights\n    if weights == 'imagenet':\n        if include_top:\n            weights_path = get_file('xception_weights_tf_dim_ordering_tf_kernels.h5',\n                                    TF_WEIGHTS_PATH,\n                                    cache_subdir='models')\n        else:\n            weights_path = get_file('xception_weights_tf_dim_ordering_tf_kernels_notop.h5',\n                                    TF_WEIGHTS_PATH_NO_TOP,\n                                    cache_subdir='models')\n        model.load_weights(weights_path)\n\n    if old_data_format:\n        K.set_image_data_format(old_data_format)\n    return model\n\n\ndef preprocess_input(x):\n    x /= 255.\n    x -= 0.5\n    x *= 2.\n    return x\n\n\nif __name__ == '__main__':\n    model = Xception(include_top=True, weights='imagenet')\n\n    img_path = 'elephant.jpg'\n    img = image.load_img(img_path, target_size=(299, 299))\n    x = image.img_to_array(img)\n    x = np.expand_dims(x, axis=0)\n    x = preprocess_input(x)\n    print('Input image shape:', x.shape)\n\n    preds = model.predict(x)\n    print(np.argmax(preds))\n    print('Predicted:', decode_predictions(preds, 1))\n"
  }
]