[
  {
    "path": ".gitignore",
    "content": "# Ignore VSCode configurations\n.vscode\n\n# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\n\n# C extensions\n*.so\n\n# Distribution / packaging\n.Python\nbuild/\ndevelop-eggs/\ndist/\ndownloads/\neggs/\n.eggs/\nlib/\nlib64/\nparts/\nsdist/\nvar/\nwheels/\n*.egg-info/\n.installed.cfg\n*.egg\nMANIFEST\n\n# PyInstaller\n#  Usually these files are written by a python script from a template\n#  before PyInstaller builds the exe, so as to inject date/other infos into it.\n*.manifest\n*.spec\n\n# Installer logs\npip-log.txt\npip-delete-this-directory.txt\n\n# Unit test / coverage reports\nhtmlcov/\n.tox/\n.coverage\n.coverage.*\n.cache\nnosetests.xml\ncoverage.xml\n*.cover\n.hypothesis/\n.pytest_cache/\n\n# Translations\n*.mo\n*.pot\n\n# Django stuff:\n*.log\nlocal_settings.py\ndb.sqlite3\n\n# Flask stuff:\ninstance/\n.webassets-cache\n\n# Scrapy stuff:\n.scrapy\n\n# Sphinx documentation\ndocs/_build/\n\n# PyBuilder\ntarget/\n\n# Jupyter Notebook\n.ipynb_checkpoints\n\n# pyenv\n.python-version\n\n# celery beat schedule file\ncelerybeat-schedule\n\n# SageMath parsed files\n*.sage.py\n\n# Environments\n.env\n.venv\nenv/\nvenv/\nENV/\nenv.bak/\nvenv.bak/\n\n# Spyder project settings\n.spyderproject\n.spyproject\n\n# Rope project settings\n.ropeproject\n\n# mkdocs documentation\n/site\n\n# mypy\n.mypy_cache/\n\n# idea pycharm data\n.idea/\n\n# cython build result\nbuild/\n"
  },
  {
    "path": "LICENSE",
    "content": "MIT License\n\nCopyright (c) 2018 Zhou Xuebin\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "Makefile",
    "content": "all:\n\tpython setup.py build_ext --inplace\n\trm -rf build\nclean:\n\trm -rf */*.pyc\n\trm -rf */*.so\n"
  },
  {
    "path": "README.md",
    "content": "# Anime-Face-Detector\nA Faster-RCNN based anime face detector.\n\nThis detector is trained on 6000 training samples and 641 testing samples, randomly selected from the dataset which is crawled from top 100 [pixiv daily ranking](https://www.pixiv.net/ranking.php?mode=daily).  \n\nThanks to [OpenCV based Anime face detector](https://github.com/nagadomi/lbpcascade_animeface) written by nagadomi, which helps labelling the data. \n\nThe original implementation of Faster-RCNN using Tensorflow can be found [here](https://github.com/endernewton/tf-faster-rcnn)\n\n## Dependencies\n- Python >= 3.6\n- `tensorflow` latest 1.x or 2.x\n- `opencv-python` (Will use other packages like `pillow` and `scikit-image` as backend in future version)\n- `cython` (optional, can be ignored with additional `-nms-type PY_NMS` argument)\n- Pre-trained ResNet101 model\n\n## Usage\n1. Clone this repository\n    ```bash\n    git clone https://github.com/qhgz2013/anime-face-detector.git\n    ```\n2. Download the pre-trained model  \n    Google Drive: [here](https://drive.google.com/open?id=1WjBgfOUqp4sdRd9BHs4TkdH2EcBtV5ri)    \n    Baidu Netdisk: [here](https://pan.baidu.com/s/1bvpCp1sbD7t9qnta8IhpmA)  \n3. Unzip the model file into `model` directory\n4. Build the CPU NMS model (skip this step if use PY_NMS with argument: `-nms-type PY_NMS`)\n    ```bash\n    make clean\n    make\n    ```\n   If using Windows Power Shell, type `cmd /C make.bat` to run build script.\n5. Run the demo as you want\n    - Visualize the result (without output path):\n        ```bash\n        python main.py -i /path/to/image.jpg\n        ```\n    - Save results to a json file\n        ```bash\n        python main.py -i /path/to/image.jpg -o /path/to/output.json\n        ```\n        Format: `{\"image_path\": [{\"score\": predicted_probability, \"bbox\": [min_x, min_y, max_x, max_y]}, ...], ...}`\n        Sample output file:\n        ```json\n        {\"/path/to/image.jpg\": [{\"score\": 0.9999708, \"bbox\": [551.3375, 314.50253, 729.2599, 485.25674]}]}\n        ```\n    - Detecting a whole directory with recursion\n        ```bash\n        python main.py -i /path/to/dir -o /path/to/output.json\n        ```\n    - Customize threshold\n        ```bash\n        python main.py -i /path/to/image.jpg -nms 0.3 -conf 0.8\n        ```\n    - Customize model path\n        ```bash\n        python main.py -i /path/to/image.jpg -model /path/to/model.ckpt\n        ```\n    - Customize nms type (supports CPU_NMS and PY_NMS, not supports GPU_NMS because of the complicated build process for Windows platform)\n        ```bash\n        python main.py -i /path/to/image.jpg -nms-type PY_NMS\n        ```\n    - Crop detected images and store them in a folder (start output is an integer to start naming the cropped images, default is 0)\n        ```bash\n        python main.py -i /path/to/image/or/folder -crop-location /path/to/store/cropped/images -start-output 1\n        ```\n    - Crop detected images and resizes them\n        ```bash\n        python main.py -i /path/to/image/or/folder -crop-location /path/to/store/cropped/images -crop-height 224 -crop-width 224\n        ```\n\n## Results\n**Mean AP for this model: 0.9086**\n\n![](./asset/sample1.png)\nCopyright info: [東方まとめ](https://www.pixiv.net/member_illust.php?mode=medium&illust_id=54275439) by [羽々斬](https://www.pixiv.net/member.php?id=2179695)\n\n![](./asset/sample2.png)\nCopyright info: [【C94】桜と刀](https://www.pixiv.net/member_illust.php?mode=medium&illust_id=69797346) by [幻像黒兎](https://www.pixiv.net/member.php?id=4462245)\n\n![](./asset/sample3.png)\nCopyright info: [アイドルマスター　シンデレラガールズ](https://www.pixiv.net/member_illust.php?mode=medium&illust_id=69753772) by [我美蘭＠１日目 東A-40a](https://www.pixiv.net/member.php?id=2003931)\n\n## About training\n\nThis model is directly trained by [Faster-RCNN](https://github.com/endernewton/tf-faster-rcnn), with following argument:\n```bash\npython tools/trainval_net.py --weight data/imagenet_weights/res101.ckpt --imdb voc_2007_trainval --imdbval voc_2007_test --iters 60000 --cfg experiments/cfgs/res101.yml --net res101 --set ANCHOR_SCALES \"[4,8,16,32]\" ANCHOR_RATIOS \"[1]\" TRAIN.STEPSIZE \"[50000]\"\n```\n\n## Dataset\n\nWe've uploaded the dataset to Google drive [here](https://drive.google.com/open?id=1nDPimhiwbAWc2diok-6davhubNVe82pr), dataset structure is similar to VOC2007 (used in original Faster-RCNN implementation).\n\n## Citation and declaration\n\nFeel free to cite this repo and dataset.  \nThis work is not related to my research team and lab, just my personal interest.\n"
  },
  {
    "path": "_tf_compat_import.py",
    "content": "__all__ = ['compat_tensorflow']\n\ndef _compat_tf_import(enable_gpu: bool = True):\n    if not enable_gpu:\n        import os\n        os.environ['CUDA_VISIBLE_DEVICES'] = '-1'\n    import tensorflow as tf\n    try:\n        tf_v1 = tf.compat.v1\n        tf_v1.disable_v2_behavior()\n        return tf_v1\n    except ImportError:\n        return tf\n\ncompat_tensorflow = _compat_tf_import()\n"
  },
  {
    "path": "faster_rcnn_wrapper.py",
    "content": "from _tf_compat_import import compat_tensorflow as tf\nfrom tf_contrib.resnet_v1 import resnet_v1_block, resnet_v1\nimport tf_contrib.slim as slim\nfrom tf_contrib.resnet_utils import arg_scope, conv2d_same\nimport numpy as np\n\n\nclass FasterRCNNSlim:\n\n    def __init__(self):\n        self._blocks = [resnet_v1_block('block1', base_depth=64, num_units=3, stride=2),\n                        resnet_v1_block('block2', base_depth=128, num_units=4, stride=2),\n                        resnet_v1_block('block3', base_depth=256, num_units=23, stride=1),\n                        resnet_v1_block('block4', base_depth=512, num_units=3, stride=1)]\n        self._image = tf.placeholder(tf.float32, shape=[1, None, None, 3])\n        self._im_info = tf.placeholder(tf.float32, shape=[3])\n\n        self._anchor_scales = [4, 8, 16, 32]\n        self._num_scales = len(self._anchor_scales)\n\n        self._anchor_ratios = [1]\n        self._num_ratios = len(self._anchor_ratios)\n\n        self._num_anchors = self._num_scales * self._num_ratios\n        self._scope = 'resnet_v1_101'\n\n        with arg_scope([slim.conv2d, slim.conv2d_in_plane, slim.conv2d_transpose, slim.separable_conv2d,\n                        slim.fully_connected],\n                       weights_regularizer=slim.l2_regularizer(0.0001),\n                       biases_regularizer=tf.no_regularizer,\n                       biases_initializer=tf.constant_initializer(0.0)):\n            # in _build_network\n            initializer = tf.random_normal_initializer(stddev=0.01)\n            initializer_bbox = tf.random_normal_initializer(stddev=0.001)\n            # in _image_to_head\n            with slim.arg_scope(self._resnet_arg_scope()):\n                # in _build_base\n                with tf.variable_scope(self._scope, self._scope):\n                    net_conv = conv2d_same(self._image, 64, 7, stride=2, scope='conv1')\n                    net_conv = tf.pad(net_conv, [[0, 0], [1, 1], [1, 1], [0, 0]])\n                    net_conv = slim.max_pool2d(net_conv, [3, 3], stride=2, padding='VALID', scope='pool1')\n                net_conv, _ = resnet_v1(net_conv, self._blocks[:-1], global_pool=False, include_root_block=False,\n                                        scope=self._scope)\n            with tf.variable_scope(self._scope, self._scope):\n                # in _anchor_component\n                with tf.variable_scope('ANCHOR-default'):\n                    height = tf.cast(tf.ceil(self._im_info[0] / 16.0), dtype=tf.int32)\n                    width = tf.cast(tf.ceil(self._im_info[1] / 16.0), dtype=tf.int32)\n\n                    shift_x = tf.range(width) * 16\n                    shift_y = tf.range(height) * 16\n                    shift_x, shift_y = tf.meshgrid(shift_x, shift_y)\n                    sx = tf.reshape(shift_x, [-1])\n                    sy = tf.reshape(shift_y, [-1])\n                    shifts = tf.transpose(tf.stack([sx, sy, sx, sy]))\n                    k = width * height\n                    shifts = tf.transpose(tf.reshape(shifts, [1, k, 4]), perm=[1, 0, 2])\n\n                    anchors = np.array([[-24, -24, 39, 39], [-56, -56, 71, 71],\n                                        [-120, -120, 135, 135], [-248, -248, 263, 263]], dtype=np.int32)\n\n                    a = anchors.shape[0]\n                    anchor_constant = tf.constant(anchors.reshape([1, a, 4]), dtype=tf.int32)\n                    length = k * a\n                    anchors_tf = tf.reshape(anchor_constant + shifts, shape=[length, 4])\n                    anchors = tf.cast(anchors_tf, dtype=tf.float32)\n                    self._anchors = anchors\n                    self._anchor_length = length\n\n                # in _region_proposal\n                rpn = slim.conv2d(net_conv, 512, [3, 3], trainable=False, weights_initializer=initializer,\n                                  scope='rpn_conv/3x3')\n                rpn_cls_score = slim.conv2d(rpn, self._num_anchors * 2, [1, 1], trainable=False,\n                                            weights_initializer=initializer, padding='VALID', activation_fn=None,\n                                            scope='rpn_cls_score')\n                rpn_cls_score_reshape = self._reshape(rpn_cls_score, 2, 'rpn_cls_score_reshape')\n                rpn_cls_prob_reshape = self._softmax(rpn_cls_score_reshape, 'rpn_cls_prob_reshape')\n                # rpn_cls_pred = tf.argmax(tf.reshape(rpn_cls_score_reshape, [-1, 2]), axis=1, name='rpn_cls_pred')\n                rpn_cls_prob = self._reshape(rpn_cls_prob_reshape, self._num_anchors * 2, 'rpn_cls_prob')\n                rpn_bbox_pred = slim.conv2d(rpn, self._num_anchors * 4, [1, 1], trainable=False,\n                                            weights_initializer=initializer, padding='VALID', activation_fn=None,\n                                            scope='rpn_bbox_pred')\n\n                # in _proposal_layer\n                with tf.variable_scope('rois'):\n                    post_nms_topn = 300\n                    nms_thresh = 0.7\n                    scores = rpn_cls_prob[:, :, :, self._num_anchors:]\n                    scores = tf.reshape(scores, [-1])\n                    rpn_bbox_pred = tf.reshape(rpn_bbox_pred, [-1, 4])\n\n                    boxes = tf.cast(self._anchors, rpn_bbox_pred.dtype)\n                    widths = boxes[:, 2] - boxes[:, 0] + 1.0\n                    heights = boxes[:, 3] - boxes[:, 1] + 1.0\n                    ctr_x = boxes[:, 0] + widths * 0.5\n                    ctr_y = boxes[:, 1] + heights * 0.5\n\n                    dx = rpn_bbox_pred[:, 0]\n                    dy = rpn_bbox_pred[:, 1]\n                    dw = rpn_bbox_pred[:, 2]\n                    dh = rpn_bbox_pred[:, 3]\n\n                    pred_ctr_x = dx * widths + ctr_x\n                    pred_ctr_y = dy * heights + ctr_y\n                    pred_w = tf.exp(dw) * widths\n                    pred_h = tf.exp(dh) * heights\n\n                    pred_boxes0 = pred_ctr_x - pred_w * 0.5\n                    pred_boxes1 = pred_ctr_y - pred_h * 0.5\n                    pred_boxes2 = pred_ctr_x + pred_w * 0.5\n                    pred_boxes3 = pred_ctr_y + pred_h * 0.5\n\n                    b0 = tf.clip_by_value(pred_boxes0, 0, self._im_info[1] - 1)\n                    b1 = tf.clip_by_value(pred_boxes1, 0, self._im_info[0] - 1)\n                    b2 = tf.clip_by_value(pred_boxes2, 0, self._im_info[1] - 1)\n                    b3 = tf.clip_by_value(pred_boxes3, 0, self._im_info[0] - 1)\n\n                    proposals = tf.stack([b0, b1, b2, b3], axis=1)\n                    indices = tf.image.non_max_suppression(proposals, scores, max_output_size=post_nms_topn,\n                                                           iou_threshold=nms_thresh)\n                    boxes = tf.cast(tf.gather(proposals, indices), dtype=tf.float32)\n                    # rpn_scores = tf.reshape(tf.gather(scores, indices), [-1, 1])\n\n                    batch_inds = tf.zeros([tf.shape(indices)[0], 1], dtype=tf.float32)\n                    rois = tf.concat([batch_inds, boxes], 1)\n\n                # in _crop_pool_layer\n                with tf.variable_scope('pool5'):\n                    batch_ids = tf.squeeze(tf.slice(rois, [0, 0], [-1, 1], name='bath_id'), [1])\n                    bottom_shape = tf.shape(net_conv)\n                    height = (tf.cast(bottom_shape[1], dtype=tf.float32) - 1) * 16.0\n                    width = (tf.cast(bottom_shape[2], dtype=tf.float32) - 1) * 16.0\n                    x1 = tf.slice(rois, [0, 1], [-1, 1], name='x1') / width\n                    y1 = tf.slice(rois, [0, 2], [-1, 1], name='y1') / height\n                    x2 = tf.slice(rois, [0, 3], [-1, 1], name='x2') / width\n                    y2 = tf.slice(rois, [0, 4], [-1, 1], name='y2') / height\n                    bboxes = tf.stop_gradient(tf.concat([y1, x1, y2, x2], 1))\n                    pool5 = tf.image.crop_and_resize(net_conv, bboxes, tf.cast(batch_ids, dtype=tf.int32), [7, 7], \n                                                     name='crops')\n            # in _head_to_tail\n            with slim.arg_scope(self._resnet_arg_scope()):\n                fc7, _ = resnet_v1(pool5, self._blocks[-1:], global_pool=False, include_root_block=False,\n                                   scope=self._scope)\n                fc7 = tf.reduce_mean(fc7, axis=[1, 2])\n            with tf.variable_scope(self._scope, self._scope):\n                # in _region_classification\n                cls_score = slim.fully_connected(fc7, 2, weights_initializer=initializer, trainable=False,\n                                                 activation_fn=None, scope='cls_score')\n                cls_prob = self._softmax(cls_score, 'cls_prob')\n                # cls_pred = tf.argmax(cls_score, 'cls_pred')\n                bbox_pred = slim.fully_connected(fc7, 2*4, weights_initializer=initializer_bbox, trainable=False,\n                                                 activation_fn=None, scope='bbox_pred')\n        self._cls_score = cls_score\n        self._cls_prob = cls_prob\n        self._bbox_pred = bbox_pred\n        self._rois = rois\n\n        stds = np.tile(np.array([0.1, 0.1, 0.2, 0.2]), 2)\n        means = np.tile(np.array([0.0, 0.0, 0.0, 0.0]), 2)\n        self._bbox_pred *= stds\n        self._bbox_pred += means\n\n    @staticmethod\n    def _resnet_arg_scope():\n        batch_norm_params = {\n            'is_training': False,\n            'decay': 0.997,\n            'epsilon': 1e-5,\n            'scale': True,\n            'trainable': False,\n            'updates_collections': tf.GraphKeys.UPDATE_OPS\n        }\n        with arg_scope([slim.conv2d],\n                       weights_regularizer=slim.l2_regularizer(0.0001),\n                       weights_initializer=slim.variance_scaling_initializer(),\n                       trainable=False,\n                       activation_fn=tf.nn.relu,\n                       normalizer_fn=slim.batch_norm,\n                       normalizer_params=batch_norm_params):\n            with arg_scope([slim.batch_norm], **batch_norm_params) as arg_sc:\n                return arg_sc\n\n    @staticmethod\n    def _reshape(bottom, num_dim, name):\n        input_shape = tf.shape(bottom)\n        with tf.variable_scope(name):\n            to_caffe = tf.transpose(bottom, [0, 3, 1, 2])\n            reshaped = tf.reshape(to_caffe, [1, num_dim, -1, input_shape[2]])\n            to_tf = tf.transpose(reshaped, [0, 2, 3, 1])\n        return to_tf\n\n    @staticmethod\n    def _softmax(bottom, name):\n        if name.startswith('rpn_cls_prob_reshape'):\n            input_shape = tf.shape(bottom)\n            bottom_reshaped = tf.reshape(bottom, [-1, input_shape[-1]])\n            reshaped_score = tf.nn.softmax(bottom_reshaped, name=name)\n            return tf.reshape(reshaped_score, input_shape)\n        return tf.nn.softmax(bottom, name=name)\n\n    def test_image(self, sess, image, im_info):\n        return sess.run([self._cls_score, self._cls_prob, self._bbox_pred, self._rois], feed_dict={\n            self._image: image,\n            self._im_info: im_info\n        })\n"
  },
  {
    "path": "main.py",
    "content": "import numpy as np\nimport cv2\nfrom faster_rcnn_wrapper import FasterRCNNSlim\nfrom _tf_compat_import import compat_tensorflow as tf\nimport argparse\nimport os\nimport json\nimport time\nfrom nms_wrapper import NMSType, NMSWrapper\n\n\ndef detect(sess, rcnn_cls, image):\n    # pre-processing image for Faster-RCNN\n    img_origin = image.astype(np.float32, copy=True)\n    img_origin -= np.array([[[102.9801, 115.9465, 112.7717]]])\n\n    img_shape = img_origin.shape\n    img_size_min = np.min(img_shape[:2])\n    img_size_max = np.max(img_shape[:2])\n\n    img_scale = 600 / img_size_min\n    if np.round(img_scale * img_size_max) > 1000:\n        img_scale = 1000 / img_size_max\n    img = cv2.resize(img_origin, None, None, img_scale, img_scale, cv2.INTER_LINEAR)\n    img_info = np.array([img.shape[0], img.shape[1], img_scale], dtype=np.float32)\n    img = np.expand_dims(img, 0)\n\n    # test image\n    _, scores, bbox_pred, rois = rcnn_cls.test_image(sess, img, img_info)\n\n    # bbox transform\n    boxes = rois[:, 1:] / img_scale\n\n    boxes = boxes.astype(bbox_pred.dtype, copy=False)\n    widths = boxes[:, 2] - boxes[:, 0] + 1\n    heights = boxes[:, 3] - boxes[:, 1] + 1\n    ctr_x = boxes[:, 0] + 0.5 * widths\n    ctr_y = boxes[:, 1] + 0.5 * heights\n    dx = bbox_pred[:, 0::4]\n    dy = bbox_pred[:, 1::4]\n    dw = bbox_pred[:, 2::4]\n    dh = bbox_pred[:, 3::4]\n    pred_ctr_x = dx * widths[:, np.newaxis] + ctr_x[:, np.newaxis]\n    pred_ctr_y = dy * heights[:, np.newaxis] + ctr_y[:, np.newaxis]\n    pred_w = np.exp(dw) * widths[:, np.newaxis]\n    pred_h = np.exp(dh) * heights[:, np.newaxis]\n    pred_boxes = np.zeros_like(bbox_pred, dtype=bbox_pred.dtype)\n    pred_boxes[:, 0::4] = pred_ctr_x - 0.5 * pred_w\n    pred_boxes[:, 1::4] = pred_ctr_y - 0.5 * pred_h\n    pred_boxes[:, 2::4] = pred_ctr_x + 0.5 * pred_w\n    pred_boxes[:, 3::4] = pred_ctr_y + 0.5 * pred_h\n    # clipping edge\n    pred_boxes[:, 0::4] = np.maximum(pred_boxes[:, 0::4], 0)\n    pred_boxes[:, 1::4] = np.maximum(pred_boxes[:, 1::4], 0)\n    pred_boxes[:, 2::4] = np.minimum(pred_boxes[:, 2::4], img_shape[1] - 1)\n    pred_boxes[:, 3::4] = np.minimum(pred_boxes[:, 3::4], img_shape[0] - 1)\n    return scores, pred_boxes\n\n\ndef load_file_from_dir(dir_path):\n    ret = []\n    for file in os.listdir(dir_path):\n        path_comb = os.path.join(dir_path, file)\n        if os.path.isdir(path_comb):\n            ret += load_file_from_dir(path_comb)\n        else:\n            ret.append(path_comb)\n    return ret\n\n\ndef fmt_time(dtime):\n    if dtime <= 0:\n        return '0:00.000'\n    elif dtime < 60:\n        return '0:%02d.%03d' % (int(dtime), int(dtime * 1000) % 1000)\n    elif dtime < 3600:\n        return '%d:%02d.%03d' % (int(dtime / 60), int(dtime) % 60, int(dtime * 1000) % 1000)\n    else:\n        return '%d:%02d:%02d.%03d' % (int(dtime / 3600), int((dtime % 3600) / 60), int(dtime) % 60,\n                                      int(dtime * 1000) % 1000)\n\n\ndef main():\n    parser = argparse.ArgumentParser(description='Anime face detector demo')\n    parser.add_argument('-i', help='The input path of an image or directory', required=True, dest='input', type=str)\n    parser.add_argument('-o', help='The output json path of the detection result', dest='output')\n    parser.add_argument('-nms', help='Change the threshold for non maximum suppression',\n                        dest='nms_thresh', default=0.3, type=float)\n    parser.add_argument('-conf', help='Change the threshold for class regression', dest='conf_thresh',\n                        default=0.8, type=float)\n    parser.add_argument('-model', help='Specify a new path for model', dest='model', type=str,\n                        default='model/res101_faster_rcnn_iter_60000.ckpt')\n    parser.add_argument('-nms-type', help='Type of nms', choices=['PY_NMS', 'CPU_NMS', 'GPU_NMS'], dest='nms_type',\n                        default='CPU_NMS')\n    parser.add_argument('-crop-location', help='The output folder to place the cropped images', dest='crop_output_image_location')\n    parser.add_argument('-start-output', help='Start the numbering of the cropped images filename', dest='start_output_number', \n                        default=0, type=int)\n    parser.add_argument('-crop-width', help='The width of images to crop', dest='crop_width', type=int)\n    parser.add_argument('-crop-height', help='The height of images to crop', dest='crop_height', type=int)\n\n    args = parser.parse_args()\n\n    assert os.path.exists(args.input), 'The input path does not exists'\n\n    if os.path.isdir(args.input):\n        files = load_file_from_dir(args.input)\n    else:\n        files = [args.input]\n    file_len = len(files)\n\n    if args.nms_type == 'PY_NMS':\n        nms_type = NMSType.PY_NMS\n    elif args.nms_type == 'CPU_NMS':\n        nms_type = NMSType.CPU_NMS\n    elif args.nms_type == 'GPU_NMS':\n        nms_type = NMSType.GPU_NMS\n    else:\n        raise ValueError('Incorrect NMS Type, not supported yet')\n\n    nms = NMSWrapper(nms_type)\n\n    cfg = tf.ConfigProto()\n    cfg.gpu_options.allow_growth = True\n    sess = tf.Session(config=cfg)\n\n    net = FasterRCNNSlim()\n    saver = tf.train.Saver()\n\n    saver.restore(sess, args.model)\n\n    result = {}\n\n    time_start = time.time()\n\n    for idx, file in enumerate(files):\n        elapsed = time.time() - time_start\n        eta = (file_len - idx) * elapsed / idx if idx > 0 else 0\n        print('[%d/%d] Elapsed: %s, ETA: %s >> %s' % (idx+1, file_len, fmt_time(elapsed), fmt_time(eta), file))\n        img = cv2.imread(file)\n        if img is None:\n            continue\n        scores, boxes = detect(sess, net, img)\n        boxes = boxes[:, 4:8]\n        scores = scores[:, 1]\n        keep = nms(np.hstack([boxes, scores[:, np.newaxis]]).astype(np.float32), args.nms_thresh)\n        boxes = boxes[keep, :]\n        scores = scores[keep]\n        inds = np.where(scores >= args.conf_thresh)[0]\n        scores = scores[inds]\n        boxes = boxes[inds, :]\n\n        result[file] = []\n        for i in range(scores.shape[0]):\n            x1, y1, x2, y2 = boxes[i, :].tolist()\n            new_result = {'score': float(scores[i]),\n                          'bbox': [x1, y1, x2, y2]}\n            result[file].append(new_result)\n\n            if args.output is None and args.crop_output_image_location is None:\n                cv2.rectangle(img, (int(x1), int(y1)), (int(x2), int(y2)), (0, 0, 255), 2)\n\n            if args.crop_output_image_location:\n                cropped_image = img[int(y1):int(y2), int(x1):int(x2)]\n\n                if args.crop_width and args.crop_height:\n                    cropped_image = cv2.resize(cropped_image, \n                                              (args.crop_width, args.crop_height), \n                                              interpolation = cv2.INTER_AREA)\n\n                cv2.imwrite(args.crop_output_image_location + str(args.start_output_number) + \".jpg\", cropped_image)\n                args.start_output_number += 1\n\n        if args.output:\n            if ((idx+1) % 1000) == 0:\n                # saving the temporary result\n                with open(args.output, 'w') as f:\n                    json.dump(result, f)\n        elif args.crop_output_image_location is None:\n            cv2.imshow(file, img)\n\n    if args.output:\n        with open(args.output, 'w') as f:\n            json.dump(result, f)\n    else:\n        cv2.waitKey()\n\n\nif __name__ == '__main__':\n    main()\n"
  },
  {
    "path": "make.bat",
    "content": "@echo off\nif /i \"%1\" == \"clean\" goto clean\ngoto all\n\n:all\npython setup.py build_ext --inplace\nrd /s /q build\n\ngoto exit\n\n\n\n:clean\ndel /f /s /q *.cpp\ndel /f /s /q *.c\ndel /f /s /q *.pyd\n\ngoto exit\n\n:exit\n"
  },
  {
    "path": "model/.gitignore",
    "content": "# all pre-trained models\n*.index\n*.data-00000-of-00001\n*.meta\n*.pkl\n"
  },
  {
    "path": "nms/.gitignore",
    "content": "*.c\n*.cpp\n"
  },
  {
    "path": "nms/__init__.py",
    "content": ""
  },
  {
    "path": "nms/cpu_nms.pyx",
    "content": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\nimport numpy as np\ncimport numpy as np\n\ncdef inline np.float32_t max(np.float32_t a, np.float32_t b):\n    return a if a >= b else b\n\ncdef inline np.float32_t min(np.float32_t a, np.float32_t b):\n    return a if a <= b else b\n\ndef cpu_nms(np.ndarray[np.float32_t, ndim=2] dets, np.float thresh):\n    cdef np.ndarray[np.float32_t, ndim=1] x1 = dets[:, 0]\n    cdef np.ndarray[np.float32_t, ndim=1] y1 = dets[:, 1]\n    cdef np.ndarray[np.float32_t, ndim=1] x2 = dets[:, 2]\n    cdef np.ndarray[np.float32_t, ndim=1] y2 = dets[:, 3]\n    cdef np.ndarray[np.float32_t, ndim=1] scores = dets[:, 4]\n\n    cdef np.ndarray[np.float32_t, ndim=1] areas = (x2 - x1 + 1) * (y2 - y1 + 1)\n    cdef np.ndarray[np.int64_t, ndim=1] order = scores.argsort()[::-1]\n\n    cdef int ndets = dets.shape[0]\n    cdef np.ndarray[np.int_t, ndim=1] suppressed = \\\n            np.zeros((ndets), dtype=np.int)\n\n    # nominal indices\n    cdef int _i, _j\n    # sorted indices\n    cdef int i, j\n    # temp variables for box i's (the box currently under consideration)\n    cdef np.float32_t ix1, iy1, ix2, iy2, iarea\n    # variables for computing overlap with box j (lower scoring box)\n    cdef np.float32_t xx1, yy1, xx2, yy2\n    cdef np.float32_t w, h\n    cdef np.float32_t inter, ovr\n\n    keep = []\n    for _i in range(ndets):\n        i = order[_i]\n        if suppressed[i] == 1:\n            continue\n        keep.append(i)\n        ix1 = x1[i]\n        iy1 = y1[i]\n        ix2 = x2[i]\n        iy2 = y2[i]\n        iarea = areas[i]\n        for _j in range(_i + 1, ndets):\n            j = order[_j]\n            if suppressed[j] == 1:\n                continue\n            xx1 = max(ix1, x1[j])\n            yy1 = max(iy1, y1[j])\n            xx2 = min(ix2, x2[j])\n            yy2 = min(iy2, y2[j])\n            w = max(0.0, xx2 - xx1 + 1)\n            h = max(0.0, yy2 - yy1 + 1)\n            inter = w * h\n            ovr = inter / (iarea + areas[j] - inter)\n            if ovr >= thresh:\n                suppressed[j] = 1\n\n    return keep\n"
  },
  {
    "path": "nms/gpu_nms.hpp",
    "content": "void _nms(int* keep_out, int* num_out, const float* boxes_host, int boxes_num,\n          int boxes_dim, float nms_overlap_thresh, int device_id);\n"
  },
  {
    "path": "nms/gpu_nms.pyx",
    "content": "# --------------------------------------------------------\n# Faster R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\nimport numpy as np\ncimport numpy as np\n\nassert sizeof(int) == sizeof(np.int32_t)\n\ncdef extern from \"gpu_nms.hpp\":\n    void _nms(np.int32_t*, int*, np.float32_t*, int, int, float, int)\n\ndef gpu_nms(np.ndarray[np.float32_t, ndim=2] dets, np.float thresh,\n            np.int32_t device_id=0):\n    cdef int boxes_num = dets.shape[0]\n    cdef int boxes_dim = dets.shape[1]\n    cdef int num_out\n    cdef np.ndarray[np.int32_t, ndim=1] \\\n        keep = np.zeros(boxes_num, dtype=np.int32)\n    cdef np.ndarray[np.float32_t, ndim=1] \\\n        scores = dets[:, 4]\n    cdef np.ndarray[np.int64_t, ndim=1] \\\n        order = scores.argsort()[::-1]\n    cdef np.ndarray[np.float32_t, ndim=2] \\\n        sorted_dets = dets[order, :]\n    _nms(&keep[0], &num_out, &sorted_dets[0, 0], boxes_num, boxes_dim, thresh, device_id)\n    keep = keep[:num_out]\n    return list(order[keep])\n"
  },
  {
    "path": "nms/nms_kernel.cu",
    "content": "// ------------------------------------------------------------------\n// Faster R-CNN\n// Copyright (c) 2015 Microsoft\n// Licensed under The MIT License [see fast-rcnn/LICENSE for details]\n// Written by Shaoqing Ren\n// ------------------------------------------------------------------\n\n#include \"gpu_nms.hpp\"\n#include <vector>\n#include <iostream>\n\n#define CUDA_CHECK(condition) \\\n  /* Code block avoids redefinition of cudaError_t error */ \\\n  do { \\\n    cudaError_t error = condition; \\\n    if (error != cudaSuccess) { \\\n      std::cout << cudaGetErrorString(error) << std::endl; \\\n    } \\\n  } while (0)\n\n#define DIVUP(m,n) ((m) / (n) + ((m) % (n) > 0))\nint const threadsPerBlock = sizeof(unsigned long long) * 8;\n\n__device__ inline float devIoU(float const * const a, float const * const b) {\n  float left = max(a[0], b[0]), right = min(a[2], b[2]);\n  float top = max(a[1], b[1]), bottom = min(a[3], b[3]);\n  float width = max(right - left + 1, 0.f), height = max(bottom - top + 1, 0.f);\n  float interS = width * height;\n  float Sa = (a[2] - a[0] + 1) * (a[3] - a[1] + 1);\n  float Sb = (b[2] - b[0] + 1) * (b[3] - b[1] + 1);\n  return interS / (Sa + Sb - interS);\n}\n\n__global__ void nms_kernel(const int n_boxes, const float nms_overlap_thresh,\n                           const float *dev_boxes, unsigned long long *dev_mask) {\n  const int row_start = blockIdx.y;\n  const int col_start = blockIdx.x;\n\n  // if (row_start > col_start) return;\n\n  const int row_size =\n        min(n_boxes - row_start * threadsPerBlock, threadsPerBlock);\n  const int col_size =\n        min(n_boxes - col_start * threadsPerBlock, threadsPerBlock);\n\n  __shared__ float block_boxes[threadsPerBlock * 5];\n  if (threadIdx.x < col_size) {\n    block_boxes[threadIdx.x * 5 + 0] =\n        dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 0];\n    block_boxes[threadIdx.x * 5 + 1] =\n        dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 1];\n    block_boxes[threadIdx.x * 5 + 2] =\n        dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 2];\n    block_boxes[threadIdx.x * 5 + 3] =\n        dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 3];\n    block_boxes[threadIdx.x * 5 + 4] =\n        dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 4];\n  }\n  __syncthreads();\n\n  if (threadIdx.x < row_size) {\n    const int cur_box_idx = threadsPerBlock * row_start + threadIdx.x;\n    const float *cur_box = dev_boxes + cur_box_idx * 5;\n    int i = 0;\n    unsigned long long t = 0;\n    int start = 0;\n    if (row_start == col_start) {\n      start = threadIdx.x + 1;\n    }\n    for (i = start; i < col_size; i++) {\n      if (devIoU(cur_box, block_boxes + i * 5) > nms_overlap_thresh) {\n        t |= 1ULL << i;\n      }\n    }\n    const int col_blocks = DIVUP(n_boxes, threadsPerBlock);\n    dev_mask[cur_box_idx * col_blocks + col_start] = t;\n  }\n}\n\nvoid _set_device(int device_id) {\n  int current_device;\n  CUDA_CHECK(cudaGetDevice(&current_device));\n  if (current_device == device_id) {\n    return;\n  }\n  // The call to cudaSetDevice must come before any calls to Get, which\n  // may perform initialization using the GPU.\n  CUDA_CHECK(cudaSetDevice(device_id));\n}\n\nvoid _nms(int* keep_out, int* num_out, const float* boxes_host, int boxes_num,\n          int boxes_dim, float nms_overlap_thresh, int device_id) {\n  _set_device(device_id);\n\n  float* boxes_dev = NULL;\n  unsigned long long* mask_dev = NULL;\n\n  const int col_blocks = DIVUP(boxes_num, threadsPerBlock);\n\n  CUDA_CHECK(cudaMalloc(&boxes_dev,\n                        boxes_num * boxes_dim * sizeof(float)));\n  CUDA_CHECK(cudaMemcpy(boxes_dev,\n                        boxes_host,\n                        boxes_num * boxes_dim * sizeof(float),\n                        cudaMemcpyHostToDevice));\n\n  CUDA_CHECK(cudaMalloc(&mask_dev,\n                        boxes_num * col_blocks * sizeof(unsigned long long)));\n\n  dim3 blocks(DIVUP(boxes_num, threadsPerBlock),\n              DIVUP(boxes_num, threadsPerBlock));\n  dim3 threads(threadsPerBlock);\n  nms_kernel<<<blocks, threads>>>(boxes_num,\n                                  nms_overlap_thresh,\n                                  boxes_dev,\n                                  mask_dev);\n\n  std::vector<unsigned long long> mask_host(boxes_num * col_blocks);\n  CUDA_CHECK(cudaMemcpy(&mask_host[0],\n                        mask_dev,\n                        sizeof(unsigned long long) * boxes_num * col_blocks,\n                        cudaMemcpyDeviceToHost));\n\n  std::vector<unsigned long long> remv(col_blocks);\n  memset(&remv[0], 0, sizeof(unsigned long long) * col_blocks);\n\n  int num_to_keep = 0;\n  for (int i = 0; i < boxes_num; i++) {\n    int nblock = i / threadsPerBlock;\n    int inblock = i % threadsPerBlock;\n\n    if (!(remv[nblock] & (1ULL << inblock))) {\n      keep_out[num_to_keep++] = i;\n      unsigned long long *p = &mask_host[0] + i * col_blocks;\n      for (int j = nblock; j < col_blocks; j++) {\n        remv[j] |= p[j];\n      }\n    }\n  }\n  *num_out = num_to_keep;\n\n  CUDA_CHECK(cudaFree(boxes_dev));\n  CUDA_CHECK(cudaFree(mask_dev));\n}\n"
  },
  {
    "path": "nms/py_cpu_nms.py",
    "content": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\nimport numpy as np\n\ndef py_cpu_nms(dets, thresh):\n    \"\"\"Pure Python NMS baseline.\"\"\"\n    x1 = dets[:, 0]\n    y1 = dets[:, 1]\n    x2 = dets[:, 2]\n    y2 = dets[:, 3]\n    scores = dets[:, 4]\n\n    areas = (x2 - x1 + 1) * (y2 - y1 + 1)\n    order = scores.argsort()[::-1]\n\n    keep = []\n    while order.size > 0:\n        i = order[0]\n        keep.append(i)\n        xx1 = np.maximum(x1[i], x1[order[1:]])\n        yy1 = np.maximum(y1[i], y1[order[1:]])\n        xx2 = np.minimum(x2[i], x2[order[1:]])\n        yy2 = np.minimum(y2[i], y2[order[1:]])\n\n        w = np.maximum(0.0, xx2 - xx1 + 1)\n        h = np.maximum(0.0, yy2 - yy1 + 1)\n        inter = w * h\n        ovr = inter / (areas[i] + areas[order[1:]] - inter)\n\n        inds = np.where(ovr <= thresh)[0]\n        order = order[inds + 1]\n\n    return keep\n"
  },
  {
    "path": "nms_wrapper.py",
    "content": "from enum import Enum\n\n\nclass NMSType(Enum):\n    PY_NMS = 1\n    CPU_NMS = 2\n    GPU_NMS = 3\n\n\ndefault_nms_type = NMSType.PY_NMS\n\n\nclass NMSWrapper:\n    def __init__(self, nms_type=default_nms_type):\n        assert type(nms_type) == NMSType\n        if nms_type == NMSType.PY_NMS:\n            from nms.py_cpu_nms import py_cpu_nms\n            self._nms = py_cpu_nms\n        elif nms_type == NMSType.CPU_NMS:\n            from nms.cpu_nms import cpu_nms\n            self._nms = cpu_nms\n        elif nms_type == NMSType.GPU_NMS:\n            from nms.gpu_nms import gpu_nms\n            self._nms = gpu_nms\n        else:\n            raise ValueError('current nms type is not implemented yet')\n\n    def __call__(self, *args, **kwargs):\n        return self._nms(*args, **kwargs)\n"
  },
  {
    "path": "setup.py",
    "content": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under The MIT License [see LICENSE for details]\n# Written by Ross Girshick\n# --------------------------------------------------------\n\nimport os\nfrom os.path import join as pjoin\nimport numpy as np\nfrom distutils.core import setup\nfrom distutils.extension import Extension\nfrom Cython.Distutils import build_ext\nimport sys\n\n\n# Obtain the numpy include directory.  This logic works across numpy versions.\ntry:\n    numpy_include = np.get_include()\nexcept AttributeError:\n    numpy_include = np.get_numpy_include()\n\n# run the customize_compiler\nclass custom_build_ext(build_ext):\n    def build_extensions(self):\n        build_ext.build_extensions(self)\n\next_modules = [\n    Extension(\n        \"nms.cpu_nms\",\n        [\"nms/cpu_nms.pyx\"],\n        extra_compile_args=[\"-Wno-cpp\", \"-Wno-unused-function\"] if sys.platform == 'linux' else [],\n        include_dirs = [numpy_include]\n    )\n]\n\nsetup(\n    name='tf_faster_rcnn',\n    ext_modules=ext_modules,\n    # inject our custom trigger\n    cmdclass={'build_ext': custom_build_ext},\n)\n"
  },
  {
    "path": "tf_contrib/README.md",
    "content": "Since tensorflow 2.0 no longer supports contrib module, I decided to freeze the related contrib code segments into this directory.\nThe files in this directory are extracted from tensorflow.contrib module of version 1.15.3 (with imports fixed, and some unused codes that causes some errors are removed).\n"
  },
  {
    "path": "tf_contrib/arg_scope.py",
    "content": "# Copyright 2016 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\"\"\"Contains the arg_scope used for scoping layers arguments.\n\n  Allows one to define models much more compactly by eliminating boilerplate\n  code. This is accomplished through the use of argument scoping (arg_scope).\n\n  Example of how to use tf.contrib.framework.arg_scope:\n\n  ```\n  from third_party.tensorflow.contrib.layers.python import layers\n\n  arg_scope = tf.contrib.framework.arg_scope\n\n  with arg_scope([layers.conv2d], padding='SAME',\n                 initializer=layers.variance_scaling_initializer(),\n                 regularizer=layers.l2_regularizer(0.05)):\n    net = layers.conv2d(inputs, 64, [11, 11], 4, padding='VALID', scope='conv1')\n    net = layers.conv2d(net, 256, [5, 5], scope='conv2')\n  ```\n  The first call to conv2d will behave as follows:\n    layers.conv2d(inputs, 64, [11, 11], 4, padding='VALID',\n                  initializer=layers.variance_scaling_initializer(),\n                  regularizer=layers.l2_regularizer(0.05), scope='conv1')\n\n  The second call to conv2d will also use the arg_scope's default for padding:\n    layers.conv2d(inputs, 256, [5, 5], padding='SAME',\n                  initializer=layers.variance_scaling_initializer(),\n                  regularizer=layers.l2_regularizer(0.05), scope='conv2')\n\n  Example of how to reuse an arg_scope:\n\n  ```\n  with arg_scope([layers.conv2d], padding='SAME',\n                 initializer=layers.variance_scaling_initializer(),\n                 regularizer=layers.l2_regularizer(0.05)) as sc:\n    net = layers.conv2d(net, 256, [5, 5], scope='conv1')\n    ....\n\n  with arg_scope(sc):\n    net = layers.conv2d(net, 256, [5, 5], scope='conv2')\n  ```\n\n  Example of how to use tf.contrib.framework.add_arg_scope to enable your\n  function to be called within an arg_scope later:\n\n  @tf.contrib.framework.add_arg_scope\n  def conv2d(*args, **kwargs)\n\"\"\"\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nfrom tensorflow.python.util import tf_contextlib\nfrom tensorflow.python.util import tf_decorator\n\n__all__ = [\n    'arg_scope', 'add_arg_scope', 'current_arg_scope', 'has_arg_scope',\n    'arg_scoped_arguments', 'arg_scope_func_key'\n]\n\n_ARGSTACK = [{}]\n\n_DECORATED_OPS = {}\n\n\ndef _get_arg_stack():\n  if _ARGSTACK:\n    return _ARGSTACK\n  else:\n    _ARGSTACK.append({})\n    return _ARGSTACK\n\n\ndef current_arg_scope():\n  stack = _get_arg_stack()\n  return stack[-1]\n\n\ndef arg_scope_func_key(op):\n  return getattr(op, '_key_op', str(op))\n\n\ndef _name_op(op):\n  return (op.__module__, op.__name__)\n\n\ndef _kwarg_names(func):\n  kwargs_length = len(func.__defaults__) if func.__defaults__ else 0\n  return func.__code__.co_varnames[-kwargs_length:func.__code__.co_argcount]\n\n\ndef _add_op(op):\n  key_op = arg_scope_func_key(op)\n  _DECORATED_OPS[key_op] = _kwarg_names(op)\n\n\n@tf_contextlib.contextmanager\ndef arg_scope(list_ops_or_scope, **kwargs):\n  \"\"\"Stores the default arguments for the given set of list_ops.\n\n  For usage, please see examples at top of the file.\n\n  Args:\n    list_ops_or_scope: List or tuple of operations to set argument scope for or\n      a dictionary containing the current scope. When list_ops_or_scope is a\n      dict, kwargs must be empty. When list_ops_or_scope is a list or tuple,\n      then every op in it need to be decorated with @add_arg_scope to work.\n    **kwargs: keyword=value that will define the defaults for each op in\n              list_ops. All the ops need to accept the given set of arguments.\n\n  Yields:\n    the current_scope, which is a dictionary of {op: {arg: value}}\n  Raises:\n    TypeError: if list_ops is not a list or a tuple.\n    ValueError: if any op in list_ops has not be decorated with @add_arg_scope.\n  \"\"\"\n  if isinstance(list_ops_or_scope, dict):\n    # Assumes that list_ops_or_scope is a scope that is being reused.\n    if kwargs:\n      raise ValueError('When attempting to re-use a scope by suppling a'\n                       'dictionary, kwargs must be empty.')\n    current_scope = list_ops_or_scope.copy()\n    try:\n      _get_arg_stack().append(current_scope)\n      yield current_scope\n    finally:\n      _get_arg_stack().pop()\n  else:\n    # Assumes that list_ops_or_scope is a list/tuple of ops with kwargs.\n    if not isinstance(list_ops_or_scope, (list, tuple)):\n      raise TypeError('list_ops_or_scope must either be a list/tuple or reused '\n                      'scope (i.e. dict)')\n    try:\n      current_scope = current_arg_scope().copy()\n      for op in list_ops_or_scope:\n        key = arg_scope_func_key(op)\n        if not has_arg_scope(op):\n          raise ValueError('%s is not decorated with @add_arg_scope',\n                           _name_op(op))\n        if key in current_scope:\n          current_kwargs = current_scope[key].copy()\n          current_kwargs.update(kwargs)\n          current_scope[key] = current_kwargs\n        else:\n          current_scope[key] = kwargs.copy()\n      _get_arg_stack().append(current_scope)\n      yield current_scope\n    finally:\n      _get_arg_stack().pop()\n\n\ndef add_arg_scope(func):\n  \"\"\"Decorates a function with args so it can be used within an arg_scope.\n\n  Args:\n    func: function to decorate.\n\n  Returns:\n    A tuple with the decorated function func_with_args().\n  \"\"\"\n\n  def func_with_args(*args, **kwargs):\n    current_scope = current_arg_scope()\n    current_args = kwargs\n    key_func = arg_scope_func_key(func)\n    if key_func in current_scope:\n      current_args = current_scope[key_func].copy()\n      current_args.update(kwargs)\n    return func(*args, **current_args)\n\n  _add_op(func)\n  setattr(func_with_args, '_key_op', arg_scope_func_key(func))\n  return tf_decorator.make_decorator(func, func_with_args)\n\n\ndef has_arg_scope(func):\n  \"\"\"Checks whether a func has been decorated with @add_arg_scope or not.\n\n  Args:\n    func: function to check.\n\n  Returns:\n    a boolean.\n  \"\"\"\n  return arg_scope_func_key(func) in _DECORATED_OPS\n\n\ndef arg_scoped_arguments(func):\n  \"\"\"Returns the list kwargs that arg_scope can set for a func.\n\n  Args:\n    func: function which has been decorated with @add_arg_scope.\n\n  Returns:\n    a list of kwargs names.\n  \"\"\"\n  assert has_arg_scope(func)\n  return _DECORATED_OPS[arg_scope_func_key(func)]\n"
  },
  {
    "path": "tf_contrib/initializers.py",
    "content": "# Copyright 2015 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\"\"\"Weight initializers for use with layers.\"\"\"\n\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport math\n\nfrom tensorflow.python.framework import dtypes\nfrom tensorflow.python.ops import random_ops\n\n\n__all__ = ['xavier_initializer', 'xavier_initializer_conv2d',\n           'variance_scaling_initializer']\n\n\ndef xavier_initializer(uniform=True, seed=None, dtype=dtypes.float32):\n  \"\"\"Returns an initializer performing \"Xavier\" initialization for weights.\n\n  This function implements the weight initialization from:\n\n  Xavier Glorot and Yoshua Bengio (2010):\n           [Understanding the difficulty of training deep feedforward neural\n           networks. International conference on artificial intelligence and\n           statistics.](\n           http://www.jmlr.org/proceedings/papers/v9/glorot10a/glorot10a.pdf)\n\n  This initializer is designed to keep the scale of the gradients roughly the\n  same in all layers. In uniform distribution this ends up being the range:\n  `x = sqrt(6. / (in + out)); [-x, x]` and for normal distribution a standard\n  deviation of `sqrt(2. / (in + out))` is used.\n\n  Args:\n    uniform: Whether to use uniform or normal distributed random initialization.\n    seed: A Python integer. Used to create random seeds. See\n          `tf.compat.v1.set_random_seed` for behavior.\n    dtype: The data type. Only floating point types are supported.\n\n  Returns:\n    An initializer for a weight matrix.\n  \"\"\"\n  return variance_scaling_initializer(factor=1.0, mode='FAN_AVG',\n                                      uniform=uniform, seed=seed, dtype=dtype)\n\nxavier_initializer_conv2d = xavier_initializer\n\n\ndef variance_scaling_initializer(factor=2.0, mode='FAN_IN', uniform=False,\n                                 seed=None, dtype=dtypes.float32):\n  \"\"\"Returns an initializer that generates tensors without scaling variance.\n\n  When initializing a deep network, it is in principle advantageous to keep\n  the scale of the input variance constant, so it does not explode or diminish\n  by reaching the final layer. This initializer use the following formula:\n\n  ```python\n    if mode='FAN_IN': # Count only number of input connections.\n      n = fan_in\n    elif mode='FAN_OUT': # Count only number of output connections.\n      n = fan_out\n    elif mode='FAN_AVG': # Average number of inputs and output connections.\n      n = (fan_in + fan_out)/2.0\n\n      truncated_normal(shape, 0.0, stddev=sqrt(factor / n))\n  ```\n\n  * To get [Delving Deep into Rectifiers](\n     http://arxiv.org/pdf/1502.01852v1.pdf) (also know as the \"MSRA \n     initialization\"), use (Default):<br/>\n    `factor=2.0 mode='FAN_IN' uniform=False`\n  * To get [Convolutional Architecture for Fast Feature Embedding](\n     http://arxiv.org/abs/1408.5093), use:<br/>\n    `factor=1.0 mode='FAN_IN' uniform=True`\n  * To get [Understanding the difficulty of training deep feedforward neural\n    networks](http://jmlr.org/proceedings/papers/v9/glorot10a/glorot10a.pdf),\n    use:<br/>\n    `factor=1.0 mode='FAN_AVG' uniform=True.`\n  * To get `xavier_initializer` use either:<br/>\n    `factor=1.0 mode='FAN_AVG' uniform=True`, or<br/>\n    `factor=1.0 mode='FAN_AVG' uniform=False`.\n\n  Args:\n    factor: Float.  A multiplicative factor.\n    mode: String.  'FAN_IN', 'FAN_OUT', 'FAN_AVG'.\n    uniform: Whether to use uniform or normal distributed random initialization.\n    seed: A Python integer. Used to create random seeds. See\n          `tf.compat.v1.set_random_seed` for behavior.\n    dtype: The data type. Only floating point types are supported.\n\n  Returns:\n    An initializer that generates tensors with unit variance.\n\n  Raises:\n    ValueError: if `dtype` is not a floating point type.\n    TypeError: if `mode` is not in ['FAN_IN', 'FAN_OUT', 'FAN_AVG'].\n  \"\"\"\n  if not dtype.is_floating:\n    raise TypeError('Cannot create initializer for non-floating point type.')\n  if mode not in ['FAN_IN', 'FAN_OUT', 'FAN_AVG']:\n    raise TypeError('Unknown mode %s [FAN_IN, FAN_OUT, FAN_AVG]', mode)\n\n  # pylint: disable=unused-argument\n  def _initializer(shape, dtype=dtype, partition_info=None):\n    \"\"\"Initializer function.\"\"\"\n    if not dtype.is_floating:\n      raise TypeError('Cannot create initializer for non-floating point type.')\n    # Estimating fan_in and fan_out is not possible to do perfectly, but we try.\n    # This is the right thing for matrix multiply and convolutions.\n    if shape:\n      fan_in = float(shape[-2]) if len(shape) > 1 else float(shape[-1])\n      fan_out = float(shape[-1])\n    else:\n      fan_in = 1.0\n      fan_out = 1.0\n    for dim in shape[:-2]:\n      fan_in *= float(dim)\n      fan_out *= float(dim)\n    if mode == 'FAN_IN':\n      # Count only number of input connections.\n      n = fan_in\n    elif mode == 'FAN_OUT':\n      # Count only number of output connections.\n      n = fan_out\n    elif mode == 'FAN_AVG':\n      # Average number of inputs and output connections.\n      n = (fan_in + fan_out) / 2.0\n    if uniform:\n      # To get stddev = math.sqrt(factor / n) need to adjust for uniform.\n      limit = math.sqrt(3.0 * factor / n)\n      return random_ops.random_uniform(shape, -limit, limit,\n                                       dtype, seed=seed)\n    else:\n      # To get stddev = math.sqrt(factor / n) need to adjust for truncated.\n      trunc_stddev = math.sqrt(1.3 * factor / n)\n      return random_ops.truncated_normal(shape, 0.0, trunc_stddev, dtype,\n                                         seed=seed)\n  # pylint: enable=unused-argument\n\n  return _initializer\n"
  },
  {
    "path": "tf_contrib/layers.py",
    "content": "# -*- coding: utf-8 -*-\n# Copyright 2016 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\n# pylint: disable=g-short-docstring-punctuation\n\"\"\"Higher level ops for building layers.\"\"\"\n\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport functools\nimport six\n\nfrom .arg_scope import add_arg_scope\nfrom . import variables\nfrom . import initializers\nfrom . import utils\nfrom tensorflow.python.eager import context\nfrom tensorflow.python.framework import constant_op\nfrom tensorflow.python.framework import dtypes\nfrom tensorflow.python.framework import function\nfrom tensorflow.python.framework import ops\nfrom tensorflow.python.framework import sparse_tensor\nfrom tensorflow.python.framework import tensor_shape\nfrom tensorflow.python.keras.engine import input_spec\nfrom tensorflow.python.layers import base\nfrom tensorflow.python.layers import convolutional as convolutional_layers\nfrom tensorflow.python.layers import core as core_layers\nfrom tensorflow.python.layers import normalization as normalization_layers\nfrom tensorflow.python.layers import pooling as pooling_layers\nfrom tensorflow.python.ops import array_ops\nfrom tensorflow.python.ops import check_ops\nfrom tensorflow.python.ops import init_ops\nfrom tensorflow.python.ops import linalg_ops\nfrom tensorflow.python.ops import math_ops\nfrom tensorflow.python.ops import nn\nfrom tensorflow.python.ops import sparse_ops\nfrom tensorflow.python.ops import standard_ops\nfrom tensorflow.python.ops import variable_scope\nfrom tensorflow.python.ops import variables as tf_variables\nfrom tensorflow.python.training import moving_averages\n\n# TODO(b/28426988): Replace legacy_* fns migrated from slim.\n# TODO(b/28426988): Remove legacy_* when all uses have migrated to new API.\n__all__ = [\n    'avg_pool2d', 'avg_pool3d', 'batch_norm', 'bias_add', 'conv1d', 'conv2d',\n    'conv3d', 'conv2d_in_plane', 'conv2d_transpose', 'conv3d_transpose',\n    'convolution', 'convolution1d', 'convolution2d', 'convolution2d_in_plane',\n    'convolution2d_transpose', 'convolution3d', 'convolution3d_transpose',\n    'dense_to_sparse', 'dropout', 'elu', 'flatten', 'fully_connected',\n    'images_to_sequence', 'layer_norm', 'linear', 'pool', 'max_pool2d',\n    'max_pool3d', 'one_hot_encoding', 'relu', 'relu6', 'repeat',\n    'scale_gradient', 'separable_conv2d', 'separable_convolution2d',\n    'sequence_to_images', 'softmax', 'spatial_softmax', 'stack', 'unit_norm',\n    'legacy_fully_connected', 'legacy_linear', 'legacy_relu', 'maxout'\n]\n\nDATA_FORMAT_NCHW = 'NCHW'\nDATA_FORMAT_NHWC = 'NHWC'\nDATA_FORMAT_NCDHW = 'NCDHW'\nDATA_FORMAT_NDHWC = 'NDHWC'\n\n\n@add_arg_scope\ndef avg_pool2d(inputs,\n               kernel_size,\n               stride=2,\n               padding='VALID',\n               data_format=DATA_FORMAT_NHWC,\n               outputs_collections=None,\n               scope=None):\n  \"\"\"Adds a 2D average pooling op.\n\n  It is assumed that the pooling is done per image but not in batch or channels.\n\n  Args:\n    inputs: A 4-D tensor of shape `[batch_size, height, width, channels]` if\n      `data_format` is `NHWC`, and `[batch_size, channels, height, width]` if\n      `data_format` is `NCHW`.\n    kernel_size: A list of length 2: [kernel_height, kernel_width] of the\n      pooling kernel over which the op is computed. Can be an int if both values\n      are the same.\n    stride: A list of length 2: [stride_height, stride_width]. Can be an int if\n      both strides are the same. Note that presently both strides must have the\n      same value.\n    padding: The padding method, either 'VALID' or 'SAME'.\n    data_format: A string. `NHWC` (default) and `NCHW` are supported.\n    outputs_collections: The collections to which the outputs are added.\n    scope: Optional scope for name_scope.\n\n  Returns:\n    A `Tensor` representing the results of the pooling operation.\n\n  Raises:\n    ValueError: If `data_format` is neither `NHWC` nor `NCHW`.\n  \"\"\"\n  if data_format not in (DATA_FORMAT_NCHW, DATA_FORMAT_NHWC):\n    raise ValueError('data_format has to be either NCHW or NHWC.')\n  with ops.name_scope(scope, 'AvgPool2D', [inputs]) as sc:\n    inputs = ops.convert_to_tensor(inputs)\n    df = ('channels_first'\n          if data_format and data_format.startswith('NC') else 'channels_last')\n    layer = pooling_layers.AveragePooling2D(\n        pool_size=kernel_size,\n        strides=stride,\n        padding=padding,\n        data_format=df,\n        _scope=sc)\n    outputs = layer.apply(inputs)\n    return utils.collect_named_outputs(outputs_collections, sc, outputs)\n\n\n@add_arg_scope\ndef avg_pool3d(inputs,\n               kernel_size,\n               stride=2,\n               padding='VALID',\n               data_format=DATA_FORMAT_NDHWC,\n               outputs_collections=None,\n               scope=None):\n  \"\"\"Adds a 3D average pooling op.\n\n  It is assumed that the pooling is done per image but not in batch or channels.\n\n  Args:\n    inputs: A 5-D tensor of shape `[batch_size, depth, height, width, channels]`\n      if `data_format` is `NDHWC`, and `[batch_size, channels, depth, height,\n      width]` if `data_format` is `NCDHW`.\n    kernel_size: A list of length 3: [kernel_depth, kernel_height, kernel_width]\n      of the pooling kernel over which the op is computed. Can be an int if both\n      values are the same.\n    stride: A list of length 3: [stride_depth, stride_height, stride_width]. Can\n      be an int if both strides are the same. Note that presently both strides\n      must have the same value.\n    padding: The padding method, either 'VALID' or 'SAME'.\n    data_format: A string. `NDHWC` (default) and `NCDHW` are supported.\n    outputs_collections: The collections to which the outputs are added.\n    scope: Optional scope for name_scope.\n\n  Returns:\n    A `Tensor` representing the results of the pooling operation.\n\n  Raises:\n    ValueError: If `data_format` is neither `NDHWC` nor `NCDHW`.\n  \"\"\"\n  if data_format not in (DATA_FORMAT_NCDHW, DATA_FORMAT_NDHWC):\n    raise ValueError('data_format has to be either NCDHW or NDHWC.')\n  with ops.name_scope(scope, 'AvgPool3D', [inputs]) as sc:\n    inputs = ops.convert_to_tensor(inputs)\n    df = ('channels_first'\n          if data_format and data_format.startswith('NC') else 'channels_last')\n    layer = pooling_layers.AveragePooling3D(\n        pool_size=kernel_size,\n        strides=stride,\n        padding=padding,\n        data_format=df,\n        _scope=sc)\n    outputs = layer.apply(inputs)\n    return utils.collect_named_outputs(outputs_collections, sc, outputs)\n\n\ndef _fused_batch_norm(inputs,\n                      decay=0.999,\n                      center=True,\n                      scale=False,\n                      epsilon=0.001,\n                      activation_fn=None,\n                      param_initializers=None,\n                      param_regularizers=None,\n                      updates_collections=ops.GraphKeys.UPDATE_OPS,\n                      is_training=True,\n                      reuse=None,\n                      variables_collections=None,\n                      outputs_collections=None,\n                      trainable=True,\n                      data_format=DATA_FORMAT_NHWC,\n                      zero_debias_moving_mean=False,\n                      scope=None):\n  \"\"\"Adds a Batch Normalization layer from http://arxiv.org/abs/1502.03167.\n\n    \"Batch Normalization: Accelerating Deep Network Training by Reducing\n    Internal Covariate Shift\"\n\n    Sergey Ioffe, Christian Szegedy\n\n  Can be used as a normalizer function for conv2d and fully_connected.\n\n  Note: when training, the moving_mean and moving_variance need to be updated.\n  By default the update ops are placed in `tf.GraphKeys.UPDATE_OPS`, so they\n  need to be added as a dependency to the `train_op`. For example:\n\n  ```python\n    update_ops = tf.compat.v1.get_collection(tf.GraphKeys.UPDATE_OPS)\n    with tf.control_dependencies(update_ops):\n      train_op = optimizer.minimize(loss)\n  ```\n\n  One can set updates_collections=None to force the updates in place, but that\n  can have a speed penalty, especially in distributed settings.\n\n  Args:\n    inputs: A tensor with 2 or more dimensions, where the first dimension has\n      `batch_size`. The normalization is over all but the last dimension if\n      `data_format` is `NHWC` and the second dimension if `data_format` is\n      `NCHW`.\n    decay: Decay for the moving average. Reasonable values for `decay` are close\n      to 1.0, typically in the multiple-nines range: 0.999, 0.99, 0.9, etc.\n        Lower `decay` value (recommend trying `decay`=0.9) if model experiences\n        reasonably good training performance but poor validation and/or test\n        performance.\n    center: If True, add offset of `beta` to normalized tensor.  If False,\n      `beta` is ignored.\n    scale: If True, multiply by `gamma`. If False, `gamma` is not used. When the\n      next layer is linear (also e.g. `nn.relu`), this can be disabled since the\n      scaling can be done by the next layer.\n    epsilon: Small float added to variance to avoid dividing by zero.\n    activation_fn: Activation function, default set to None to skip it and\n      maintain a linear activation.\n    param_initializers: Optional initializers for beta, gamma, moving mean and\n      moving variance.\n    param_regularizers: Optional regularizer for beta and gamma.\n    updates_collections: Collections to collect the update ops for computation.\n      The updates_ops need to be executed with the train_op. If None, a control\n      dependency would be added to make sure the updates are computed in place.\n    is_training: Whether or not the layer is in training mode. In training mode\n      it would accumulate the statistics of the moments into `moving_mean` and\n      `moving_variance` using an exponential moving average with the given\n      `decay`. When it is not in training mode then it would use the values of\n      the `moving_mean` and the `moving_variance`.\n    reuse: Whether or not the layer and its variables should be reused. To be\n      able to reuse the layer scope must be given.\n    variables_collections: Optional collections for the variables.\n    outputs_collections: Collections to add the outputs.\n    trainable: If `True` also add variables to the graph collection\n      `GraphKeys.TRAINABLE_VARIABLES` (see `tf.Variable`).\n    data_format: A string. `NHWC` (default) and `NCHW` are supported.\n    zero_debias_moving_mean: Use zero_debias for moving_mean.\n    scope: Optional scope for `variable_scope`.\n\n  Returns:\n    A `Tensor` representing the output of the operation.\n\n  Raises:\n    ValueError: If `data_format` is neither `NHWC` nor `NCHW`.\n    ValueError: If the rank of `inputs` is undefined.\n    ValueError: If the rank of `inputs` is neither 2 or 4.\n    ValueError: If rank or `C` dimension of `inputs` is undefined.\n  \"\"\"\n  if data_format not in (DATA_FORMAT_NCHW, DATA_FORMAT_NHWC):\n    raise ValueError('data_format has to be either NCHW or NHWC.')\n  with variable_scope.variable_scope(\n      scope, 'BatchNorm', [inputs], reuse=reuse) as sc:\n    inputs = ops.convert_to_tensor(inputs)\n    original_shape = inputs.get_shape()\n    original_inputs = inputs\n    original_rank = original_shape.ndims\n    if original_rank is None:\n      raise ValueError('Inputs %s has undefined rank' % inputs.name)\n    elif original_rank not in [2, 4]:\n      raise ValueError('Inputs %s has unsupported rank.'\n                       ' Expected 2 or 4 but got %d' %\n                       (inputs.name, original_rank))\n    if original_rank == 2:\n      channels = inputs.get_shape().dims[-1].value\n      if channels is None:\n        raise ValueError('`C` dimension must be known but is None')\n      new_shape = [-1, 1, 1, channels]\n      if data_format == DATA_FORMAT_NCHW:\n        new_shape = [-1, channels, 1, 1]\n      inputs = array_ops.reshape(inputs, new_shape)\n    inputs_shape = inputs.get_shape()\n    if data_format == DATA_FORMAT_NHWC:\n      params_shape = inputs_shape[-1:]\n    else:\n      params_shape = inputs_shape[1:2]\n    if not params_shape.is_fully_defined():\n      raise ValueError('Inputs %s has undefined `C` dimension %s.' %\n                       (inputs.name, params_shape))\n\n    # Allocate parameters for the beta and gamma of the normalization.\n    beta_collections = utils.get_variable_collections(variables_collections,\n                                                      'beta')\n    # Float32 required to avoid precision-loss when using fp16 input/output\n    variable_dtype = dtypes.float32\n    if not param_initializers:\n      param_initializers = {}\n    if not param_regularizers:\n      param_regularizers = {}\n    beta_regularizer = param_regularizers.get('beta')\n    gamma_regularizer = param_regularizers.get('gamma')\n\n    if center:\n      beta_initializer = param_initializers.get('beta',\n                                                init_ops.zeros_initializer())\n      beta = variables.model_variable(\n          'beta',\n          shape=params_shape,\n          dtype=variable_dtype,\n          initializer=beta_initializer,\n          regularizer=beta_regularizer,\n          collections=beta_collections,\n          trainable=trainable)\n    else:\n      beta = array_ops.constant(0.0, dtype=variable_dtype, shape=params_shape)\n\n    if scale:\n      gamma_collections = utils.get_variable_collections(\n          variables_collections, 'gamma')\n      gamma_initializer = param_initializers.get('gamma',\n                                                 init_ops.ones_initializer())\n      gamma = variables.model_variable(\n          'gamma',\n          shape=params_shape,\n          dtype=variable_dtype,\n          initializer=gamma_initializer,\n          regularizer=gamma_regularizer,\n          collections=gamma_collections,\n          trainable=trainable)\n    else:\n      gamma = array_ops.constant(1.0, dtype=variable_dtype, shape=params_shape)\n\n    # Create moving_mean and moving_variance variables and add them to the\n    # appropriate collections. We disable variable partitioning while creating\n    # them, because assign_moving_average is not yet supported for partitioned\n    # variables (this needs to be handled carefully, as it may break\n    # the checkpoint backward compatibility).\n    with variable_scope.variable_scope(\n        variable_scope.get_variable_scope()) as local_scope:\n      local_scope.set_partitioner(None)\n      moving_mean_collections = utils.get_variable_collections(\n          variables_collections, 'moving_mean')\n      moving_mean_initializer = param_initializers.get(\n          'moving_mean', init_ops.zeros_initializer())\n      moving_mean = variables.model_variable(\n          'moving_mean',\n          shape=params_shape,\n          dtype=variable_dtype,\n          initializer=moving_mean_initializer,\n          trainable=False,\n          collections=moving_mean_collections)\n      moving_variance_collections = utils.get_variable_collections(\n          variables_collections, 'moving_variance')\n      moving_variance_initializer = param_initializers.get(\n          'moving_variance', init_ops.ones_initializer())\n      moving_variance = variables.model_variable(\n          'moving_variance',\n          shape=params_shape,\n          dtype=variable_dtype,\n          initializer=moving_variance_initializer,\n          trainable=False,\n          collections=moving_variance_collections)\n\n    def _fused_batch_norm_training():\n      return nn.fused_batch_norm(\n          inputs, gamma, beta, epsilon=epsilon, data_format=data_format)\n\n    def _fused_batch_norm_inference():\n      return nn.fused_batch_norm(\n          inputs,\n          gamma,\n          beta,\n          mean=moving_mean,\n          variance=moving_variance,\n          epsilon=epsilon,\n          is_training=False,\n          data_format=data_format)\n\n    outputs, mean, variance = utils.smart_cond(is_training,\n                                               _fused_batch_norm_training,\n                                               _fused_batch_norm_inference)\n\n    # If `is_training` doesn't have a constant value, because it is a `Tensor`,\n    # a `Variable` or `Placeholder` then is_training_value will be None and\n    # `need_updates` will be true.\n    is_training_value = utils.constant_value(is_training)\n    need_updates = is_training_value is None or is_training_value\n    if need_updates:\n      if updates_collections is None:\n        no_updates = lambda: outputs\n\n        def _force_updates():\n          \"\"\"Internal function forces updates moving_vars if is_training.\"\"\"\n          update_moving_mean = moving_averages.assign_moving_average(\n              moving_mean, mean, decay, zero_debias=zero_debias_moving_mean)\n          update_moving_variance = moving_averages.assign_moving_average(\n              moving_variance, variance, decay, zero_debias=False)\n          with ops.control_dependencies(\n              [update_moving_mean, update_moving_variance]):\n            return array_ops.identity(outputs)\n\n        outputs = utils.smart_cond(is_training, _force_updates, no_updates)\n      else:\n        moving_vars_fn = lambda: (moving_mean, moving_variance)\n\n        def _delay_updates():\n          \"\"\"Internal function that delay updates moving_vars if is_training.\"\"\"\n          update_moving_mean = moving_averages.assign_moving_average(\n              moving_mean, mean, decay, zero_debias=zero_debias_moving_mean)\n          update_moving_variance = moving_averages.assign_moving_average(\n              moving_variance, variance, decay, zero_debias=False)\n          return update_moving_mean, update_moving_variance\n\n        update_mean, update_variance = utils.smart_cond(is_training,\n                                                        _delay_updates,\n                                                        moving_vars_fn)\n        ops.add_to_collections(updates_collections, update_mean)\n        ops.add_to_collections(updates_collections, update_variance)\n\n    outputs.set_shape(inputs_shape)\n    if original_shape.ndims == 2:\n      outputs = array_ops.reshape(outputs, array_ops.shape(original_inputs))\n    if activation_fn is not None:\n      outputs = activation_fn(outputs)\n    return utils.collect_named_outputs(outputs_collections, sc.name, outputs)\n\n\n@add_arg_scope\ndef batch_norm(inputs,\n               decay=0.999,\n               center=True,\n               scale=False,\n               epsilon=0.001,\n               activation_fn=None,\n               param_initializers=None,\n               param_regularizers=None,\n               updates_collections=ops.GraphKeys.UPDATE_OPS,\n               is_training=True,\n               reuse=None,\n               variables_collections=None,\n               outputs_collections=None,\n               trainable=True,\n               batch_weights=None,\n               fused=None,\n               data_format=DATA_FORMAT_NHWC,\n               zero_debias_moving_mean=False,\n               scope=None,\n               renorm=False,\n               renorm_clipping=None,\n               renorm_decay=0.99,\n               adjustment=None):\n  \"\"\"Adds a Batch Normalization layer from http://arxiv.org/abs/1502.03167.\n\n    \"Batch Normalization: Accelerating Deep Network Training by Reducing\n    Internal Covariate Shift\"\n\n    Sergey Ioffe, Christian Szegedy\n\n  Can be used as a normalizer function for conv2d and fully_connected. The\n  normalization is over all but the last dimension if `data_format` is `NHWC`\n  and all but the second dimension if `data_format` is `NCHW`.  In case of a 2D\n  tensor this corresponds to the batch dimension, while in case of a 4D tensor\n  this\n  corresponds to the batch and space dimensions.\n\n  Note: when training, the moving_mean and moving_variance need to be updated.\n  By default the update ops are placed in `tf.GraphKeys.UPDATE_OPS`, so they\n  need to be added as a dependency to the `train_op`. For example:\n\n  ```python\n    update_ops = tf.compat.v1.get_collection(tf.GraphKeys.UPDATE_OPS)\n    with tf.control_dependencies(update_ops):\n      train_op = optimizer.minimize(loss)\n  ```\n\n  One can set updates_collections=None to force the updates in place, but that\n  can have a speed penalty, especially in distributed settings.\n\n  Args:\n    inputs: A tensor with 2 or more dimensions, where the first dimension has\n      `batch_size`. The normalization is over all but the last dimension if\n      `data_format` is `NHWC` and the second dimension if `data_format` is\n      `NCHW`.\n    decay: Decay for the moving average. Reasonable values for `decay` are close\n      to 1.0, typically in the multiple-nines range: 0.999, 0.99, 0.9, etc.\n        Lower `decay` value (recommend trying `decay`=0.9) if model experiences\n        reasonably good training performance but poor validation and/or test\n        performance. Try zero_debias_moving_mean=True for improved stability.\n    center: If True, add offset of `beta` to normalized tensor. If False, `beta`\n      is ignored.\n    scale: If True, multiply by `gamma`. If False, `gamma` is not used. When the\n      next layer is linear (also e.g. `nn.relu`), this can be disabled since the\n      scaling can be done by the next layer.\n    epsilon: Small float added to variance to avoid dividing by zero.\n    activation_fn: Activation function, default set to None to skip it and\n      maintain a linear activation.\n    param_initializers: Optional initializers for beta, gamma, moving mean and\n      moving variance.\n    param_regularizers: Optional regularizer for beta and gamma.\n    updates_collections: Collections to collect the update ops for computation.\n      The updates_ops need to be executed with the train_op. If None, a control\n      dependency would be added to make sure the updates are computed in place.\n    is_training: Whether or not the layer is in training mode. In training mode\n      it would accumulate the statistics of the moments into `moving_mean` and\n      `moving_variance` using an exponential moving average with the given\n      `decay`. When it is not in training mode then it would use the values of\n      the `moving_mean` and the `moving_variance`.\n    reuse: Whether or not the layer and its variables should be reused. To be\n      able to reuse the layer scope must be given.\n    variables_collections: Optional collections for the variables.\n    outputs_collections: Collections to add the outputs.\n    trainable: If `True` also add variables to the graph collection\n      `GraphKeys.TRAINABLE_VARIABLES` (see `tf.Variable`).\n    batch_weights: An optional tensor of shape `[batch_size]`, containing a\n      frequency weight for each batch item. If present, then the batch\n      normalization uses weighted mean and variance. (This can be used to\n      correct for bias in training example selection.)\n    fused: if `None` or `True`, use a faster, fused implementation if possible.\n      If `False`, use the system recommended implementation.\n    data_format: A string. `NHWC` (default) and `NCHW` are supported.\n    zero_debias_moving_mean: Use zero_debias for moving_mean. It creates a new\n      pair of variables 'moving_mean/biased' and 'moving_mean/local_step'.\n    scope: Optional scope for `variable_scope`.\n    renorm: Whether to use Batch Renormalization\n      (https://arxiv.org/abs/1702.03275). This adds extra variables during\n        training. The inference is the same for either value of this parameter.\n    renorm_clipping: A dictionary that may map keys 'rmax', 'rmin', 'dmax' to\n      scalar `Tensors` used to clip the renorm correction. The correction `(r,\n      d)` is used as `corrected_value = normalized_value * r + d`, with `r`\n      clipped to [rmin, rmax], and `d` to [-dmax, dmax]. Missing rmax, rmin,\n      dmax are set to inf, 0, inf, respectively.\n    renorm_decay: Momentum used to update the moving means and standard\n      deviations with renorm. Unlike `momentum`, this affects training and\n      should be neither too small (which would add noise) nor too large (which\n      would give stale estimates). Note that `decay` is still applied to get the\n      means and variances for inference.\n    adjustment: A function taking the `Tensor` containing the (dynamic) shape of\n      the input tensor and returning a pair (scale, bias) to apply to the\n      normalized values (before gamma and beta), only during training. For\n      example,\n        `adjustment = lambda shape: (\n          tf.random.uniform(shape[-1:], 0.93, 1.07),\n          tf.random.uniform(shape[-1:], -0.1, 0.1))` will scale the normalized\n            value by up to 7% up or down, then shift the result by up to 0.1\n            (with independent scaling and bias for each feature but shared\n            across all examples), and finally apply gamma and/or beta. If\n            `None`, no adjustment is applied.\n\n  Returns:\n    A `Tensor` representing the output of the operation.\n\n  Raises:\n    ValueError: If `data_format` is neither `NHWC` nor `NCHW`.\n    ValueError: If the rank of `inputs` is undefined.\n    ValueError: If rank or channels dimension of `inputs` is undefined.\n  \"\"\"\n  if fused is None:\n    fused = True\n\n  # Only use _fused_batch_norm if all of the following three\n  # conditions are true:\n  # (1) fused is set True;\n  # (2) it is possible to use (currently it doesn't support batch weights,\n  #   renorm, and the case when rank is neither 2 nor 4);\n  # (3) it is used with zero_debias_moving_mean, or an input shape of rank 2,\n  #   or non-default updates_collections (not implemented in\n  #   normalization_layers.BatchNormalization yet); otherwise use the fused\n  #   implementation in normalization_layers.BatchNormalization.\n  inputs = ops.convert_to_tensor(inputs)\n  rank = inputs.get_shape().ndims\n  possible_to_fuse = (\n      batch_weights is None and not renorm and rank in [2, 4] and\n      adjustment is None)\n  if fused and possible_to_fuse and (\n      zero_debias_moving_mean or rank == 2 or\n      updates_collections is not ops.GraphKeys.UPDATE_OPS):\n    return _fused_batch_norm(\n        inputs,\n        decay=decay,\n        center=center,\n        scale=scale,\n        epsilon=epsilon,\n        activation_fn=activation_fn,\n        param_initializers=param_initializers,\n        param_regularizers=param_regularizers,\n        updates_collections=updates_collections,\n        is_training=is_training,\n        reuse=reuse,\n        variables_collections=variables_collections,\n        outputs_collections=outputs_collections,\n        trainable=trainable,\n        data_format=data_format,\n        zero_debias_moving_mean=zero_debias_moving_mean,\n        scope=scope)\n\n  if data_format not in (DATA_FORMAT_NCHW, DATA_FORMAT_NHWC):\n    raise ValueError('data_format has to be either NCHW or NHWC.')\n\n  layer_variable_getter = _build_variable_getter()\n  with variable_scope.variable_scope(\n      scope,\n      'BatchNorm', [inputs],\n      reuse=reuse,\n      custom_getter=layer_variable_getter) as sc:\n    inputs = ops.convert_to_tensor(inputs)\n\n    # Determine whether we can use the core layer class.\n    if (batch_weights is None and\n        updates_collections is ops.GraphKeys.UPDATE_OPS and\n        not zero_debias_moving_mean):\n      # Use the core layer class.\n      axis = 1 if data_format == DATA_FORMAT_NCHW else -1\n      if not param_initializers:\n        param_initializers = {}\n      beta_initializer = param_initializers.get('beta',\n                                                init_ops.zeros_initializer())\n      gamma_initializer = param_initializers.get('gamma',\n                                                 init_ops.ones_initializer())\n      moving_mean_initializer = param_initializers.get(\n          'moving_mean', init_ops.zeros_initializer())\n      moving_variance_initializer = param_initializers.get(\n          'moving_variance', init_ops.ones_initializer())\n      if not param_regularizers:\n        param_regularizers = {}\n      beta_regularizer = param_regularizers.get('beta')\n      gamma_regularizer = param_regularizers.get('gamma')\n      layer = normalization_layers.BatchNormalization(\n          axis=axis,\n          momentum=decay,\n          epsilon=epsilon,\n          center=center,\n          scale=scale,\n          beta_initializer=beta_initializer,\n          gamma_initializer=gamma_initializer,\n          moving_mean_initializer=moving_mean_initializer,\n          moving_variance_initializer=moving_variance_initializer,\n          beta_regularizer=beta_regularizer,\n          gamma_regularizer=gamma_regularizer,\n          trainable=trainable,\n          renorm=renorm,\n          renorm_clipping=renorm_clipping,\n          renorm_momentum=renorm_decay,\n          adjustment=adjustment,\n          name=sc.name,\n          _scope=sc,\n          _reuse=reuse,\n          fused=fused)\n      outputs = layer.apply(inputs, training=is_training)\n\n      # Add variables to collections.\n      _add_variable_to_collections(layer.moving_mean, variables_collections,\n                                   'moving_mean')\n      _add_variable_to_collections(layer.moving_variance, variables_collections,\n                                   'moving_variance')\n      if layer.beta is not None:\n        _add_variable_to_collections(layer.beta, variables_collections, 'beta')\n      if layer.gamma is not None:\n        _add_variable_to_collections(layer.gamma, variables_collections,\n                                     'gamma')\n\n      if activation_fn is not None:\n        outputs = activation_fn(outputs)\n      return utils.collect_named_outputs(outputs_collections, sc.name, outputs)\n\n    # Not supported by layer class: batch_weights argument,\n    # and custom updates_collections. In that case, use the legacy BN\n    # implementation.\n    # Custom updates collections are not supported because the update logic\n    # is different in this case, in particular w.r.t. \"forced updates\" and\n    # update op reuse.\n    if renorm:\n      raise ValueError('renorm is not supported with batch_weights, '\n                       'updates_collections or zero_debias_moving_mean')\n    inputs_shape = inputs.get_shape()\n    inputs_rank = inputs_shape.ndims\n    if inputs_rank is None:\n      raise ValueError('Inputs %s has undefined rank.' % inputs.name)\n    dtype = inputs.dtype.base_dtype\n    if batch_weights is not None:\n      batch_weights = ops.convert_to_tensor(batch_weights)\n      inputs_shape[0:1].assert_is_compatible_with(batch_weights.get_shape())\n      # Reshape batch weight values so they broadcast across inputs.\n      nshape = [-1] + [1 for _ in range(inputs_rank - 1)]\n      batch_weights = array_ops.reshape(batch_weights, nshape)\n\n    if data_format == DATA_FORMAT_NCHW:\n      moments_axes = [0] + list(range(2, inputs_rank))\n      params_shape = inputs_shape[1:2]\n      # For NCHW format, rather than relying on implicit broadcasting, we\n      # explicitly reshape the params to params_shape_broadcast when computing\n      # the moments and the batch normalization.\n      params_shape_broadcast = list([1, inputs_shape.dims[1].value] +\n                                    [1 for _ in range(2, inputs_rank)])\n    else:\n      moments_axes = list(range(inputs_rank - 1))\n      params_shape = inputs_shape[-1:]\n      params_shape_broadcast = None\n    if not params_shape.is_fully_defined():\n      raise ValueError('Inputs %s has undefined channels dimension %s.' %\n                       (inputs.name, params_shape))\n\n    # Allocate parameters for the beta and gamma of the normalization.\n    beta, gamma = None, None\n    if not param_initializers:\n      param_initializers = {}\n    if center:\n      beta_collections = utils.get_variable_collections(variables_collections,\n                                                        'beta')\n      beta_initializer = param_initializers.get('beta',\n                                                init_ops.zeros_initializer())\n      beta = variables.model_variable(\n          'beta',\n          shape=params_shape,\n          dtype=dtype,\n          initializer=beta_initializer,\n          collections=beta_collections,\n          trainable=trainable)\n    if scale:\n      gamma_collections = utils.get_variable_collections(\n          variables_collections, 'gamma')\n      gamma_initializer = param_initializers.get('gamma',\n                                                 init_ops.ones_initializer())\n      gamma = variables.model_variable(\n          'gamma',\n          shape=params_shape,\n          dtype=dtype,\n          initializer=gamma_initializer,\n          collections=gamma_collections,\n          trainable=trainable)\n\n    # Create moving_mean and moving_variance variables and add them to the\n    # appropriate collections. We disable variable partitioning while creating\n    # them, because assign_moving_average is not yet supported for partitioned\n    # variables (this needs to be handled carefully, as it may break\n    # the checkpoint backward compatibility).\n    with variable_scope.variable_scope(\n        variable_scope.get_variable_scope()) as local_scope:\n      local_scope.set_partitioner(None)\n      moving_mean_collections = utils.get_variable_collections(\n          variables_collections, 'moving_mean')\n      moving_mean_initializer = param_initializers.get(\n          'moving_mean', init_ops.zeros_initializer())\n      moving_mean = variables.model_variable(\n          'moving_mean',\n          shape=params_shape,\n          dtype=dtype,\n          initializer=moving_mean_initializer,\n          trainable=False,\n          collections=moving_mean_collections)\n      moving_variance_collections = utils.get_variable_collections(\n          variables_collections, 'moving_variance')\n      moving_variance_initializer = param_initializers.get(\n          'moving_variance', init_ops.ones_initializer())\n      moving_variance = variables.model_variable(\n          'moving_variance',\n          shape=params_shape,\n          dtype=dtype,\n          initializer=moving_variance_initializer,\n          trainable=False,\n          collections=moving_variance_collections)\n\n    # If `is_training` doesn't have a constant value, because it is a `Tensor`,\n    # a `Variable` or `Placeholder` then is_training_value will be None and\n    # `needs_moments` will be true.\n    is_training_value = utils.constant_value(is_training)\n    need_moments = is_training_value is None or is_training_value\n    if need_moments:\n      # Calculate the moments based on the individual batch.\n      if batch_weights is None:\n        if data_format == DATA_FORMAT_NCHW:\n          mean, variance = nn.moments(inputs, moments_axes, keep_dims=True)\n          mean = array_ops.reshape(mean, [-1])\n          variance = array_ops.reshape(variance, [-1])\n        else:\n          mean, variance = nn.moments(inputs, moments_axes)\n      else:\n        if data_format == DATA_FORMAT_NCHW:\n          mean, variance = nn.weighted_moments(\n              inputs, moments_axes, batch_weights, keepdims=True)\n          mean = array_ops.reshape(mean, [-1])\n          variance = array_ops.reshape(variance, [-1])\n        else:\n          mean, variance = nn.weighted_moments(inputs, moments_axes,\n                                               batch_weights)\n\n      moving_vars_fn = lambda: (moving_mean, moving_variance)\n      if updates_collections is None:\n\n        def _force_updates():\n          \"\"\"Internal function forces updates moving_vars if is_training.\"\"\"\n          update_moving_mean = moving_averages.assign_moving_average(\n              moving_mean, mean, decay, zero_debias=zero_debias_moving_mean)\n          update_moving_variance = moving_averages.assign_moving_average(\n              moving_variance, variance, decay, zero_debias=False)\n          with ops.control_dependencies(\n              [update_moving_mean, update_moving_variance]):\n            return array_ops.identity(mean), array_ops.identity(variance)\n\n        mean, variance = utils.smart_cond(is_training, _force_updates,\n                                          moving_vars_fn)\n      else:\n\n        def _delay_updates():\n          \"\"\"Internal function that delay updates moving_vars if is_training.\"\"\"\n          update_moving_mean = moving_averages.assign_moving_average(\n              moving_mean, mean, decay, zero_debias=zero_debias_moving_mean)\n          update_moving_variance = moving_averages.assign_moving_average(\n              moving_variance, variance, decay, zero_debias=False)\n          return update_moving_mean, update_moving_variance\n\n        update_mean, update_variance = utils.smart_cond(is_training,\n                                                        _delay_updates,\n                                                        moving_vars_fn)\n        ops.add_to_collections(updates_collections, update_mean)\n        ops.add_to_collections(updates_collections, update_variance)\n        # Use computed moments during training and moving_vars otherwise.\n        vars_fn = lambda: (mean, variance)\n        mean, variance = utils.smart_cond(is_training, vars_fn, moving_vars_fn)\n    else:\n      mean, variance = moving_mean, moving_variance\n    if data_format == DATA_FORMAT_NCHW:\n      mean = array_ops.reshape(mean, params_shape_broadcast)\n      variance = array_ops.reshape(variance, params_shape_broadcast)\n      if beta is not None:\n        beta = array_ops.reshape(beta, params_shape_broadcast)\n      if gamma is not None:\n        gamma = array_ops.reshape(gamma, params_shape_broadcast)\n\n    # Compute batch_normalization.\n    outputs = nn.batch_normalization(inputs, mean, variance, beta, gamma,\n                                     epsilon)\n    outputs.set_shape(inputs_shape)\n    if activation_fn is not None:\n      outputs = activation_fn(outputs)\n    return utils.collect_named_outputs(outputs_collections, sc.name, outputs)\n\n\n@add_arg_scope\ndef bias_add(inputs,\n             activation_fn=None,\n             initializer=init_ops.zeros_initializer(),\n             regularizer=None,\n             reuse=None,\n             variables_collections=None,\n             outputs_collections=None,\n             trainable=True,\n             data_format=DATA_FORMAT_NHWC,\n             scope=None):\n  \"\"\"Adds a bias to the inputs.\n\n  Can be used as a normalizer function for conv2d and fully_connected.\n\n  Args:\n    inputs: A tensor of with at least rank 2 and value for the last dimension,\n      e.g. `[batch_size, depth]`, `[None, None, None, depth]`.\n    activation_fn: Activation function, default set to None to skip it and\n      maintain a linear activation.\n    initializer: An initializer for the bias, defaults to 0.\n    regularizer: A regularizer like the result of `l1_regularizer` or\n      `l2_regularizer`.\n    reuse: Whether or not the layer and its variables should be reused. To be\n      able to reuse the layer scope must be given.\n    variables_collections: Optional collections for the variables.\n    outputs_collections: Collections to add the outputs.\n    trainable: If `True` also add variables to the graph collection\n      `GraphKeys.TRAINABLE_VARIABLES` (see tf.Variable).\n    data_format: A string. 'NHWC' and 'NCHW' are supported.\n    scope: Optional scope for variable_scope.\n\n  Returns:\n    A tensor representing the result of adding biases to the inputs.\n\n  Raises:\n    ValueError: If `data_format` is neither `NHWC` nor `NCHW`.\n    ValueError: If `data_format` is `NCHW` and rank of `inputs` is not 4.\n    ValueError: If the rank of `inputs` is undefined.\n    ValueError: If rank or `C` dimension of `inputs` is undefined.\n  \"\"\"\n  if data_format not in (DATA_FORMAT_NCHW, DATA_FORMAT_NHWC):\n    raise ValueError('data_format has to be either NCHW or NHWC.')\n  with variable_scope.variable_scope(\n      scope, 'BiasAdd', [inputs], reuse=reuse) as sc:\n    inputs = ops.convert_to_tensor(inputs)\n    dtype = inputs.dtype.base_dtype\n    inputs_shape = inputs.get_shape()\n    inputs_rank = inputs_shape.ndims\n    if inputs_rank is None:\n      raise ValueError('Dims of shape must be known but is None')\n    elif inputs_rank != 4 and data_format == DATA_FORMAT_NCHW:\n      raise ValueError('Data format NCHW only supports 4D Tensor')\n    axis = 1 if data_format == DATA_FORMAT_NCHW else -1\n    num_features = inputs_shape.dims[axis].value\n    if num_features is None:\n      raise ValueError('`C` dimension must be known but is None')\n    biases_collections = utils.get_variable_collections(variables_collections,\n                                                        'biases')\n    biases = variables.model_variable(\n        'biases',\n        shape=[\n            num_features,\n        ],\n        dtype=dtype,\n        initializer=initializer,\n        regularizer=regularizer,\n        collections=biases_collections,\n        trainable=trainable)\n    outputs = nn.bias_add(inputs, biases, data_format=data_format)\n    if activation_fn is not None:\n      outputs = activation_fn(outputs)\n    return utils.collect_named_outputs(outputs_collections, sc.name, outputs)\n\n\n# TODO(jbms): change `rate` parameter to `dilation_rate` for consistency with\n# underlying op.\n@add_arg_scope\ndef convolution(inputs,\n                num_outputs,\n                kernel_size,\n                stride=1,\n                padding='SAME',\n                data_format=None,\n                rate=1,\n                activation_fn=nn.relu,\n                normalizer_fn=None,\n                normalizer_params=None,\n                weights_initializer=initializers.xavier_initializer(),\n                weights_regularizer=None,\n                biases_initializer=init_ops.zeros_initializer(),\n                biases_regularizer=None,\n                reuse=None,\n                variables_collections=None,\n                outputs_collections=None,\n                trainable=True,\n                scope=None,\n                conv_dims=None):\n  \"\"\"Adds an N-D convolution followed by an optional batch_norm layer.\n\n  It is required that 1 <= N <= 3.\n\n  `convolution` creates a variable called `weights`, representing the\n  convolutional kernel, that is convolved (actually cross-correlated) with the\n  `inputs` to produce a `Tensor` of activations. If a `normalizer_fn` is\n  provided (such as `batch_norm`), it is then applied. Otherwise, if\n  `normalizer_fn` is None and a `biases_initializer` is provided then a `biases`\n  variable would be created and added the activations. Finally, if\n  `activation_fn` is not `None`, it is applied to the activations as well.\n\n  Performs atrous convolution with input stride/dilation rate equal to `rate`\n  if a value > 1 for any dimension of `rate` is specified.  In this case\n  `stride` values != 1 are not supported.\n\n  Args:\n    inputs: A Tensor of rank N+2 of shape `[batch_size] + input_spatial_shape +\n      [in_channels]` if data_format does not start with \"NC\" (default), or\n      `[batch_size, in_channels] + input_spatial_shape` if data_format starts\n      with \"NC\".\n    num_outputs: Integer, the number of output filters.\n    kernel_size: A sequence of N positive integers specifying the spatial\n      dimensions of the filters.  Can be a single integer to specify the same\n      value for all spatial dimensions.\n    stride: A sequence of N positive integers specifying the stride at which to\n      compute output.  Can be a single integer to specify the same value for all\n      spatial dimensions.  Specifying any `stride` value != 1 is incompatible\n      with specifying any `rate` value != 1.\n    padding: One of `\"VALID\"` or `\"SAME\"`.\n    data_format: A string or None.  Specifies whether the channel dimension of\n      the `input` and output is the last dimension (default, or if `data_format`\n      does not start with \"NC\"), or the second dimension (if `data_format`\n      starts with \"NC\").  For N=1, the valid values are \"NWC\" (default) and\n      \"NCW\".  For N=2, the valid values are \"NHWC\" (default) and \"NCHW\". For\n      N=3, the valid values are \"NDHWC\" (default) and \"NCDHW\".\n    rate: A sequence of N positive integers specifying the dilation rate to use\n      for atrous convolution.  Can be a single integer to specify the same value\n      for all spatial dimensions.  Specifying any `rate` value != 1 is\n      incompatible with specifying any `stride` value != 1.\n    activation_fn: Activation function. The default value is a ReLU function.\n      Explicitly set it to None to skip it and maintain a linear activation.\n    normalizer_fn: Normalization function to use instead of `biases`. If\n      `normalizer_fn` is provided then `biases_initializer` and\n      `biases_regularizer` are ignored and `biases` are not created nor added.\n      default set to None for no normalizer function\n    normalizer_params: Normalization function parameters.\n    weights_initializer: An initializer for the weights.\n    weights_regularizer: Optional regularizer for the weights.\n    biases_initializer: An initializer for the biases. If None skip biases.\n    biases_regularizer: Optional regularizer for the biases.\n    reuse: Whether or not the layer and its variables should be reused. To be\n      able to reuse the layer scope must be given.\n    variables_collections: Optional list of collections for all the variables or\n      a dictionary containing a different list of collection per variable.\n    outputs_collections: Collection to add the outputs.\n    trainable: If `True` also add variables to the graph collection\n      `GraphKeys.TRAINABLE_VARIABLES` (see tf.Variable).\n    scope: Optional scope for `variable_scope`.\n    conv_dims: Optional convolution dimensionality, when set it would use the\n      corresponding convolution (e.g. 2 for Conv 2D, 3 for Conv 3D, ..). When\n      leaved to None it would select the convolution dimensionality based on the\n      input rank (i.e. Conv ND, with N = input_rank - 2).\n\n  Returns:\n    A tensor representing the output of the operation.\n\n  Raises:\n    ValueError: If `data_format` is invalid.\n    ValueError: Both 'rate' and `stride` are not uniformly 1.\n  \"\"\"\n  if data_format not in [None, 'NWC', 'NCW', 'NHWC', 'NCHW', 'NDHWC', 'NCDHW']:\n    raise ValueError('Invalid data_format: %r' % (data_format,))\n\n  layer_variable_getter = _build_variable_getter({\n      'bias': 'biases',\n      'kernel': 'weights'\n  })\n\n  with variable_scope.variable_scope(\n      scope, 'Conv', [inputs], reuse=reuse,\n      custom_getter=layer_variable_getter) as sc:\n    inputs = ops.convert_to_tensor(inputs)\n    input_rank = inputs.get_shape().ndims\n\n    if conv_dims is not None and conv_dims + 2 != input_rank:\n      raise ValueError('Convolution expects input with rank %d, got %d' %\n                       (conv_dims + 2, input_rank))\n    if input_rank == 3:\n      layer_class = convolutional_layers.Convolution1D\n    elif input_rank == 4:\n      layer_class = convolutional_layers.Convolution2D\n    elif input_rank == 5:\n      layer_class = convolutional_layers.Convolution3D\n    else:\n      raise ValueError('Convolution not supported for input with rank',\n                       input_rank)\n\n    df = ('channels_first'\n          if data_format and data_format.startswith('NC') else 'channels_last')\n    layer = layer_class(\n        filters=num_outputs,\n        kernel_size=kernel_size,\n        strides=stride,\n        padding=padding,\n        data_format=df,\n        dilation_rate=rate,\n        activation=None,\n        use_bias=not normalizer_fn and biases_initializer,\n        kernel_initializer=weights_initializer,\n        bias_initializer=biases_initializer,\n        kernel_regularizer=weights_regularizer,\n        bias_regularizer=biases_regularizer,\n        activity_regularizer=None,\n        trainable=trainable,\n        name=sc.name,\n        dtype=inputs.dtype.base_dtype,\n        _scope=sc,\n        _reuse=reuse)\n    outputs = layer.apply(inputs)\n\n    # Add variables to collections.\n    _add_variable_to_collections(layer.kernel, variables_collections, 'weights')\n    if layer.use_bias:\n      _add_variable_to_collections(layer.bias, variables_collections, 'biases')\n\n    if normalizer_fn is not None:\n      normalizer_params = normalizer_params or {}\n      outputs = normalizer_fn(outputs, **normalizer_params)\n\n    if activation_fn is not None:\n      outputs = activation_fn(outputs)\n    return utils.collect_named_outputs(outputs_collections, sc.name, outputs)\n\n\n@add_arg_scope\ndef convolution1d(inputs,\n                  num_outputs,\n                  kernel_size,\n                  stride=1,\n                  padding='SAME',\n                  data_format=None,\n                  rate=1,\n                  activation_fn=nn.relu,\n                  normalizer_fn=None,\n                  normalizer_params=None,\n                  weights_initializer=initializers.xavier_initializer(),\n                  weights_regularizer=None,\n                  biases_initializer=init_ops.zeros_initializer(),\n                  biases_regularizer=None,\n                  reuse=None,\n                  variables_collections=None,\n                  outputs_collections=None,\n                  trainable=True,\n                  scope=None):\n  return convolution(\n      inputs,\n      num_outputs,\n      kernel_size,\n      stride,\n      padding,\n      data_format,\n      rate,\n      activation_fn,\n      normalizer_fn,\n      normalizer_params,\n      weights_initializer,\n      weights_regularizer,\n      biases_initializer,\n      biases_regularizer,\n      reuse,\n      variables_collections,\n      outputs_collections,\n      trainable,\n      scope,\n      conv_dims=1)\n\n\nconvolution1d.__doc__ = convolution.__doc__\n\n\n@add_arg_scope\ndef convolution2d(inputs,\n                  num_outputs,\n                  kernel_size,\n                  stride=1,\n                  padding='SAME',\n                  data_format=None,\n                  rate=1,\n                  activation_fn=nn.relu,\n                  normalizer_fn=None,\n                  normalizer_params=None,\n                  weights_initializer=initializers.xavier_initializer(),\n                  weights_regularizer=None,\n                  biases_initializer=init_ops.zeros_initializer(),\n                  biases_regularizer=None,\n                  reuse=None,\n                  variables_collections=None,\n                  outputs_collections=None,\n                  trainable=True,\n                  scope=None):\n  return convolution(\n      inputs,\n      num_outputs,\n      kernel_size,\n      stride,\n      padding,\n      data_format,\n      rate,\n      activation_fn,\n      normalizer_fn,\n      normalizer_params,\n      weights_initializer,\n      weights_regularizer,\n      biases_initializer,\n      biases_regularizer,\n      reuse,\n      variables_collections,\n      outputs_collections,\n      trainable,\n      scope,\n      conv_dims=2)\n\n\nconvolution2d.__doc__ = convolution.__doc__\n\n\n@add_arg_scope\ndef convolution3d(inputs,\n                  num_outputs,\n                  kernel_size,\n                  stride=1,\n                  padding='SAME',\n                  data_format=None,\n                  rate=1,\n                  activation_fn=nn.relu,\n                  normalizer_fn=None,\n                  normalizer_params=None,\n                  weights_initializer=initializers.xavier_initializer(),\n                  weights_regularizer=None,\n                  biases_initializer=init_ops.zeros_initializer(),\n                  biases_regularizer=None,\n                  reuse=None,\n                  variables_collections=None,\n                  outputs_collections=None,\n                  trainable=True,\n                  scope=None):\n  return convolution(\n      inputs,\n      num_outputs,\n      kernel_size,\n      stride,\n      padding,\n      data_format,\n      rate,\n      activation_fn,\n      normalizer_fn,\n      normalizer_params,\n      weights_initializer,\n      weights_regularizer,\n      biases_initializer,\n      biases_regularizer,\n      reuse,\n      variables_collections,\n      outputs_collections,\n      trainable,\n      scope,\n      conv_dims=3)\n\n\nconvolution3d.__doc__ = convolution.__doc__\n\n\n@add_arg_scope\ndef convolution2d_in_plane(\n    inputs,\n    kernel_size,\n    stride=1,\n    padding='SAME',\n    activation_fn=nn.relu,\n    normalizer_fn=None,\n    normalizer_params=None,\n    weights_initializer=initializers.xavier_initializer(),\n    weights_regularizer=None,\n    biases_initializer=init_ops.zeros_initializer(),\n    biases_regularizer=None,\n    reuse=None,\n    variables_collections=None,\n    outputs_collections=None,\n    trainable=True,\n    scope=None):\n  \"\"\"Performs the same in-plane convolution to each channel independently.\n\n  This is useful for performing various simple channel-independent convolution\n  operations such as image gradients:\n\n    image = tf.constant(..., shape=(16, 240, 320, 3))\n    vert_gradients = layers.conv2d_in_plane(image,\n                                            kernel=[1, -1],\n                                            kernel_size=[2, 1])\n    horz_gradients = layers.conv2d_in_plane(image,\n                                            kernel=[1, -1],\n                                            kernel_size=[1, 2])\n\n  Args:\n    inputs: A 4-D tensor with dimensions [batch_size, height, width, channels].\n    kernel_size: A list of length 2 holding the [kernel_height, kernel_width] of\n      of the pooling. Can be an int if both values are the same.\n    stride: A list of length 2 `[stride_height, stride_width]`. Can be an int if\n      both strides are the same. Note that presently both strides must have the\n      same value.\n    padding: The padding type to use, either 'SAME' or 'VALID'.\n    activation_fn: Activation function. The default value is a ReLU function.\n      Explicitly set it to None to skip it and maintain a linear activation.\n    normalizer_fn: Normalization function to use instead of `biases`. If\n      `normalizer_fn` is provided then `biases_initializer` and\n      `biases_regularizer` are ignored and `biases` are not created nor added.\n      default set to None for no normalizer function\n    normalizer_params: Normalization function parameters.\n    weights_initializer: An initializer for the weights.\n    weights_regularizer: Optional regularizer for the weights.\n    biases_initializer: An initializer for the biases. If None skip biases.\n    biases_regularizer: Optional regularizer for the biases.\n    reuse: Whether or not the layer and its variables should be reused. To be\n      able to reuse the layer scope must be given.\n    variables_collections: Optional list of collections for all the variables or\n      a dictionary containing a different list of collection per variable.\n    outputs_collections: Collection to add the outputs.\n    trainable: If `True` also add variables to the graph collection\n      `GraphKeys.TRAINABLE_VARIABLES` (see tf.Variable).\n    scope: Optional scope for `variable_scope`.\n\n  Returns:\n    A `Tensor` representing the output of the operation.\n  \"\"\"\n  with variable_scope.variable_scope(\n      scope, 'ConvInPlane', [inputs], reuse=reuse) as sc:\n    dtype = inputs.dtype.base_dtype\n    kernel_h, kernel_w = utils.two_element_tuple(kernel_size)\n    stride_h, stride_w = utils.two_element_tuple(stride)\n    num_filters_in = utils.last_dimension(inputs.get_shape(), min_rank=4)\n    weights_shape = [kernel_h, kernel_w, 1, 1]\n    weights_collections = utils.get_variable_collections(\n        variables_collections, 'weights')\n    weights = variables.model_variable(\n        'weights',\n        shape=weights_shape,\n        dtype=dtype,\n        initializer=weights_initializer,\n        regularizer=weights_regularizer,\n        collections=weights_collections,\n        trainable=trainable)\n    depthwise_weights = array_ops.tile(weights, [1, 1, num_filters_in, 1])\n    outputs = nn.depthwise_conv2d(inputs, depthwise_weights,\n                                  [1, stride_h, stride_w, 1], padding)\n    if normalizer_fn is not None:\n      normalizer_params = normalizer_params or {}\n      outputs = normalizer_fn(outputs, **normalizer_params)\n    else:\n      if biases_initializer is not None:\n        biases_collections = utils.get_variable_collections(\n            variables_collections, 'biases')\n        biases = variables.model_variable(\n            'biases',\n            shape=[\n                num_filters_in,\n            ],\n            dtype=dtype,\n            initializer=biases_initializer,\n            regularizer=biases_regularizer,\n            collections=biases_collections,\n            trainable=trainable)\n        outputs = nn.bias_add(outputs, biases)\n\n    if activation_fn is not None:\n      outputs = activation_fn(outputs)\n    return utils.collect_named_outputs(outputs_collections, sc.name, outputs)\n\n\n@add_arg_scope\ndef convolution2d_transpose(\n    inputs,\n    num_outputs,\n    kernel_size,\n    stride=1,\n    padding='SAME',\n    data_format=DATA_FORMAT_NHWC,\n    activation_fn=nn.relu,\n    normalizer_fn=None,\n    normalizer_params=None,\n    weights_initializer=initializers.xavier_initializer(),\n    weights_regularizer=None,\n    biases_initializer=init_ops.zeros_initializer(),\n    biases_regularizer=None,\n    reuse=None,\n    variables_collections=None,\n    outputs_collections=None,\n    trainable=True,\n    scope=None):\n  \"\"\"Adds a convolution2d_transpose with an optional batch normalization layer.\n\n  The function creates a variable called `weights`, representing the\n  kernel, that is convolved with the input. If `normalizer_fn` is `None`, a\n  second variable called 'biases' is added to the result of the operation.\n\n  Args:\n    inputs: A 4-D `Tensor` of type `float` and shape `[batch, height, width,\n      in_channels]` for `NHWC` data format or `[batch, in_channels, height,\n      width]` for `NCHW` data format.\n    num_outputs: Integer, the number of output filters.\n    kernel_size: A list of length 2 holding the [kernel_height, kernel_width] of\n      of the filters. Can be an int if both values are the same.\n    stride: A list of length 2: [stride_height, stride_width]. Can be an int if\n      both strides are the same.  Note that presently both strides must have the\n      same value.\n    padding: One of 'VALID' or 'SAME'.\n    data_format: A string. `NHWC` (default) and `NCHW` are supported.\n    activation_fn: Activation function. The default value is a ReLU function.\n      Explicitly set it to None to skip it and maintain a linear activation.\n    normalizer_fn: Normalization function to use instead of `biases`. If\n      `normalizer_fn` is provided then `biases_initializer` and\n      `biases_regularizer` are ignored and `biases` are not created nor added.\n      default set to None for no normalizer function\n    normalizer_params: Normalization function parameters.\n    weights_initializer: An initializer for the weights.\n    weights_regularizer: Optional regularizer for the weights.\n    biases_initializer: An initializer for the biases. If None skip biases.\n    biases_regularizer: Optional regularizer for the biases.\n    reuse: Whether or not the layer and its variables should be reused. To be\n      able to reuse the layer scope must be given.\n    variables_collections: Optional list of collections for all the variables or\n      a dictionary containing a different list of collection per variable.\n    outputs_collections: Collection to add the outputs.\n    trainable: Whether or not the variables should be trainable or not.\n    scope: Optional scope for variable_scope.\n\n  Returns:\n    A tensor representing the output of the operation.\n\n  Raises:\n    ValueError: If 'kernel_size' is not a list of length 2.\n    ValueError: If `data_format` is neither `NHWC` nor `NCHW`.\n    ValueError: If `C` dimension of `inputs` is None.\n  \"\"\"\n  layer_variable_getter = _build_variable_getter({\n      'bias': 'biases',\n      'kernel': 'weights'\n  })\n\n  with variable_scope.variable_scope(\n      scope,\n      'Conv2d_transpose', [inputs],\n      reuse=reuse,\n      custom_getter=layer_variable_getter) as sc:\n    if data_format not in (DATA_FORMAT_NCHW, DATA_FORMAT_NHWC):\n      raise ValueError('data_format has to be either NCHW or NHWC.')\n\n    inputs = ops.convert_to_tensor(inputs)\n\n    df = ('channels_first'\n          if data_format and data_format.startswith('NC') else 'channels_last')\n    layer = convolutional_layers.Convolution2DTranspose(\n        filters=num_outputs,\n        kernel_size=kernel_size,\n        strides=stride,\n        padding=padding,\n        data_format=df,\n        activation=None,\n        use_bias=not normalizer_fn and biases_initializer,\n        kernel_initializer=weights_initializer,\n        bias_initializer=biases_initializer,\n        kernel_regularizer=weights_regularizer,\n        bias_regularizer=biases_regularizer,\n        activity_regularizer=None,\n        trainable=trainable,\n        name=sc.name,\n        dtype=inputs.dtype.base_dtype,\n        _scope=sc,\n        _reuse=reuse)\n    outputs = layer.apply(inputs)\n\n    # Add variables to collections.\n    _add_variable_to_collections(layer.kernel, variables_collections, 'weights')\n    if layer.bias is not None:\n      _add_variable_to_collections(layer.bias, variables_collections, 'biases')\n\n    if normalizer_fn is not None:\n      normalizer_params = normalizer_params or {}\n      outputs = normalizer_fn(outputs, **normalizer_params)\n\n    if activation_fn is not None:\n      outputs = activation_fn(outputs)\n    return utils.collect_named_outputs(outputs_collections, sc.name, outputs)\n\n\n@add_arg_scope\ndef convolution3d_transpose(\n    inputs,\n    num_outputs,\n    kernel_size,\n    stride=1,\n    padding='SAME',\n    data_format=DATA_FORMAT_NDHWC,\n    activation_fn=nn.relu,\n    normalizer_fn=None,\n    normalizer_params=None,\n    weights_initializer=initializers.xavier_initializer(),\n    weights_regularizer=None,\n    biases_initializer=init_ops.zeros_initializer(),\n    biases_regularizer=None,\n    reuse=None,\n    variables_collections=None,\n    outputs_collections=None,\n    trainable=True,\n    scope=None):\n  \"\"\"Adds a convolution3d_transpose with an optional batch normalization layer.\n\n  The function creates a variable called `weights`, representing the\n  kernel, that is convolved with the input. If `batch_norm_params` is `None`, a\n  second variable called 'biases' is added to the result of the operation.\n  Args:\n    inputs: A 5-D `Tensor` of type `float` and shape `[batch, depth, height,\n      width, in_channels]` for `NDHWC` data format or `[batch, in_channels,\n      depth, height, width]` for `NCDHW` data format.\n    num_outputs: Integer, the number of output filters.\n    kernel_size: A list of length 3 holding the [kernel_depth, kernel_height,\n      kernel_width] of the filters. Can be an int if both values are the same.\n    stride: A list of length 3: [stride_depth, stride_height, stride_width]. Can\n      be an int if both strides are the same.  Note that presently both strides\n      must have the same value.\n    padding: One of 'VALID' or 'SAME'.\n    data_format: A string. `NDHWC` (default) and `NCDHW` are supported.\n    activation_fn: Activation function. The default value is a ReLU function.\n      Explicitly set it to None to skip it and maintain a linear activation.\n    normalizer_fn: Normalization function to use instead of `biases`. If\n      `normalizer_fn` is provided then `biases_initializer` and\n      `biases_regularizer` are ignored and `biases` are not created nor added.\n      default set to None for no normalizer function\n    normalizer_params: Normalization function parameters.\n    weights_initializer: An initializer for the weights.\n    weights_regularizer: Optional regularizer for the weights.\n    biases_initializer: An initializer for the biases. If None skip biases.\n    biases_regularizer: Optional regularizer for the biases.\n    reuse: Whether or not the layer and its variables should be reused. To be\n      able to reuse the layer scope must be given.\n    variables_collections: Optional list of collections for all the variables or\n      a dictionary containing a different list of collection per variable.\n    outputs_collections: Collection to add the outputs.\n    trainable: Whether or not the variables should be trainable or not.\n    scope: Optional scope for variable_scope.\n\n  Returns:\n    A tensor representing the output of the operation.\n  Raises:\n    ValueError: If 'kernel_size' is not a list of length 3.\n    ValueError: If `data_format` is neither `NDHWC` nor `NCDHW`.\n    ValueError: If `C` dimension of `inputs` is None.\n  \"\"\"\n  layer_variable_getter = _build_variable_getter({\n      'bias': 'biases',\n      'kernel': 'weights'\n  })\n\n  with variable_scope.variable_scope(\n      scope,\n      'Conv3d_transpose', [inputs],\n      reuse=reuse,\n      custom_getter=layer_variable_getter) as sc:\n    if data_format not in (DATA_FORMAT_NCDHW, DATA_FORMAT_NDHWC):\n      raise ValueError('data_format has to be either NCDHW or NDHWC.')\n\n    inputs = ops.convert_to_tensor(inputs)\n\n    df = ('channels_first'\n          if data_format and data_format.startswith('NC') else 'channels_last')\n    layer = convolutional_layers.Convolution3DTranspose(\n        filters=num_outputs,\n        kernel_size=kernel_size,\n        strides=stride,\n        padding=padding,\n        data_format=df,\n        activation=None,\n        use_bias=not normalizer_fn and biases_initializer,\n        kernel_initializer=weights_initializer,\n        bias_initializer=biases_initializer,\n        kernel_regularizer=weights_regularizer,\n        bias_regularizer=biases_regularizer,\n        activity_regularizer=None,\n        trainable=trainable,\n        name=sc.name,\n        dtype=inputs.dtype.base_dtype,\n        _scope=sc,\n        _reuse=reuse)\n    outputs = layer.apply(inputs)\n\n    # Add variables to collections.\n    _add_variable_to_collections(layer.kernel, variables_collections, 'weights')\n    if layer.bias is not None:\n      _add_variable_to_collections(layer.bias, variables_collections, 'biases')\n\n    if normalizer_fn is not None:\n      normalizer_params = normalizer_params or {}\n      outputs = normalizer_fn(outputs, **normalizer_params)\n\n    if activation_fn is not None:\n      outputs = activation_fn(outputs)\n    return utils.collect_named_outputs(outputs_collections, sc.name, outputs)\n\n\n@add_arg_scope\ndef dense_to_sparse(tensor, eos_token=0, outputs_collections=None, scope=None):\n  \"\"\"Converts a dense tensor into a sparse tensor.\n\n  An example use would be to convert dense labels to sparse ones\n  so that they can be fed to the ctc_loss.\n\n  Args:\n     tensor: An `int` `Tensor` to be converted to a `Sparse`.\n     eos_token: An integer. It is part of the target label that signifies the\n       end of a sentence.\n     outputs_collections: Collection to add the outputs.\n     scope: Optional scope for name_scope.\n  \"\"\"\n  with variable_scope.variable_scope(scope, 'dense_to_sparse', [tensor]) as sc:\n    tensor = ops.convert_to_tensor(tensor)\n    indices = array_ops.where(\n        math_ops.not_equal(tensor, constant_op.constant(eos_token,\n                                                        tensor.dtype)))\n    values = array_ops.gather_nd(tensor, indices)\n    shape = array_ops.shape(tensor, out_type=dtypes.int64)\n    outputs = sparse_tensor.SparseTensor(indices, values, shape)\n    return utils.collect_named_outputs(outputs_collections, sc.name, outputs)\n\n\n@add_arg_scope\ndef dropout(inputs,\n            keep_prob=0.5,\n            noise_shape=None,\n            is_training=True,\n            outputs_collections=None,\n            scope=None,\n            seed=None):\n  \"\"\"Returns a dropout op applied to the input.\n\n  With probability `keep_prob`, outputs the input element scaled up by\n  `1 / keep_prob`, otherwise outputs `0`.  The scaling is so that the expected\n  sum is unchanged.\n\n  Args:\n    inputs: The tensor to pass to the nn.dropout op.\n    keep_prob: A scalar `Tensor` with the same type as x. The probability that\n      each element is kept.\n    noise_shape: A 1-D `Tensor` of type `int32`, representing the shape for\n      randomly generated keep/drop flags.\n    is_training: A bool `Tensor` indicating whether or not the model is in\n      training mode. If so, dropout is applied and values scaled. Otherwise,\n      inputs is returned.\n    outputs_collections: Collection to add the outputs.\n    scope: Optional scope for name_scope.\n    seed: A Python integer. Used to create random seeds. See\n      `tf.compat.v1.set_random_seed` for behavior.\n\n  Returns:\n    A tensor representing the output of the operation.\n  \"\"\"\n  with variable_scope.variable_scope(\n      scope, 'Dropout', [inputs], custom_getter=_model_variable_getter) as sc:\n    inputs = ops.convert_to_tensor(inputs)\n    layer = core_layers.Dropout(\n        rate=1 - keep_prob,\n        noise_shape=noise_shape,\n        seed=seed,\n        name=sc.name,\n        _scope=sc)\n    outputs = layer.apply(inputs, training=is_training)\n    return utils.collect_named_outputs(outputs_collections, sc.name, outputs)\n\n\n@add_arg_scope\ndef flatten(inputs, outputs_collections=None, scope=None):\n  \"\"\"Flattens the input while maintaining the batch_size.\n\n    Assumes that the first dimension represents the batch.\n\n  Args:\n    inputs: A tensor of size [batch_size, ...].\n    outputs_collections: Collection to add the outputs.\n    scope: Optional scope for name_scope.\n\n  Returns:\n    A flattened tensor with shape [batch_size, k].\n  Raises:\n    ValueError: If inputs rank is unknown or less than 2.\n  \"\"\"\n  with ops.name_scope(scope, 'Flatten', [inputs]) as sc:\n    inputs = ops.convert_to_tensor(inputs)\n    outputs = core_layers.flatten(inputs)\n    return utils.collect_named_outputs(outputs_collections, sc, outputs)\n\n\ndef _sparse_inner_flatten(inputs, new_rank):\n  \"\"\"Helper function for `inner_flatten`.\"\"\"\n  inputs_rank = inputs.dense_shape.get_shape().as_list()[0]\n  if inputs_rank < new_rank:\n    raise ValueError(\n        'Inputs has rank less than new_rank. {} must have rank at least'\n        ' {}. Received rank {}, shape {}'.format(inputs, new_rank, inputs_rank,\n                                                 inputs.get_shape()))\n\n  outer_dimensions = inputs.dense_shape[:new_rank - 1]\n  inner_dimensions = inputs.dense_shape[new_rank - 1:]\n  new_shape = array_ops.concat(\n      (outer_dimensions, [math_ops.reduce_prod(inner_dimensions)]), 0)\n  flattened = sparse_ops.sparse_reshape(inputs, new_shape)\n  return flattened\n\n\ndef _dense_inner_flatten(inputs, new_rank):\n  \"\"\"Helper function for `inner_flatten`.\"\"\"\n  rank_assertion = check_ops.assert_rank_at_least(\n      inputs, new_rank, message='inputs has rank less than new_rank')\n  with ops.control_dependencies([rank_assertion]):\n    outer_dimensions = array_ops.strided_slice(\n        array_ops.shape(inputs), [0], [new_rank - 1])\n    new_shape = array_ops.concat((outer_dimensions, [-1]), 0)\n    reshaped = array_ops.reshape(inputs, new_shape)\n\n  # if `new_rank` is an integer, try to calculate new shape.\n  if isinstance(new_rank, six.integer_types):\n    static_shape = inputs.get_shape()\n    if static_shape is not None and static_shape.dims is not None:\n      static_shape = static_shape.as_list()\n      static_outer_dims = static_shape[:new_rank - 1]\n      static_inner_dims = static_shape[new_rank - 1:]\n      flattened_dimension = 1\n      for inner_dim in static_inner_dims:\n        if inner_dim is None:\n          flattened_dimension = None\n          break\n        flattened_dimension *= inner_dim\n      reshaped.set_shape(static_outer_dims + [flattened_dimension])\n  return reshaped\n\n\n@add_arg_scope\ndef _inner_flatten(inputs, new_rank, output_collections=None, scope=None):\n  \"\"\"Flattens inner dimensions of `inputs`, returns a Tensor with `new_rank`.\n\n  For example:\n  '''\n      x = tf.random.uniform(shape=[1, 2, 3, 4, 5, 6])\n      y = _inner_flatten(x, 4)\n      assert y.get_shape().as_list() == [1, 2, 3, (4 * 5 * 6)]\n  '''\n  This layer will fail at run time if `new_rank` is greater than the current\n  rank of `inputs`.\n\n  Args:\n    inputs: A `Tensor` or `SparseTensor`.\n    new_rank: The desired rank of the returned `Tensor` or `SparseTensor`.\n    output_collections: Collection to which the outputs will be added.\n    scope: Optional scope for `name_scope`.\n\n  Returns:\n    A `Tensor` or `SparseTensor` containing the same values as `inputs`, but\n    with innermost dimensions flattened to obtain rank `new_rank`.\n\n  Raises:\n    TypeError: `inputs` is not a `Tensor` or `SparseTensor`.\n  \"\"\"\n  with ops.name_scope(scope, 'InnerFlatten', [inputs, new_rank]) as sc:\n    if isinstance(inputs, sparse_tensor.SparseTensor):\n      flattened = _sparse_inner_flatten(inputs, new_rank)\n    else:\n      inputs = ops.convert_to_tensor(inputs)\n      flattened = _dense_inner_flatten(inputs, new_rank)\n  return utils.collect_named_outputs(output_collections, sc, flattened)\n\n\ndef _model_variable_getter(\n    getter,\n    name,\n    shape=None,\n    dtype=None,\n    initializer=None,\n    regularizer=None,\n    trainable=True,\n    collections=None,\n    caching_device=None,\n    partitioner=None,\n    rename=None,\n    use_resource=None,\n    synchronization=tf_variables.VariableSynchronization.AUTO,\n    aggregation=tf_variables.VariableAggregation.NONE,\n    **_):\n  \"\"\"Getter that uses model_variable for compatibility with core layers.\"\"\"\n  short_name = name.split('/')[-1]\n  if rename and short_name in rename:\n    name_components = name.split('/')\n    name_components[-1] = rename[short_name]\n    name = '/'.join(name_components)\n  return variables.model_variable(\n      name,\n      shape=shape,\n      dtype=dtype,\n      initializer=initializer,\n      regularizer=regularizer,\n      collections=collections,\n      trainable=trainable,\n      caching_device=caching_device,\n      partitioner=partitioner,\n      custom_getter=getter,\n      use_resource=use_resource,\n      synchronization=synchronization,\n      aggregation=aggregation)\n\n\ndef _build_variable_getter(rename=None):\n  \"\"\"Build a model variable getter that respects scope getter and renames.\"\"\"\n\n  # VariableScope will nest the getters\n  def layer_variable_getter(getter, *args, **kwargs):\n    kwargs['rename'] = rename\n    return _model_variable_getter(getter, *args, **kwargs)\n\n  return layer_variable_getter\n\n\ndef _add_variable_to_collections(variable, collections_set, collections_name):\n  \"\"\"Adds variable (or all its parts) to all collections with that name.\"\"\"\n  collections = utils.get_variable_collections(collections_set,\n                                               collections_name) or []\n  variables_list = [variable]\n  if isinstance(variable, tf_variables.PartitionedVariable):\n    variables_list = [v for v in variable]\n  for collection in collections:\n    for var in variables_list:\n      if var not in ops.get_collection(collection):\n        ops.add_to_collection(collection, var)\n\n\n@add_arg_scope\ndef fully_connected(inputs,\n                    num_outputs,\n                    activation_fn=nn.relu,\n                    normalizer_fn=None,\n                    normalizer_params=None,\n                    weights_initializer=initializers.xavier_initializer(),\n                    weights_regularizer=None,\n                    biases_initializer=init_ops.zeros_initializer(),\n                    biases_regularizer=None,\n                    reuse=None,\n                    variables_collections=None,\n                    outputs_collections=None,\n                    trainable=True,\n                    scope=None):\n  \"\"\"Adds a fully connected layer.\n\n  `fully_connected` creates a variable called `weights`, representing a fully\n  connected weight matrix, which is multiplied by the `inputs` to produce a\n  `Tensor` of hidden units. If a `normalizer_fn` is provided (such as\n  `batch_norm`), it is then applied. Otherwise, if `normalizer_fn` is\n  None and a `biases_initializer` is provided then a `biases` variable would be\n  created and added the hidden units. Finally, if `activation_fn` is not `None`,\n  it is applied to the hidden units as well.\n\n  Note: that if `inputs` have a rank greater than 2, then `inputs` is flattened\n  prior to the initial matrix multiply by `weights`.\n\n  Args:\n    inputs: A tensor of at least rank 2 and static value for the last dimension;\n      i.e. `[batch_size, depth]`, `[None, None, None, channels]`.\n    num_outputs: Integer or long, the number of output units in the layer.\n    activation_fn: Activation function. The default value is a ReLU function.\n      Explicitly set it to None to skip it and maintain a linear activation.\n    normalizer_fn: Normalization function to use instead of `biases`. If\n      `normalizer_fn` is provided then `biases_initializer` and\n      `biases_regularizer` are ignored and `biases` are not created nor added.\n      default set to None for no normalizer function\n    normalizer_params: Normalization function parameters.\n    weights_initializer: An initializer for the weights.\n    weights_regularizer: Optional regularizer for the weights.\n    biases_initializer: An initializer for the biases. If None skip biases.\n    biases_regularizer: Optional regularizer for the biases.\n    reuse: Whether or not the layer and its variables should be reused. To be\n      able to reuse the layer scope must be given.\n    variables_collections: Optional list of collections for all the variables or\n      a dictionary containing a different list of collections per variable.\n    outputs_collections: Collection to add the outputs.\n    trainable: If `True` also add variables to the graph collection\n      `GraphKeys.TRAINABLE_VARIABLES` (see tf.Variable).\n    scope: Optional scope for variable_scope.\n\n  Returns:\n     The tensor variable representing the result of the series of operations.\n\n  Raises:\n    ValueError: If x has rank less than 2 or if its last dimension is not set.\n  \"\"\"\n  if not isinstance(num_outputs, six.integer_types):\n    raise ValueError('num_outputs type should be one of %s, got %s.' %\n                     (list(six.integer_types), type(num_outputs)))\n\n  layer_variable_getter = _build_variable_getter({\n      'bias': 'biases',\n      'kernel': 'weights'\n  })\n\n  with variable_scope.variable_scope(\n      scope,\n      'fully_connected', [inputs],\n      reuse=reuse,\n      custom_getter=layer_variable_getter) as sc:\n    inputs = ops.convert_to_tensor(inputs)\n    layer = core_layers.Dense(\n        units=num_outputs,\n        activation=None,\n        use_bias=not normalizer_fn and biases_initializer,\n        kernel_initializer=weights_initializer,\n        bias_initializer=biases_initializer,\n        kernel_regularizer=weights_regularizer,\n        bias_regularizer=biases_regularizer,\n        activity_regularizer=None,\n        trainable=trainable,\n        name=sc.name,\n        dtype=inputs.dtype.base_dtype,\n        _scope=sc,\n        _reuse=reuse)\n    outputs = layer.apply(inputs)\n\n    # Add variables to collections.\n    _add_variable_to_collections(layer.kernel, variables_collections, 'weights')\n    if layer.bias is not None:\n      _add_variable_to_collections(layer.bias, variables_collections, 'biases')\n\n    # Apply normalizer function / layer.\n    if normalizer_fn is not None:\n      if not normalizer_params:\n        normalizer_params = {}\n      outputs = normalizer_fn(outputs, **normalizer_params)\n\n    if activation_fn is not None:\n      outputs = activation_fn(outputs)\n\n    return utils.collect_named_outputs(outputs_collections, sc.name, outputs)\n\n\n# class GDN(base.Layer):\n#   \"\"\"Generalized divisive normalization layer.\n\n#   Based on the papers:\n\n#     \"Density Modeling of Images using a Generalized Normalization\n#     Transformation\"\n\n#     Johannes Ballé, Valero Laparra, Eero P. Simoncelli\n\n#     https://arxiv.org/abs/1511.06281\n\n#     \"End-to-end Optimized Image Compression\"\n\n#     Johannes Ballé, Valero Laparra, Eero P. Simoncelli\n\n#     https://arxiv.org/abs/1611.01704\n\n#   Implements an activation function that is essentially a multivariate\n#   generalization of a particular sigmoid-type function:\n\n#   ```\n#   y[i] = x[i] / sqrt(beta[i] + sum_j(gamma[j, i] * x[j]))\n#   ```\n\n#   where `i` and `j` run over channels. This implementation never sums across\n#   spatial dimensions. It is similar to local response normalization, but much\n#   more flexible, as `beta` and `gamma` are trainable parameters.\n\n#   Arguments:\n#     inverse: If `False` (default), compute GDN response. If `True`, compute IGDN\n#       response (one step of fixed point iteration to invert GDN; the division is\n#       replaced by multiplication).\n#     beta_min: Lower bound for beta, to prevent numerical error from causing\n#       square root of zero or negative values.\n#     gamma_init: The gamma matrix will be initialized as the identity matrix\n#       multiplied with this value. If set to zero, the layer is effectively\n#       initialized to the identity operation, since beta is initialized as one. A\n#       good default setting is somewhere between 0 and 0.5.\n#     reparam_offset: Offset added to the reparameterization of beta and gamma.\n#       The reparameterization of beta and gamma as their square roots lets the\n#       training slow down when their values are close to zero, which is desirable\n#       as small values in the denominator can lead to a situation where gradient\n#       noise on beta/gamma leads to extreme amounts of noise in the GDN\n#       activations. However, without the offset, we would get zero gradients if\n#       any elements of beta or gamma were exactly zero, and thus the training\n#       could get stuck. To prevent this, we add this small constant. The default\n#       value was empirically determined as a good starting point. Making it\n#       bigger potentially leads to more gradient noise on the activations, making\n#       it too small may lead to numerical precision issues.\n#     data_format: Format of input tensor. Currently supports `'channels_first'`\n#       and `'channels_last'`.\n#     activity_regularizer: Regularizer function for the output.\n#     trainable: Boolean, if `True`, also add variables to the graph collection\n#       `GraphKeys.TRAINABLE_VARIABLES` (see `tf.Variable`).\n#     name: String, the name of the layer. Layers with the same name will share\n#       weights, but to avoid mistakes we require `reuse=True` in such cases.\n#   Properties:\n#     inverse: Boolean, whether GDN is computed (`True`) or IGDN (`False`).\n#     data_format: Format of input tensor. Currently supports `'channels_first'`\n#       and `'channels_last'`.\n#     beta: The beta parameter as defined above (1D `Tensor`).\n#     gamma: The gamma parameter as defined above (2D `Tensor`).\n#   \"\"\"\n\n#   def __init__(self,\n#                inverse=False,\n#                beta_min=1e-6,\n#                gamma_init=.1,\n#                reparam_offset=2**-18,\n#                data_format='channels_last',\n#                activity_regularizer=None,\n#                trainable=True,\n#                name=None,\n#                **kwargs):\n#     super(GDN, self).__init__(\n#         trainable=trainable,\n#         name=name,\n#         activity_regularizer=activity_regularizer,\n#         **kwargs)\n#     self.inverse = inverse\n#     self._beta_min = beta_min\n#     self._gamma_init = gamma_init\n#     self._reparam_offset = reparam_offset\n#     self.data_format = data_format\n#     self._channel_axis()  # trigger ValueError early\n#     self.input_spec = input_spec.InputSpec(min_ndim=3, max_ndim=5)\n\n#   def _channel_axis(self):\n#     try:\n#       return {'channels_first': 1, 'channels_last': -1}[self.data_format]\n#     except KeyError:\n#       raise ValueError('Unsupported `data_format` for GDN layer: {}.'.format(\n#           self.data_format))\n\n#   @staticmethod\n#   def _lower_bound(inputs, bound, name=None):\n#     \"\"\"Same as tf.maximum, but with helpful gradient for inputs < bound.\n\n#     The gradient is overwritten so that it is passed through if the input is not\n#     hitting the bound. If it is, only gradients that push `inputs` higher than\n#     the bound are passed through. No gradients are passed through to the bound.\n\n#     Args:\n#       inputs: input tensor\n#       bound: lower bound for the input tensor\n#       name: name for this op\n\n#     Returns:\n#       tf.maximum(inputs, bound)\n#     \"\"\"\n#     with ops.name_scope(name, 'GDNLowerBound', [inputs, bound]) as scope:\n#       inputs = ops.convert_to_tensor(inputs, name='inputs')\n#       bound = ops.convert_to_tensor(bound, name='bound')\n#       with ops.get_default_graph().gradient_override_map(\n#           {'Maximum': 'GDNLowerBound'}):\n#         return math_ops.maximum(inputs, bound, name=scope)\n\n#   @staticmethod\n#   def _lower_bound_grad(op, grad):\n#     \"\"\"Gradient for `_lower_bound`.\n\n#     Args:\n#       op: the tensorflow op for which to calculate a gradient\n#       grad: gradient with respect to the output of the op\n\n#     Returns:\n#       gradients with respect to the inputs of the op\n#     \"\"\"\n#     inputs = op.inputs[0]\n#     bound = op.inputs[1]\n#     pass_through_if = math_ops.logical_or(inputs >= bound, grad < 0)\n#     return [math_ops.cast(pass_through_if, grad.dtype) * grad, None]\n\n#   def build(self, input_shape):\n#     channel_axis = self._channel_axis()\n#     input_shape = tensor_shape.TensorShape(input_shape)\n#     num_channels = input_shape.dims[channel_axis].value\n#     if num_channels is None:\n#       raise ValueError('The channel dimension of the inputs to `GDN` '\n#                        'must be defined.')\n#     self._input_rank = input_shape.ndims\n#     self.input_spec = input_spec.InputSpec(\n#         ndim=input_shape.ndims, axes={channel_axis: num_channels})\n\n#     pedestal = array_ops.constant(self._reparam_offset**2, dtype=self.dtype)\n#     beta_bound = array_ops.constant(\n#         (self._beta_min + self._reparam_offset**2)**.5, dtype=self.dtype)\n#     gamma_bound = array_ops.constant(self._reparam_offset, dtype=self.dtype)\n\n#     def beta_initializer(shape, dtype=None, partition_info=None):\n#       del partition_info  # unused\n#       pedestal = array_ops.constant(self._reparam_offset**2, dtype=self.dtype)\n#       return math_ops.sqrt(array_ops.ones(shape, dtype=dtype) + pedestal)\n\n#     def gamma_initializer(shape, dtype=None, partition_info=None):\n#       del partition_info  # unused\n#       assert len(shape) == 2\n#       assert shape[0] == shape[1]\n#       eye = linalg_ops.eye(shape[0], dtype=dtype)\n#       pedestal = array_ops.constant(self._reparam_offset**2, dtype=self.dtype)\n#       return math_ops.sqrt(self._gamma_init * eye + pedestal)\n\n#     beta = self.add_variable(\n#         'reparam_beta',\n#         shape=[num_channels],\n#         initializer=beta_initializer,\n#         dtype=self.dtype,\n#         trainable=True)\n#     beta = self._lower_bound(beta, beta_bound)\n#     self.beta = math_ops.square(beta) - pedestal\n\n#     gamma = self.add_variable(\n#         'reparam_gamma',\n#         shape=[num_channels, num_channels],\n#         initializer=gamma_initializer,\n#         dtype=self.dtype,\n#         trainable=True)\n#     gamma = self._lower_bound(gamma, gamma_bound)\n#     self.gamma = math_ops.square(gamma) - pedestal\n\n#     self.built = True\n\n#   def call(self, inputs):\n#     inputs = ops.convert_to_tensor(inputs, dtype=self.dtype)\n#     ndim = self._input_rank\n\n#     shape = self.gamma.get_shape().as_list()\n#     gamma = array_ops.reshape(self.gamma, (ndim - 2) * [1] + shape)\n\n#     # Compute normalization pool.\n#     if self.data_format == 'channels_first':\n#       norm_pool = nn.convolution(\n#           math_ops.square(inputs),\n#           gamma,\n#           'VALID',\n#           data_format='NC' + 'DHW' [-(ndim - 2):])\n#       if ndim == 3:\n#         norm_pool = array_ops.expand_dims(norm_pool, 2)\n#         norm_pool = nn.bias_add(norm_pool, self.beta, data_format='NCHW')\n#         norm_pool = array_ops.squeeze(norm_pool, [2])\n#       elif ndim == 5:\n#         shape = array_ops.shape(norm_pool)\n#         norm_pool = array_ops.reshape(norm_pool, shape[:3] + [-1])\n#         norm_pool = nn.bias_add(norm_pool, self.beta, data_format='NCHW')\n#         norm_pool = array_ops.reshape(norm_pool, shape)\n#       else:  # ndim == 4\n#         norm_pool = nn.bias_add(norm_pool, self.beta, data_format='NCHW')\n#     else:  # channels_last\n#       norm_pool = nn.convolution(math_ops.square(inputs), gamma, 'VALID')\n#       norm_pool = nn.bias_add(norm_pool, self.beta, data_format='NHWC')\n#     norm_pool = math_ops.sqrt(norm_pool)\n\n#     if self.inverse:\n#       outputs = inputs * norm_pool\n#     else:\n#       outputs = inputs / norm_pool\n#     outputs.set_shape(inputs.get_shape())\n#     return outputs\n\n#   def compute_output_shape(self, input_shape):\n#     channel_axis = self._channel_axis()\n#     input_shape = tensor_shape.TensorShape(input_shape)\n#     if not 3 <= input_shape.ndim <= 5:\n#       raise ValueError('`input_shape` must be of rank 3 to 5, inclusive.')\n#     if input_shape.dims[channel_axis].value is None:\n#       raise ValueError(\n#           'The channel dimension of `input_shape` must be defined.')\n#     return input_shape\n\n\n# ops.RegisterGradient('GDNLowerBound')(GDN._lower_bound_grad)  # pylint:disable=protected-access\n\n\n# def gdn(inputs,\n#         inverse=False,\n#         beta_min=1e-6,\n#         gamma_init=.1,\n#         reparam_offset=2**-18,\n#         data_format='channels_last',\n#         activity_regularizer=None,\n#         trainable=True,\n#         name=None,\n#         reuse=None):\n#   \"\"\"Functional interface for GDN layer.\n\n#   Based on the papers:\n\n#     \"Density Modeling of Images using a Generalized Normalization\n#     Transformation\"\n#     Johannes Ballé, Valero Laparra, Eero P. Simoncelli\n#     https://arxiv.org/abs/1511.06281\n\n#     \"End-to-end Optimized Image Compression\"\n#     Johannes Ballé, Valero Laparra, Eero P. Simoncelli\n#     https://arxiv.org/abs/1611.01704\n\n#   Implements an activation function that is essentially a multivariate\n#   generalization of a particular sigmoid-type function:\n\n#   ```\n#   y[i] = x[i] / sqrt(beta[i] + sum_j(gamma[j, i] * x[j]))\n#   ```\n\n#   where `i` and `j` run over channels. This implementation never sums across\n#   spatial dimensions. It is similar to local response normalization, but much\n#   more flexible, as `beta` and `gamma` are trainable parameters.\n\n#   Args:\n#     inputs: Tensor input.\n#     inverse: If `False` (default), compute GDN response. If `True`, compute IGDN\n#       response (one step of fixed point iteration to invert GDN; the division is\n#       replaced by multiplication).\n#     beta_min: Lower bound for beta, to prevent numerical error from causing\n#       square root of zero or negative values.\n#     gamma_init: The gamma matrix will be initialized as the identity matrix\n#       multiplied with this value. If set to zero, the layer is effectively\n#       initialized to the identity operation, since beta is initialized as one. A\n#       good default setting is somewhere between 0 and 0.5.\n#     reparam_offset: Offset added to the reparameterization of beta and gamma.\n#       The reparameterization of beta and gamma as their square roots lets the\n#       training slow down when their values are close to zero, which is desirable\n#       as small values in the denominator can lead to a situation where gradient\n#       noise on beta/gamma leads to extreme amounts of noise in the GDN\n#       activations. However, without the offset, we would get zero gradients if\n#       any elements of beta or gamma were exactly zero, and thus the training\n#       could get stuck. To prevent this, we add this small constant. The default\n#       value was empirically determined as a good starting point. Making it\n#       bigger potentially leads to more gradient noise on the activations, making\n#       it too small may lead to numerical precision issues.\n#     data_format: Format of input tensor. Currently supports `'channels_first'`\n#       and `'channels_last'`.\n#     activity_regularizer: Regularizer function for the output.\n#     trainable: Boolean, if `True`, also add variables to the graph collection\n#       `GraphKeys.TRAINABLE_VARIABLES` (see `tf.Variable`).\n#     name: String, the name of the layer. Layers with the same name will share\n#       weights, but to avoid mistakes we require `reuse=True` in such cases.\n#     reuse: Boolean, whether to reuse the weights of a previous layer by the same\n#       name.\n\n#   Returns:\n#     Output tensor.\n#   \"\"\"\n#   layer = GDN(\n#       inverse=inverse,\n#       beta_min=beta_min,\n#       gamma_init=gamma_init,\n#       reparam_offset=reparam_offset,\n#       data_format=data_format,\n#       activity_regularizer=activity_regularizer,\n#       trainable=trainable,\n#       name=name,\n#       dtype=inputs.dtype.base_dtype,\n#       _scope=name,\n#       _reuse=reuse)\n#   return layer.apply(inputs)\n\n\n@add_arg_scope\ndef layer_norm(inputs,\n               center=True,\n               scale=True,\n               activation_fn=None,\n               reuse=None,\n               variables_collections=None,\n               outputs_collections=None,\n               trainable=True,\n               begin_norm_axis=1,\n               begin_params_axis=-1,\n               scope=None):\n  \"\"\"Adds a Layer Normalization layer.\n\n  Based on the paper:\n\n    \"Layer Normalization\"\n\n    Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton\n\n    https://arxiv.org/abs/1607.06450.\n\n  Can be used as a normalizer function for conv2d and fully_connected.\n\n  Given a tensor `inputs` of rank `R`, moments are calculated and normalization\n  is performed over axes `begin_norm_axis ... R - 1`.  Scaling and centering,\n  if requested, is performed over axes `begin_params_axis .. R - 1`.\n\n  By default, `begin_norm_axis = 1` and `begin_params_axis = -1`,\n  meaning that normalization is performed over all but the first axis\n  (the `HWC` if `inputs` is `NHWC`), while the `beta` and `gamma` trainable\n  parameters are calculated for the rightmost axis (the `C` if `inputs` is\n  `NHWC`).  Scaling and recentering is performed via broadcast of the\n  `beta` and `gamma` parameters with the normalized tensor.\n\n  The shapes of `beta` and `gamma` are `inputs.shape[begin_params_axis:]`,\n  and this part of the inputs' shape must be fully defined.\n\n  Args:\n    inputs: A tensor having rank `R`. The normalization is performed over axes\n      `begin_norm_axis ... R - 1` and centering and scaling parameters are\n      calculated over `begin_params_axis ... R - 1`.\n    center: If True, add offset of `beta` to normalized tensor. If False, `beta`\n      is ignored.\n    scale: If True, multiply by `gamma`. If False, `gamma` is not used. When the\n      next layer is linear (also e.g. `nn.relu`), this can be disabled since the\n      scaling can be done by the next layer.\n    activation_fn: Activation function, default set to None to skip it and\n      maintain a linear activation.\n    reuse: Whether or not the layer and its variables should be reused. To be\n      able to reuse the layer scope must be given.\n    variables_collections: Optional collections for the variables.\n    outputs_collections: Collections to add the outputs.\n    trainable: If `True` also add variables to the graph collection\n      `GraphKeys.TRAINABLE_VARIABLES` (see tf.Variable).\n    begin_norm_axis: The first normalization dimension: normalization will be\n      performed along dimensions `begin_norm_axis : rank(inputs)`\n    begin_params_axis: The first parameter (beta, gamma) dimension: scale and\n      centering parameters will have dimensions\n      `begin_params_axis : rank(inputs)` and will be broadcast with the\n        normalized inputs accordingly.\n    scope: Optional scope for `variable_scope`.\n\n  Returns:\n    A `Tensor` representing the output of the operation, having the same\n    shape and dtype as `inputs`.\n\n  Raises:\n    ValueError: If the rank of `inputs` is not known at graph build time,\n      or if `inputs.shape[begin_params_axis:]` is not fully defined at\n      graph build time.\n  \"\"\"\n  with variable_scope.variable_scope(\n      scope, 'LayerNorm', [inputs], reuse=reuse) as sc:\n    inputs = ops.convert_to_tensor(inputs)\n    inputs_shape = inputs.shape\n    inputs_rank = inputs_shape.ndims\n    if inputs_rank is None:\n      raise ValueError('Inputs %s has undefined rank.' % inputs.name)\n    dtype = inputs.dtype.base_dtype\n    if begin_norm_axis < 0:\n      begin_norm_axis = inputs_rank + begin_norm_axis\n    if begin_params_axis >= inputs_rank or begin_norm_axis >= inputs_rank:\n      raise ValueError('begin_params_axis (%d) and begin_norm_axis (%d) '\n                       'must be < rank(inputs) (%d)' %\n                       (begin_params_axis, begin_norm_axis, inputs_rank))\n    params_shape = inputs_shape[begin_params_axis:]\n    if not params_shape.is_fully_defined():\n      raise ValueError(\n          'Inputs %s: shape(inputs)[%s:] is not fully defined: %s' %\n          (inputs.name, begin_params_axis, inputs_shape))\n    # Allocate parameters for the beta and gamma of the normalization.\n    beta, gamma = None, None\n    if center:\n      beta_collections = utils.get_variable_collections(variables_collections,\n                                                        'beta')\n      beta = variables.model_variable(\n          'beta',\n          shape=params_shape,\n          dtype=dtype,\n          initializer=init_ops.zeros_initializer(),\n          collections=beta_collections,\n          trainable=trainable)\n    if scale:\n      gamma_collections = utils.get_variable_collections(\n          variables_collections, 'gamma')\n      gamma = variables.model_variable(\n          'gamma',\n          shape=params_shape,\n          dtype=dtype,\n          initializer=init_ops.ones_initializer(),\n          collections=gamma_collections,\n          trainable=trainable)\n    # By default, compute the moments across all the dimensions except the one with index 0.\n    norm_axes = list(range(begin_norm_axis, inputs_rank))\n    mean, variance = nn.moments(inputs, norm_axes, keep_dims=True)\n    # Compute layer normalization using the batch_normalization function.\n    # Note that epsilon must be increased for float16 due to the limited\n    # representable range.\n    variance_epsilon = 1e-12 if dtype != dtypes.float16 else 1e-3\n    outputs = nn.batch_normalization(\n        inputs,\n        mean,\n        variance,\n        offset=beta,\n        scale=gamma,\n        variance_epsilon=variance_epsilon)\n    outputs.set_shape(inputs_shape)\n    if activation_fn is not None:\n      outputs = activation_fn(outputs)\n    return utils.collect_named_outputs(outputs_collections, sc.name, outputs)\n\n\n@add_arg_scope\ndef images_to_sequence(inputs,\n                       data_format=DATA_FORMAT_NHWC,\n                       outputs_collections=None,\n                       scope=None):\n  \"\"\"Convert a batch of images into a batch of sequences.\n\n  Args:\n    inputs: a (num_images, height, width, depth) tensor\n    data_format: A string. `NHWC` (default) and `NCHW` are supported.\n    outputs_collections: The collections to which the outputs are added.\n    scope: Optional scope for name_scope.\n\n  Raises:\n     ValueError: If `data_format` is not either NCHW or NHWC.\n\n  Returns:\n    (width, num_images*height, depth) sequence tensor\n  \"\"\"\n  if data_format not in (DATA_FORMAT_NCHW, DATA_FORMAT_NHWC):\n    raise ValueError('data_format has to be either NCHW or NHWC.')\n  with ops.name_scope(scope, 'ImagesToSequence', [inputs]) as sc:\n    inputs = ops.convert_to_tensor(inputs)\n    df = ('channels_first'\n          if data_format and data_format.startswith('NC') else 'channels_last')\n    if df == 'channels_first':\n      inputs = array_ops.transpose(inputs, [0, 2, 3, 1])\n    _, _, width, depth = inputs.get_shape().as_list()\n    s = array_ops.shape(inputs)\n    batch_size, height = s[0], s[1]\n    transposed = array_ops.transpose(inputs, [2, 0, 1, 3])\n    outputs = array_ops.reshape(transposed, [width, batch_size * height, depth])\n    return utils.collect_named_outputs(outputs_collections, sc, outputs)\n\n\n@add_arg_scope\ndef max_pool2d(inputs,\n               kernel_size,\n               stride=2,\n               padding='VALID',\n               data_format=DATA_FORMAT_NHWC,\n               outputs_collections=None,\n               scope=None):\n  \"\"\"Adds a 2D Max Pooling op.\n\n  It is assumed that the pooling is done per image but not in batch or channels.\n\n  Args:\n    inputs: A 4-D tensor of shape `[batch_size, height, width, channels]` if\n      `data_format` is `NHWC`, and `[batch_size, channels, height, width]` if\n      `data_format` is `NCHW`.\n    kernel_size: A list of length 2: [kernel_height, kernel_width] of the\n      pooling kernel over which the op is computed. Can be an int if both values\n      are the same.\n    stride: A list of length 2: [stride_height, stride_width]. Can be an int if\n      both strides are the same. Note that presently both strides must have the\n      same value.\n    padding: The padding method, either 'VALID' or 'SAME'.\n    data_format: A string. `NHWC` (default) and `NCHW` are supported.\n    outputs_collections: The collections to which the outputs are added.\n    scope: Optional scope for name_scope.\n\n  Returns:\n    A `Tensor` representing the results of the pooling operation.\n\n  Raises:\n    ValueError: If `data_format` is neither `NHWC` nor `NCHW`.\n    ValueError: If 'kernel_size' is not a 2-D list\n  \"\"\"\n  if data_format not in (DATA_FORMAT_NCHW, DATA_FORMAT_NHWC):\n    raise ValueError('data_format has to be either NCHW or NHWC.')\n  with ops.name_scope(scope, 'MaxPool2D', [inputs]) as sc:\n    inputs = ops.convert_to_tensor(inputs)\n    df = ('channels_first'\n          if data_format and data_format.startswith('NC') else 'channels_last')\n    layer = pooling_layers.MaxPooling2D(\n        pool_size=kernel_size,\n        strides=stride,\n        padding=padding,\n        data_format=df,\n        _scope=sc)\n    outputs = layer.apply(inputs)\n    return utils.collect_named_outputs(outputs_collections, sc, outputs)\n\n\n@add_arg_scope\ndef max_pool3d(inputs,\n               kernel_size,\n               stride=2,\n               padding='VALID',\n               data_format=DATA_FORMAT_NDHWC,\n               outputs_collections=None,\n               scope=None):\n  \"\"\"Adds a 3D Max Pooling op.\n\n  It is assumed that the pooling is done per image but not in batch or channels.\n\n  Args:\n    inputs: A 5-D tensor of shape `[batch_size, depth, height, width, channels]`\n      if `data_format` is `NDHWC`, and `[batch_size, channels, depth, height,\n      width]` if `data_format` is `NCDHW`.\n    kernel_size: A list of length 3: [kernel_depth, kernel_height, kernel_width]\n      of the pooling kernel over which the op is computed. Can be an int if both\n      values are the same.\n    stride: A list of length 3: [stride_depth, stride_height, stride_width]. Can\n      be an int if both strides are the same. Note that presently both strides\n      must have the same value.\n    padding: The padding method, either 'VALID' or 'SAME'.\n    data_format: A string. `NDHWC` (default) and `NCDHW` are supported.\n    outputs_collections: The collections to which the outputs are added.\n    scope: Optional scope for name_scope.\n\n  Returns:\n    A `Tensor` representing the results of the pooling operation.\n\n  Raises:\n    ValueError: If `data_format` is neither `NDHWC` nor `NCDHW`.\n    ValueError: If 'kernel_size' is not a 3-D list\n  \"\"\"\n  if data_format not in (DATA_FORMAT_NCDHW, DATA_FORMAT_NDHWC):\n    raise ValueError('data_format has to be either NCDHW or NDHWC.')\n  with ops.name_scope(scope, 'MaxPool3D', [inputs]) as sc:\n    inputs = ops.convert_to_tensor(inputs)\n    df = ('channels_first'\n          if data_format and data_format.startswith('NC') else 'channels_last')\n    layer = pooling_layers.MaxPooling3D(\n        pool_size=kernel_size,\n        strides=stride,\n        padding=padding,\n        data_format=df,\n        _scope=sc)\n    outputs = layer.apply(inputs)\n    return utils.collect_named_outputs(outputs_collections, sc, outputs)\n\n\n@add_arg_scope\ndef pool(inputs,\n         kernel_size,\n         pooling_type,\n         padding='VALID',\n         data_format=None,\n         dilation_rate=1,\n         stride=1,\n         outputs_collections=None,\n         scope=None):\n  # pylint: disable=line-too-long\n  \"\"\"Adds a pooling op.\n\n\n  Args:\n    inputs: Tensor of rank N+2, of shape `[batch_size] + input_spatial_shape +\n      [num_channels]` if data_format does not start with \"NC\" (default), or\n      `[batch_size, num_channels] + input_spatial_shape` if data_format starts\n      with \"NC\".  Pooling happens over the spatial dimensions only.\n    kernel_size: Sequence of N ints >= 1.  Can also be a single integer to\n      specify the same value for all spatial dimensions.\n    pooling_type: Specifies pooling operation, must be \"AVG\" or \"MAX\".\n    padding: The padding algorithm, must be \"SAME\" or \"VALID\".\n    data_format: A string or None.  Specifies whether the channel dimension of\n      the `input` and output is the last dimension (default, or if `data_format`\n      does not start with \"NC\"), or the second dimension (if `data_format`\n      starts with \"NC\").  For N=1, the valid values are \"NWC\" (default) and\n      \"NCW\".  For N=2, the valid values are \"NHWC\" (default) and \"NCHW\". For\n      N=3, the valid values are \"NDHWC\" (default) and \"NCDHW\".\n    dilation_rate: Optional.  Dilation rate.  Sequence of N ints >= 1.  Defaults\n      to [1]*N.  Can also be a single integer to specify the same value for all\n      spatial dimensions.  If any value of dilation_rate is > 1, then all values\n      of stride must be 1.\n    stride: Optional.  Sequence of N ints >= 1.  Defaults to [1]*N.  Can also be\n      a single integer to specify the same value for all spatial dimensions.  If\n      any value of stride is > 1, then all values of dilation_rate must be 1.\n    outputs_collections: The collections to which the outputs are added.\n    scope: Optional scope for name_scope.\n\n  Returns:\n    A `Tensor` representing the results of the pooling operation.\n\n  Raises:\n    ValueError: If arguments are invalid.\n\n  \"\"\"\n  # pylint: enable=line-too-long\n  with ops.name_scope(scope, '%s_pool' % (pooling_type.lower()),\n                      [inputs]) as sc:\n    inputs = ops.convert_to_tensor(inputs)\n    input_rank = inputs.get_shape().ndims\n    if input_rank is None:\n      raise ValueError('Rank of inputs must be known')\n    if input_rank < 3:\n      raise ValueError('Rank of inputs must be >= 3')\n    num_spatial_dims = input_rank - 2\n    output = nn.pool(\n        input=inputs,\n        window_shape=utils.n_positive_integers(num_spatial_dims, kernel_size),\n        pooling_type=pooling_type,\n        padding=padding,\n        data_format=data_format,\n        dilation_rate=utils.n_positive_integers(num_spatial_dims,\n                                                dilation_rate),\n        strides=utils.n_positive_integers(num_spatial_dims, stride),\n        name=sc)\n    return utils.collect_named_outputs(outputs_collections, sc, output)\n\n\n@add_arg_scope\ndef one_hot_encoding(labels,\n                     num_classes,\n                     on_value=1.0,\n                     off_value=0.0,\n                     outputs_collections=None,\n                     scope=None):\n  \"\"\"Transform numeric labels into onehot_labels using `tf.one_hot`.\n\n  Args:\n    labels: [batch_size] target labels.\n    num_classes: Total number of classes.\n    on_value: A scalar defining the on-value.\n    off_value: A scalar defining the off-value.\n    outputs_collections: Collection to add the outputs.\n    scope: Optional scope for name_scope.\n\n  Returns:\n    One-hot encoding of the labels.\n  \"\"\"\n  with ops.name_scope(scope, 'OneHotEncoding', [labels, num_classes]) as sc:\n    labels = ops.convert_to_tensor(labels)\n    if labels.dtype == dtypes.int32:\n      labels = standard_ops.to_int64(labels)\n    outputs = standard_ops.one_hot(\n        labels, num_classes, on_value=on_value, off_value=off_value)\n    return utils.collect_named_outputs(outputs_collections, sc, outputs)\n\n\ndef _apply_activation(y, activation_fn, output_collections):\n  if activation_fn is not None:\n    y = activation_fn(y)\n  ops.add_to_collections(\n      list(output_collections or []) + [ops.GraphKeys.ACTIVATIONS], y)\n  return y\n\n\ndef repeat(inputs, repetitions, layer, *args, **kwargs):\n  \"\"\"Applies the same layer with the same arguments repeatedly.\n\n  ```python\n    y = repeat(x, 3, conv2d, 64, [3, 3], scope='conv1')\n    # It is equivalent to:\n\n    x = conv2d(x, 64, [3, 3], scope='conv1/conv1_1')\n    x = conv2d(x, 64, [3, 3], scope='conv1/conv1_2')\n    y = conv2d(x, 64, [3, 3], scope='conv1/conv1_3')\n  ```\n\n  If the `scope` argument is not given in `kwargs`, it is set to\n  `layer.__name__`, or `layer.func.__name__` (for `functools.partial`\n  objects). If neither `__name__` nor `func.__name__` is available, the\n  layers are called with `scope='stack'`.\n\n  Args:\n    inputs: A `Tensor` suitable for layer.\n    repetitions: Int, number of repetitions.\n    layer: A layer with arguments `(inputs, *args, **kwargs)`\n    *args: Extra args for the layer.\n    **kwargs: Extra kwargs for the layer.\n\n  Returns:\n    A tensor result of applying the layer, repetitions times.\n  Raises:\n    ValueError: If the op is unknown or wrong.\n  \"\"\"\n  scope = kwargs.pop('scope', None)\n  with variable_scope.variable_scope(scope, 'Repeat', [inputs]):\n    inputs = ops.convert_to_tensor(inputs)\n    if scope is None:\n      if hasattr(layer, '__name__'):\n        scope = layer.__name__\n      elif hasattr(layer, 'func') and hasattr(layer.func, '__name__'):\n        scope = layer.func.__name__  # In case layer is a functools.partial.\n      else:\n        scope = 'repeat'\n    outputs = inputs\n    for i in range(repetitions):\n      kwargs['scope'] = scope + '_' + str(i + 1)\n      outputs = layer(outputs, *args, **kwargs)\n    return outputs\n\n\ndef _scale_gradient_shape(op):\n  \"\"\"Shape helper function for scale_gradient function below.\"\"\"\n  return [op.inputs[0].shape]\n\n\ndef _scale_gradient_grad(op, grad):\n  \"\"\"Python gradient helper function for scale_gradient function below.\"\"\"\n  return [grad * op.inputs[1], None]\n\n\n@function.Defun(\n    python_grad_func=_scale_gradient_grad, shape_func=_scale_gradient_shape)\ndef scale_gradient(inputs, gradient_multiplier):\n  \"\"\"Identity operation, but with the gradient multiplied by a tensor.\n\n  The TensorFlow gradient system will compute the gradient with respect to\n  `inputs` as the product of the gradient with respect to the `output`\n  multiplied by a specified `gradient_multiplier` tensor.  If\n  `gradient_multiplier` is equal to 1, then this results in the true gradient.\n  Otherwise, it results in a scaled gradient.\n\n  This can be useful for adjusting the relative learning rate of different\n  parameter tensors when performing gradient descent, and because this rescaling\n  can be inserted at arbitrary locations within a graph, is often more\n  convenient to apply than simply rescaling the final computed gradients.\n\n  Args:\n    inputs: Tensor to be output.\n    gradient_multiplier: Tensor by which to multiply the gradient with respect\n      to `output` to compute the gradient with respect to `inputs`.  Its shape\n      must be broadcastable to the shape of `inputs`.\n\n  Returns:\n    output Tensor, equal to `inputs`.\n  \"\"\"\n  # gradient_multiplier is implicitly saved by decorator, and only used for\n  # gradient computation.\n  del gradient_multiplier\n\n  return inputs\n\n\n@add_arg_scope\ndef separable_convolution2d(\n    inputs,\n    num_outputs,\n    kernel_size,\n    depth_multiplier=1,\n    stride=1,\n    padding='SAME',\n    data_format=DATA_FORMAT_NHWC,\n    rate=1,\n    activation_fn=nn.relu,\n    normalizer_fn=None,\n    normalizer_params=None,\n    weights_initializer=initializers.xavier_initializer(),\n    pointwise_initializer=None,\n    weights_regularizer=None,\n    biases_initializer=init_ops.zeros_initializer(),\n    biases_regularizer=None,\n    reuse=None,\n    variables_collections=None,\n    outputs_collections=None,\n    trainable=True,\n    scope=None):\n  \"\"\"Adds a depth-separable 2D convolution with optional batch_norm layer.\n\n  This op first performs a depthwise convolution that acts separately on\n  channels, creating a variable called `depthwise_weights`. If `num_outputs`\n  is not None, it adds a pointwise convolution that mixes channels, creating a\n  variable called `pointwise_weights`. Then, if `normalizer_fn` is None,\n  it adds bias to the result, creating a variable called 'biases', otherwise,\n  the `normalizer_fn` is applied. It finally applies an activation function\n  to produce the end result.\n\n  Args:\n    inputs: A tensor of size [batch_size, height, width, channels].\n    num_outputs: The number of pointwise convolution output filters. If is None,\n      then we skip the pointwise convolution stage.\n    kernel_size: A list of length 2: [kernel_height, kernel_width] of of the\n      filters. Can be an int if both values are the same.\n    depth_multiplier: The number of depthwise convolution output channels for\n      each input channel. The total number of depthwise convolution output\n      channels will be equal to `num_filters_in * depth_multiplier`.\n    stride: A list of length 2: [stride_height, stride_width], specifying the\n      depthwise convolution stride. Can be an int if both strides are the same.\n    padding: One of 'VALID' or 'SAME'.\n    data_format: A string. `NHWC` (default) and `NCHW` are supported.\n    rate: A list of length 2: [rate_height, rate_width], specifying the dilation\n      rates for atrous convolution. Can be an int if both rates are the same. If\n      any value is larger than one, then both stride values need to be one.\n    activation_fn: Activation function. The default value is a ReLU function.\n      Explicitly set it to None to skip it and maintain a linear activation.\n    normalizer_fn: Normalization function to use instead of `biases`. If\n      `normalizer_fn` is provided then `biases_initializer` and\n      `biases_regularizer` are ignored and `biases` are not created nor added.\n      default set to None for no normalizer function\n    normalizer_params: Normalization function parameters.\n    weights_initializer: An initializer for the depthwise weights.\n    pointwise_initializer: An initializer for the pointwise weights. default set\n      to None, means use weights_initializer.\n    weights_regularizer: Optional regularizer for the weights.\n    biases_initializer: An initializer for the biases. If None skip biases.\n    biases_regularizer: Optional regularizer for the biases.\n    reuse: Whether or not the layer and its variables should be reused. To be\n      able to reuse the layer scope must be given.\n    variables_collections: Optional list of collections for all the variables or\n      a dictionary containing a different list of collection per variable.\n    outputs_collections: Collection to add the outputs.\n    trainable: Whether or not the variables should be trainable or not.\n    scope: Optional scope for variable_scope.\n\n  Returns:\n    A `Tensor` representing the output of the operation.\n  Raises:\n    ValueError: If `data_format` is invalid.\n  \"\"\"\n  if data_format not in (DATA_FORMAT_NCHW, DATA_FORMAT_NHWC):\n    raise ValueError('data_format has to be either NCHW or NHWC.')\n  layer_variable_getter = _build_variable_getter({\n      'bias': 'biases',\n      'depthwise_kernel': 'depthwise_weights',\n      'pointwise_kernel': 'pointwise_weights'\n  })\n\n  with variable_scope.variable_scope(\n      scope,\n      'SeparableConv2d', [inputs],\n      reuse=reuse,\n      custom_getter=layer_variable_getter) as sc:\n    inputs = ops.convert_to_tensor(inputs)\n\n    if pointwise_initializer is None:\n      pointwise_initializer = weights_initializer\n\n    df = ('channels_first'\n          if data_format and data_format.startswith('NC') else 'channels_last')\n    if num_outputs is not None:\n      # Apply separable conv using the SeparableConvolution2D layer.\n      layer = convolutional_layers.SeparableConvolution2D(\n          filters=num_outputs,\n          kernel_size=kernel_size,\n          strides=stride,\n          padding=padding,\n          data_format=df,\n          dilation_rate=utils.two_element_tuple(rate),\n          activation=None,\n          depth_multiplier=depth_multiplier,\n          use_bias=not normalizer_fn and biases_initializer,\n          depthwise_initializer=weights_initializer,\n          pointwise_initializer=pointwise_initializer,\n          bias_initializer=biases_initializer,\n          depthwise_regularizer=weights_regularizer,\n          pointwise_regularizer=weights_regularizer,\n          bias_regularizer=biases_regularizer,\n          activity_regularizer=None,\n          trainable=trainable,\n          name=sc.name,\n          dtype=inputs.dtype.base_dtype,\n          _scope=sc,\n          _reuse=reuse)\n      outputs = layer.apply(inputs)\n\n      # Add variables to collections.\n      _add_variable_to_collections(layer.depthwise_kernel,\n                                   variables_collections, 'weights')\n      _add_variable_to_collections(layer.pointwise_kernel,\n                                   variables_collections, 'weights')\n      if layer.bias is not None:\n        _add_variable_to_collections(layer.bias, variables_collections,\n                                     'biases')\n\n      if normalizer_fn is not None:\n        normalizer_params = normalizer_params or {}\n        outputs = normalizer_fn(outputs, **normalizer_params)\n    else:\n      # Actually apply depthwise conv instead of separable conv.\n      dtype = inputs.dtype.base_dtype\n      kernel_h, kernel_w = utils.two_element_tuple(kernel_size)\n      stride_h, stride_w = utils.two_element_tuple(stride)\n      num_filters_in = utils.channel_dimension(\n          inputs.get_shape(), df, min_rank=4)\n      weights_collections = utils.get_variable_collections(\n          variables_collections, 'weights')\n\n      depthwise_shape = [kernel_h, kernel_w, num_filters_in, depth_multiplier]\n      depthwise_weights = variables.model_variable(\n          'depthwise_weights',\n          shape=depthwise_shape,\n          dtype=dtype,\n          initializer=weights_initializer,\n          regularizer=weights_regularizer,\n          trainable=trainable,\n          collections=weights_collections)\n      strides = [\n          1, 1, stride_h, stride_w\n      ] if data_format.startswith('NC') else [1, stride_h, stride_w, 1]\n\n      outputs = nn.depthwise_conv2d(\n          inputs,\n          depthwise_weights,\n          strides,\n          padding,\n          rate=utils.two_element_tuple(rate),\n          data_format=data_format)\n      num_outputs = depth_multiplier * num_filters_in\n\n      if normalizer_fn is not None:\n        normalizer_params = normalizer_params or {}\n        outputs = normalizer_fn(outputs, **normalizer_params)\n      else:\n        if biases_initializer is not None:\n          biases_collections = utils.get_variable_collections(\n              variables_collections, 'biases')\n          biases = variables.model_variable(\n              'biases',\n              shape=[\n                  num_outputs,\n              ],\n              dtype=dtype,\n              initializer=biases_initializer,\n              regularizer=biases_regularizer,\n              trainable=trainable,\n              collections=biases_collections)\n          outputs = nn.bias_add(outputs, biases, data_format=data_format)\n\n    if activation_fn is not None:\n      outputs = activation_fn(outputs)\n    return utils.collect_named_outputs(outputs_collections, sc.name, outputs)\n\n\n@add_arg_scope\ndef sequence_to_images(inputs,\n                       height,\n                       output_data_format='channels_last',\n                       outputs_collections=None,\n                       scope=None):\n  \"\"\"Convert a batch of sequences into a batch of images.\n\n  Args:\n    inputs: (num_steps, num_batches, depth) sequence tensor\n    height: the height of the images\n    output_data_format: Format of output tensor. Currently supports\n      `'channels_first'` and `'channels_last'`.\n    outputs_collections: The collections to which the outputs are added.\n    scope: Optional scope for name_scope.\n\n  Returns:\n    A tensor representing the output of the operation.\n  \"\"\"\n  with ops.name_scope(scope, 'SequenceToImages', [inputs]) as sc:\n    inputs = ops.convert_to_tensor(inputs)\n    width, num_batches, depth = inputs.get_shape().as_list()\n    if num_batches is None:\n      num_batches = -1\n    else:\n      num_batches //= height\n    reshaped = array_ops.reshape(inputs, [width, num_batches, height, depth])\n    if output_data_format == 'channels_first':\n      outputs = array_ops.transpose(reshaped, [1, 3, 2, 0])\n    else:\n      outputs = array_ops.transpose(reshaped, [1, 2, 0, 3])\n    return utils.collect_named_outputs(outputs_collections, sc, outputs)\n\n\n@add_arg_scope\ndef softmax(logits, scope=None):\n  \"\"\"Performs softmax on Nth dimension of N-dimensional logit tensor.\n\n  For two-dimensional logits this reduces to tf.nn.softmax. The N-th dimension\n  needs to have a specified number of elements (number of classes).\n\n  Args:\n    logits: N-dimensional `Tensor` with logits, where N > 1.\n    scope: Optional scope for variable_scope.\n\n  Returns:\n    A `Tensor` with same shape and type as logits.\n  \"\"\"\n  # TODO(jrru): Add axis argument which defaults to last dimension.\n  with variable_scope.variable_scope(scope, 'softmax', [logits]):\n    num_logits = utils.last_dimension(logits.get_shape(), min_rank=2)\n    logits_2d = array_ops.reshape(logits, [-1, num_logits])\n    predictions = nn.softmax(logits_2d)\n    predictions = array_ops.reshape(predictions, array_ops.shape(logits))\n    if not context.executing_eagerly():\n      predictions.set_shape(logits.get_shape())\n    return predictions\n\n\n@add_arg_scope\ndef spatial_softmax(features,\n                    temperature=None,\n                    name=None,\n                    variables_collections=None,\n                    trainable=True,\n                    data_format='NHWC'):\n  \"\"\"Computes the spatial softmax of a convolutional feature map.\n\n  First computes the softmax over the spatial extent of each channel of a\n  convolutional feature map. Then computes the expected 2D position of the\n  points of maximal activation for each channel, resulting in a set of\n  feature keypoints [i1, j1, ... iN, jN] for all N channels.\n\n  Read more here:\n  \"Learning visual feature spaces for robotic manipulation with\n  deep spatial autoencoders.\" Finn et al., http://arxiv.org/abs/1509.06113.\n\n  Args:\n    features: A `Tensor` of size [batch_size, W, H, num_channels]; the\n      convolutional feature map.\n    temperature: Softmax temperature (optional). If None, a learnable\n      temperature is created.\n    name: A name for this operation (optional).\n    variables_collections: Collections for the temperature variable.\n    trainable: If `True` also add variables to the graph collection\n      `GraphKeys.TRAINABLE_VARIABLES` (see `tf.Variable`).\n    data_format: A string. `NHWC` (default) and `NCHW` are supported.\n\n  Returns:\n    feature_keypoints: A `Tensor` with size [batch_size, num_channels * 2];\n      the expected 2D locations of each channel's feature keypoint (normalized\n      to the range (-1,1)). The inner dimension is arranged as\n      [i1, j1, ... iN, jN].\n  Raises:\n    ValueError: If unexpected data_format specified.\n    ValueError: If num_channels dimension is unspecified.\n  \"\"\"\n  with variable_scope.variable_scope(name, 'spatial_softmax'):\n    shape = array_ops.shape(features)\n    static_shape = features.shape\n    if data_format == DATA_FORMAT_NHWC:\n      height, width, num_channels = shape[1], shape[2], static_shape[3]\n    elif data_format == DATA_FORMAT_NCHW:\n      num_channels, height, width = static_shape[1], shape[2], shape[3]\n    else:\n      raise ValueError('data_format has to be either NCHW or NHWC.')\n    if tensor_shape.dimension_value(num_channels) is None:\n      raise ValueError('The num_channels dimension of the inputs to '\n                       '`spatial_softmax` should be defined. Found `None`.')\n\n    with ops.name_scope('spatial_softmax_op', 'spatial_softmax_op', [features]):\n      # Create tensors for x and y coordinate values, scaled to range [-1, 1].\n      pos_x, pos_y = array_ops.meshgrid(\n          math_ops.lin_space(-1., 1., num=height),\n          math_ops.lin_space(-1., 1., num=width),\n          indexing='ij')\n      pos_x = array_ops.reshape(pos_x, [height * width])\n      pos_y = array_ops.reshape(pos_y, [height * width])\n\n      if temperature is None:\n        temp_initializer = init_ops.ones_initializer()\n      else:\n        temp_initializer = init_ops.constant_initializer(temperature)\n\n      if not trainable:\n        temp_collections = None\n      else:\n        temp_collections = utils.get_variable_collections(\n            variables_collections, 'temperature')\n\n      temperature = variables.model_variable(\n          'temperature',\n          shape=(),\n          dtype=dtypes.float32,\n          initializer=temp_initializer,\n          collections=temp_collections,\n          trainable=trainable)\n      if data_format == 'NCHW':\n        features = array_ops.reshape(features, [-1, height * width])\n      else:\n        features = array_ops.reshape(\n            array_ops.transpose(features, [0, 3, 1, 2]), [-1, height * width])\n\n      softmax_attention = nn.softmax(features / temperature)\n      expected_x = math_ops.reduce_sum(\n          pos_x * softmax_attention, [1], keepdims=True)\n      expected_y = math_ops.reduce_sum(\n          pos_y * softmax_attention, [1], keepdims=True)\n      expected_xy = array_ops.concat([expected_x, expected_y], 1)\n      feature_keypoints = array_ops.reshape(\n          expected_xy, [-1, tensor_shape.dimension_value(num_channels) * 2])\n      feature_keypoints.set_shape(\n          [None, tensor_shape.dimension_value(num_channels) * 2])\n  return feature_keypoints\n\n\ndef stack(inputs, layer, stack_args, **kwargs):\n  \"\"\"Builds a stack of layers by applying layer repeatedly using stack_args.\n\n  `stack` allows you to repeatedly apply the same operation with different\n  arguments `stack_args[i]`. For each application of the layer, `stack` creates\n  a new scope appended with an increasing number. For example:\n\n  ```python\n    y = stack(x, fully_connected, [32, 64, 128], scope='fc')\n    # It is equivalent to:\n\n    x = fully_connected(x, 32, scope='fc/fc_1')\n    x = fully_connected(x, 64, scope='fc/fc_2')\n    y = fully_connected(x, 128, scope='fc/fc_3')\n  ```\n\n  If the `scope` argument is not given in `kwargs`, it is set to\n  `layer.__name__`, or `layer.func.__name__` (for `functools.partial`\n  objects). If neither `__name__` nor `func.__name__` is available, the\n  layers are called with `scope='stack'`.\n\n  Args:\n    inputs: A `Tensor` suitable for layer.\n    layer: A layer with arguments `(inputs, *args, **kwargs)`\n    stack_args: A list/tuple of parameters for each call of layer.\n    **kwargs: Extra kwargs for the layer.\n\n  Returns:\n    A `Tensor` result of applying the stacked layers.\n\n  Raises:\n    ValueError: If the op is unknown or wrong.\n  \"\"\"\n  scope = kwargs.pop('scope', None)\n  if not isinstance(stack_args, (list, tuple)):\n    raise ValueError('stack_args need to be a list or tuple')\n  with variable_scope.variable_scope(scope, 'Stack', [inputs]):\n    inputs = ops.convert_to_tensor(inputs)\n    if scope is None:\n      if hasattr(layer, '__name__'):\n        scope = layer.__name__\n      elif hasattr(layer, 'func') and hasattr(layer.func, '__name__'):\n        scope = layer.func.__name__  # In case layer is a functools.partial.\n      else:\n        scope = 'stack'\n    outputs = inputs\n    for i in range(len(stack_args)):\n      kwargs['scope'] = scope + '_' + str(i + 1)\n      layer_args = stack_args[i]\n      if not isinstance(layer_args, (list, tuple)):\n        layer_args = [layer_args]\n      outputs = layer(outputs, *layer_args, **kwargs)\n    return outputs\n\n\n@add_arg_scope\ndef unit_norm(inputs, dim, epsilon=1e-7, scope=None):\n  \"\"\"Normalizes the given input across the specified dimension to unit length.\n\n  Note that the rank of `input` must be known.\n\n  Args:\n    inputs: A `Tensor` of arbitrary size.\n    dim: The dimension along which the input is normalized.\n    epsilon: A small value to add to the inputs to avoid dividing by zero.\n    scope: Optional scope for variable_scope.\n\n  Returns:\n    The normalized `Tensor`.\n\n  Raises:\n    ValueError: If dim is smaller than the number of dimensions in 'inputs'.\n  \"\"\"\n  with variable_scope.variable_scope(scope, 'UnitNorm', [inputs]):\n    if not inputs.get_shape():\n      raise ValueError('The input rank must be known.')\n    input_rank = len(inputs.get_shape().as_list())\n    if dim < 0 or dim >= input_rank:\n      raise ValueError('dim must be positive but smaller than the input rank.')\n\n    lengths = math_ops.sqrt(\n        epsilon + math_ops.reduce_sum(math_ops.square(inputs), dim, True))\n    multiples = []\n    if dim > 0:\n      multiples.append(array_ops.ones([dim], dtypes.int32))\n    multiples.append(\n        array_ops.strided_slice(array_ops.shape(inputs), [dim], [dim + 1]))\n    if dim < (input_rank - 1):\n      multiples.append(array_ops.ones([input_rank - 1 - dim], dtypes.int32))\n    multiples = array_ops.concat(multiples, 0)\n    return math_ops.div(inputs, array_ops.tile(lengths, multiples))\n\n\n@add_arg_scope\ndef maxout(inputs, num_units, axis=-1, scope=None):\n  \"\"\"Adds a maxout op from https://arxiv.org/abs/1302.4389\n\n  \"Maxout Networks\" Ian J. Goodfellow, David Warde-Farley, Mehdi Mirza, Aaron\n  Courville,\n   Yoshua Bengio\n\n  Usually the operation is performed in the filter/channel dimension. This can\n  also be\n  used after fully-connected layers to reduce number of features.\n\n  Arguments:\n    inputs: Tensor input\n    num_units: Specifies how many features will remain after maxout in the\n      `axis` dimension (usually channel). This must be a factor of number of\n      features.\n    axis: The dimension where max pooling will be performed. Default is the last\n      dimension.\n    scope: Optional scope for variable_scope.\n\n  Returns:\n    A `Tensor` representing the results of the pooling operation.\n\n  Raises:\n    ValueError: if num_units is not multiple of number of features.\n  \"\"\"\n  with variable_scope.variable_scope(scope, 'MaxOut', [inputs]):\n    inputs = ops.convert_to_tensor(inputs)\n    shape = inputs.get_shape().as_list()\n    num_channels = shape[axis]\n    if num_channels % num_units:\n      raise ValueError('number of features({}) is not '\n                       'a multiple of num_units({})'.format(\n                           num_channels, num_units))\n    shape[axis] = num_units\n    shape += [num_channels // num_units]\n\n    # Dealing with batches with arbitrary sizes\n    for i in range(len(shape)):\n      if shape[i] is None:\n        shape[i] = array_ops.shape(inputs)[i]\n    outputs = math_ops.reduce_max(\n        array_ops.reshape(inputs, shape), -1, keepdims=False)\n    return outputs\n\n\ndef poincare_normalize(x, axis=1, epsilon=1e-5, name=None):\n  \"\"\"Project into the Poincare ball with norm <= 1.0 - epsilon.\n\n  https://en.wikipedia.org/wiki/Poincare_ball_model\n\n  Used in\n  Poincare Embeddings for Learning Hierarchical Representations\n  Maximilian Nickel, Douwe Kiela\n  https://arxiv.org/pdf/1705.08039.pdf\n\n  For a 1-D tensor with `axis = 0`, computes\n\n                (x * (1 - epsilon)) / ||x||     if ||x|| > 1 - epsilon\n      output =\n                 x                              otherwise\n\n  For `x` with more dimensions, independently normalizes each 1-D slice along\n  dimension `axis`.\n\n  Args:\n    x: A `Tensor`.\n    axis: Axis along which to normalize.  A scalar or a vector of integers.\n    epsilon: A small deviation from the edge of the unit sphere for numerical\n      stability.\n    name: A name for this operation (optional).\n\n  Returns:\n    A `Tensor` with the same shape as `x`.\n  \"\"\"\n  with ops.name_scope(name, 'poincare_normalize', [x]) as name:\n    x = ops.convert_to_tensor(x, name='x')\n    square_sum = math_ops.reduce_sum(math_ops.square(x), axis, keepdims=True)\n    x_inv_norm = math_ops.rsqrt(square_sum)\n    x_inv_norm = math_ops.minimum((1. - epsilon) * x_inv_norm, 1.)\n    return math_ops.multiply(x, x_inv_norm, name=name)\n\n\ndef legacy_fully_connected(x,\n                           num_output_units,\n                           activation_fn=None,\n                           weight_init=initializers.xavier_initializer(),\n                           bias_init=init_ops.zeros_initializer(),\n                           name=None,\n                           weight_collections=(ops.GraphKeys.WEIGHTS,),\n                           bias_collections=(ops.GraphKeys.BIASES,),\n                           output_collections=(ops.GraphKeys.ACTIVATIONS,),\n                           trainable=True,\n                           weight_regularizer=None,\n                           bias_regularizer=None):\n  # pylint: disable=anomalous-backslash-in-string\n  r\"\"\"Adds the parameters for a fully connected layer and returns the output.\n\n  A fully connected layer is generally defined as a matrix multiply:\n  `y = f(w * x + b)` where `f` is given by `activation_fn`. If\n  `activation_fn` is `None`, the result of `y = w * x + b` is\n  returned.\n\n  If `x` has shape [\\\\(\\text{dim}_0, \\text{dim}_1, ..., \\text{dim}_n\\\\)]\n  with more than 2 dimensions (\\\\(n > 1\\\\)), then we repeat the matrix\n  multiply along the first dimensions. The result r is a tensor of shape\n  [\\\\(\\text{dim}_0, ..., \\text{dim}_{n-1},\\\\) `num_output_units`],\n  where \\\\( r_{i_0, ..., i_{n-1}, k} =\n  \\sum_{0 \\leq j < \\text{dim}_n} x_{i_0, ... i_{n-1}, j} \\cdot w_{j, k}\\\\).\n  This is accomplished by reshaping `x` to 2-D\n  [\\\\(\\text{dim}_0 \\cdot ... \\cdot \\text{dim}_{n-1}, \\text{dim}_n\\\\)]\n  before the matrix multiply and afterwards reshaping it to\n  [\\\\(\\text{dim}_0, ..., \\text{dim}_{n-1},\\\\) `num_output_units`].\n\n  This op creates `w` and optionally `b`. Bias (`b`) can be disabled by setting\n  `bias_init` to `None`.\n\n  The variable creation is compatible with `tf.compat.v1.variable_scope` and so\n  can be\n  reused with `tf.compat.v1.variable_scope` or `tf.compat.v1.make_template`.\n\n  Most of the details of variable creation can be controlled by specifying the\n  initializers (`weight_init` and `bias_init`) and in which collections to place\n  the created variables (`weight_collections` and `bias_collections`; note that\n  the variables are always added to the `VARIABLES` collection). The output of\n  the layer can be placed in custom collections using `output_collections`.\n  The collections arguments default to `WEIGHTS`, `BIASES` and `ACTIVATIONS`,\n  respectively.\n\n  A per layer regularization can be specified by setting `weight_regularizer`\n  and `bias_regularizer`, which are applied to the weights and biases\n  respectively, and whose output is added to the `REGULARIZATION_LOSSES`\n  collection.\n\n  Args:\n    x: The input `Tensor`.\n    num_output_units: The size of the output.\n    activation_fn: Activation function, default set to None to skip it and\n      maintain a linear activation.\n    weight_init: An optional weight initialization, defaults to\n      `xavier_initializer`.\n    bias_init: An initializer for the bias, defaults to 0. Set to `None` in\n      order to disable bias.\n    name: The name for this operation is used to name operations and to find\n      variables. If specified it must be unique for this scope, otherwise a\n      unique name starting with \"fully_connected\" will be created.  See\n      `tf.compat.v1.variable_scope` for details.\n    weight_collections: List of graph collections to which weights are added.\n    bias_collections: List of graph collections to which biases are added.\n    output_collections: List of graph collections to which outputs are added.\n    trainable: If `True` also add variables to the graph collection\n      `GraphKeys.TRAINABLE_VARIABLES` (see tf.Variable).\n    weight_regularizer: A regularizer like the result of `l1_regularizer` or\n      `l2_regularizer`. Used for weights.\n    bias_regularizer: A regularizer like the result of `l1_regularizer` or\n      `l2_regularizer`. Used for biases.\n\n  Returns:\n    The output of the fully connected layer.\n\n  Raises:\n    ValueError: If x has rank less than 2 or if its last dimension is not set.\n  \"\"\"\n  with variable_scope.variable_scope(name, 'fully_connected', [x]):\n    x = ops.convert_to_tensor(x)\n    dims = x.get_shape().dims\n    if dims is None:\n      raise ValueError('dims of x must be known but is None')\n    if len(dims) < 2:\n      raise ValueError('rank of x must be at least 2 not: %d' % len(dims))\n    num_input_units = dims[-1].value\n    if num_input_units is None:\n      raise ValueError('last dimension of x must be known but is None')\n    dtype = x.dtype.base_dtype\n\n    weight_collections = set(\n        list(weight_collections or []) + [ops.GraphKeys.GLOBAL_VARIABLES])\n    w = variable_scope.get_variable(\n        'weights',\n        shape=[num_input_units, num_output_units],\n        dtype=dtype,\n        initializer=weight_init,\n        collections=weight_collections,\n        regularizer=weight_regularizer,\n        trainable=trainable)\n    x_2_dim = x if len(dims) <= 2 else array_ops.reshape(\n        x, [-1, num_input_units])\n    y = standard_ops.matmul(x_2_dim, w)\n\n    if bias_init is not None:\n      bias_collections = set(\n          list(bias_collections or []) + [ops.GraphKeys.GLOBAL_VARIABLES])\n      b = variable_scope.get_variable(\n          'bias',\n          shape=[num_output_units],\n          dtype=dtype,\n          initializer=bias_init,\n          collections=bias_collections,\n          regularizer=bias_regularizer,\n          trainable=trainable)\n\n      y = nn.bias_add(y, b)\n\n    if len(dims) > 2:\n      out_shape = array_ops.unstack(array_ops.shape(x))\n      out_shape[-1] = num_output_units\n\n      y = array_ops.reshape(y, array_ops.stack(out_shape))\n\n      static_shape = x.get_shape().as_list()\n      static_shape[-1] = num_output_units\n      y.set_shape(static_shape)\n\n    return _apply_activation(y, activation_fn, output_collections)\n\n\n# TODO(eiderm): Verify and fix autocomplete in colab (also relu6).\n# Simple aliases which remove the activation_fn parameter.\nelu = functools.partial(fully_connected, activation_fn=nn.elu)\nlegacy_relu = functools.partial(legacy_fully_connected, activation_fn=nn.relu)\nlegacy_linear = functools.partial(legacy_fully_connected, activation_fn=None)\nrelu = functools.partial(fully_connected, activation_fn=nn.relu)\nrelu6 = functools.partial(fully_connected, activation_fn=nn.relu6)\nlinear = functools.partial(fully_connected, activation_fn=None)\n\n# Simple alias.\nconv1d = convolution1d\nconv2d = convolution2d\nconv3d = convolution3d\nconv2d_transpose = convolution2d_transpose\nconv3d_transpose = convolution3d_transpose\nconv2d_in_plane = convolution2d_in_plane\nseparable_conv2d = separable_convolution2d\n"
  },
  {
    "path": "tf_contrib/loader.py",
    "content": "# Copyright 2016 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\"\"\"Utilities for loading op libraries.\n\n@@load_op_library\n\"\"\"\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport os\nimport re\n\nfrom tensorflow.python.framework import load_library\nfrom tensorflow.python.platform import resource_loader\n\n\ndef load_op_library(path):\n  \"\"\"Loads a contrib op library from the given path.\n\n  NOTE(mrry): On Windows, we currently assume that some contrib op\n  libraries are statically linked into the main TensorFlow Python\n  extension DLL - use dynamically linked ops if the .so is present.\n\n  Args:\n    path: An absolute path to a shared object file.\n\n  Returns:\n    A Python module containing the Python wrappers for Ops defined in the\n    plugin.\n  \"\"\"\n  if os.name == 'nt':\n    # To avoid making every user_ops aware of windows, re-write\n    # the file extension from .so to .dll if .so file doesn't exist.\n    if not os.path.exists(path):\n      path = re.sub(r'\\.so$', '.dll', path)\n\n    # Currently we have only some user_ops as dlls on windows - don't try\n    # to load them if the dll is not found.\n    # TODO(mrry): Once we have all of them this check should be removed.\n    if not os.path.exists(path):\n      return None\n  path = resource_loader.get_path_to_datafile(path)\n  ret = load_library.load_op_library(path)\n  assert ret, 'Could not load %s' % path\n  return ret\n"
  },
  {
    "path": "tf_contrib/regularizers.py",
    "content": "# Copyright 2015 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\"\"\"Regularizers for use with layers.\"\"\"\n\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport numbers\n\nfrom tensorflow.python.framework import constant_op\nfrom tensorflow.python.framework import ops\nfrom tensorflow.python.ops import math_ops\nfrom tensorflow.python.ops import nn\nfrom tensorflow.python.ops import standard_ops\nfrom tensorflow.python.platform import tf_logging as logging\n\n__all__ = ['l1_regularizer',\n           'l2_regularizer',\n           'l1_l2_regularizer',\n           'sum_regularizer',\n           'apply_regularization']\n\n\ndef l1_regularizer(scale, scope=None):\n  \"\"\"Returns a function that can be used to apply L1 regularization to weights.\n\n  L1 regularization encourages sparsity.\n\n  Args:\n    scale: A scalar multiplier `Tensor`. 0.0 disables the regularizer.\n    scope: An optional scope name.\n\n  Returns:\n    A function with signature `l1(weights)` that apply L1 regularization.\n\n  Raises:\n    ValueError: If scale is negative or if scale is not a float.\n  \"\"\"\n  if isinstance(scale, numbers.Integral):\n    raise ValueError('scale cannot be an integer: %s' % scale)\n  if isinstance(scale, numbers.Real):\n    if scale < 0.:\n      raise ValueError('Setting a scale less than 0 on a regularizer: %g' %\n                       scale)\n    if scale == 0.:\n      logging.info('Scale of 0 disables regularizer.')\n      return lambda _: None\n\n  def l1(weights, name=None):\n    \"\"\"Applies L1 regularization to weights.\"\"\"\n    with ops.name_scope(scope, 'l1_regularizer', [weights]) as name:\n      my_scale = ops.convert_to_tensor(scale,\n                                       dtype=weights.dtype.base_dtype,\n                                       name='scale')\n      return standard_ops.multiply(\n          my_scale,\n          standard_ops.reduce_sum(standard_ops.abs(weights)),\n          name=name)\n\n  return l1\n\n\ndef l2_regularizer(scale, scope=None):\n  \"\"\"Returns a function that can be used to apply L2 regularization to weights.\n\n  Small values of L2 can help prevent overfitting the training data.\n\n  Args:\n    scale: A scalar multiplier `Tensor`. 0.0 disables the regularizer.\n    scope: An optional scope name.\n\n  Returns:\n    A function with signature `l2(weights)` that applies L2 regularization.\n\n  Raises:\n    ValueError: If scale is negative or if scale is not a float.\n  \"\"\"\n  if isinstance(scale, numbers.Integral):\n    raise ValueError('scale cannot be an integer: %s' % (scale,))\n  if isinstance(scale, numbers.Real):\n    if scale < 0.:\n      raise ValueError('Setting a scale less than 0 on a regularizer: %g.' %\n                       scale)\n    if scale == 0.:\n      logging.info('Scale of 0 disables regularizer.')\n      return lambda _: None\n\n  def l2(weights):\n    \"\"\"Applies l2 regularization to weights.\"\"\"\n    with ops.name_scope(scope, 'l2_regularizer', [weights]) as name:\n      my_scale = ops.convert_to_tensor(scale,\n                                       dtype=weights.dtype.base_dtype,\n                                       name='scale')\n      return standard_ops.multiply(my_scale, nn.l2_loss(weights), name=name)\n\n  return l2\n\n\ndef l1_l2_regularizer(scale_l1=1.0, scale_l2=1.0, scope=None):\n  \"\"\"Returns a function that can be used to apply L1 L2 regularizations.\n\n  Args:\n    scale_l1: A scalar multiplier `Tensor` for L1 regularization.\n    scale_l2: A scalar multiplier `Tensor` for L2 regularization.\n    scope: An optional scope name.\n\n  Returns:\n    A function with signature `l1_l2(weights)` that applies a weighted sum of\n    L1 L2 regularization.\n\n  Raises:\n    ValueError: If scale is negative or if scale is not a float.\n  \"\"\"\n  if isinstance(scale_l1, numbers.Integral):\n    raise ValueError('scale_l1 cannot be an integer: %s' % (scale_l1,))\n  if isinstance(scale_l2, numbers.Integral):\n    raise ValueError('scale_l2 cannot be an integer: %s' % (scale_l2,))\n  scope = scope or 'l1_l2_regularizer'\n  if scale_l1 == 0.:\n    return l2_regularizer(scale_l2, scope)\n  if scale_l2 == 0.:\n    return l1_regularizer(scale_l1, scope)\n  return sum_regularizer([l1_regularizer(scale_l1),\n                          l2_regularizer(scale_l2)],\n                         scope=scope)\n\n\ndef sum_regularizer(regularizer_list, scope=None):\n  \"\"\"Returns a function that applies the sum of multiple regularizers.\n\n  Args:\n    regularizer_list: A list of regularizers to apply.\n    scope: An optional scope name\n\n  Returns:\n    A function with signature `sum_reg(weights)` that applies the\n    sum of all the input regularizers.\n  \"\"\"\n  regularizer_list = [reg for reg in regularizer_list if reg is not None]\n  if not regularizer_list:\n    return None\n\n  def sum_reg(weights):\n    \"\"\"Applies the sum of all the input regularizers.\"\"\"\n    with ops.name_scope(scope, 'sum_regularizer', [weights]) as name:\n      regularizer_tensors = []\n      for reg in regularizer_list:\n        tensor = reg(weights)\n        if tensor is not None:\n          regularizer_tensors.append(tensor)\n      return math_ops.add_n(\n          regularizer_tensors, name=name) if regularizer_tensors else None\n\n  return sum_reg\n\n\ndef apply_regularization(regularizer, weights_list=None):\n  \"\"\"Returns the summed penalty by applying `regularizer` to the `weights_list`.\n\n  Adding a regularization penalty over the layer weights and embedding weights\n  can help prevent overfitting the training data. Regularization over layer\n  biases is less common/useful, but assuming proper data preprocessing/mean\n  subtraction, it usually shouldn't hurt much either.\n\n  Args:\n    regularizer: A function that takes a single `Tensor` argument and returns\n      a scalar `Tensor` output.\n    weights_list: List of weights `Tensors` or `Variables` to apply\n      `regularizer` over. Defaults to the `GraphKeys.WEIGHTS` collection if\n      `None`.\n\n  Returns:\n    A scalar representing the overall regularization penalty.\n\n  Raises:\n    ValueError: If `regularizer` does not return a scalar output, or if we find\n        no weights.\n  \"\"\"\n  if not weights_list:\n    weights_list = ops.get_collection(ops.GraphKeys.WEIGHTS)\n  if not weights_list:\n    raise ValueError('No weights to regularize.')\n  with ops.name_scope('get_regularization_penalty',\n                      values=weights_list) as scope:\n    penalties = [regularizer(w) for w in weights_list]\n    penalties = [\n        p if p is not None else constant_op.constant(0.0) for p in penalties\n    ]\n    for p in penalties:\n      if p.get_shape().ndims != 0:\n        raise ValueError('regularizer must return a scalar Tensor instead of a '\n                         'Tensor with rank %d.' % p.get_shape().ndims)\n\n    summed_penalty = math_ops.add_n(penalties, name=scope)\n    ops.add_to_collection(ops.GraphKeys.REGULARIZATION_LOSSES, summed_penalty)\n    return summed_penalty\n"
  },
  {
    "path": "tf_contrib/resnet_utils.py",
    "content": "# Copyright 2016 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\"\"\"Contains building blocks for various versions of Residual Networks.\n\nResidual networks (ResNets) were proposed in:\n  Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun\n  Deep Residual Learning for Image Recognition. arXiv:1512.03385, 2015\n\nMore variants were introduced in:\n  Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun\n  Identity Mappings in Deep Residual Networks. arXiv: 1603.05027, 2016\n\nWe can obtain different ResNet variants by changing the network depth, width,\nand form of residual unit. This module implements the infrastructure for\nbuilding them. Concrete ResNet units and full ResNet networks are implemented in\nthe accompanying resnet_v1.py and resnet_v2.py modules.\n\nCompared to https://github.com/KaimingHe/deep-residual-networks, in the current\nimplementation we subsample the output activations in the last residual unit of\neach block, instead of subsampling the input activations in the first residual\nunit of each block. The two implementations give identical results but our\nimplementation is more memory efficient.\n\"\"\"\n\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport collections\n\nfrom . import layers as layers_lib\nfrom .arg_scope import arg_scope, add_arg_scope\nfrom . import initializers\nfrom . import layers\nfrom . import regularizers\nfrom . import utils\nfrom tensorflow.python.framework import ops\nfrom tensorflow.python.ops import array_ops\nfrom tensorflow.python.ops import nn_ops\nfrom tensorflow.python.ops import variable_scope\n\n\nclass Block(collections.namedtuple('Block', ['scope', 'unit_fn', 'args'])):\n  \"\"\"A named tuple describing a ResNet block.\n\n  Its parts are:\n    scope: The scope of the `Block`.\n    unit_fn: The ResNet unit function which takes as input a `Tensor` and\n      returns another `Tensor` with the output of the ResNet unit.\n    args: A list of length equal to the number of units in the `Block`. The list\n      contains one (depth, depth_bottleneck, stride) tuple for each unit in the\n      block to serve as argument to unit_fn.\n  \"\"\"\n\n\ndef subsample(inputs, factor, scope=None):\n  \"\"\"Subsamples the input along the spatial dimensions.\n\n  Args:\n    inputs: A `Tensor` of size [batch, height_in, width_in, channels].\n    factor: The subsampling factor.\n    scope: Optional variable_scope.\n\n  Returns:\n    output: A `Tensor` of size [batch, height_out, width_out, channels] with the\n      input, either intact (if factor == 1) or subsampled (if factor > 1).\n  \"\"\"\n  if factor == 1:\n    return inputs\n  else:\n    return layers.max_pool2d(inputs, [1, 1], stride=factor, scope=scope)\n\n\ndef conv2d_same(inputs, num_outputs, kernel_size, stride, rate=1, scope=None):\n  \"\"\"Strided 2-D convolution with 'SAME' padding.\n\n  When stride > 1, then we do explicit zero-padding, followed by conv2d with\n  'VALID' padding.\n\n  Note that\n\n     net = conv2d_same(inputs, num_outputs, 3, stride=stride)\n\n  is equivalent to\n\n     net = tf.contrib.layers.conv2d(inputs, num_outputs, 3, stride=1,\n     padding='SAME')\n     net = subsample(net, factor=stride)\n\n  whereas\n\n     net = tf.contrib.layers.conv2d(inputs, num_outputs, 3, stride=stride,\n     padding='SAME')\n\n  is different when the input's height or width is even, which is why we add the\n  current function. For more details, see ResnetUtilsTest.testConv2DSameEven().\n\n  Args:\n    inputs: A 4-D tensor of size [batch, height_in, width_in, channels].\n    num_outputs: An integer, the number of output filters.\n    kernel_size: An int with the kernel_size of the filters.\n    stride: An integer, the output stride.\n    rate: An integer, rate for atrous convolution.\n    scope: Scope.\n\n  Returns:\n    output: A 4-D tensor of size [batch, height_out, width_out, channels] with\n      the convolution output.\n  \"\"\"\n  if stride == 1:\n    return layers_lib.conv2d(\n        inputs,\n        num_outputs,\n        kernel_size,\n        stride=1,\n        rate=rate,\n        padding='SAME',\n        scope=scope)\n  else:\n    kernel_size_effective = kernel_size + (kernel_size - 1) * (rate - 1)\n    pad_total = kernel_size_effective - 1\n    pad_beg = pad_total // 2\n    pad_end = pad_total - pad_beg\n    inputs = array_ops.pad(\n        inputs, [[0, 0], [pad_beg, pad_end], [pad_beg, pad_end], [0, 0]])\n    return layers_lib.conv2d(\n        inputs,\n        num_outputs,\n        kernel_size,\n        stride=stride,\n        rate=rate,\n        padding='VALID',\n        scope=scope)\n\n\n@add_arg_scope\ndef stack_blocks_dense(net,\n                       blocks,\n                       output_stride=None,\n                       outputs_collections=None):\n  \"\"\"Stacks ResNet `Blocks` and controls output feature density.\n\n  First, this function creates scopes for the ResNet in the form of\n  'block_name/unit_1', 'block_name/unit_2', etc.\n\n  Second, this function allows the user to explicitly control the ResNet\n  output_stride, which is the ratio of the input to output spatial resolution.\n  This is useful for dense prediction tasks such as semantic segmentation or\n  object detection.\n\n  Most ResNets consist of 4 ResNet blocks and subsample the activations by a\n  factor of 2 when transitioning between consecutive ResNet blocks. This results\n  to a nominal ResNet output_stride equal to 8. If we set the output_stride to\n  half the nominal network stride (e.g., output_stride=4), then we compute\n  responses twice.\n\n  Control of the output feature density is implemented by atrous convolution.\n\n  Args:\n    net: A `Tensor` of size [batch, height, width, channels].\n    blocks: A list of length equal to the number of ResNet `Blocks`. Each\n      element is a ResNet `Block` object describing the units in the `Block`.\n    output_stride: If `None`, then the output will be computed at the nominal\n      network stride. If output_stride is not `None`, it specifies the requested\n      ratio of input to output spatial resolution, which needs to be equal to\n      the product of unit strides from the start up to some level of the ResNet.\n      For example, if the ResNet employs units with strides 1, 2, 1, 3, 4, 1,\n      then valid values for the output_stride are 1, 2, 6, 24 or None (which\n      is equivalent to output_stride=24).\n    outputs_collections: Collection to add the ResNet block outputs.\n\n  Returns:\n    net: Output tensor with stride equal to the specified output_stride.\n\n  Raises:\n    ValueError: If the target output_stride is not valid.\n  \"\"\"\n  # The current_stride variable keeps track of the effective stride of the\n  # activations. This allows us to invoke atrous convolution whenever applying\n  # the next residual unit would result in the activations having stride larger\n  # than the target output_stride.\n  current_stride = 1\n\n  # The atrous convolution rate parameter.\n  rate = 1\n\n  for block in blocks:\n    with variable_scope.variable_scope(block.scope, 'block', [net]) as sc:\n      for i, unit in enumerate(block.args):\n        if output_stride is not None and current_stride > output_stride:\n          raise ValueError('The target output_stride cannot be reached.')\n\n        with variable_scope.variable_scope('unit_%d' % (i + 1), values=[net]):\n          # If we have reached the target output_stride, then we need to employ\n          # atrous convolution with stride=1 and multiply the atrous rate by the\n          # current unit's stride for use in subsequent layers.\n          if output_stride is not None and current_stride == output_stride:\n            net = block.unit_fn(net, rate=rate, **dict(unit, stride=1))\n            rate *= unit.get('stride', 1)\n\n          else:\n            net = block.unit_fn(net, rate=1, **unit)\n            current_stride *= unit.get('stride', 1)\n      net = utils.collect_named_outputs(outputs_collections, sc.name, net)\n\n  if output_stride is not None and current_stride != output_stride:\n    raise ValueError('The target output_stride cannot be reached.')\n\n  return net\n\n\ndef resnet_arg_scope(weight_decay=0.0001,\n                     batch_norm_decay=0.997,\n                     batch_norm_epsilon=1e-5,\n                     batch_norm_scale=True):\n  \"\"\"Defines the default ResNet arg scope.\n\n  TODO(gpapan): The batch-normalization related default values above are\n    appropriate for use in conjunction with the reference ResNet models\n    released at https://github.com/KaimingHe/deep-residual-networks. When\n    training ResNets from scratch, they might need to be tuned.\n\n  Args:\n    weight_decay: The weight decay to use for regularizing the model.\n    batch_norm_decay: The moving average decay when estimating layer activation\n      statistics in batch normalization.\n    batch_norm_epsilon: Small constant to prevent division by zero when\n      normalizing activations by their variance in batch normalization.\n    batch_norm_scale: If True, uses an explicit `gamma` multiplier to scale the\n      activations in the batch normalization layer.\n\n  Returns:\n    An `arg_scope` to use for the resnet models.\n  \"\"\"\n  batch_norm_params = {\n      'decay': batch_norm_decay,\n      'epsilon': batch_norm_epsilon,\n      'scale': batch_norm_scale,\n      'updates_collections': ops.GraphKeys.UPDATE_OPS,\n  }\n\n  with arg_scope(\n      [layers_lib.conv2d],\n      weights_regularizer=regularizers.l2_regularizer(weight_decay),\n      weights_initializer=initializers.variance_scaling_initializer(),\n      activation_fn=nn_ops.relu,\n      normalizer_fn=layers.batch_norm,\n      normalizer_params=batch_norm_params):\n    with arg_scope([layers.batch_norm], **batch_norm_params):\n      # The following implies padding='SAME' for pool1, which makes feature\n      # alignment easier for dense prediction tasks. This is also used in\n      # https://github.com/facebook/fb.resnet.torch. However the accompanying\n      # code of 'Deep Residual Learning for Image Recognition' uses\n      # padding='VALID' for pool1. You can switch to that choice by setting\n      # tf.contrib.framework.arg_scope([tf.contrib.layers.max_pool2d], padding='VALID').\n      with arg_scope([layers.max_pool2d], padding='SAME') as arg_sc:\n        return arg_sc\n"
  },
  {
    "path": "tf_contrib/resnet_v1.py",
    "content": "# Copyright 2016 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\"\"\"Contains definitions for the original form of Residual Networks.\n\nThe 'v1' residual networks (ResNets) implemented in this module were proposed\nby:\n[1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun\n    Deep Residual Learning for Image Recognition. arXiv:1512.03385\n\nOther variants were introduced in:\n[2] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun\n    Identity Mappings in Deep Residual Networks. arXiv: 1603.05027\n\nThe networks defined in this module utilize the bottleneck building block of\n[1] with projection shortcuts only for increasing depths. They employ batch\nnormalization *after* every weight layer. This is the architecture used by\nMSRA in the Imagenet and MSCOCO 2016 competition models ResNet-101 and\nResNet-152. See [2; Fig. 1a] for a comparison between the current 'v1'\narchitecture and the alternative 'v2' architecture of [2] which uses batch\nnormalization *before* every weight layer in the so-called full pre-activation\nunits.\n\nTypical use:\n\n   from tensorflow.contrib.slim.python.slim.nets import\n   resnet_v1\n\nResNet-101 for image classification into 1000 classes:\n\n   # inputs has shape [batch, 224, 224, 3]\n   with slim.arg_scope(resnet_v1.resnet_arg_scope()):\n      net, end_points = resnet_v1.resnet_v1_101(inputs, 1000, is_training=False)\n\nResNet-101 for semantic segmentation into 21 classes:\n\n   # inputs has shape [batch, 513, 513, 3]\n   with slim.arg_scope(resnet_v1.resnet_arg_scope()):\n      net, end_points = resnet_v1.resnet_v1_101(inputs,\n                                                21,\n                                                is_training=False,\n                                                global_pool=False,\n                                                output_stride=16)\n\"\"\"\n\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nfrom . import layers\nfrom .arg_scope import add_arg_scope, arg_scope\n# from . import layers as layers_lib\nfrom . import utils\nfrom . import resnet_utils\nfrom tensorflow.python.ops import math_ops\nfrom tensorflow.python.ops import nn_ops\nfrom tensorflow.python.ops import variable_scope\n\nresnet_arg_scope = resnet_utils.resnet_arg_scope\n\n\n@add_arg_scope\ndef bottleneck(inputs,\n               depth,\n               depth_bottleneck,\n               stride,\n               rate=1,\n               outputs_collections=None,\n               scope=None):\n  \"\"\"Bottleneck residual unit variant with BN after convolutions.\n\n  This is the original residual unit proposed in [1]. See Fig. 1(a) of [2] for\n  its definition. Note that we use here the bottleneck variant which has an\n  extra bottleneck layer.\n\n  When putting together two consecutive ResNet blocks that use this unit, one\n  should use stride = 2 in the last unit of the first block.\n\n  Args:\n    inputs: A tensor of size [batch, height, width, channels].\n    depth: The depth of the ResNet unit output.\n    depth_bottleneck: The depth of the bottleneck layers.\n    stride: The ResNet unit's stride. Determines the amount of downsampling of\n      the units output compared to its input.\n    rate: An integer, rate for atrous convolution.\n    outputs_collections: Collection to add the ResNet unit output.\n    scope: Optional variable_scope.\n\n  Returns:\n    The ResNet unit's output.\n  \"\"\"\n  with variable_scope.variable_scope(scope, 'bottleneck_v1', [inputs]) as sc:\n    depth_in = utils.last_dimension(inputs.get_shape(), min_rank=4)\n    if depth == depth_in:\n      shortcut = resnet_utils.subsample(inputs, stride, 'shortcut')\n    else:\n      shortcut = layers.conv2d(\n          inputs,\n          depth, [1, 1],\n          stride=stride,\n          activation_fn=None,\n          scope='shortcut')\n\n    residual = layers.conv2d(\n        inputs, depth_bottleneck, [1, 1], stride=1, scope='conv1')\n    residual = resnet_utils.conv2d_same(\n        residual, depth_bottleneck, 3, stride, rate=rate, scope='conv2')\n    residual = layers.conv2d(\n        residual, depth, [1, 1], stride=1, activation_fn=None, scope='conv3')\n\n    output = nn_ops.relu(shortcut + residual)\n\n    return utils.collect_named_outputs(outputs_collections, sc.name, output)\n\n\ndef resnet_v1(inputs,\n              blocks,\n              num_classes=None,\n              is_training=True,\n              global_pool=True,\n              output_stride=None,\n              include_root_block=True,\n              reuse=None,\n              scope=None):\n  \"\"\"Generator for v1 ResNet models.\n\n  This function generates a family of ResNet v1 models. See the resnet_v1_*()\n  methods for specific model instantiations, obtained by selecting different\n  block instantiations that produce ResNets of various depths.\n\n  Training for image classification on Imagenet is usually done with [224, 224]\n  inputs, resulting in [7, 7] feature maps at the output of the last ResNet\n  block for the ResNets defined in [1] that have nominal stride equal to 32.\n  However, for dense prediction tasks we advise that one uses inputs with\n  spatial dimensions that are multiples of 32 plus 1, e.g., [321, 321]. In\n  this case the feature maps at the ResNet output will have spatial shape\n  [(height - 1) / output_stride + 1, (width - 1) / output_stride + 1]\n  and corners exactly aligned with the input image corners, which greatly\n  facilitates alignment of the features to the image. Using as input [225, 225]\n  images results in [8, 8] feature maps at the output of the last ResNet block.\n\n  For dense prediction tasks, the ResNet needs to run in fully-convolutional\n  (FCN) mode and global_pool needs to be set to False. The ResNets in [1, 2] all\n  have nominal stride equal to 32 and a good choice in FCN mode is to use\n  output_stride=16 in order to increase the density of the computed features at\n  small computational and memory overhead, cf. http://arxiv.org/abs/1606.00915.\n\n  Args:\n    inputs: A tensor of size [batch, height_in, width_in, channels].\n    blocks: A list of length equal to the number of ResNet blocks. Each element\n      is a resnet_utils.Block object describing the units in the block.\n    num_classes: Number of predicted classes for classification tasks. If None\n      we return the features before the logit layer.\n    is_training: whether batch_norm layers are in training mode.\n    global_pool: If True, we perform global average pooling before computing the\n      logits. Set to True for image classification, False for dense prediction.\n    output_stride: If None, then the output will be computed at the nominal\n      network stride. If output_stride is not None, it specifies the requested\n      ratio of input to output spatial resolution.\n    include_root_block: If True, include the initial convolution followed by\n      max-pooling, if False excludes it.\n    reuse: whether or not the network and its variables should be reused. To be\n      able to reuse 'scope' must be given.\n    scope: Optional variable_scope.\n\n  Returns:\n    net: A rank-4 tensor of size [batch, height_out, width_out, channels_out].\n      If global_pool is False, then height_out and width_out are reduced by a\n      factor of output_stride compared to the respective height_in and width_in,\n      else both height_out and width_out equal one. If num_classes is None, then\n      net is the output of the last ResNet block, potentially after global\n      average pooling. If num_classes is not None, net contains the pre-softmax\n      activations.\n    end_points: A dictionary from components of the network to the corresponding\n      activation.\n\n  Raises:\n    ValueError: If the target output_stride is not valid.\n  \"\"\"\n  with variable_scope.variable_scope(\n      scope, 'resnet_v1', [inputs], reuse=reuse) as sc:\n    end_points_collection = sc.original_name_scope + '_end_points'\n    with arg_scope(\n        [layers.conv2d, bottleneck, resnet_utils.stack_blocks_dense],\n        outputs_collections=end_points_collection):\n      with arg_scope([layers.batch_norm], is_training=is_training):\n        net = inputs\n        if include_root_block:\n          if output_stride is not None:\n            if output_stride % 4 != 0:\n              raise ValueError('The output_stride needs to be a multiple of 4.')\n            output_stride /= 4\n          net = resnet_utils.conv2d_same(net, 64, 7, stride=2, scope='conv1')\n          net = layers.max_pool2d(net, [3, 3], stride=2, scope='pool1')\n        net = resnet_utils.stack_blocks_dense(net, blocks, output_stride)\n        if global_pool:\n          # Global average pooling.\n          net = math_ops.reduce_mean(net, [1, 2], name='pool5', keepdims=True)\n        if num_classes is not None:\n          net = layers.conv2d(\n              net,\n              num_classes, [1, 1],\n              activation_fn=None,\n              normalizer_fn=None,\n              scope='logits')\n        # Convert end_points_collection into a dictionary of end_points.\n        end_points = utils.convert_collection_to_dict(end_points_collection)\n        if num_classes is not None:\n          end_points['predictions'] = layers.softmax(\n              net, scope='predictions')\n        return net, end_points\nresnet_v1.default_image_size = 224\n\n\ndef resnet_v1_block(scope, base_depth, num_units, stride):\n  \"\"\"Helper function for creating a resnet_v1 bottleneck block.\n\n  Args:\n    scope: The scope of the block.\n    base_depth: The depth of the bottleneck layer for each unit.\n    num_units: The number of units in the block.\n    stride: The stride of the block, implemented as a stride in the last unit.\n      All other units have stride=1.\n\n  Returns:\n    A resnet_v1 bottleneck block.\n  \"\"\"\n  return resnet_utils.Block(scope, bottleneck, [{\n      'depth': base_depth * 4,\n      'depth_bottleneck': base_depth,\n      'stride': 1\n  }] * (num_units - 1) + [{\n      'depth': base_depth * 4,\n      'depth_bottleneck': base_depth,\n      'stride': stride\n  }])\n\n\ndef resnet_v1_50(inputs,\n                 num_classes=None,\n                 is_training=True,\n                 global_pool=True,\n                 output_stride=None,\n                 reuse=None,\n                 scope='resnet_v1_50'):\n  \"\"\"ResNet-50 model of [1]. See resnet_v1() for arg and return description.\"\"\"\n  blocks = [\n      resnet_v1_block('block1', base_depth=64, num_units=3, stride=2),\n      resnet_v1_block('block2', base_depth=128, num_units=4, stride=2),\n      resnet_v1_block('block3', base_depth=256, num_units=6, stride=2),\n      resnet_v1_block('block4', base_depth=512, num_units=3, stride=1),\n  ]\n  return resnet_v1(\n      inputs,\n      blocks,\n      num_classes,\n      is_training,\n      global_pool,\n      output_stride,\n      include_root_block=True,\n      reuse=reuse,\n      scope=scope)\n\n\ndef resnet_v1_101(inputs,\n                  num_classes=None,\n                  is_training=True,\n                  global_pool=True,\n                  output_stride=None,\n                  reuse=None,\n                  scope='resnet_v1_101'):\n  \"\"\"ResNet-101 model of [1]. See resnet_v1() for arg and return description.\"\"\"\n  blocks = [\n      resnet_v1_block('block1', base_depth=64, num_units=3, stride=2),\n      resnet_v1_block('block2', base_depth=128, num_units=4, stride=2),\n      resnet_v1_block('block3', base_depth=256, num_units=23, stride=2),\n      resnet_v1_block('block4', base_depth=512, num_units=3, stride=1),\n  ]\n  return resnet_v1(\n      inputs,\n      blocks,\n      num_classes,\n      is_training,\n      global_pool,\n      output_stride,\n      include_root_block=True,\n      reuse=reuse,\n      scope=scope)\n\n\ndef resnet_v1_152(inputs,\n                  num_classes=None,\n                  is_training=True,\n                  global_pool=True,\n                  output_stride=None,\n                  reuse=None,\n                  scope='resnet_v1_152'):\n  \"\"\"ResNet-152 model of [1]. See resnet_v1() for arg and return description.\"\"\"\n  blocks = [\n      resnet_v1_block('block1', base_depth=64, num_units=3, stride=2),\n      resnet_v1_block('block2', base_depth=128, num_units=8, stride=2),\n      resnet_v1_block('block3', base_depth=256, num_units=36, stride=2),\n      resnet_v1_block('block4', base_depth=512, num_units=3, stride=1),\n  ]\n  return resnet_v1(\n      inputs,\n      blocks,\n      num_classes,\n      is_training,\n      global_pool,\n      output_stride,\n      include_root_block=True,\n      reuse=reuse,\n      scope=scope)\n\n\ndef resnet_v1_200(inputs,\n                  num_classes=None,\n                  is_training=True,\n                  global_pool=True,\n                  output_stride=None,\n                  reuse=None,\n                  scope='resnet_v1_200'):\n  \"\"\"ResNet-200 model of [2]. See resnet_v1() for arg and return description.\"\"\"\n  blocks = [\n      resnet_v1_block('block1', base_depth=64, num_units=3, stride=2),\n      resnet_v1_block('block2', base_depth=128, num_units=24, stride=2),\n      resnet_v1_block('block3', base_depth=256, num_units=36, stride=2),\n      resnet_v1_block('block4', base_depth=512, num_units=3, stride=1),\n  ]\n  return resnet_v1(\n      inputs,\n      blocks,\n      num_classes,\n      is_training,\n      global_pool,\n      output_stride,\n      include_root_block=True,\n      reuse=reuse,\n      scope=scope)\n"
  },
  {
    "path": "tf_contrib/slim.py",
    "content": "# Copyright 2016 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\"\"\"Slim is an interface to contrib functions, examples and models.\n\nTODO(nsilberman): flesh out documentation.\n\"\"\"\n\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\n# pylint: disable=unused-import,line-too-long,g-importing-member,wildcard-import\n# TODO(jart): Delete non-slim imports\nfrom .arg_scope import *\n# from .variables import *\nfrom .layers import *\nfrom .initializers import *\nfrom .regularizers import *\n# from tensorflow.contrib.slim.python.slim import evaluation\n# from tensorflow.contrib.slim.python.slim import learning\n# from tensorflow.contrib.slim.python.slim import model_analyzer\n# from tensorflow.contrib.slim.python.slim import queues\n# from tensorflow.contrib.slim.python.slim import summaries\n# from tensorflow.contrib.slim.python.slim.data import data_decoder\n# from tensorflow.contrib.slim.python.slim.data import data_provider\n# from tensorflow.contrib.slim.python.slim.data import dataset\n# from tensorflow.contrib.slim.python.slim.data import dataset_data_provider\n# from tensorflow.contrib.slim.python.slim.data import parallel_reader\n# from tensorflow.contrib.slim.python.slim.data import prefetch_queue\n# from tensorflow.contrib.slim.python.slim.data import tfexample_decoder\nfrom tensorflow.python.util.all_util import make_all\n# pylint: enable=unused-import,line-too-long,g-importing-member,wildcard-import\n\n__all__ = make_all(__name__)\n"
  },
  {
    "path": "tf_contrib/utils.py",
    "content": "# Copyright 2015 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\"\"\"Common util functions used by layers.\n\"\"\"\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nfrom collections import namedtuple\nfrom collections import OrderedDict\nfrom tensorflow.python.framework import ops\nfrom tensorflow.python.framework import tensor_shape\nfrom tensorflow.python.framework import tensor_util\nfrom tensorflow.python.ops import control_flow_ops\nfrom tensorflow.python.ops import variables\n\n__all__ = ['collect_named_outputs',\n           'constant_value',\n           'static_cond',\n           'smart_cond',\n           'get_variable_collections',\n           'two_element_tuple',\n           'n_positive_integers',\n           'channel_dimension',\n           'last_dimension']\n\nNamedOutputs = namedtuple('NamedOutputs', ['name', 'outputs'])\n\n\ndef collect_named_outputs(collections, alias, outputs):\n  \"\"\"Add `Tensor` outputs tagged with alias to collections.\n\n  It is useful to collect end-points or tags for summaries. Example of usage:\n\n  logits = collect_named_outputs('end_points', 'inception_v3/logits', logits)\n  assert 'inception_v3/logits' in logits.aliases\n\n  Args:\n    collections: A collection or list of collections. If None skip collection.\n    alias: String to append to the list of aliases of outputs, for example,\n           'inception_v3/conv1'.\n    outputs: Tensor, an output tensor to collect\n\n  Returns:\n    The outputs Tensor to allow inline call.\n  \"\"\"\n  if collections:\n    append_tensor_alias(outputs, alias)\n    ops.add_to_collections(collections, outputs)\n  return outputs\n\n\ndef append_tensor_alias(tensor, alias):\n  \"\"\"Append an alias to the list of aliases of the tensor.\n\n  Args:\n    tensor: A `Tensor`.\n    alias: String, to add to the list of aliases of the tensor.\n\n  Returns:\n    The tensor with a new alias appended to its list of aliases.\n  \"\"\"\n  # Remove ending '/' if present.\n  if alias[-1] == '/':\n    alias = alias[:-1]\n  if hasattr(tensor, 'aliases'):\n    tensor.aliases.append(alias)\n  else:\n    tensor.aliases = [alias]\n  return tensor\n\n\ndef gather_tensors_aliases(tensors):\n  \"\"\"Given a list of tensors, gather their aliases.\n\n  Args:\n    tensors: A list of `Tensors`.\n\n  Returns:\n    A list of strings with the aliases of all tensors.\n  \"\"\"\n  aliases = []\n  for tensor in tensors:\n    aliases += get_tensor_aliases(tensor)\n  return aliases\n\n\ndef get_tensor_aliases(tensor):\n  \"\"\"Get a list with the aliases of the input tensor.\n\n  If the tensor does not have any alias, it would default to its its op.name or\n  its name.\n\n  Args:\n    tensor: A `Tensor`.\n\n  Returns:\n    A list of strings with the aliases of the tensor.\n  \"\"\"\n  if hasattr(tensor, 'aliases'):\n    aliases = tensor.aliases\n  else:\n    if tensor.name[-2:] == ':0':\n      # Use op.name for tensor ending in :0\n      aliases = [tensor.op.name]\n    else:\n      aliases = [tensor.name]\n  return aliases\n\n\ndef convert_collection_to_dict(collection, clear_collection=False):\n  \"\"\"Returns an OrderedDict of Tensors with their aliases as keys.\n\n  Args:\n    collection: A collection.\n    clear_collection: When True, it clears the collection after converting to\n      OrderedDict.\n\n  Returns:\n    An OrderedDict of {alias: tensor}\n  \"\"\"\n  output = OrderedDict((alias, tensor)\n                       for tensor in ops.get_collection(collection)\n                       for alias in get_tensor_aliases(tensor))\n  if clear_collection:\n    ops.get_default_graph().clear_collection(collection)\n  return output\n\n\ndef constant_value(value_or_tensor_or_var, dtype=None):\n  \"\"\"Returns value if value_or_tensor_or_var has a constant value.\n\n  Args:\n    value_or_tensor_or_var: A value, a `Tensor` or a `Variable`.\n    dtype: Optional `tf.dtype`, if set it would check it has the right\n      dtype.\n\n  Returns:\n    The constant value or None if it not constant.\n\n  Raises:\n    ValueError: if value_or_tensor_or_var is None or the tensor_variable has the\n    wrong dtype.\n  \"\"\"\n  if value_or_tensor_or_var is None:\n    raise ValueError('value_or_tensor_or_var cannot be None')\n  value = value_or_tensor_or_var\n  if isinstance(value_or_tensor_or_var, (ops.Tensor, variables.Variable)):\n    if dtype and value_or_tensor_or_var.dtype != dtype:\n      raise ValueError('It has the wrong type %s instead of %s' % (\n          value_or_tensor_or_var.dtype, dtype))\n    if isinstance(value_or_tensor_or_var, variables.Variable):\n      value = None\n    else:\n      value = tensor_util.constant_value(value_or_tensor_or_var)\n  return value\n\n\ndef static_cond(pred, fn1, fn2):\n  \"\"\"Return either fn1() or fn2() based on the boolean value of `pred`.\n\n  Same signature as `control_flow_ops.cond()` but requires pred to be a bool.\n\n  Args:\n    pred: A value determining whether to return the result of `fn1` or `fn2`.\n    fn1: The callable to be performed if pred is true.\n    fn2: The callable to be performed if pred is false.\n\n  Returns:\n    Tensors returned by the call to either `fn1` or `fn2`.\n\n  Raises:\n    TypeError: if `fn1` or `fn2` is not callable.\n  \"\"\"\n  if not callable(fn1):\n    raise TypeError('fn1 must be callable.')\n  if not callable(fn2):\n    raise TypeError('fn2 must be callable.')\n  if pred:\n    return fn1()\n  else:\n    return fn2()\n\n\ndef smart_cond(pred, fn1, fn2, name=None):\n  \"\"\"Return either fn1() or fn2() based on the boolean predicate/value `pred`.\n\n  If `pred` is bool or has a constant value it would use `static_cond`,\n  otherwise it would use `tf.cond`.\n\n  Args:\n    pred: A scalar determining whether to return the result of `fn1` or `fn2`.\n    fn1: The callable to be performed if pred is true.\n    fn2: The callable to be performed if pred is false.\n    name: Optional name prefix when using tf.cond\n  Returns:\n    Tensors returned by the call to either `fn1` or `fn2`.\n  \"\"\"\n  pred_value = constant_value(pred)\n  if pred_value is not None:\n    # Use static_cond if pred has a constant value.\n    return static_cond(pred_value, fn1, fn2)\n  else:\n    # Use dynamic cond otherwise.\n    return control_flow_ops.cond(pred, fn1, fn2, name)\n\n\ndef get_variable_collections(variables_collections, name):\n  if isinstance(variables_collections, dict):\n    variable_collections = variables_collections.get(name, None)\n  else:\n    variable_collections = variables_collections\n  return variable_collections\n\n\ndef _get_dimension(shape, dim, min_rank=1):\n  \"\"\"Returns the `dim` dimension of `shape`, while checking it has `min_rank`.\n\n  Args:\n    shape: A `TensorShape`.\n    dim: Integer, which dimension to return.\n    min_rank: Integer, minimum rank of shape.\n\n  Returns:\n    The value of the `dim` dimension.\n\n  Raises:\n    ValueError: if inputs don't have at least min_rank dimensions, or if the\n      first dimension value is not defined.\n  \"\"\"\n  dims = shape.dims\n  if dims is None:\n    raise ValueError('dims of shape must be known but is None')\n  if len(dims) < min_rank:\n    raise ValueError('rank of shape must be at least %d not: %d' % (min_rank,\n                                                                    len(dims)))\n  value = dims[dim].value\n  if value is None:\n    raise ValueError(\n        'dimension %d of shape must be known but is None: %s' % (dim, shape))\n  return value\n\n\ndef channel_dimension(shape, data_format, min_rank=1):\n  \"\"\"Returns the channel dimension of shape, while checking it has min_rank.\n\n  Args:\n    shape: A `TensorShape`.\n    data_format: `channels_first` or `channels_last`.\n    min_rank: Integer, minimum rank of shape.\n\n  Returns:\n    The value of the first dimension.\n\n  Raises:\n    ValueError: if inputs don't have at least min_rank dimensions, or if the\n      first dimension value is not defined.\n  \"\"\"\n  return _get_dimension(shape, 1 if data_format == 'channels_first' else -1,\n                        min_rank=min_rank)\n\n\ndef last_dimension(shape, min_rank=1):\n  \"\"\"Returns the last dimension of shape while checking it has min_rank.\n\n  Args:\n    shape: A `TensorShape`.\n    min_rank: Integer, minimum rank of shape.\n\n  Returns:\n    The value of the last dimension.\n\n  Raises:\n    ValueError: if inputs don't have at least min_rank dimensions, or if the\n      last dimension value is not defined.\n  \"\"\"\n  return _get_dimension(shape, -1, min_rank=min_rank)\n\n\ndef two_element_tuple(int_or_tuple):\n  \"\"\"Converts `int_or_tuple` to height, width.\n\n  Several of the functions that follow accept arguments as either\n  a tuple of 2 integers or a single integer.  A single integer\n  indicates that the 2 values of the tuple are the same.\n\n  This functions normalizes the input value by always returning a tuple.\n\n  Args:\n    int_or_tuple: A list of 2 ints, a single int or a `TensorShape`.\n\n  Returns:\n    A tuple with 2 values.\n\n  Raises:\n    ValueError: If `int_or_tuple` it not well formed.\n  \"\"\"\n  if isinstance(int_or_tuple, (list, tuple)):\n    if len(int_or_tuple) != 2:\n      raise ValueError('Must be a list with 2 elements: %s' % int_or_tuple)\n    return int(int_or_tuple[0]), int(int_or_tuple[1])\n  if isinstance(int_or_tuple, int):\n    return int(int_or_tuple), int(int_or_tuple)\n  if isinstance(int_or_tuple, tensor_shape.TensorShape):\n    if len(int_or_tuple) == 2:\n      return int_or_tuple[0], int_or_tuple[1]\n  raise ValueError('Must be an int, a list with 2 elements or a TensorShape of '\n                   'length 2')\n\n\ndef n_positive_integers(n, value):\n  \"\"\"Converts `value` to a sequence of `n` positive integers.\n\n  `value` may be either be a sequence of values convertible to `int`, or a\n  single value convertible to `int`, in which case the resulting integer is\n  duplicated `n` times.  It may also be a TensorShape of rank `n`.\n\n  Args:\n    n: Length of sequence to return.\n    value: Either a single value convertible to a positive `int` or an\n      `n`-element sequence of values convertible to a positive `int`.\n\n  Returns:\n    A tuple of `n` positive integers.\n\n  Raises:\n    TypeError: If `n` is not convertible to an integer.\n    ValueError: If `n` or `value` are invalid.\n  \"\"\"\n\n  n_orig = n\n  n = int(n)\n  if n < 1 or n != n_orig:\n    raise ValueError('n must be a positive integer')\n\n  try:\n    value = int(value)\n  except (TypeError, ValueError):\n    sequence_len = len(value)\n    if sequence_len != n:\n      raise ValueError(\n          'Expected sequence of %d positive integers, but received %r' %\n          (n, value))\n    try:\n      values = tuple(int(x) for x in value)\n    except:\n      raise ValueError(\n          'Expected sequence of %d positive integers, but received %r' %\n          (n, value))\n    for x in values:\n      if x < 1:\n        raise ValueError('expected positive integer, but received %d' % x)\n    return values\n\n  if value < 1:\n    raise ValueError('expected positive integer, but received %d' % value)\n  return (value,) * n\n"
  },
  {
    "path": "tf_contrib/variables.py",
    "content": "# Copyright 2015 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\"\"\"Variable functions.\"\"\"\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\n\nimport functools\nimport re\n\nfrom .arg_scope import add_arg_scope as contrib_add_arg_scope\n# from . import gen_variable_ops\nfrom . import loader\nfrom tensorflow.core.protobuf import saver_pb2\nfrom tensorflow.python import pywrap_tensorflow\nfrom tensorflow.python.framework import device as tf_device\nfrom tensorflow.python.framework import dtypes\nfrom tensorflow.python.framework import ops\nfrom tensorflow.python.ops import array_ops\nfrom tensorflow.python.ops import control_flow_ops\nfrom tensorflow.python.ops import resource_variable_ops\nfrom tensorflow.python.ops import variable_scope\nfrom tensorflow.python.ops import variables\nfrom tensorflow.python.platform import resource_loader\nfrom tensorflow.python.platform import tf_logging as logging\nfrom tensorflow.python.training import saver as tf_saver\nfrom tensorflow.python.training import training_util\nfrom tensorflow.python.util.deprecation import deprecated\n\n\n__all__ = ['add_model_variable',\n           'assert_global_step',\n           'assert_or_get_global_step',\n           'assign_from_checkpoint',\n           'assign_from_checkpoint_fn',\n           'assign_from_values',\n           'assign_from_values_fn',\n           'create_global_step',\n           'filter_variables',\n           'get_global_step',\n           'get_or_create_global_step',\n           'get_local_variables',\n           'get_model_variables',\n           'get_trainable_variables',\n           'get_unique_variable',\n           'get_variables_by_name',\n           'get_variables_by_suffix',\n           'get_variable_full_name',\n           'get_variables_to_restore',\n           'get_variables',\n           'global_variable',\n           'local_variable',\n           'model_variable',\n           'variable',\n           'VariableDeviceChooser',\n           'zero_initializer']\n\n\ndef zero_initializer(ref, use_locking=True, name=\"zero_initializer\"):\n  \"\"\"Initialize 'ref' with all zeros, ref tensor should be uninitialized.\n\n  If already initialized, you will get ValueError. This op is intended to\n  save memory during initialization.\n  Args:\n    ref: ref of the tensor need to be zero initialized.\n    name: optional name for this operation.\n\n  Returns:\n    ref that initialized.\n  Raises:\n    ValueError: If ref tensor is initialized.\n  \"\"\"\n  loader.load_op_library(\n      resource_loader.get_path_to_datafile('_variable_ops.so'))\n  if resource_variable_ops.is_resource_variable(ref):\n    return gen_variable_ops.zero_var_initializer(\n        ref.handle, shape=ref.shape, dtype=ref.dtype, name=name)\n  else:\n    # return gen_variable_ops.zero_initializer(ref, name=name)\n    raise RuntimeError('gen_variable_ops is not implemented in this simplified version')\n\n\n@deprecated(None, 'Please switch to tf.train.assert_global_step')\ndef assert_global_step(global_step_tensor):\n  training_util.assert_global_step(global_step_tensor)\n\n\ndef assert_or_get_global_step(graph=None, global_step_tensor=None):\n  \"\"\"Verifies that a global step tensor is valid or gets one if None is given.\n\n  If `global_step_tensor` is not None, check that it is a valid global step\n  tensor (using `assert_global_step`). Otherwise find a global step tensor using\n  `get_global_step` and return it.\n\n  Args:\n    graph: The graph to find the global step tensor for.\n    global_step_tensor: The tensor to check for suitability as a global step. If\n      None is given (the default), find a global step tensor.\n\n  Returns:\n    A tensor suitable as a global step, or `None` if none was provided and none\n    was found.\n  \"\"\"\n  if global_step_tensor is None:\n    # Get the global step tensor the same way the supervisor would.\n    global_step_tensor = get_global_step(graph)\n  else:\n    assert_global_step(global_step_tensor)\n  return global_step_tensor\n\n\n@deprecated(None, 'Please switch to tf.train.get_global_step')\ndef get_global_step(graph=None):\n  return training_util.get_global_step(graph)\n\n\n@deprecated(None, 'Please switch to tf.train.create_global_step')\ndef create_global_step(graph=None):\n  \"\"\"Create global step tensor in graph.\n\n  This API is deprecated. Use core framework training version instead.\n\n  Args:\n    graph: The graph in which to create the global step tensor. If missing, use\n      default graph.\n\n  Returns:\n    Global step tensor.\n\n  Raises:\n    ValueError: if global step tensor is already defined.\n  \"\"\"\n  return training_util.create_global_step(graph)\n\n\n@deprecated(None, 'Please switch to tf.train.get_or_create_global_step')\ndef get_or_create_global_step(graph=None):\n  \"\"\"Returns and create (if necessary) the global step tensor.\n\n  Args:\n    graph: The graph in which to create the global step tensor. If missing, use\n      default graph.\n\n  Returns:\n    The global step tensor.\n  \"\"\"\n  return training_util.get_or_create_global_step(graph)\n\n\ndef local_variable(initial_value,\n                   validate_shape=True,\n                   name=None,\n                   use_resource=None):\n  \"\"\"Create a variable with a value and add it to `GraphKeys.LOCAL_VARIABLES`.\n\n  Args:\n    initial_value: See variables.Variable.__init__.\n    validate_shape: See variables.Variable.__init__.\n    name: See variables.Variable.__init__.\n    use_resource: If `True` use a ResourceVariable instead of a Variable.\n\n  Returns:\n    New variable.\n  \"\"\"\n  return variable_scope.variable(\n      initial_value,\n      trainable=False,\n      collections=[ops.GraphKeys.LOCAL_VARIABLES],\n      validate_shape=validate_shape,\n      use_resource=use_resource,\n      name=name)\n\n\ndef global_variable(initial_value,\n                    validate_shape=True,\n                    name=None,\n                    use_resource=None):\n  \"\"\"Create a variable with a value and add it to `GraphKeys.GLOBAL_VARIABLES`.\n\n  Args:\n    initial_value: See variables.Variable.__init__.\n    validate_shape: See variables.Variable.__init__.\n    name: See variables.Variable.__init__.\n    use_resource: If `True` use a ResourceVariable instead of a Variable.\n\n  Returns:\n    New variable.\n  \"\"\"\n  return variable_scope.variable(\n      initial_value,\n      trainable=False,\n      collections=[ops.GraphKeys.GLOBAL_VARIABLES],\n      validate_shape=validate_shape,\n      use_resource=use_resource,\n      name=name)\n\n\n@contrib_add_arg_scope\ndef variable(name,\n             shape=None,\n             dtype=None,\n             initializer=None,\n             regularizer=None,\n             trainable=True,\n             collections=None,\n             caching_device=None,\n             device=None,\n             partitioner=None,\n             custom_getter=None,\n             use_resource=None,\n             synchronization=variables.VariableSynchronization.AUTO,\n             aggregation=variables.VariableAggregation.NONE):\n  \"\"\"Gets an existing variable with these parameters or creates a new one.\n\n  Args:\n    name: the name of the new or existing variable.\n    shape: shape of the new or existing variable.\n    dtype: type of the new or existing variable (defaults to `DT_FLOAT`).\n    initializer: initializer for the variable if one is created.\n    regularizer: a (Tensor -> Tensor or None) function; the result of applying\n      it on a newly created variable will be added to the collection\n      GraphKeys.REGULARIZATION_LOSSES and can be used for regularization.\n    trainable: If `True` also add the variable to the graph collection\n      `GraphKeys.TRAINABLE_VARIABLES` (see `tf.Variable`).\n    collections: A list of collection names to which the Variable will be added.\n      If None it would default to `tf.GraphKeys.GLOBAL_VARIABLES`.\n    caching_device: Optional device string or function describing where the\n      Variable should be cached for reading.  Defaults to the Variable's device.\n    device: Optional device to place the variable. It can be an string or a\n      function that is called to get the device for the variable.\n    partitioner: Optional callable that accepts a fully defined `TensorShape`\n      and dtype of the `Variable` to be created, and returns a list of\n      partitions for each axis (currently only one axis can be partitioned).\n    custom_getter: Callable that allows overwriting the internal get_variable\n      method and has to have the same signature.\n    use_resource: If `True` use a ResourceVariable instead of a Variable.\n    synchronization: Indicates when a distributed a variable will be aggregated.\n      Accepted values are constants defined in the class\n      `tf.VariableSynchronization`. By default the synchronization is set to\n      `AUTO` and the current `DistributionStrategy` chooses when to synchronize.\n    aggregation: Indicates how a distributed variable will be aggregated.\n      Accepted values are constants defined in the class\n      `tf.VariableAggregation`.\n\n  Returns:\n    The created or existing variable.\n  \"\"\"\n  collections = list(collections if collections is not None else\n                     [ops.GraphKeys.GLOBAL_VARIABLES])\n\n  # Remove duplicates\n  collections = list(set(collections))\n  getter = variable_scope.get_variable\n  if custom_getter is not None:\n    getter = functools.partial(\n        custom_getter, reuse=variable_scope.get_variable_scope().reuse)\n  with ops.device(device or ''):\n    return getter(\n        name,\n        shape=shape,\n        dtype=dtype,\n        initializer=initializer,\n        regularizer=regularizer,\n        trainable=trainable,\n        collections=collections,\n        caching_device=caching_device,\n        partitioner=partitioner,\n        use_resource=use_resource,\n        synchronization=synchronization,\n        aggregation=aggregation)\n\n\n@contrib_add_arg_scope\ndef model_variable(name,\n                   shape=None,\n                   dtype=dtypes.float32,\n                   initializer=None,\n                   regularizer=None,\n                   trainable=True,\n                   collections=None,\n                   caching_device=None,\n                   device=None,\n                   partitioner=None,\n                   custom_getter=None,\n                   use_resource=None,\n                   synchronization=variables.VariableSynchronization.AUTO,\n                   aggregation=variables.VariableAggregation.NONE):\n  \"\"\"Gets an existing model variable with these parameters or creates a new one.\n\n  Args:\n    name: the name of the new or existing variable.\n    shape: shape of the new or existing variable.\n    dtype: type of the new or existing variable (defaults to `DT_FLOAT`).\n    initializer: initializer for the variable if one is created.\n    regularizer: a (Tensor -> Tensor or None) function; the result of applying\n      it on a newly created variable will be added to the collection\n      GraphKeys.REGULARIZATION_LOSSES and can be used for regularization.\n    trainable: If `True` also add the variable to the graph collection\n      `GraphKeys.TRAINABLE_VARIABLES` (see `tf.Variable`).\n    collections: A list of collection names to which the Variable will be added.\n      Note that the variable is always also added to the\n      `GraphKeys.GLOBAL_VARIABLES` and `GraphKeys.MODEL_VARIABLES` collections.\n    caching_device: Optional device string or function describing where the\n      Variable should be cached for reading.  Defaults to the Variable's device.\n    device: Optional device to place the variable. It can be an string or a\n      function that is called to get the device for the variable.\n    partitioner: Optional callable that accepts a fully defined `TensorShape`\n      and dtype of the `Variable` to be created, and returns a list of\n      partitions for each axis (currently only one axis can be partitioned).\n    custom_getter: Callable that allows overwriting the internal get_variable\n      method and has to have the same signature.\n    use_resource: If `True` use a ResourceVariable instead of a Variable.\n    synchronization: Indicates when a distributed a variable will be aggregated.\n      Accepted values are constants defined in the class\n      `tf.VariableSynchronization`. By default the synchronization is set to\n      `AUTO` and the current `DistributionStrategy` chooses when to synchronize.\n    aggregation: Indicates how a distributed variable will be aggregated.\n      Accepted values are constants defined in the class\n      `tf.VariableAggregation`.\n\n  Returns:\n    The created or existing variable.\n  \"\"\"\n  collections = list(collections or [])\n  collections += [ops.GraphKeys.GLOBAL_VARIABLES, ops.GraphKeys.MODEL_VARIABLES]\n  var = variable(\n      name,\n      shape=shape,\n      dtype=dtype,\n      initializer=initializer,\n      regularizer=regularizer,\n      trainable=trainable,\n      collections=collections,\n      caching_device=caching_device,\n      device=device,\n      partitioner=partitioner,\n      custom_getter=custom_getter,\n      use_resource=use_resource,\n      synchronization=synchronization,\n      aggregation=aggregation)\n  return var\n\n\ndef add_model_variable(var):\n  \"\"\"Adds a variable to the `GraphKeys.MODEL_VARIABLES` collection.\n\n  Args:\n    var: a variable.\n  \"\"\"\n  if var not in ops.get_collection(ops.GraphKeys.MODEL_VARIABLES):\n    ops.add_to_collection(ops.GraphKeys.MODEL_VARIABLES, var)\n\n\ndef get_variables(scope=None,\n                  suffix=None,\n                  collection=ops.GraphKeys.GLOBAL_VARIABLES):\n  \"\"\"Gets the list of variables, filtered by scope and/or suffix.\n\n  Args:\n    scope: an optional scope for filtering the variables to return. Can be a\n      variable scope or a string.\n    suffix: an optional suffix for filtering the variables to return.\n    collection: in which collection search for. Defaults to\n      `GraphKeys.GLOBAL_VARIABLES`.\n\n  Returns:\n    a list of variables in collection with scope and suffix.\n  \"\"\"\n  if isinstance(scope, variable_scope.VariableScope):\n    scope = scope.name\n  if suffix is not None:\n    if ':' not in suffix:\n      suffix += ':'\n    scope = (scope or '') + '.*' + suffix\n  return ops.get_collection(collection, scope)\n\n\ndef get_model_variables(scope=None, suffix=None):\n  \"\"\"Gets the list of model variables, filtered by scope and/or suffix.\n\n  Args:\n    scope: an optional scope for filtering the variables to return.\n    suffix: an optional suffix for filtering the variables to return.\n\n  Returns:\n    a list of variables in collection with scope and suffix.\n  \"\"\"\n  return get_variables(scope, suffix, ops.GraphKeys.MODEL_VARIABLES)\n\n\ndef get_local_variables(scope=None, suffix=None):\n  \"\"\"Gets the list of local variables, filtered by scope and/or suffix.\n\n  Args:\n    scope: an optional scope for filtering the variables to return.\n    suffix: an optional suffix for filtering the variables to return.\n\n  Returns:\n    a list of variables in collection with scope and suffix.\n  \"\"\"\n  return get_variables(scope, suffix, ops.GraphKeys.LOCAL_VARIABLES)\n\n\ndef get_trainable_variables(scope=None, suffix=None):\n  \"\"\"Gets the list of trainable variables, filtered by scope and/or suffix.\n\n  Args:\n    scope: an optional scope for filtering the variables to return.\n    suffix: an optional suffix for filtering the variables to return.\n\n  Returns:\n    a list of variables in the trainable collection with scope and suffix.\n  \"\"\"\n  return get_variables(scope, suffix, ops.GraphKeys.TRAINABLE_VARIABLES)\n\n\ndef get_variables_to_restore(include=None, exclude=None):\n  \"\"\"Gets the list of the variables to restore.\n\n  Args:\n    include: an optional list/tuple of scope strings for filtering which\n      variables from the VARIABLES collection to include. None would include all\n      the variables.\n    exclude: an optional list/tuple of scope strings for filtering which\n      variables from the VARIABLES collection to exclude. None it would not\n      exclude any.\n\n  Returns:\n    a list of variables to restore.\n\n  Raises:\n    TypeError: include or exclude is provided but is not a list or a tuple.\n  \"\"\"\n  if include is None:\n    # Include all variables.\n    vars_to_include = get_variables()\n  else:\n    if not isinstance(include, (list, tuple)):\n      raise TypeError('include is provided but is not a list or a tuple.')\n    vars_to_include = []\n    for scope in include:\n      vars_to_include += get_variables(scope)\n  vars_to_exclude = set()\n  if exclude is not None:\n    if not isinstance(exclude, (list, tuple)):\n      raise TypeError('exclude is provided but is not a list or a tuple.')\n    for scope in exclude:\n      vars_to_exclude |= set(get_variables(scope))\n  # Exclude the variables in vars_to_exclude\n  return [v for v in vars_to_include if v not in vars_to_exclude]\n\n\ndef get_variables_by_suffix(suffix, scope=None):\n  \"\"\"Gets the list of variables that end with the given suffix.\n\n  Args:\n    suffix: suffix for filtering the variables to return.\n    scope: an optional scope for filtering the variables to return.\n\n  Returns:\n    a copied list of variables with the given name and prefix.\n  \"\"\"\n  return get_variables(scope=scope, suffix=suffix)\n\n\ndef get_variables_by_name(given_name, scope=None):\n  \"\"\"Gets the list of variables that were given that name.\n\n  Args:\n    given_name: name given to the variable without any scope.\n    scope: an optional scope for filtering the variables to return.\n\n  Returns:\n    a copied list of variables with the given name and scope.\n  \"\"\"\n  suffix = '/' + given_name + ':|^' + given_name + ':'\n  return get_variables(scope=scope, suffix=suffix)\n\n\ndef get_unique_variable(var_op_name):\n  \"\"\"Gets the variable uniquely identified by that var_op_name.\n\n  Args:\n    var_op_name: the full name of the variable op, including the scope.\n\n  Returns:\n    a tensorflow variable.\n\n  Raises:\n    ValueError: if no variable uniquely identified by the name exists.\n  \"\"\"\n  candidates = get_variables(scope=var_op_name)\n  if not candidates:\n    raise ValueError('Couldn\\'t find variable %s' % var_op_name)\n\n  for candidate in candidates:\n    if candidate.op.name == var_op_name:\n      return candidate\n  raise ValueError('Variable %s does not uniquely identify a variable' %\n                   var_op_name)\n\n\ndef assign_from_values(var_names_to_values):\n  \"\"\"Creates an assignment operation from a given mapping.\n\n  This function provides a mechanism for performing assignment of variables\n  to values in a way that does not fill the graph with large assignment values.\n\n  Args:\n    var_names_to_values: A map from variable names to values.\n\n  Returns:\n    assign_op: An `Operation` that assigns each of the given variables to the\n      requested values.\n    feed_dict: The feed dictionary to use when evaluating `assign_op`.\n\n  Raises:\n    ValueError: if any of the given variable names were not found.\n  \"\"\"\n  feed_dict = {}\n  assign_ops = []\n\n  for var_name in var_names_to_values:\n    var_value = var_names_to_values[var_name]\n    var = ops.get_collection(ops.GraphKeys.GLOBAL_VARIABLES, var_name)\n    if not var:\n      raise ValueError('Variable %s wasn\\'t found' % var_name)\n    elif len(var) > 1:\n      # tf.compat.v1.get_collection is just a filter on the prefix: find the exact match:\n      found = False\n      for v in var:\n        if v.op.name == var_name:\n          var = v\n          found = True\n          break\n\n      if not found:\n        raise ValueError('Variable %s doesn\\'t uniquely identify a variable' %\n                         var_name)\n    else:\n      var = var[0]\n\n    # TODO(nsilberman): ensure placeholder and assign are on the same device.\n    # Assign a placeholder to the value that will be filled later.\n    placeholder_name = 'placeholder/' + var.op.name\n    placeholder_value = array_ops.placeholder(\n        dtype=var.dtype.base_dtype,\n        shape=var.get_shape(),\n        name=placeholder_name)\n    assign_ops.append(var.assign(placeholder_value))\n\n    feed_dict[placeholder_value] = var_value.reshape(var.get_shape())\n\n  assign_op = control_flow_ops.group(*assign_ops)\n  return assign_op, feed_dict\n\n\ndef assign_from_values_fn(var_names_to_values):\n  \"\"\"Returns a function that assigns specific variables from the given values.\n\n  This function provides a mechanism for performing assignment of variables\n  to values in a way that does not fill the graph with large assignment values.\n\n  Args:\n    var_names_to_values: A map from variable names to values.\n\n  Returns:\n    A function that takes a single argument, a `tf.compat.v1.Session`, that\n    applies the\n    assignment operation.\n\n  Raises:\n    ValueError: if any of the given variable names were not found.\n  \"\"\"\n  assign_op, feed_dict = assign_from_values(var_names_to_values)\n\n  def callback(session):\n    return session.run(assign_op, feed_dict)\n\n  return callback\n\n\n# pylint: disable=protected-access\n# Currently variable_scope doesn't provide very good APIs to access\n# all variables under scope and retrieve and check existing scopes.\ndef get_variable_full_name(var):\n  \"\"\"Returns the full name of a variable.\n\n  For normal Variables, this is the same as the var.op.name.  For\n  sliced or PartitionedVariables, this name is the same for all the\n  slices/partitions. In both cases, this is normally the name used in\n  a checkpoint file.\n\n  Args:\n    var: A `Variable` object.\n\n  Returns:\n    A string that is the full name.\n  \"\"\"\n  if var._save_slice_info:\n    return var._save_slice_info.full_name\n  else:\n    return var.op.name\n\n\n# TODO(nsilberman): add flag to load exponential moving averages instead\n#\n# TODO(sguada): Update docs in slim/g3doc/index.md to describe\n# the new feature where the var_list dictionary can have values that\n# are each a list of Variables.\ndef assign_from_checkpoint(model_path, var_list, ignore_missing_vars=False):\n  \"\"\"Creates an operation to assign specific variables from a checkpoint.\n\n  Args:\n    model_path: The full path to the model checkpoint. To get latest checkpoint\n      use `model_path = tf.train.latest_checkpoint(checkpoint_dir)`\n    var_list: A list of (possibly partitioned) `Variable` objects or a\n      dictionary mapping names in the checkpoint to the corresponding variables\n      or list of variables to initialize from that checkpoint value. For\n      partitioned Variables, the name in the checkpoint must be the full\n      variable, not the name of the partitioned variable, eg. \"my_var\" rather\n      than \"my_var/part_4\". If empty, returns no_op(), {}.\n    ignore_missing_vars: Boolean, if True ignore variables missing in the\n      checkpoint with a warning instead of failing.\n\n  Returns:\n    the restore_op and the feed_dict that need to be run to restore var_list.\n\n  Raises:\n    ValueError: If `ignore_missing_vars` is False and the checkpoint specified\n        at `model_path` is missing one of the variables in `var_list`.\n  \"\"\"\n  # Normalize var_list into a dictionary mapping names in the\n  # checkpoint to the list of variables to initialize from that\n  # checkpoint variable. Sliced (including partitioned) variables will\n  # end up under the same key.\n  grouped_vars = {}\n  if isinstance(var_list, (tuple, list)):\n    for var in var_list:\n      ckpt_name = get_variable_full_name(var)\n      if ckpt_name not in grouped_vars:\n        grouped_vars[ckpt_name] = []\n      grouped_vars[ckpt_name].append(var)\n\n  else:\n    for ckpt_name, value in var_list.items():\n      if isinstance(value, (tuple, list)):\n        grouped_vars[ckpt_name] = value\n      else:\n        grouped_vars[ckpt_name] = [value]\n\n  # Read each checkpoint entry. Create a placeholder variable and\n  # add the (possibly sliced) data from the checkpoint to the feed_dict.\n  reader = pywrap_tensorflow.NewCheckpointReader(model_path)\n  feed_dict = {}\n  assign_ops = []\n  for ckpt_name in grouped_vars:\n    if not reader.has_tensor(ckpt_name):\n      log_str = 'Checkpoint is missing variable [%s]' % ckpt_name\n      if ignore_missing_vars:\n        logging.warning(log_str)\n        continue\n      else:\n        raise ValueError(log_str)\n    ckpt_value = reader.get_tensor(ckpt_name)\n\n    for var in grouped_vars[ckpt_name]:\n      placeholder_tensor = array_ops.placeholder(\n          dtype=var.dtype.base_dtype,\n          shape=var.get_shape(),\n          name='placeholder/' + var.op.name)\n      assign_ops.append(var.assign(placeholder_tensor))\n\n      if not var._save_slice_info:\n        if var.get_shape() != ckpt_value.shape:\n          raise ValueError(\n              'Total size of new array must be unchanged for %s '\n              'lh_shape: [%s], rh_shape: [%s]' %\n              (ckpt_name, str(ckpt_value.shape), str(var.get_shape())))\n\n        feed_dict[placeholder_tensor] = ckpt_value.reshape(ckpt_value.shape)\n      else:\n        slice_dims = zip(var._save_slice_info.var_offset,\n                         var._save_slice_info.var_shape)\n        slice_dims = [(start, start + size) for (start, size) in slice_dims]\n        slice_dims = [slice(*x) for x in slice_dims]\n        slice_value = ckpt_value[slice_dims]\n        slice_value = slice_value.reshape(var._save_slice_info.var_shape)\n        feed_dict[placeholder_tensor] = slice_value\n\n  assign_op = control_flow_ops.group(*assign_ops)\n  return assign_op, feed_dict\n\n\n# pylint: enable=protected-access\n\n\ndef assign_from_checkpoint_fn(model_path,\n                              var_list,\n                              ignore_missing_vars=False,\n                              reshape_variables=False):\n  \"\"\"Returns a function that assigns specific variables from a checkpoint.\n\n  If ignore_missing_vars is True and no variables are found in the checkpoint\n  it returns None.\n\n  Args:\n    model_path: The full path to the model checkpoint. To get latest checkpoint\n      use `model_path = tf.train.latest_checkpoint(checkpoint_dir)`\n    var_list: A list of `Variable` objects or a dictionary mapping names in the\n      checkpoint to the corresponding variables to initialize. If empty or\n      `None`, it would return `no_op(), None`.\n    ignore_missing_vars: Boolean, if True it would ignore variables missing in\n      the checkpoint with a warning instead of failing.\n    reshape_variables: Boolean, if True it would automatically reshape variables\n      which are of different shape then the ones stored in the checkpoint but\n      which have the same number of elements.\n\n  Returns:\n    A function that takes a single argument, a `tf.compat.v1.Session`, that\n    applies the\n    assignment operation. If no matching variables were found in the checkpoint\n    then `None` is returned.\n\n  Raises:\n    ValueError: If var_list is empty.\n  \"\"\"\n  if not var_list:\n    raise ValueError('var_list cannot be empty')\n  if ignore_missing_vars:\n    reader = pywrap_tensorflow.NewCheckpointReader(model_path)\n    if isinstance(var_list, dict):\n      var_dict = var_list\n    else:\n      var_dict = {var.op.name: var for var in var_list}\n    available_vars = {}\n    for var in var_dict:\n      if reader.has_tensor(var):\n        available_vars[var] = var_dict[var]\n      else:\n        logging.warning('Variable %s missing in checkpoint %s', var, model_path)\n    var_list = available_vars\n  if var_list:\n    saver = tf_saver.Saver(\n        var_list,\n        reshape=reshape_variables,\n        write_version=saver_pb2.SaverDef.V1)\n\n    def callback(session):\n      saver.restore(session, model_path)\n\n    return callback\n  else:\n    logging.warning('No Variables to restore')\n    return None\n\n\nclass VariableDeviceChooser(object):\n  \"\"\"Device chooser for variables.\n\n  When using a parameter server it will assign them in a round-robin fashion.\n  When not using a parameter server it allows GPU or CPU placement.\n  \"\"\"\n\n  def __init__(self,\n               num_tasks=0,\n               job_name='ps',\n               device_type='CPU',\n               device_index=0,\n               replica=None):\n    \"\"\"Initialize VariableDeviceChooser.\n\n    Usage:\n      To use with 2 parameter servers:\n        VariableDeviceChooser(2)\n\n      To use without parameter servers:\n        VariableDeviceChooser()\n        VariableDeviceChooser(device_type='GPU') # For GPU placement\n\n    Args:\n      num_tasks: number of tasks.\n      job_name: String, a name for the parameter server job.\n      device_type: Optional device type string (e.g. \"CPU\" or \"GPU\")\n      device_index: int.  Optional device index.  If left unspecified, device\n        represents 'any' device_index.\n    \"\"\"\n    self._job_name = job_name\n    self._device_type = device_type\n    self._device_index = device_index\n    self._replica = replica\n    self._num_tasks = num_tasks\n    self._next_task_id = 0\n\n  def __call__(self, op):\n    device_spec = tf_device.DeviceSpec(\n        replica=self._replica,\n        device_type=self._device_type,\n        device_index=self._device_index)\n    if self._num_tasks > 0:\n      task_id = self._next_task_id\n      self._next_task_id = (self._next_task_id + 1) % self._num_tasks\n      device_spec.job = self._job_name\n      device_spec.task = task_id\n    return device_spec.to_string()\n\n\ndef filter_variables(var_list,\n                     include_patterns=None,\n                     exclude_patterns=None,\n                     reg_search=True):\n  \"\"\"Filter a list of variables using regular expressions.\n\n  First includes variables according to the list of include_patterns.\n  Afterwards, eliminates variables according to the list of exclude_patterns.\n\n  For example, one can obtain a list of variables with the weights of all\n  convolutional layers (depending on the network definition) by:\n\n  ```python\n  variables = tf.contrib.framework.get_model_variables()\n  conv_weight_variables = tf.contrib.framework.filter_variables(\n      variables,\n      include_patterns=['Conv'],\n      exclude_patterns=['biases', 'Logits'])\n  ```\n\n  Args:\n    var_list: list of variables.\n    include_patterns: list of regular expressions to include. Defaults to None,\n      which means all variables are selected according to the include rules. A\n      variable is included if it matches any of the include_patterns.\n    exclude_patterns: list of regular expressions to exclude. Defaults to None,\n      which means all variables are selected according to the exclude rules. A\n      variable is excluded if it matches any of the exclude_patterns.\n    reg_search: boolean. If True (default), performs re.search to find matches\n      (i.e. pattern can match any substring of the variable name). If False,\n      performs re.match (i.e. regexp should match from the beginning of the\n      variable name).\n\n  Returns:\n    filtered list of variables.\n  \"\"\"\n  if reg_search:\n    reg_exp_func = re.search\n  else:\n    reg_exp_func = re.match\n\n  # First include variables.\n  if include_patterns is None:\n    included_variables = list(var_list)\n  else:\n    included_variables = []\n    for var in var_list:\n      if any(reg_exp_func(ptrn, var.name) for ptrn in include_patterns):\n        included_variables.append(var)\n\n  # Afterwards, exclude variables.\n  if exclude_patterns is None:\n    filtered_variables = included_variables\n  else:\n    filtered_variables = []\n    for var in included_variables:\n      if not any(reg_exp_func(ptrn, var.name) for ptrn in exclude_patterns):\n        filtered_variables.append(var)\n\n  return filtered_variables\n"
  }
]