[
  {
    "path": ".gitmodules",
    "content": "[submodule \"face_renderer\"]\n\tpath = face_renderer\n\turl = https://github.com/iacopomasi/face_specific_augm/\n"
  },
  {
    "path": "BFM/README",
    "content": "\n"
  },
  {
    "path": "README.md",
    "content": "# Face-Pose-Net\n\n![Teaser](./teasers/extreme_cases.jpg)\n<sub>**Extreme face alignment examples:** Faces rendered to a 45 degrees yaw angle (aligned to half profile) using our FacePoseNet. Images were taken from the IJB-A collection and represent extreme viewing conditions, including near profile views, occlusions, and low resolution. Such conditions are often too hard for existing face landmark detection methods to handle yet easily aligned with our FacePoseNet.</sub>\n<br/>\n<br/>\nThis page contains DCNN model and python code to robustly estimate 6 degrees of freedom, 3D face pose from an unconstrained image, without the use of face landmark detectors. The method is described in the paper:\n\n_F.-J. Chang, A. Tran, T. Hassner, I. Masi, R. Nevatia, G. Medioni, \"[FacePoseNet: Making a Case for Landmark-Free Face Alignment](https://arxiv.org/abs/1708.07517)\", in 7th IEEE International Workshop on Analysis and Modeling of Faces and Gestures, ICCV Workshops, 2017_ [1].\n\nThis release bundles up our **FacePoseNet** (FPN) with the **Face Renderer** from Masi _et al._ [2,5], which is available separately from [this project page](https://github.com/iacopomasi/face_specific_augm).\n\nThe result is an end-to-end pipeline that seamlessly estimates facial pose and produces multiple rendered views to be used for face alignment and data augmentation.\n\n![Teaser](./teasers/diagram.png)\n\n## Updates (Modified and New features, 12/20/2018)\n* FPN structure is changed to ResNet-101 for better pose prediction [fpn-resnet101](./ResNet/ThreeDMM_shape.py)\n* **Two versions of FPNs (under the assumption of weak perspective transformation) are added**: \n* (1) **Predict 6DoF head pose** (scale, pitch, yaw, roll, translation_x, translation_y):  [main_predict_6DoF.py](./main_predict_6DoF.py)\n* (2) **Predict 11 parameters of the 3x4 projection matrix**:  [main_predict_ProjMat.py](./main_predict_ProjMat.py)\n* The codes to convert 6DoF head pose to 3x4 projection matrix is [here](https://github.com/fengju514/Face-Pose-Net/blob/fb733f358d9f633f6525a41f3a7a0a99e5c71647/main_predict_6DoF.py#L263-L268) \n* The codes to convert 11 parameters / 3x4 projection matrix to 6DoF head pose is [here](https://github.com/fengju514/Face-Pose-Net/blob/92bd65fa056d17065890e186ca2f2b376a5ab135/main_predict_ProjMat.py#L306-L308)\n* The corresponding 3D shape and landmarks can be obtained by predicted 6DoF head pose [3D shape from 6DoF](https://github.com/fengju514/Face-Pose-Net/blob/92bd65fa056d17065890e186ca2f2b376a5ab135/main_predict_6DoF.py#L271-L297) or by predicted 11 parameters [3D shape from 11 parameters](https://github.com/fengju514/Face-Pose-Net/blob/92bd65fa056d17065890e186ca2f2b376a5ab135/main_predict_ProjMat.py#L272-L297)\n* Download new FPN models: Please put all model files [here](https://www.dropbox.com/sh/lr9u4my1qrhmgik/AADQVUIHSJIUXqUAj1AoZMIGa?dl=0) in the folder `models`\n* Download BFM models: Please put BFM shape and expression files [here](https://www.dropbox.com/sh/ru7ierl9516a9az/AABTP9hJj3dJnapicFFgHmOna?dl=0) in the folder `BFM`\n* Run new FPN to predict 6DoF head pose:\n```bash\n$ python main_predict_6DoF.py <gpu_id> <input-list-path>\n```\n* Run new FPN to predict 11DoF parameters of the projection matrix:\n```bash\n$ python main_predict_ProjMat.py <gpu_id> <input-list-path>\n```\nWe provide a sample input list available [here](./input_list.txt).\n```bash\n<FILE_NAME, FACE_X, FACE_y, FACE_WIDTH, FACE_HEIGHT>\n```\nwhere `<FACE_X, FACE_y, FACE_WIDTH, FACE_HEIGHT>` is the x,y coordinates of the upper-left point, the width, and the height of the tight face bounding box, either obtained manually, by the face detector or by the landmark detector. The predicted 6DoF and 11DoF results would be saved in [output_6DoF folder](https://github.com/fengju514/Face-Pose-Net/blob/a7923b764f92892021297fd046065c22a41dc519/main_predict_6DoF.py#L232-L236) and [output_ProjMat folder](https://github.com/fengju514/Face-Pose-Net/blob/a7923b764f92892021297fd046065c22a41dc519/main_predict_ProjMat.py#L235-L239) respectively. The output 3D shapes and landmarks by 6DoF and 11DoF are saved in [output_6DoF folder](https://github.com/fengju514/Face-Pose-Net/blob/a7923b764f92892021297fd046065c22a41dc519/main_predict_6DoF.py#L301) and in [output_ProjMat folder](https://github.com/fengju514/Face-Pose-Net/blob/a7923b764f92892021297fd046065c22a41dc519/main_predict_ProjMat.py#L301) respectively. You can visualize the 3D shapes and landmarks via Matlab.\n\n\n* The same renderer can be used. Instead of feeding into the 6DoF pose, you need to feed into the predicted landmarks either from 6DoF head pose or from 3x4 projection matrix. Please see an example in demo.py of [this project page](https://github.com/iacopomasi/face_specific_augm)\n\n## Features\n* **6DoF 3D Head Pose estimation** + **3D rendered facial views**.\n* Does not use **fragile** landmark detectors\n* Robustness on images landmark detectors struggle with (low rez., occluded, etc.)\n* Extremely fast pose estimation\n* Both CPU and GPU supported\n* Provides better face recognition through better face alignment than alignment using state of the art landmark detectors [1]\n\n## Dependencies\n\n* [TensorFlow](https://www.tensorflow.org/)\n* [OpenCV Python Wrapper](http://opencv.org/)\n* [Numpy](http://www.numpy.org/)\n* [Python2.7](https://www.python.org/download/releases/2.7/)\n\nThe code has been tested on Linux only. On Linux you can rely on the default version of python, installing all the packages needed from the package manager or on Anaconda Python and install required packages through `conda`. \n\n**Note:** no landmarks are used in our method, although you can still project the landmarks on the input image using the estimated pose. See the paper for further details. \n\n## Usage\n\n* **Important:** In order to download **both** FPN code and the renderer use `git clone --recursive`\n* **Important:** Please download the learned models from https://www.dropbox.com/s/r38psbq55y2yj4f/fpn_new_model.tar.gz?dl=0   and make sure that the FPN models are stored in the folder `fpn_new_model`.\n\n### Run it\n\nThe alignment and rendering can be used from the command line in the following, different ways.\n\nTo run it directly on a list of images (software will run FPN to estimate the pose and then render novel views based on the estimated pose):\n\n```bash\n$ python main_fpn.py <input-list-path>\n```\n\nWe provide a sample input list available [here](input.csv).\n```bash\n<ID, FILE, FACE_X, FACE_y, FACE_WIDTH, FACE_HEIGHT>\n```\nwhere `<FACE_X, FACE_y, FACE_WIDTH, FACE_HEIGHT>` is the face bounding box information, either obtained manually or by the face detector. \n\n## Sample Results\nPlease see the input images [here](images) and rendered outputs [here](output_render).\n\n### input: ### \n![sbj10](./images/input10.jpg)\n### rendering: ### \n![sbj10](./output_render/subject10/subject10_a_rendered_aug_-00_00_10.jpg)\n![sbj10](./output_render/subject10/subject10_a_rendered_aug_-22_00_10.jpg)\n![sbj10](./output_render/subject10/subject10_a_rendered_aug_-40_00_10.jpg)\n![sbj10](./output_render/subject10/subject10_a_rendered_aug_-55_00_10.jpg)\n![sbj10](./output_render/subject10/subject10_a_rendered_aug_-75_00_10.jpg)\n\n\n\n## Current Limitations\nFPN is currently trained with a single 3D generic shape, without accounting for facial expressions. Addressing these is planned as future work.\n\n## Citation\n\nPlease cite our paper with the following bibtex if you use our face renderer:\n\n``` latex\n@inproceedings{chang17fpn,\n      title={{F}ace{P}ose{N}et: Making a Case for Landmark-Free Face Alignment},\n      booktitle = {7th IEEE International Workshop on Analysis and Modeling of Faces and Gestures, ICCV Workshops},\n      author={\n      Feng-ju Chang\n      and Anh Tran \n      and Tal Hassner \n      and Iacopo Masi \n      and Ram Nevatia\n      and G\\'{e}rard Medioni},\n      year={2017},\n    }\n```\n\n## References\n[1] F.-J. Chang, A. Tran, T. Hassner, I. Masi, R. Nevatia, G. Medioni, \"[FacePoseNet: Making a Case for Landmark-Free Face Alignment](https://arxiv.org/abs/1708.07517)\", in 7th IEEE International Workshop on Analysis and Modeling of Faces and Gestures, ICCV Workshops, 2017\n\n[2] I. Masi\\*, A. Tran\\*, T. Hassner\\*, J. Leksut, G. Medioni, \"Do We Really Need to Collect Million of Faces for Effective Face Recognition? \", ECCV 2016, \n    \\* denotes equal authorship\n\n[3] I. Masi, S. Rawls, G. Medioni, P. Natarajan \"Pose-Aware Face Recognition in the Wild\", CVPR 2016\n\n[4] T. Hassner, S. Harel, E. Paz and R. Enbar \"Effective Face Frontalization in Unconstrained Images\", CVPR 2015\n\n[5] I. Masi, T. Hassner, A. Tran, and G. Medioni, \"Rapid Synthesis of Massive Face Sets for Improved Face Recognition\", FG 2017\n\n## Changelog\n- August 2017, First Release \n\n## Disclaimer\n\n_The SOFTWARE PACKAGE provided in this page is provided \"as is\", without any guarantee made as to its suitability or fitness for any particular use. It may contain bugs, so use of this tool is at your own risk. We take no responsibility for any damage of any sort that may unintentionally be caused through its use._\n\n## Contacts\n\nIf you have any questions, drop an email to _fengjuch@usc.edu_, _anhttran@usc.edu_, _iacopo.masi@usc.edu_ or _hassner@isi.edu_ or leave a message below with GitHub (log-in is needed).\n"
  },
  {
    "path": "ResNet/ThreeDMM_shape.py",
    "content": "import sys\nsys.path.append('./kaffe')\nsys.path.append('./kaffe/tensorflow')\n#from kaffe.tensorflow.network_allNonTrain import Network\nfrom network_shape import Network_Shape\n\n\n\nclass ResNet_101(Network_Shape):\n    def setup(self):\n        (self.feed('input')\n             .conv(7, 7, 64, 2, 2, biased=False, relu=False, name='conv1')\n             .batch_normalization(relu=True, name='bn_conv1')\n             .max_pool(3, 3, 2, 2, name='pool1')\n             .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res2a_branch1')\n             .batch_normalization(name='bn2a_branch1'))\n\n        (self.feed('pool1')\n             .conv(1, 1, 64, 1, 1, biased=False, relu=False, name='res2a_branch2a')\n             .batch_normalization(relu=True, name='bn2a_branch2a')\n             .conv(3, 3, 64, 1, 1, biased=False, relu=False, name='res2a_branch2b')\n             .batch_normalization(relu=True, name='bn2a_branch2b')\n             .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res2a_branch2c')\n             .batch_normalization(name='bn2a_branch2c'))\n\n        (self.feed('bn2a_branch1', \n                   'bn2a_branch2c')\n             .add(name='res2a')\n             .relu(name='res2a_relu') # batch_size x 56 x 56 x 256\n             .conv(1, 1, 64, 1, 1, biased=False, relu=False, name='res2b_branch2a') # batch_size x 56 x 56 x 64\n             .batch_normalization(relu=True, name='bn2b_branch2a') # batch_size x 56 x 56 x 64\n             .conv(3, 3, 64, 1, 1, biased=False, relu=False, name='res2b_branch2b') # batch_size x 56 x 56 x 64\n             .batch_normalization(relu=True, name='bn2b_branch2b') # batch_size x 56 x 56 x 64\n             .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res2b_branch2c') # batch_size x 56 x 56 x 256\n             .batch_normalization(name='bn2b_branch2c')) # batch_size x 56 x 56 x 256\n\n        (self.feed('res2a_relu', # batch_size x 56 x 56 x 256\n                   'bn2b_branch2c') # batch_size x 56 x 56 x 256\n             .add(name='res2b')\n             .relu(name='res2b_relu') # batch_size x 56 x 56 x 256\n             .conv(1, 1, 64, 1, 1, biased=False, relu=False, name='res2c_branch2a')\n             .batch_normalization(relu=True, name='bn2c_branch2a')\n             .conv(3, 3, 64, 1, 1, biased=False, relu=False, name='res2c_branch2b')\n             .batch_normalization(relu=True, name='bn2c_branch2b')\n             .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res2c_branch2c')\n             .batch_normalization(name='bn2c_branch2c'))\n\n        (self.feed('res2b_relu', \n                   'bn2c_branch2c')\n             .add(name='res2c')\n             .relu(name='res2c_relu') # batch_size x 56 x 56 x 256\n             .conv(1, 1, 512, 2, 2, biased=False, relu=False, name='res3a_branch1')\n             .batch_normalization(name='bn3a_branch1'))\n\n        (self.feed('res2c_relu') # batch_size x 56 x 56 x 256\n             .conv(1, 1, 128, 2, 2, biased=False, relu=False, name='res3a_branch2a')\n             .batch_normalization(relu=True, name='bn3a_branch2a')\n             .conv(3, 3, 128, 1, 1, biased=False, relu=False, name='res3a_branch2b')\n             .batch_normalization(relu=True, name='bn3a_branch2b')\n             .conv(1, 1, 512, 1, 1, biased=False, relu=False, name='res3a_branch2c')\n             .batch_normalization(name='bn3a_branch2c'))\n\n        (self.feed('bn3a_branch1', \n                   'bn3a_branch2c')\n             .add(name='res3a')\n             .relu(name='res3a_relu') # batch_size x 28 x 28 x 512\n             .conv(1, 1, 128, 1, 1, biased=False, relu=False, name='res3b1_branch2a')\n             .batch_normalization(relu=True, name='bn3b1_branch2a')\n             .conv(3, 3, 128, 1, 1, biased=False, relu=False, name='res3b1_branch2b')\n             .batch_normalization(relu=True, name='bn3b1_branch2b')\n             .conv(1, 1, 512, 1, 1, biased=False, relu=False, name='res3b1_branch2c')\n             .batch_normalization(name='bn3b1_branch2c'))\n\n        (self.feed('res3a_relu', \n                   'bn3b1_branch2c')\n             .add(name='res3b1')\n             .relu(name='res3b1_relu') # batch_size x 28 x 28 x 512\n             .conv(1, 1, 128, 1, 1, biased=False, relu=False, name='res3b2_branch2a')\n             .batch_normalization(relu=True, name='bn3b2_branch2a')\n             .conv(3, 3, 128, 1, 1, biased=False, relu=False, name='res3b2_branch2b')\n             .batch_normalization(relu=True, name='bn3b2_branch2b')\n             .conv(1, 1, 512, 1, 1, biased=False, relu=False, name='res3b2_branch2c')\n             .batch_normalization(name='bn3b2_branch2c'))\n\n        (self.feed('res3b1_relu', \n                   'bn3b2_branch2c')\n             .add(name='res3b2')\n             .relu(name='res3b2_relu') # batch_size x 28 x 28 x 512\n             .conv(1, 1, 128, 1, 1, biased=False, relu=False, name='res3b3_branch2a')\n             .batch_normalization(relu=True, name='bn3b3_branch2a')\n             .conv(3, 3, 128, 1, 1, biased=False, relu=False, name='res3b3_branch2b')\n             .batch_normalization(relu=True, name='bn3b3_branch2b')\n             .conv(1, 1, 512, 1, 1, biased=False, relu=False, name='res3b3_branch2c')\n             .batch_normalization(name='bn3b3_branch2c'))\n\n        (self.feed('res3b2_relu', \n                   'bn3b3_branch2c')\n             .add(name='res3b3')\n             .relu(name='res3b3_relu') # batch_size x 28 x 28 x 512\n             .conv(1, 1, 1024, 2, 2, biased=False, relu=False, name='res4a_branch1')\n             .batch_normalization(name='bn4a_branch1'))\n\n        (self.feed('res3b3_relu') # batch_size x 28 x 28 x 512\n             .conv(1, 1, 256, 2, 2, biased=False, relu=False, name='res4a_branch2a')\n             .batch_normalization(relu=True, name='bn4a_branch2a')\n             .conv(3, 3, 256, 1, 1, biased=False, relu=False, name='res4a_branch2b')\n             .batch_normalization(relu=True, name='bn4a_branch2b')\n             .conv(1, 1, 1024, 1, 1, biased=False, relu=False, name='res4a_branch2c')\n             .batch_normalization(name='bn4a_branch2c'))\n\n        (self.feed('bn4a_branch1', \n                   'bn4a_branch2c')\n             .add(name='res4a')\n             .relu(name='res4a_relu') # batch_size x 14 x 14 x 1024\n             .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4b1_branch2a')\n             .batch_normalization(relu=True, name='bn4b1_branch2a')\n             .conv(3, 3, 256, 1, 1, biased=False, relu=False, name='res4b1_branch2b')\n             .batch_normalization(relu=True, name='bn4b1_branch2b')\n             .conv(1, 1, 1024, 1, 1, biased=False, relu=False, name='res4b1_branch2c')\n             .batch_normalization(name='bn4b1_branch2c'))\n\n        (self.feed('res4a_relu', \n                   'bn4b1_branch2c')\n             .add(name='res4b1')\n             .relu(name='res4b1_relu') # batch_size x 14 x 14 x 1024\n             .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4b2_branch2a')\n             .batch_normalization(relu=True, name='bn4b2_branch2a')\n             .conv(3, 3, 256, 1, 1, biased=False, relu=False, name='res4b2_branch2b')\n             .batch_normalization(relu=True, name='bn4b2_branch2b')\n             .conv(1, 1, 1024, 1, 1, biased=False, relu=False, name='res4b2_branch2c')\n             .batch_normalization(name='bn4b2_branch2c'))\n\n        (self.feed('res4b1_relu', \n                   'bn4b2_branch2c')\n             .add(name='res4b2')\n             .relu(name='res4b2_relu') # batch_size x 14 x 14 x 1024\n             .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4b3_branch2a')\n             .batch_normalization(relu=True, name='bn4b3_branch2a')\n             .conv(3, 3, 256, 1, 1, biased=False, relu=False, name='res4b3_branch2b')\n             .batch_normalization(relu=True, name='bn4b3_branch2b')\n             .conv(1, 1, 1024, 1, 1, biased=False, relu=False, name='res4b3_branch2c')\n             .batch_normalization(name='bn4b3_branch2c'))\n\n        (self.feed('res4b2_relu', \n                   'bn4b3_branch2c')\n             .add(name='res4b3')\n             .relu(name='res4b3_relu') # batch_size x 14 x 14 x 1024\n             .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4b4_branch2a')\n             .batch_normalization(relu=True, name='bn4b4_branch2a')\n             .conv(3, 3, 256, 1, 1, biased=False, relu=False, name='res4b4_branch2b')\n             .batch_normalization(relu=True, name='bn4b4_branch2b')\n             .conv(1, 1, 1024, 1, 1, biased=False, relu=False, name='res4b4_branch2c')\n             .batch_normalization(name='bn4b4_branch2c'))\n\n        (self.feed('res4b3_relu', \n                   'bn4b4_branch2c')\n             .add(name='res4b4')\n             .relu(name='res4b4_relu') # batch_size x 14 x 14 x 1024\n             .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4b5_branch2a')\n             .batch_normalization(relu=True, name='bn4b5_branch2a')\n             .conv(3, 3, 256, 1, 1, biased=False, relu=False, name='res4b5_branch2b')\n             .batch_normalization(relu=True, name='bn4b5_branch2b')\n             .conv(1, 1, 1024, 1, 1, biased=False, relu=False, name='res4b5_branch2c')\n             .batch_normalization(name='bn4b5_branch2c'))\n\n        (self.feed('res4b4_relu', \n                   'bn4b5_branch2c')\n             .add(name='res4b5')\n             .relu(name='res4b5_relu') # batch_size x 14 x 14 x 1024\n             .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4b6_branch2a')\n             .batch_normalization(relu=True, name='bn4b6_branch2a')\n             .conv(3, 3, 256, 1, 1, biased=False, relu=False, name='res4b6_branch2b')\n             .batch_normalization(relu=True, name='bn4b6_branch2b')\n             .conv(1, 1, 1024, 1, 1, biased=False, relu=False, name='res4b6_branch2c')\n             .batch_normalization(name='bn4b6_branch2c'))\n\n        (self.feed('res4b5_relu', \n                   'bn4b6_branch2c')\n             .add(name='res4b6')\n             .relu(name='res4b6_relu') # batch_size x 14 x 14 x 1024\n             .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4b7_branch2a')\n             .batch_normalization(relu=True, name='bn4b7_branch2a')\n             .conv(3, 3, 256, 1, 1, biased=False, relu=False, name='res4b7_branch2b')\n             .batch_normalization(relu=True, name='bn4b7_branch2b')\n             .conv(1, 1, 1024, 1, 1, biased=False, relu=False, name='res4b7_branch2c')\n             .batch_normalization(name='bn4b7_branch2c'))\n\n        (self.feed('res4b6_relu', \n                   'bn4b7_branch2c')\n             .add(name='res4b7')\n             .relu(name='res4b7_relu') # batch_size x 14 x 14 x 1024\n             .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4b8_branch2a')\n             .batch_normalization(relu=True, name='bn4b8_branch2a')\n             .conv(3, 3, 256, 1, 1, biased=False, relu=False, name='res4b8_branch2b')\n             .batch_normalization(relu=True, name='bn4b8_branch2b')\n             .conv(1, 1, 1024, 1, 1, biased=False, relu=False, name='res4b8_branch2c')\n             .batch_normalization(name='bn4b8_branch2c'))\n\n        (self.feed('res4b7_relu', \n                   'bn4b8_branch2c')\n             .add(name='res4b8')\n             .relu(name='res4b8_relu') # batch_size x 14 x 14 x 1024\n             .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4b9_branch2a')\n             .batch_normalization(relu=True, name='bn4b9_branch2a')\n             .conv(3, 3, 256, 1, 1, biased=False, relu=False, name='res4b9_branch2b')\n             .batch_normalization(relu=True, name='bn4b9_branch2b')\n             .conv(1, 1, 1024, 1, 1, biased=False, relu=False, name='res4b9_branch2c')\n             .batch_normalization(name='bn4b9_branch2c'))\n\n        (self.feed('res4b8_relu', \n                   'bn4b9_branch2c')\n             .add(name='res4b9')\n             .relu(name='res4b9_relu') # batch_size x 14 x 14 x 1024\n             .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4b10_branch2a')\n             .batch_normalization(relu=True, name='bn4b10_branch2a')\n             .conv(3, 3, 256, 1, 1, biased=False, relu=False, name='res4b10_branch2b')\n             .batch_normalization(relu=True, name='bn4b10_branch2b')\n             .conv(1, 1, 1024, 1, 1, biased=False, relu=False, name='res4b10_branch2c')\n             .batch_normalization(name='bn4b10_branch2c'))\n\n        (self.feed('res4b9_relu', \n                   'bn4b10_branch2c')\n             .add(name='res4b10')\n             .relu(name='res4b10_relu') # batch_size x 14 x 14 x 1024\n             .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4b11_branch2a')\n             .batch_normalization(relu=True, name='bn4b11_branch2a')\n             .conv(3, 3, 256, 1, 1, biased=False, relu=False, name='res4b11_branch2b')\n             .batch_normalization(relu=True, name='bn4b11_branch2b')\n             .conv(1, 1, 1024, 1, 1, biased=False, relu=False, name='res4b11_branch2c')\n             .batch_normalization(name='bn4b11_branch2c'))\n\n        (self.feed('res4b10_relu', \n                   'bn4b11_branch2c')\n             .add(name='res4b11')\n             .relu(name='res4b11_relu') # batch_size x 14 x 14 x 1024\n             .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4b12_branch2a')\n             .batch_normalization(relu=True, name='bn4b12_branch2a')\n             .conv(3, 3, 256, 1, 1, biased=False, relu=False, name='res4b12_branch2b')\n             .batch_normalization(relu=True, name='bn4b12_branch2b')\n             .conv(1, 1, 1024, 1, 1, biased=False, relu=False, name='res4b12_branch2c')\n             .batch_normalization(name='bn4b12_branch2c'))\n\n        (self.feed('res4b11_relu', \n                   'bn4b12_branch2c')\n             .add(name='res4b12')\n             .relu(name='res4b12_relu') # batch_size x 14 x 14 x 1024\n             .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4b13_branch2a')\n             .batch_normalization(relu=True, name='bn4b13_branch2a')\n             .conv(3, 3, 256, 1, 1, biased=False, relu=False, name='res4b13_branch2b')\n             .batch_normalization(relu=True, name='bn4b13_branch2b')\n             .conv(1, 1, 1024, 1, 1, biased=False, relu=False, name='res4b13_branch2c')\n             .batch_normalization(name='bn4b13_branch2c'))\n\n        (self.feed('res4b12_relu', \n                   'bn4b13_branch2c')\n             .add(name='res4b13')\n             .relu(name='res4b13_relu') # batch_size x 14 x 14 x 1024\n             .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4b14_branch2a')\n             .batch_normalization(relu=True, name='bn4b14_branch2a')\n             .conv(3, 3, 256, 1, 1, biased=False, relu=False, name='res4b14_branch2b')\n             .batch_normalization(relu=True, name='bn4b14_branch2b')\n             .conv(1, 1, 1024, 1, 1, biased=False, relu=False, name='res4b14_branch2c')\n             .batch_normalization(name='bn4b14_branch2c'))\n\n        (self.feed('res4b13_relu', \n                   'bn4b14_branch2c')\n             .add(name='res4b14')\n             .relu(name='res4b14_relu') # batch_size x 14 x 14 x 1024\n             .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4b15_branch2a')\n             .batch_normalization(relu=True, name='bn4b15_branch2a')\n             .conv(3, 3, 256, 1, 1, biased=False, relu=False, name='res4b15_branch2b')\n             .batch_normalization(relu=True, name='bn4b15_branch2b')\n             .conv(1, 1, 1024, 1, 1, biased=False, relu=False, name='res4b15_branch2c')\n             .batch_normalization(name='bn4b15_branch2c'))\n\n        (self.feed('res4b14_relu', \n                   'bn4b15_branch2c')\n             .add(name='res4b15')\n             .relu(name='res4b15_relu') # batch_size x 14 x 14 x 1024\n             .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4b16_branch2a')\n             .batch_normalization(relu=True, name='bn4b16_branch2a')\n             .conv(3, 3, 256, 1, 1, biased=False, relu=False, name='res4b16_branch2b')\n             .batch_normalization(relu=True, name='bn4b16_branch2b')\n             .conv(1, 1, 1024, 1, 1, biased=False, relu=False, name='res4b16_branch2c')\n             .batch_normalization(name='bn4b16_branch2c'))\n\n        (self.feed('res4b15_relu', \n                   'bn4b16_branch2c')\n             .add(name='res4b16')\n             .relu(name='res4b16_relu') # batch_size x 14 x 14 x 1024\n             .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4b17_branch2a')\n             .batch_normalization(relu=True, name='bn4b17_branch2a')\n             .conv(3, 3, 256, 1, 1, biased=False, relu=False, name='res4b17_branch2b')\n             .batch_normalization(relu=True, name='bn4b17_branch2b')\n             .conv(1, 1, 1024, 1, 1, biased=False, relu=False, name='res4b17_branch2c')\n             .batch_normalization(name='bn4b17_branch2c'))\n\n        (self.feed('res4b16_relu', \n                   'bn4b17_branch2c')\n             .add(name='res4b17')\n             .relu(name='res4b17_relu') # batch_size x 14 x 14 x 1024\n             .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4b18_branch2a')\n             .batch_normalization(relu=True, name='bn4b18_branch2a')\n             .conv(3, 3, 256, 1, 1, biased=False, relu=False, name='res4b18_branch2b')\n             .batch_normalization(relu=True, name='bn4b18_branch2b')\n             .conv(1, 1, 1024, 1, 1, biased=False, relu=False, name='res4b18_branch2c')\n             .batch_normalization(name='bn4b18_branch2c'))\n\n        (self.feed('res4b17_relu', \n                   'bn4b18_branch2c')\n             .add(name='res4b18')\n             .relu(name='res4b18_relu') # batch_size x 14 x 14 x 1024\n             .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4b19_branch2a')\n             .batch_normalization(relu=True, name='bn4b19_branch2a')\n             .conv(3, 3, 256, 1, 1, biased=False, relu=False, name='res4b19_branch2b')\n             .batch_normalization(relu=True, name='bn4b19_branch2b')\n             .conv(1, 1, 1024, 1, 1, biased=False, relu=False, name='res4b19_branch2c')\n             .batch_normalization(name='bn4b19_branch2c'))\n\n        (self.feed('res4b18_relu', \n                   'bn4b19_branch2c')\n             .add(name='res4b19')\n             .relu(name='res4b19_relu') # batch_size x 14 x 14 x 1024\n             .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4b20_branch2a')\n             .batch_normalization(relu=True, name='bn4b20_branch2a')\n             .conv(3, 3, 256, 1, 1, biased=False, relu=False, name='res4b20_branch2b')\n             .batch_normalization(relu=True, name='bn4b20_branch2b')\n             .conv(1, 1, 1024, 1, 1, biased=False, relu=False, name='res4b20_branch2c')\n             .batch_normalization(name='bn4b20_branch2c'))\n\n        (self.feed('res4b19_relu', \n                   'bn4b20_branch2c')\n             .add(name='res4b20')\n             .relu(name='res4b20_relu') # batch_size x 14 x 14 x 1024\n             .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4b21_branch2a')\n             .batch_normalization(relu=True, name='bn4b21_branch2a')\n             .conv(3, 3, 256, 1, 1, biased=False, relu=False, name='res4b21_branch2b')\n             .batch_normalization(relu=True, name='bn4b21_branch2b')\n             .conv(1, 1, 1024, 1, 1, biased=False, relu=False, name='res4b21_branch2c')\n             .batch_normalization(name='bn4b21_branch2c'))\n\n        (self.feed('res4b20_relu', \n                   'bn4b21_branch2c')\n             .add(name='res4b21')\n             .relu(name='res4b21_relu') # batch_size x 14 x 14 x 1024\n             .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4b22_branch2a')\n             .batch_normalization(relu=True, name='bn4b22_branch2a')\n             .conv(3, 3, 256, 1, 1, biased=False, relu=False, name='res4b22_branch2b')\n             .batch_normalization(relu=True, name='bn4b22_branch2b')\n             .conv(1, 1, 1024, 1, 1, biased=False, relu=False, name='res4b22_branch2c')\n             .batch_normalization(name='bn4b22_branch2c'))\n\n        (self.feed('res4b21_relu', \n                   'bn4b22_branch2c')\n             .add(name='res4b22')\n             .relu(name='res4b22_relu') # batch_size x 14 x 14 x 1024\n             .conv(1, 1, 2048, 2, 2, biased=False, relu=False, name='res5a_branch1')\n             .batch_normalization(name='bn5a_branch1'))\n\n        (self.feed('res4b22_relu')\n             .conv(1, 1, 512, 2, 2, biased=False, relu=False, name='res5a_branch2a')\n             .batch_normalization(relu=True, name='bn5a_branch2a')\n             .conv(3, 3, 512, 1, 1, biased=False, relu=False, name='res5a_branch2b')\n             .batch_normalization(relu=True, name='bn5a_branch2b')\n             .conv(1, 1, 2048, 1, 1, biased=False, relu=False, name='res5a_branch2c')\n             .batch_normalization(name='bn5a_branch2c'))\n\n        (self.feed('bn5a_branch1', \n                   'bn5a_branch2c')\n             .add(name='res5a')\n             .relu(name='res5a_relu') # batch_size x 7 x 7 x 2048\n             .conv(1, 1, 512, 1, 1, biased=False, relu=False, name='res5b_branch2a')\n             .batch_normalization(relu=True, name='bn5b_branch2a')\n             .conv(3, 3, 512, 1, 1, biased=False, relu=False, name='res5b_branch2b')\n             .batch_normalization(relu=True, name='bn5b_branch2b')\n             .conv(1, 1, 2048, 1, 1, biased=False, relu=False, name='res5b_branch2c')\n             .batch_normalization(name='bn5b_branch2c'))\n\n        (self.feed('res5a_relu', \n                   'bn5b_branch2c')\n             .add(name='res5b')\n             .relu(name='res5b_relu') # batch_size x 7 x 7 x 2048\n             .conv(1, 1, 512, 1, 1, biased=False, relu=False, name='res5c_branch2a')\n             .batch_normalization(relu=True, name='bn5c_branch2a')\n             .conv(3, 3, 512, 1, 1, biased=False, relu=False, name='res5c_branch2b')\n             .batch_normalization(relu=True, name='bn5c_branch2b')\n             .conv(1, 1, 2048, 1, 1, biased=False, relu=False, name='res5c_branch2c')\n             .batch_normalization(name='bn5c_branch2c'))\n\n        (self.feed('res5b_relu', \n                   'bn5c_branch2c')\n             .add(name='res5c')\n             .relu(name='res5c_relu') # batch_size x 7 x 7 x 2048\n             .avg_pool(7, 7, 1, 1, padding='VALID', name='pool5'))\n             #.fc(198, relu=False, name='fc_ftnew'))\n"
  },
  {
    "path": "get_Rts.py",
    "content": "\"\"\"3D pose estimation network: get R and ts                                                                                                                                 \n\"\"\"\nimport lmdb\nimport sys\nimport time\nimport csv                                                                                                                                        \nimport numpy as np\nimport numpy.matlib\nimport os\n\nimport pose_model as Pose_model\nimport tf_utils as util\n\nimport tensorflow as tf\nimport scipy\nfrom scipy import ndimage, misc\nimport os.path\nimport glob\n\ntf.logging.set_verbosity(tf.logging.INFO)\n\nFLAGS = tf.app.flags.FLAGS\ntf.app.flags.DEFINE_string('mode', 'valid', 'train or eval or valid.')\ntf.app.flags.DEFINE_integer('image_size', 227, 'Image side length.')\ntf.app.flags.DEFINE_string('log_root', '.', 'Directory to keep the checkpoints')\ntf.app.flags.DEFINE_string('model_root', '.', 'Directory to keep the checkpoints')\ntf.app.flags.DEFINE_integer('num_gpus', 0, 'Number of gpus used for training. (0 or 1)')\ntf.app.flags.DEFINE_integer('gpu_id', 0, 'GPU ID to be used.')\ntf.app.flags.DEFINE_string('input_csv', 'input.csv', 'input file to process')\ntf.app.flags.DEFINE_string('output_lmdb', 'pose_lmdb', 'output lmdb')\ntf.app.flags.DEFINE_integer('batch_size', 1, 'Batch Size')\n\n\n\ndef run_pose_estimation(root_model_path, inputFile, outputDB, model_used, lr_rate_scalar, if_dropout, keep_rate):\n\n    # Load training images mean: The values are in the range of [0,1], so the image pixel values should also divided by 255 \n    file = np.load(root_model_path + \"perturb_Oxford_train_imgs_mean.npz\")\n    train_mean_vec = file[\"train_mean_vec\"]\n    del file\n    \n    # Load training labels mean and std\n    file = np.load(root_model_path +\"perturb_Oxford_train_labels_mean_std.npz\")\n    mean_labels = file[\"mean_labels\"]\n    std_labels = file[\"std_labels\"]\n    del file\n\n\n    # placeholders for the batches                                                                                                                                      \n    x = tf.placeholder(tf.float32, [FLAGS.batch_size, FLAGS.image_size, FLAGS.image_size, 3])\n    y = tf.placeholder(tf.float32, [FLAGS.batch_size, 6])\n   \n    net_data = np.load(root_model_path +\"PAM_frontal_ALexNet.npy\").item()\n    pose_3D_model = Pose_model.ThreeD_Pose_Estimation(x, y, 'valid', if_dropout, keep_rate, keep_rate, lr_rate_scalar, net_data, FLAGS.batch_size, mean_labels, std_labels)\n    pose_3D_model._build_graph()\n    del net_data\n\n    # #Add ops to save and restore all the variables.   \n    saver = tf.train.Saver(var_list=tf.get_collection(tf.GraphKeys.VARIABLES, scope='Spatial_Transformer'))\n\n    pose_lmdb_env = lmdb.Environment(outputDB, map_size=1e12)\n\n    \n    with tf.Session(config=tf.ConfigProto(allow_soft_placement=True )) as sess, \\\n         pose_lmdb_env.begin(write=True) as pose_txn:\n\n        \n        # Restore variables from disk.\n        load_path = root_model_path + model_used\n\n        saver.restore(sess, load_path)\n\n        print(\"Model restored.\")\n\n        # Load cropped and scaled image file list (csv file)\n        with open(inputFile, 'r') as csvfile:\n            csvreader = csv.reader(csvfile, delimiter=',')\n            lines = csvfile.readlines()\n            for lin in lines:\n                ### THE file is of the form\n                ### key1, image_path_key_1\n                mykey = lin.split(',')[0]\n                image_file_path = lin.split(',')[-1].rstrip('\\n')\n                import cv2\n                image = cv2.imread(image_file_path)\n                image = np.asarray(image)\n                # Fix the 2D image\n                if len(image.shape) < 3:\n                    image_r = np.reshape(image, (image.shape[0], image.shape[1], 1))\n                    image = np.append(image_r, image_r, axis=2)\n                    image = np.append(image, image_r, axis=2)\n\n                label = np.array([0.,0.,0.,0.,0.,0.])\n                id_labels = np.array([0])\n                \n                # Normalize images and labels \n                nr_image, nr_pose_label, id_label = util.input_processing(image, label, id_labels, train_mean_vec, mean_labels, std_labels, 1, FLAGS.image_size, 739)\n                del id_label\n\n                # Reshape the image and label to fit model\n                nr_image = nr_image.reshape(1, FLAGS.image_size, FLAGS.image_size, 3)\n                nr_pose_label = nr_pose_label.reshape(1,6)\n\n                # Get predicted R-ts\n                pred_Rts = sess.run(pose_3D_model.preds_unNormalized, feed_dict={x: nr_image, y: nr_pose_label})\n                print 'Predicted pose for:  ' + mykey\n                pose_txn.put( mykey , pred_Rts[0].astype('float32') )\n\n\ndef esimatePose(root_model_path, inputFile, outputDB, model_used, lr_rate_scalar, if_dropout, keep_rate, use_gpu=False ):\n    ## Force TF to use CPU oterwise we set the ID of the string of GPU we wanna use but here we are going use CPU\n    os.environ['CUDA_VISIBLE_DEVICES'] = '1' #e.g. str(FLAGS.gpu_id)# '7'\n    if use_gpu == False:\n        dev = '/cpu:0'\n        print \"Using CPU\"\n    elif usc_gpu == True:\n        dev = '/gpu:0'\n        print \"Using GPU \" + os.environ['CUDA_VISIBLE_DEVICES']\n    else:\n        raise ValueError('Only support 0 or 1 gpu.')\n    run_pose_estimation( root_model_path, inputFile, outputDB, model_used, lr_rate_scalar, if_dropout, keep_rate )\n"
  },
  {
    "path": "input.csv",
    "content": "ID,FILE,FACE_X,FACE_Y,FACE_WIDTH,FACE_HEIGHT\r\nsubject1_a,images/input1.jpg,108.2642,119.6774,170,179\r\nsubject2_a,images/input2.jpg,48.51129913,38.1551857,141.19125366,149.40893555\r\nsubject3_a,images/input3.jpg,47.94947433,26.95211983,126.64208984,169.57138062\r\nsubject4_a,images/input4.jpg,41.02483749,81.23366547,122.9382019,79.80832672\r\nsubject5_a,images/input5.jpg,44.65912247,30.22106934,138.8326416,156.31950378\r\nsubject6_a,images/input6.jpg,54.94252396,41.26684189,117.19006348,137.38693237\r\nsubject7_a,images/input7.jpg,63.90779114,54.21474075,159.63040161,90.42936707\r\nsubject8_a,images/input8.jpg,53.62681198,48.40485001,78.09403992,101.56494141\r\nsubject9_a,images/input9.jpg,55.74394226,72.12078094,76.75720215,114.19478607\r\nsubject10_a,images/input10.jpg,48.07297897,30.98786163,145.96961975,124.47624969\r\n"
  },
  {
    "path": "input_list.txt",
    "content": "./input_samples/HELEN_30427236_2_0.jpg,132.8177,213.8680,183.1707,171.1925\n./input_samples/LFPW_image_test_0001_0.jpg,132.3776,213.9545,178.3532,168.3772\n./input_samples/LFPW_image_test_0008_0.jpg,138.6822,210.8271,172.3008,174.2025"
  },
  {
    "path": "input_samples/README",
    "content": "Three input sample images to run our new FPN\n"
  },
  {
    "path": "kaffe/__init__.py",
    "content": "from .graph import GraphBuilder, NodeMapper\nfrom .errors import KaffeError, print_stderr\n\nfrom . import tensorflow\n"
  },
  {
    "path": "kaffe/errors.py",
    "content": "import sys\n\nclass KaffeError(Exception):\n    pass\n\ndef print_stderr(msg):\n    sys.stderr.write('%s\\n' % msg)\n"
  },
  {
    "path": "kaffe/graph.py",
    "content": "from google.protobuf import text_format\n\nfrom .caffe import get_caffe_resolver\nfrom .errors import KaffeError, print_stderr\nfrom .layers import LayerAdapter, LayerType, NodeKind, NodeDispatch\nfrom .shapes import TensorShape\n\nclass Node(object):\n\n    def __init__(self, name, kind, layer=None):\n        self.name = name\n        self.kind = kind\n        self.layer = LayerAdapter(layer, kind) if layer else None\n        self.parents = []\n        self.children = []\n        self.data = None\n        self.output_shape = None\n        self.metadata = {}\n\n    def add_parent(self, parent_node):\n        assert parent_node not in self.parents\n        self.parents.append(parent_node)\n        if self not in parent_node.children:\n            parent_node.children.append(self)\n\n    def add_child(self, child_node):\n        assert child_node not in self.children\n        self.children.append(child_node)\n        if self not in child_node.parents:\n            child_node.parents.append(self)\n\n    def get_only_parent(self):\n        if len(self.parents) != 1:\n            raise KaffeError('Node (%s) expected to have 1 parent. Found %s.' %\n                             (self, len(self.parents)))\n        return self.parents[0]\n\n    @property\n    def parameters(self):\n        if self.layer is not None:\n            return self.layer.parameters\n        return None\n\n    def __str__(self):\n        return '[%s] %s' % (self.kind, self.name)\n\n    def __repr__(self):\n        return '%s (0x%x)' % (self.name, id(self))\n\n\nclass Graph(object):\n\n    def __init__(self, nodes=None, name=None):\n        self.nodes = nodes or []\n        self.node_lut = {node.name: node for node in self.nodes}\n        self.name = name\n\n    def add_node(self, node):\n        self.nodes.append(node)\n        self.node_lut[node.name] = node\n\n    def get_node(self, name):\n        try:\n            return self.node_lut[name]\n        except KeyError:\n            raise KaffeError('Layer not found: %s' % name)\n\n    def get_input_nodes(self):\n        return [node for node in self.nodes if len(node.parents) == 0]\n\n    def get_output_nodes(self):\n        return [node for node in self.nodes if len(node.children) == 0]\n\n    def topologically_sorted(self):\n        sorted_nodes = []\n        unsorted_nodes = list(self.nodes)\n        temp_marked = set()\n        perm_marked = set()\n\n        def visit(node):\n            if node in temp_marked:\n                raise KaffeError('Graph is not a DAG.')\n            if node in perm_marked:\n                return\n            temp_marked.add(node)\n            for child in node.children:\n                visit(child)\n            perm_marked.add(node)\n            temp_marked.remove(node)\n            sorted_nodes.insert(0, node)\n\n        while len(unsorted_nodes):\n            visit(unsorted_nodes.pop())\n        return sorted_nodes\n\n    def compute_output_shapes(self):\n        sorted_nodes = self.topologically_sorted()\n        for node in sorted_nodes:\n            node.output_shape = TensorShape(*NodeKind.compute_output_shape(node))\n\n    def replaced(self, new_nodes):\n        return Graph(nodes=new_nodes, name=self.name)\n\n    def transformed(self, transformers):\n        graph = self\n        for transformer in transformers:\n            graph = transformer(graph)\n            if graph is None:\n                raise KaffeError('Transformer failed: {}'.format(transformer))\n            assert isinstance(graph, Graph)\n        return graph\n\n    def __contains__(self, key):\n        return key in self.node_lut\n\n    def __str__(self):\n        hdr = '{:<20} {:<30} {:>20} {:>20}'.format('Type', 'Name', 'Param', 'Output')\n        s = [hdr, '-' * 94]\n        for node in self.topologically_sorted():\n            # If the node has learned parameters, display the first one's shape.\n            # In case of convolutions, this corresponds to the weights.\n            data_shape = node.data[0].shape if node.data else '--'\n            out_shape = node.output_shape or '--'\n            s.append('{:<20} {:<30} {:>20} {:>20}'.format(node.kind, node.name, data_shape,\n                                                          tuple(out_shape)))\n        return '\\n'.join(s)\n\n\nclass GraphBuilder(object):\n    '''Constructs a model graph from a Caffe protocol buffer definition.'''\n\n    def __init__(self, def_path, phase='test'):\n        '''\n        def_path: Path to the model definition (.prototxt)\n        data_path: Path to the model data (.caffemodel)\n        phase: Either 'test' or 'train'. Used for filtering phase-specific nodes.\n        '''\n        self.def_path = def_path\n        self.phase = phase\n        self.load()\n\n    def load(self):\n        '''Load the layer definitions from the prototxt.'''\n        self.params = get_caffe_resolver().NetParameter()\n        with open(self.def_path, 'rb') as def_file:\n            text_format.Merge(def_file.read(), self.params)\n\n    def filter_layers(self, layers):\n        '''Filter out layers based on the current phase.'''\n        phase_map = {0: 'train', 1: 'test'}\n        filtered_layer_names = set()\n        filtered_layers = []\n        for layer in layers:\n            phase = self.phase\n            if len(layer.include):\n                phase = phase_map[layer.include[0].phase]\n            if len(layer.exclude):\n                phase = phase_map[1 - layer.include[0].phase]\n            exclude = (phase != self.phase)\n            # Dropout layers appear in a fair number of Caffe\n            # test-time networks. These are just ignored. We'll\n            # filter them out here.\n            if (not exclude) and (phase == 'test'):\n                exclude = (layer.type == LayerType.Dropout)\n            if not exclude:\n                filtered_layers.append(layer)\n                # Guard against dupes.\n                assert layer.name not in filtered_layer_names\n                filtered_layer_names.add(layer.name)\n        return filtered_layers\n\n    def make_node(self, layer):\n        '''Create a graph node for the given layer.'''\n        kind = NodeKind.map_raw_kind(layer.type)\n        if kind is None:\n            raise KaffeError('Unknown layer type encountered: %s' % layer.type)\n        # We want to use the layer's top names (the \"output\" names), rather than the\n        # name attribute, which is more of readability thing than a functional one.\n        # Other layers will refer to a node by its \"top name\".\n        return Node(layer.name, kind, layer=layer)\n\n    def make_input_nodes(self):\n        '''\n        Create data input nodes.\n\n        This method is for old-style inputs, where the input specification\n        was not treated as a first-class layer in the prototext.\n        Newer models use the \"Input layer\" type.\n        '''\n        nodes = [Node(name, NodeKind.Data) for name in self.params.input]\n        if len(nodes):\n            input_dim = map(int, self.params.input_dim)\n            if not input_dim:\n                if len(self.params.input_shape) > 0:\n                    input_dim = map(int, self.params.input_shape[0].dim)\n                else:\n                    raise KaffeError('Dimensions for input not specified.')\n            for node in nodes:\n                node.output_shape = tuple(input_dim)\n        return nodes\n\n    def build(self):\n        '''\n        Builds the graph from the Caffe layer definitions.\n        '''\n        # Get the layers\n        layers = self.params.layers or self.params.layer\n        # Filter out phase-excluded layers\n        layers = self.filter_layers(layers)\n        # Get any separately-specified input layers\n        nodes = self.make_input_nodes()\n        nodes += [self.make_node(layer) for layer in layers]\n        # Initialize the graph\n        graph = Graph(nodes=nodes, name=self.params.name)\n        # Connect the nodes\n        #\n        # A note on layers and outputs:\n        # In Caffe, each layer can produce multiple outputs (\"tops\") from a set of inputs\n        # (\"bottoms\"). The bottoms refer to other layers' tops. The top can rewrite a bottom\n        # (in case of in-place operations). Note that the layer's name is not used for establishing\n        # any connectivity. It's only used for data association. By convention, a layer with a\n        # single top will often use the same name (although this is not required).\n        #\n        # The current implementation only supports single-output nodes (note that a node can still\n        # have multiple children, since multiple child nodes can refer to the single top's name).\n        node_outputs = {}\n        for layer in layers:\n            node = graph.get_node(layer.name)\n            for input_name in layer.bottom:\n                assert input_name != layer.name\n                parent_node = node_outputs.get(input_name)\n                if (parent_node is None) or (parent_node == node):\n                    parent_node = graph.get_node(input_name)\n                node.add_parent(parent_node)\n            if len(layer.top)>1:\n                raise KaffeError('Multiple top nodes are not supported.')\n            for output_name in layer.top:\n                if output_name == layer.name:\n                    # Output is named the same as the node. No further action required.\n                    continue\n                # There are two possibilities here:\n                #\n                # Case 1: output_name refers to another node in the graph.\n                # This is an \"in-place operation\" that overwrites an existing node.\n                # This would create a cycle in the graph. We'll undo the in-placing\n                # by substituting this node wherever the overwritten node is referenced.\n                #\n                # Case 2: output_name violates the convention layer.name == output_name.\n                # Since we are working in the single-output regime, we will can rename it to\n                # match the layer name.\n                #\n                # For both cases, future references to this top re-routes to this node.\n                node_outputs[output_name] = node\n\n        graph.compute_output_shapes()\n        return graph\n\n\nclass NodeMapper(NodeDispatch):\n\n    def __init__(self, graph):\n        self.graph = graph\n\n    def map(self):\n        nodes = self.graph.topologically_sorted()\n        # Remove input nodes - we'll handle them separately.\n        input_nodes = self.graph.get_input_nodes()\n        nodes = [t for t in nodes if t not in input_nodes]\n        # Decompose DAG into chains.\n        chains = []\n        for node in nodes:\n            attach_to_chain = None\n            if len(node.parents) == 1:\n                parent = node.get_only_parent()\n                for chain in chains:\n                    if chain[-1] == parent:\n                        # Node is part of an existing chain.\n                        attach_to_chain = chain\n                        break\n            if attach_to_chain is None:\n                # Start a new chain for this node.\n                attach_to_chain = []\n                chains.append(attach_to_chain)\n            attach_to_chain.append(node)\n        # Map each chain.\n        mapped_chains = []\n        for chain in chains:\n            mapped_chains.append(self.map_chain(chain))\n        return self.commit(mapped_chains)\n\n    def map_chain(self, chain):\n        return [self.map_node(node) for node in chain]\n\n    def map_node(self, node):\n        map_func = self.get_handler(node.kind, 'map')\n        mapped_node = map_func(node)\n        assert mapped_node is not None\n        mapped_node.node = node\n        return mapped_node\n\n    def commit(self, mapped_chains):\n        raise NotImplementedError('Must be implemented by subclass.')\n"
  },
  {
    "path": "kaffe/layers.py",
    "content": "import re\nimport numbers\nfrom collections import namedtuple\n\nfrom .shapes import *\n\nLAYER_DESCRIPTORS = {\n\n    # Caffe Types\n    'AbsVal': shape_identity,\n    'Accuracy': shape_scalar,\n    'ArgMax': shape_not_implemented,\n    'BatchNorm': shape_identity,\n    'BNLL': shape_not_implemented,\n    'Concat': shape_concat,\n    'ContrastiveLoss': shape_scalar,\n    'Convolution': shape_convolution,\n    'Deconvolution': shape_not_implemented,\n    'Data': shape_data,\n    'Dropout': shape_identity,\n    'DummyData': shape_data,\n    'EuclideanLoss': shape_scalar,\n    'Eltwise': shape_identity,\n    'Exp': shape_identity,\n    'Flatten': shape_not_implemented,\n    'HDF5Data': shape_data,\n    'HDF5Output': shape_identity,\n    'HingeLoss': shape_scalar,\n    'Im2col': shape_not_implemented,\n    'ImageData': shape_data,\n    'InfogainLoss': shape_scalar,\n    'InnerProduct': shape_inner_product,\n    'Input': shape_data,\n    'LRN': shape_identity,\n    'MemoryData': shape_mem_data,\n    'MultinomialLogisticLoss': shape_scalar,\n    'MVN': shape_not_implemented,\n    'Pooling': shape_pool,\n    'Power': shape_identity,\n    'ReLU': shape_identity,\n    'Scale': shape_identity,\n    'Sigmoid': shape_identity,\n    'SigmoidCrossEntropyLoss': shape_scalar,\n    'Silence': shape_not_implemented,\n    'Softmax': shape_identity,\n    'SoftmaxWithLoss': shape_scalar,\n    'Split': shape_not_implemented,\n    'Slice': shape_not_implemented,\n    'TanH': shape_identity,\n    'WindowData': shape_not_implemented,\n    'Threshold': shape_identity,\n}\n\nLAYER_TYPES = LAYER_DESCRIPTORS.keys()\n\nLayerType = type('LayerType', (), {t: t for t in LAYER_TYPES})\n\nclass NodeKind(LayerType):\n\n    @staticmethod\n    def map_raw_kind(kind):\n        if kind in LAYER_TYPES:\n            return kind\n        return None\n\n    @staticmethod\n    def compute_output_shape(node):\n        try:\n            val = LAYER_DESCRIPTORS[node.kind](node)\n            return val\n        except NotImplementedError:\n            raise KaffeError('Output shape computation not implemented for type: %s' % node.kind)\n\n\nclass NodeDispatchError(KaffeError):\n\n    pass\n\n\nclass NodeDispatch(object):\n\n    @staticmethod\n    def get_handler_name(node_kind):\n        if len(node_kind) <= 4:\n            # A catch-all for things like ReLU and tanh\n            return node_kind.lower()\n        # Convert from CamelCase to under_scored\n        name = re.sub('(.)([A-Z][a-z]+)', r'\\1_\\2', node_kind)\n        return re.sub('([a-z0-9])([A-Z])', r'\\1_\\2', name).lower()\n\n    def get_handler(self, node_kind, prefix):\n        name = self.get_handler_name(node_kind)\n        name = '_'.join((prefix, name))\n        try:\n            return getattr(self, name)\n        except AttributeError:\n            raise NodeDispatchError('No handler found for node kind: %s (expected: %s)' %\n                                    (node_kind, name))\n\n\nclass LayerAdapter(object):\n\n    def __init__(self, layer, kind):\n        self.layer = layer\n        self.kind = kind\n\n    @property\n    def parameters(self):\n        name = NodeDispatch.get_handler_name(self.kind)\n        name = '_'.join((name, 'param'))\n        try:\n            return getattr(self.layer, name)\n        except AttributeError:\n            raise NodeDispatchError('Caffe parameters not found for layer kind: %s' % (self.kind))\n\n    @staticmethod\n    def get_kernel_value(scalar, repeated, idx, default=None):\n        if scalar:\n            return scalar\n        if repeated:\n            if isinstance(repeated, numbers.Number):\n                return repeated\n            if len(repeated) == 1:\n                # Same value applies to all spatial dimensions\n                return int(repeated[0])\n            assert idx < len(repeated)\n            # Extract the value for the given spatial dimension\n            return repeated[idx]\n        if default is None:\n            raise ValueError('Unable to determine kernel parameter!')\n        return default\n\n    @property\n    def kernel_parameters(self):\n        assert self.kind in (NodeKind.Convolution, NodeKind.Pooling)\n        params = self.parameters\n        k_h = self.get_kernel_value(params.kernel_h, params.kernel_size, 0)\n        k_w = self.get_kernel_value(params.kernel_w, params.kernel_size, 1)\n        s_h = self.get_kernel_value(params.stride_h, params.stride, 0, default=1)\n        s_w = self.get_kernel_value(params.stride_w, params.stride, 1, default=1)\n        p_h = self.get_kernel_value(params.pad_h, params.pad, 0, default=0)\n        p_w = self.get_kernel_value(params.pad_h, params.pad, 1, default=0)\n        return KernelParameters(k_h, k_w, s_h, s_w, p_h, p_w)\n\n\nKernelParameters = namedtuple('KernelParameters', ['kernel_h', 'kernel_w', 'stride_h', 'stride_w',\n                                                   'pad_h', 'pad_w'])\n"
  },
  {
    "path": "kaffe/shapes.py",
    "content": "import math\nfrom collections import namedtuple\n\nfrom .errors import KaffeError\n\nTensorShape = namedtuple('TensorShape', ['batch_size', 'channels', 'height', 'width'])\n\n\ndef get_filter_output_shape(i_h, i_w, params, round_func):\n    o_h = (i_h + 2 * params.pad_h - params.kernel_h) / float(params.stride_h) + 1\n    o_w = (i_w + 2 * params.pad_w - params.kernel_w) / float(params.stride_w) + 1\n    return (int(round_func(o_h)), int(round_func(o_w)))\n\n\ndef get_strided_kernel_output_shape(node, round_func):\n    assert node.layer is not None\n    input_shape = node.get_only_parent().output_shape\n    o_h, o_w = get_filter_output_shape(input_shape.height, input_shape.width,\n                                       node.layer.kernel_parameters, round_func)\n    params = node.layer.parameters\n    has_c_o = hasattr(params, 'num_output')\n    c = params.num_output if has_c_o else input_shape.channels\n    return TensorShape(input_shape.batch_size, c, o_h, o_w)\n\n\ndef shape_not_implemented(node):\n    raise NotImplementedError\n\n\ndef shape_identity(node):\n    assert len(node.parents) > 0\n    return node.parents[0].output_shape\n\n\ndef shape_scalar(node):\n    return TensorShape(1, 1, 1, 1)\n\n\ndef shape_data(node):\n    if node.output_shape:\n        # Old-style input specification\n        return node.output_shape\n    try:\n        # New-style input specification\n        return map(int, node.parameters.shape[0].dim)\n    except:\n        # We most likely have a data layer on our hands. The problem is,\n        # Caffe infers the dimensions of the data from the source (eg: LMDB).\n        # We want to avoid reading datasets here. Fail for now.\n        # This can be temporarily fixed by transforming the data layer to\n        # Caffe's \"input\" layer (as is usually used in the \"deploy\" version).\n        # TODO: Find a better solution for this.\n        raise KaffeError('Cannot determine dimensions of data layer.\\n'\n                         'See comments in function shape_data for more info.')\n\n\ndef shape_mem_data(node):\n    params = node.parameters\n    return TensorShape(params.batch_size, params.channels, params.height, params.width)\n\n\ndef shape_concat(node):\n    axis = node.layer.parameters.axis\n    output_shape = None\n    for parent in node.parents:\n        if output_shape is None:\n            output_shape = list(parent.output_shape)\n        else:\n            output_shape[axis] += parent.output_shape[axis]\n    return tuple(output_shape)\n\n\ndef shape_convolution(node):\n    return get_strided_kernel_output_shape(node, math.floor)\n\n\ndef shape_pool(node):\n    return get_strided_kernel_output_shape(node, math.ceil)\n\n\ndef shape_inner_product(node):\n    input_shape = node.get_only_parent().output_shape\n    return TensorShape(input_shape.batch_size, node.layer.parameters.num_output, 1, 1)\n"
  },
  {
    "path": "kaffe/tensorflow/__init__.py",
    "content": "from .transformer import TensorFlowTransformer\nfrom .network import Network\n"
  },
  {
    "path": "kaffe/tensorflow/network_shape.py",
    "content": "import numpy as np\nimport tensorflow as tf\n\nDEFAULT_PADDING = 'SAME'\n\n\ndef layer(op):\n    '''Decorator for composable network layers.'''\n\n    def layer_decorated(self, *args, **kwargs):\n        # Automatically set a name if not provided.\n        name = kwargs.setdefault('name', self.get_unique_name(op.__name__))\n        # Figure out the layer inputs.\n        if len(self.terminals) == 0:\n            raise RuntimeError('No input variables found for layer %s.' % name)\n        elif len(self.terminals) == 1:\n            layer_input = self.terminals[0]\n        else:\n            layer_input = list(self.terminals)\n        # Perform the operation and get the output.\n        layer_output = op(self, layer_input, *args, **kwargs)\n        # Add to layer LUT.\n        self.layers[name] = layer_output\n        # This output is now the input for the next layer.\n        self.feed(layer_output)\n        # Return self for chained calls.\n        return self\n\n    return layer_decorated\n\n\nclass Network_Shape(object):\n\n    def __init__(self, inputs, trainable=True):\n        # The input nodes for this network\n        self.inputs = inputs\n        # The current list of terminal nodes\n        self.terminals = []\n        # Mapping from layer names to layers\n        self.layers = dict(inputs)\n        # If true, the resulting variables are set as trainable\n        self.trainable = trainable\n        print self.trainable\n        # Switch variable for dropout\n        self.use_dropout = tf.placeholder_with_default(tf.constant(1.0),\n                                                       shape=[],\n                                                       name='use_dropout')\n        self.setup()\n\n    def setup(self):\n        '''Construct the network. '''\n        raise NotImplementedError('Must be implemented by the subclass.')\n\n    \n    def load(self, data_path, prefix_name, session, ignore_missing=False):\n        '''Load network weights.\n        data_path: The path to the numpy-serialized network weights\n        session: The current TensorFlow session\n        ignore_missing: If true, serialized weights for missing layers are ignored.\n        '''\n        data_dict = np.load(data_path).item()\n        print len(data_dict) #data_dict['res2b_branch2a']\n        for op_name in data_dict:\n            #print op_name\n            #if op_name == \"res2b_branch2a\":\n            #    REUSE = None\n            #else:\n            #    REUSE = True\n            \n            if op_name == 'fc_ftnew':\n                continue\n\n            with tf.variable_scope(prefix_name + '/' + op_name, reuse=True): # reuse=True\n                for param_name, data in data_dict[op_name].iteritems():\n                    #if op_name == 'fc_ftnew':   \n                    #    print param_name, data, data.shape\n                    try:\n                        #if op_name == \"res2b_branch2a\":\n                        #        var = tf.Variable(data_dict[op_name][param_name], trainable=False, name=param_name)\n                        #else:\n                        var = tf.get_variable(param_name)\n                        session.run(var.assign(data))\n                    except ValueError:\n                        if not ignore_missing:\n                            raise\n    \n\n    \"\"\"\n    def load(self, data_path, ignore_missing=False):\n        '''Load network weights.\n        data_path: The path to the numpy-serialized network weights\n        session: The current TensorFlow session\n        ignore_missing: If true, serialized weights for missing layers are ignored.\n        '''\n        data_dict = np.load(data_path).item()\n        #print data_dict['res5c_branch2c'], data_dict['res5c_branch2c']['weights'], data_dict['res5c_branch2c']['weights'].shape\n        for op_name in data_dict:\n            with tf.variable_scope(op_name): # reuse=True\n                for param_name, data in data_dict[op_name].iteritems():\n                    #print param_name, data\n                    try:\n                        if op_name == 'res5c_branch2c':\n                            var = tf.Variable(data_dict[op_name][param_name], trainable=True, name=param_name)\n                        else:\n                            var = tf.Variable(data_dict[op_name][param_name], trainable=False, name=param_name)\n                        #session.run(var.assign(data))\n                    except ValueError:\n                        if not ignore_missing:\n                            raise\n\n    \"\"\"\n    \n    def load_specific_vars(self, data_path, op_name, session, ignore_missing=False):\n        '''Load network weights.\n        data_path: The path to the numpy-serialized network weights\n        session: The current TensorFlow session\n        ignore_missing: If true, serialized weights for missing layers are ignored.\n        '''\n        data_dict = np.load(data_path).item()\n        with tf.variable_scope(op_name, reuse=True): # reuse=None\n            for param_name, data in data_dict[op_name].iteritems():\n                #print param_name, data\n                try:\n                    var = tf.get_variable(param_name)\n                    session.run(var.assign(data))\n                except ValueError:\n                    if not ignore_missing:\n                        raise\n\n\n\n\n    def feed(self, *args):\n        '''Set the input(s) for the next operation by replacing the terminal nodes.\n        The arguments can be either layer names or the actual layers.\n        '''\n        assert len(args) != 0\n        self.terminals = []\n        for fed_layer in args:\n            if isinstance(fed_layer, basestring):\n                try:\n                    fed_layer = self.layers[fed_layer]\n                except KeyError:\n                    raise KeyError('Unknown layer name fed: %s' % fed_layer)\n            self.terminals.append(fed_layer)\n        return self\n\n    def get_output(self):\n        '''Returns the current network output.'''\n        return self.terminals[-1]\n\n    def get_unique_name(self, prefix):\n        '''Returns an index-suffixed unique name for the given prefix.\n        This is used for auto-generating layer names based on the type-prefix.\n        '''\n        ident = sum(t.startswith(prefix) for t, _ in self.layers.items()) + 1\n        return '%s_%d' % (prefix, ident)\n\n    def make_var(self, name, shape):\n        '''Creates a new TensorFlow variable.'''\n        return tf.get_variable(name, shape, trainable=self.trainable) #self.trainable)\n        #tmp = tf.get_variable(name, shape=shape, trainable=False)\n        #return tf.Variable(tmp, trainable=False, name=name)\n\n    def make_var_fixed(self, name, shape):\n        '''Creates a new TensorFlow variable.'''\n        return tf.get_variable(name, shape, trainable=False)\n        #tmp = tf.get_variable(name, shape=shape, trainable=False)\n        #return tf.Variable(tmp, trainable=False, name=name)\n\n    def validate_padding(self, padding):\n        '''Verifies that the padding is one of the supported ones.'''\n        assert padding in ('SAME', 'VALID')\n\n    @layer\n    def conv(self,\n             input,\n             k_h,\n             k_w,\n             c_o,\n             s_h,\n             s_w,\n             name,\n             relu=True,\n             padding=DEFAULT_PADDING,\n             group=1,\n             biased=True):\n        # Verify that the padding is acceptable\n        self.validate_padding(padding)\n        # Get the number of channels in the input\n        c_i = input.get_shape()[-1]\n        # Verify that the grouping parameter is valid\n        assert c_i % group == 0\n        assert c_o % group == 0\n        # Convolution for a given input and kernel\n        convolve = lambda i, k: tf.nn.conv2d(i, k, [1, s_h, s_w, 1], padding=padding)\n        with tf.variable_scope(name) as scope:\n            if name == 'res5c_branch2c' or name == 'res5c_branch2b' or name == 'res5c_branch2a' or \\\n               name == 'res5b_branch2c' or name == 'res5b_branch2b' or name == 'res5b_branch2a': # or \\\n               #name == 'res5a_branch2c' or name == 'res5a_branch2b' or name == 'res5a_branch2a' or \\\n               #name == 'res5a_branch1':    \n                kernel = self.make_var('weights', shape=[k_h, k_w, c_i / group, c_o])\n            else:\n                #kernel = self.make_var_fixed('weights', shape=[k_h, k_w, c_i / group, c_o])\n                kernel = self.make_var('weights', shape=[k_h, k_w, c_i / group, c_o])\n\n            if group == 1:\n                # This is the common-case. Convolve the input without any further complications.\n                output = convolve(input, kernel)\n            else:\n                # Split the input into groups and then convolve each of them independently\n                input_groups = tf.split(3, group, input)\n                kernel_groups = tf.split(3, group, kernel)\n                output_groups = [convolve(i, k) for i, k in zip(input_groups, kernel_groups)]\n                # Concatenate the groups\n                output = tf.concat(3, output_groups)\n            # Add the biases\n            if biased:\n                if name == 'res5c_branch2c' or name == 'res5c_branch2b' or name == 'res5c_branch2a' or \\\n                   name == 'res5b_branch2c' or name == 'res5b_branch2b' or name == 'res5b_branch2a': # or \\\n                   #name == 'res5a_branch2c' or name == 'res5a_branch2b' or name == 'res5a_branch2a' or \\\n                   #name == 'res5a_branch1':    \n                        biases = self.make_var('biases', [c_o])\n                else:\n                        #biases = self.make_var_fixed('biases', [c_o])\n                        biases = self.make_var('biases', [c_o])\n                output = tf.nn.bias_add(output, biases)\n            if relu:\n                # ReLU non-linearity\n                output = tf.nn.relu(output, name=scope.name)\n            return output\n\n    @layer\n    def relu(self, input, name):\n        return tf.nn.relu(input, name=name)\n\n    @layer\n    def max_pool(self, input, k_h, k_w, s_h, s_w, name, padding=DEFAULT_PADDING):\n        self.validate_padding(padding)\n        return tf.nn.max_pool(input,\n                              ksize=[1, k_h, k_w, 1],\n                              strides=[1, s_h, s_w, 1],\n                              padding=padding,\n                              name=name)\n\n    @layer\n    def avg_pool(self, input, k_h, k_w, s_h, s_w, name, padding=DEFAULT_PADDING):\n        self.validate_padding(padding)\n        return tf.nn.avg_pool(input,\n                              ksize=[1, k_h, k_w, 1],\n                              strides=[1, s_h, s_w, 1],\n                              padding=padding,\n                              name=name)\n\n    @layer\n    def lrn(self, input, radius, alpha, beta, name, bias=1.0):\n        return tf.nn.local_response_normalization(input,\n                                                  depth_radius=radius,\n                                                  alpha=alpha,\n                                                  beta=beta,\n                                                  bias=bias,\n                                                  name=name)\n\n    @layer\n    def concat(self, inputs, axis, name):\n        return tf.concat(concat_dim=axis, values=inputs, name=name)\n\n    @layer\n    def add(self, inputs, name):\n        return tf.add_n(inputs, name=name)\n\n    @layer\n    def fc(self, input, num_out, name, relu=True):\n        with tf.variable_scope(name) as scope:\n            input_shape = input.get_shape()\n            if input_shape.ndims == 4:\n                # The input is spatial. Vectorize it first.\n                dim = 1\n                for d in input_shape[1:].as_list():\n                    dim *= d\n                feed_in = tf.reshape(input, [-1, dim])\n            else:\n                feed_in, dim = (input, input_shape[-1].value)\n            weights = self.make_var('weights', shape=[dim, num_out])\n            biases = self.make_var('biases', [num_out])\n            op = tf.nn.relu_layer if relu else tf.nn.xw_plus_b\n            fc = op(feed_in, weights, biases, name=scope.name)\n            return fc\n\n    @layer\n    def softmax(self, input, name):\n        input_shape = map(lambda v: v.value, input.get_shape())\n        if len(input_shape) > 2:\n            # For certain models (like NiN), the singleton spatial dimensions\n            # need to be explicitly squeezed, since they're not broadcast-able\n            # in TensorFlow's NHWC ordering (unlike Caffe's NCHW).\n            if input_shape[1] == 1 and input_shape[2] == 1:\n                input = tf.squeeze(input, squeeze_dims=[1, 2])\n            else:\n                raise ValueError('Rank 2 tensor input expected for softmax!')\n        return tf.nn.softmax(input, name)\n\n    @layer\n    def batch_normalization(self, input, name, scale_offset=True, relu=False):\n        # NOTE: Currently, only inference is supported\n        with tf.variable_scope(name) as scope:\n            shape = [input.get_shape()[-1]]\n            if scale_offset:\n                scale = self.make_var_fixed('scale', shape=shape)\n                offset = self.make_var_fixed('offset', shape=shape)\n                #scale = self.make_var('scale', shape=shape)\n                #offset = self.make_var('offset', shape=shape)\n            else:\n                scale, offset = (None, None)\n            output = tf.nn.batch_normalization(\n                input,\n                mean=self.make_var_fixed('mean', shape=shape),\n                variance=self.make_var_fixed('variance', shape=shape),\n                #mean=self.make_var('mean', shape=shape),\n                #variance=self.make_var('variance', shape=shape),\n                offset=offset,\n                scale=scale,\n                # TODO: This is the default Caffe batch norm eps\n                # Get the actual eps from parameters\n                variance_epsilon=1e-5,\n                name=name)\n            if relu:\n                output = tf.nn.relu(output)\n            return output\n\n    @layer\n    def dropout(self, input, keep_prob, name):\n        keep = 1 - self.use_dropout + (self.use_dropout * keep_prob)\n        return tf.nn.dropout(input, keep, name=name)\n"
  },
  {
    "path": "kaffe/transformers.py",
    "content": "'''\nA collection of graph transforms.\n\nA transformer is a callable that accepts a graph and returns a transformed version.\n'''\n\nimport numpy as np\n\nfrom .caffe import get_caffe_resolver, has_pycaffe\nfrom .errors import KaffeError, print_stderr\nfrom .layers import NodeKind\n\n\nclass DataInjector(object):\n    '''\n    Associates parameters loaded from a .caffemodel file with their corresponding nodes.\n    '''\n\n    def __init__(self, def_path, data_path):\n        # The .prototxt file defining the graph\n        self.def_path = def_path\n        # The .caffemodel file containing the learned parameters\n        self.data_path = data_path\n        # Set to true if the fallback protocol-buffer based backend was used\n        self.did_use_pb = False\n        # A list containing (layer name, parameters) tuples\n        self.params = None\n        # Load the parameters\n        self.load()\n\n    def load(self):\n        if has_pycaffe():\n            self.load_using_caffe()\n        else:\n            self.load_using_pb()\n\n    def load_using_caffe(self):\n        caffe = get_caffe_resolver().caffe\n        net = caffe.Net(self.def_path, self.data_path, caffe.TEST)\n        data = lambda blob: blob.data\n        self.params = [(k, map(data, v)) for k, v in net.params.items()]\n\n    def load_using_pb(self):\n        data = get_caffe_resolver().NetParameter()\n        data.MergeFromString(open(self.data_path, 'rb').read())\n        pair = lambda layer: (layer.name, self.normalize_pb_data(layer))\n        layers = data.layers or data.layer\n        self.params = [pair(layer) for layer in layers if layer.blobs]\n        self.did_use_pb = True\n\n    def normalize_pb_data(self, layer):\n        transformed = []\n        for blob in layer.blobs:\n            if len(blob.shape.dim):\n                dims = blob.shape.dim\n                c_o, c_i, h, w = map(int, [1] * (4 - len(dims)) + list(dims))\n            else:\n                c_o = blob.num\n                c_i = blob.channels\n                h = blob.height\n                w = blob.width\n            data = np.array(blob.data, dtype=np.float32).reshape(c_o, c_i, h, w)\n            transformed.append(data)\n        return transformed\n\n    def adjust_parameters(self, node, data):\n        if not self.did_use_pb:\n            return data\n        # When using the protobuf-backend, each parameter initially has four dimensions.\n        # In certain cases (like FC layers), we want to eliminate the singleton dimensions.\n        # This implementation takes care of the common cases. However, it does leave the\n        # potential for future issues.\n        # The Caffe-backend does not suffer from this problem.\n        data = list(data)\n        squeeze_indices = [1]  # Squeeze biases.\n        if node.kind == NodeKind.InnerProduct:\n            squeeze_indices.append(0)  # Squeeze FC.\n        for idx in squeeze_indices:\n            data[idx] = np.squeeze(data[idx])\n        return data\n\n    def __call__(self, graph):\n        for layer_name, data in self.params:\n            if layer_name in graph:\n                node = graph.get_node(layer_name)\n                node.data = self.adjust_parameters(node, data)\n            else:\n                print_stderr('Ignoring parameters for non-existent layer: %s' % layer_name)\n        return graph\n\n\nclass DataReshaper(object):\n\n    def __init__(self, mapping, replace=True):\n        # A dictionary mapping NodeKind to the transposed order.\n        self.mapping = mapping\n        # The node kinds eligible for reshaping\n        self.reshaped_node_types = self.mapping.keys()\n        # If true, the reshaped data will replace the old one.\n        # Otherwise, it's set to the reshaped_data attribute.\n        self.replace = replace\n\n    def has_spatial_parent(self, node):\n        try:\n            parent = node.get_only_parent()\n            s = parent.output_shape\n            return s.height > 1 or s.width > 1\n        except KaffeError:\n            return False\n\n    def map(self, node_kind):\n        try:\n            return self.mapping[node_kind]\n        except KeyError:\n            raise KaffeError('Ordering not found for node kind: {}'.format(node_kind))\n\n    def __call__(self, graph):\n        for node in graph.nodes:\n            if node.data is None:\n                continue\n            if node.kind not in self.reshaped_node_types:\n                # Check for 2+ dimensional data\n                if any(len(tensor.shape) > 1 for tensor in node.data):\n                    print_stderr('Warning: parmaters not reshaped for node: {}'.format(node))\n                continue\n            transpose_order = self.map(node.kind)\n            weights = node.data[0]\n            if (node.kind == NodeKind.InnerProduct) and self.has_spatial_parent(node):\n                # The FC layer connected to the spatial layer needs to be\n                # re-wired to match the new spatial ordering.\n                in_shape = node.get_only_parent().output_shape\n                fc_shape = weights.shape\n                output_channels = fc_shape[0]\n                weights = weights.reshape((output_channels, in_shape.channels, in_shape.height,\n                                           in_shape.width))\n                weights = weights.transpose(self.map(NodeKind.Convolution))\n                node.reshaped_data = weights.reshape(fc_shape[transpose_order[0]],\n                                                     fc_shape[transpose_order[1]])\n            else:\n                node.reshaped_data = weights.transpose(transpose_order)\n\n        if self.replace:\n            for node in graph.nodes:\n                if hasattr(node, 'reshaped_data'):\n                    # Set the weights\n                    node.data[0] = node.reshaped_data\n                    del node.reshaped_data\n        return graph\n\n\nclass SubNodeFuser(object):\n    '''\n    An abstract helper for merging a single-child with its single-parent.\n    '''\n\n    def __call__(self, graph):\n        nodes = graph.nodes\n        fused_nodes = []\n        for node in nodes:\n            if len(node.parents) != 1:\n                # We're only fusing nodes with single parents\n                continue\n            parent = node.get_only_parent()\n            if len(parent.children) != 1:\n                # We can only fuse a node if its parent's\n                # value isn't used by any other node.\n                continue\n            if not self.is_eligible_pair(parent, node):\n                continue\n            # Rewrite the fused node's children to its parent.\n            for child in node.children:\n                child.parents.remove(node)\n                parent.add_child(child)\n            # Disconnect the fused node from the graph.\n            parent.children.remove(node)\n            fused_nodes.append(node)\n            # Let the sub-class merge the fused node in any arbitrary way.\n            self.merge(parent, node)\n        transformed_nodes = [node for node in nodes if node not in fused_nodes]\n        return graph.replaced(transformed_nodes)\n\n    def is_eligible_pair(self, parent, child):\n        '''Returns true if this parent/child pair is eligible for fusion.'''\n        raise NotImplementedError('Must be implemented by subclass.')\n\n    def merge(self, parent, child):\n        '''Merge the child node into the parent.'''\n        raise NotImplementedError('Must be implemented by subclass')\n\n\nclass ReLUFuser(SubNodeFuser):\n    '''\n    Fuses rectified linear units with their parent nodes.\n    '''\n\n    def __init__(self, allowed_parent_types=None):\n        # Fuse ReLUs when the parent node is one of the given types.\n        # If None, all node types are eligible.\n        self.allowed_parent_types = allowed_parent_types\n\n    def is_eligible_pair(self, parent, child):\n        return ((self.allowed_parent_types is None or parent.kind in self.allowed_parent_types) and\n                child.kind == NodeKind.ReLU)\n\n    def merge(self, parent, _):\n        parent.metadata['relu'] = True\n\n\nclass BatchNormScaleBiasFuser(SubNodeFuser):\n    '''\n    The original batch normalization paper includes two learned\n    parameters: a scaling factor \\gamma and a bias \\beta.\n    Caffe's implementation does not include these two. However, it is commonly\n    replicated by adding a scaling+bias layer immidiately after the batch norm.\n\n    This fuser merges the scaling+bias layer with the batch norm.\n    '''\n\n    def is_eligible_pair(self, parent, child):\n        return (parent.kind == NodeKind.BatchNorm and child.kind == NodeKind.Scale and\n                child.parameters.axis == 1 and child.parameters.bias_term == True)\n\n    def merge(self, parent, child):\n        parent.scale_bias_node = child\n\n\nclass BatchNormPreprocessor(object):\n    '''\n    Prescale batch normalization parameters.\n    Concatenate gamma (scale) and beta (bias) terms if set.\n    '''\n\n    def __call__(self, graph):\n        for node in graph.nodes:\n            if node.kind != NodeKind.BatchNorm:\n                continue\n            assert node.data is not None\n            assert len(node.data) == 3\n            mean, variance, scale = node.data\n            # Prescale the stats\n            scaling_factor = 1.0 / scale if scale != 0 else 0\n            mean *= scaling_factor\n            variance *= scaling_factor\n            # Replace with the updated values\n            node.data = [mean, variance]\n            if hasattr(node, 'scale_bias_node'):\n                # Include the scale and bias terms\n                gamma, beta = node.scale_bias_node.data\n                node.data += [gamma, beta]\n        return graph\n\n\nclass NodeRenamer(object):\n    '''\n    Renames nodes in the graph using a given unary function that\n    accepts a node and returns its new name.\n    '''\n\n    def __init__(self, renamer):\n        self.renamer = renamer\n\n    def __call__(self, graph):\n        for node in graph.nodes:\n            node.name = self.renamer(node)\n        return graph\n\n\nclass ParameterNamer(object):\n    '''\n    Convert layer data arrays to a dictionary mapping parameter names to their values.\n    '''\n\n    def __call__(self, graph):\n        for node in graph.nodes:\n            if node.data is None:\n                continue\n            if node.kind in (NodeKind.Convolution, NodeKind.InnerProduct):\n                names = ('weights',)\n                if node.parameters.bias_term:\n                    names += ('biases',)\n            elif node.kind == NodeKind.BatchNorm:\n                names = ('mean', 'variance')\n                if len(node.data) == 4:\n                    names += ('scale', 'offset')\n            else:\n                print_stderr('WARNING: Unhandled parameters: {}'.format(node.kind))\n                continue\n            assert len(names) == len(node.data)\n            node.data = dict(zip(names, node.data))\n        return graph\n"
  },
  {
    "path": "main_fpn.py",
    "content": "import sys\nimport os\nimport csv\nimport numpy as np\nimport cv2\nimport math\nimport pose_utils\nimport os\nimport myparse\nimport renderer_fpn\n## To make tensorflow print less (this can be useful for debug though)\n#os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' \n#import ctypes; \nprint '> loading getRts'\nimport get_Rts as getRts\n######## TMP FOLDER #####################\n_tmpdir = './tmp/'#os.environ['TMPDIR'] + '/'\nprint '> make dir'\nif not os.path.exists( _tmpdir):\n    os.makedirs( _tmpdir )\n#########################################\n##INPUT/OUTPUT\ninput_file = str(sys.argv[1]) #'input.csv'\noutpu_proc = 'output_preproc.csv'\noutput_pose_db =  './output_pose.lmdb'\noutput_render = './output_render'\n#################################################\nprint '> network'\n_alexNetSize = 227\n_factor = 0.25 #0.1\n\n# ***** please download the model in https://www.dropbox.com/s/r38psbq55y2yj4f/fpn_new_model.tar.gz?dl=0 ***** #\nmodel_folder = './fpn_new_model/'\nmodel_used = 'model_0_1.0_1.0_1e-07_1_16000.ckpt' #'model_0_1.0_1.0_1e-05_0_6000.ckpt'\nlr_rate_scalar = 1.0\nif_dropout = 0\nkeep_rate = 1\n################################\ndata_dict = myparse.parse_input(input_file)\n## Pre-processing the images \nprint '> preproc'\npose_utils.preProcessImage( _tmpdir, data_dict, './',\\\n                            _factor, _alexNetSize, outpu_proc )\n## Runnin FacePoseNet\nprint '> run'\n## Running the pose estimation\ngetRts.esimatePose( model_folder, outpu_proc, output_pose_db, model_used, lr_rate_scalar, if_dropout, keep_rate, use_gpu=False )\n\n\nrenderer_fpn.render_fpn(outpu_proc, output_pose_db, output_render)\n"
  },
  {
    "path": "main_predict_6DoF.py",
    "content": "import sys\nimport numpy as np\nimport tensorflow as tf\nimport cv2\nimport scipy.io as sio\nsys.path.append('./utils')\nimport pose_utils as pu\nimport os\nimport os.path\nfrom glob import glob\nimport time\nimport pickle\n\nsys.path.append('./kaffe')\nsys.path.append('./ResNet')\nfrom ThreeDMM_shape import ResNet_101 as resnet101_shape\n\n\n\n# Global parameters\nfactor = 0.25\n_resNetSize = 224\nn_hidden1 = 2048\nn_hidden2 = 4096\nifdropout = 0\n\n\ngpuID = int(sys.argv[1])\ninput_sample_list_path = str(sys.argv[2]) #'./input_list.txt' # You can change to your own image list\n\n\ntf.logging.set_verbosity(tf.logging.INFO)\n\nFLAGS = tf.app.flags.FLAGS\ntf.app.flags.DEFINE_integer('image_size', 224, 'Image side length.')\n\noutput_path = './output_6DoF' \ntf.app.flags.DEFINE_string('save_output_path', output_path, 'Directory to keep the checkpoints')\ntf.app.flags.DEFINE_integer('num_gpus', 1, 'Number of gpus used for training. (0 or 1)')\ntf.app.flags.DEFINE_integer('batch_size', 1, 'Batch Size') # 60\n\n\n\n\nif not os.path.exists(FLAGS.save_output_path):\n        os.makedirs(FLAGS.save_output_path)\n\n\n\n\n\ndef extract_3dmm_pose():\n\n\n        ########################################\n        # Load train image mean, train label mean and std\n        ########################################\n        \n\n        \n\n        # labels stats on 300W-LP\n        train_label_mean = np.load('./train_stats/train_label_mean_300WLP.npy')\n        train_label_std = np.load('./train_stats/train_label_std_300WLP.npy')\n\n        Pose_label_mean = train_label_mean[:6]\n        Pose_label_std = train_label_std[:6]\n\n        #ShapeExpr_label_mean_300WLP = train_label_mean[6:]\n        #ShapeExpr_label_std_300WLP = train_label_std[6:]\n\n       \n        # Get training image mean from Anh's ShapeNet (CVPR2017)\n        mean_image_shape = np.load('./train_stats/3DMM_shape_mean.npy') # 3 x 224 x 224 \n        train_image_mean = np.transpose(mean_image_shape, [1,2,0]) # 224 x 224 x 3, [0,255]\n\n        \n\n       \n\n        ########################################\n        # Build CNN graph\n        ########################################\n\n        # placeholders for the batches                                                                                                                                      \n        x_img = tf.placeholder(tf.float32, [None, FLAGS.image_size, FLAGS.image_size, 3])\n       \n    \n\n        # Resize Image\n        x2 = tf.image.resize_bilinear(x_img, tf.constant([224,224], dtype=tf.int32))\n        x2 = tf.cast(x2, 'float32')\n        x2 = tf.reshape(x2, [-1, 224, 224, 3])\n        \n        # Image normalization\n        mean = tf.reshape(train_image_mean, [1, 224, 224, 3])\n        mean = tf.cast(mean, 'float32')\n        x2 = x2 - mean\n       \n\n\n        ########################################\n        # New-FPN with ResNet structure\n        ########################################\n\n        with tf.variable_scope('shapeCNN'):\n                net_shape = resnet101_shape({'input': x2}, trainable=True) # False: Freeze the ResNet Layers\n                pool5 = net_shape.layers['pool5']\n                pool5 = tf.squeeze(pool5)\n                pool5 = tf.reshape(pool5, [1, 2048])\n                print pool5.get_shape() # batch_size x 2048\n\n           \n        with tf.variable_scope('Pose'):   \n\n                with tf.variable_scope('fc1'):\n                       \n                        fc1W = tf.Variable(tf.random_normal(tf.stack([pool5.get_shape()[1].value, n_hidden1]), mean=0.0, stddev=0.01), trainable=True, name='W')\n                        fc1b = tf.Variable(tf.zeros([n_hidden1]), trainable=True, name='baises')\n               \n                        fc1 = tf.nn.relu_layer(tf.reshape(pool5, [-1, int(np.prod(pool5.get_shape()[1:]))]), fc1W, fc1b, name='fc1')\n                        print \"\\nfc1 shape:\"\n                        print fc1.get_shape(), fc1W.get_shape(), fc1b.get_shape() # (batch_size, 4096) (2048, 4096) (4096,)\n                        \n                        if ifdropout == 1:\n                                fc1 = tf.nn.dropout(fc1, prob, name='fc1_dropout')\n\n                with tf.variable_scope('fc2'):\n\n                        fc2W = tf.Variable(tf.random_normal([n_hidden1, n_hidden2], mean=0.0, stddev=0.01), trainable=True, name='W')\n                        fc2b = tf.Variable(tf.zeros([n_hidden2]), trainable=True, name='baises')\n\n                        fc2 = tf.nn.relu_layer(fc1, fc2W, fc2b, name='fc2')\n                        print fc2.get_shape(), fc2W.get_shape(), fc2b.get_shape() # (batch_size, 29 (2048, 2048) (2048,)\n\n                        if ifdropout == 1:\n                                fc2 = tf.nn.dropout(fc2, prob, name='fc2_dropout')\n\n                with tf.variable_scope('fc3'):\n               \n                        # Move everything into depth so we can perform a single matrix multiplication.                            \n                        fc2 = tf.reshape(fc2, [FLAGS.batch_size, -1])\n                \n                        dim = fc2.get_shape()[1].value\n                        print \"\\nfc2 dim:\"\n                        print fc2.get_shape(), dim\n                \n                        fc3W = tf.Variable(tf.random_normal(tf.stack([dim,6]), mean=0.0, stddev=0.01), trainable=True, name='W')\n                        fc3b = tf.Variable(tf.zeros([6]), trainable=True, name='baises')\n                        #print \"*** label shape: \" + str(len(train_label_mean))\n                        Pose_params_ZNorm = tf.nn.xw_plus_b(fc2, fc3W, fc3b)  \n                        print \"\\nfc3 shape:\"\n                        print Pose_params_ZNorm.get_shape(), fc3W.get_shape(), fc3b.get_shape() \n\n\n\n                        Pose_label_mean = tf.cast(tf.reshape(Pose_label_mean, [1, -1]), 'float32')\n                        Pose_label_std = tf.cast(tf.reshape(Pose_label_std, [1, -1]), 'float32')\n                        Pose_params = Pose_params_ZNorm * (Pose_label_std + 0.000000000000000001) + Pose_label_mean\n\n             \n\n\n        ########################################\n        # Start extracting 3dmm pose\n        ########################################        \n        init_op = tf.global_variables_initializer()\n        saver = tf.train.Saver(var_list=tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES))\n        saver_ini_shape_net = tf.train.Saver(var_list=tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, scope='shapeCNN'))\n        saver_shapeCNN = tf.train.Saver(var_list=tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope='shapeCNN'))\n        saver_Pose = tf.train.Saver(var_list=tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope='Pose'))\n       \n\n        config = tf.ConfigProto(allow_soft_placement=True) #, log_device_placement=True)\n        #config.gpu_options.per_process_gpu_memory_fraction = 0.5\n        config.gpu_options.allow_growth = True\n        with tf.Session(config=config) as sess:\n               \n\n                sess.run(init_op)\n                start_time = time.time()\n                \n                \n                # For non-trainable parameters such as the parameters for batch normalization\n                load_path = \"./models/ini_shapeNet_model_L7L_trainable.ckpt\"\n                saver_ini_shape_net.restore(sess, load_path)\n                \n                # For other trainable parameters\n                load_path = \"./models/model_0.0001_1_18_0.0_2048_4096.ckpt\"\n                saver_shapeCNN.restore(sess, load_path)\n                saver_Pose.restore(sess, load_path)\n\n\n                load_model_time = time.time() - start_time                \n                print(\"Model restored: \" + str(load_model_time))\n\n\n                with open(input_sample_list_path, 'r') as fin:\n\n                        for line in fin:\n\n                                curr_line = line.strip().split(',')\n                                image_path = curr_line[0]\n                                bbox = np.array([float(curr_line[1]), float(curr_line[2]), float(curr_line[3]), float(curr_line[4])]) # [lt_x, lt_y, w, h]\n                                image_key = image_path.split('/')[-1][:-4]\n\n\n                                image = cv2.imread(image_path,1) # BGR                                                                            \n                                image = np.asarray(image)\n                        \n\n                                # Fix the grey image                                                                                                                       \n                                if len(image.shape) < 3:\n                                        image_r = np.reshape(image, (image.shape[0], image.shape[1], 1))\n                                        image = np.append(image_r, image_r, axis=2)\n                                        image = np.append(image, image_r, axis=2)\n\n\n\n                                # Crop and expand (25%) the image based on the tight bbox (from the face detector or detected lmks)\n                                factor = [1.9255, 2.2591, 1.9423, 1.6087];\n                                img_new = pu.preProcessImage_v2(image.copy(), bbox.copy(), factor, _resNetSize, 1)\n                                image_array = np.reshape(img_new, [1, _resNetSize, _resNetSize, 3])\n\n\n\n                                (params_pose, pool5_feats) = sess.run([Pose_params, pool5], feed_dict={x_img: image_array}) # [scale, pitch, yaw, roll, translation_x, translation_y]\n                                params_pose = params_pose[0]\n                                print params_pose #, pool5_feats       \n\n\n                                # save the predicted pose\n                                with open(FLAGS.save_output_path + '/' + image_key + '.txt', 'w') as fout:\n\n                                        for pp in params_pose:\n                                                fout.write(str(pp) + '\\n')\n                                \n\n                                \n                                # Convert the 6DoF predicted pose to 3x4 projection matrix (weak-perspective projection)\n                                # Load BFM model \n                                shape_mat = sio.loadmat('./BFM/Model_Shape.mat')\n                                mu_shape = shape_mat['mu_shape'].astype('float32')\n                                \n                                expr_mat = sio.loadmat('./BFM/Model_Exp.mat')\n                                mu_exp = expr_mat['mu_exp'].astype('float32')\n                                \n                                mu = mu_shape + mu_exp\n                                len_mu = len(mu) \n                                mu = np.reshape(mu, [-1,1])\n\n                                keypoints = np.reshape(shape_mat['keypoints'], [-1]) - 1 # -1 for python index         \n                                keypoints = keypoints.astype('int32')\n                                \n                                \n                        \n                                vertex = np.reshape(mu, [len_mu/3, 3]) # # of vertices x 3\n                                # mean shape\n                                mesh = vertex.T # 3 x # of vertices\n                                mesh_1 = np.concatenate([mesh, np.ones([1,len_mu/3])], axis=0) # 4 x # of vertices\n\n\n                                # Get projection matrix from 6DoF pose\n                                scale, pitch, yaw, roll, tx, ty = params_pose\n                                R = pu.RotationMatrix(pitch, yaw, roll)\n                                ProjMat = np.zeros([3,4])\n                                ProjMat[:,:3] = scale * R\n                                ProjMat[:,3] = np.array([tx,ty,0])\n                                \n\n                                # Get predicted shape\n                                #print ProjMat, ProjMat.shape\n                                #print mesh_1, mesh_1.shape\n                                pred_shape = np.matmul(ProjMat, mesh_1) # 3 x # of vertices\n                                pred_shape = pred_shape.T # # of vertices x 3\n                                \n\n                                pred_shape_x = np.reshape(pred_shape[:,0], [len_mu/3, 1])\n                                pred_shape_z = np.reshape(pred_shape[:,2], [len_mu/3, 1])\n                                pred_shape_y = 224 + 1 - pred_shape[:,1]\n                                pred_shape_y = np.reshape(pred_shape_y, [len_mu/3, 1])\n                                pred_shape = np.concatenate([pred_shape_x, pred_shape_y, pred_shape_z], 1)\n                \n\n\n\n                                # Convert shape and lmks back to the original image scale\n\n                                _, bbox_new, _, lmks_filling, old_h, old_w, img_new = pu.resize_crop_rescaleCASIA(image.copy(), bbox.copy(), pred_shape.copy(), factor)\n                                #print lmks_filling\n                                pred_shape[:,0] = pred_shape[:,0] * old_w / 224.\n                                pred_shape[:,1] = pred_shape[:,1] * old_h / 224.\n                                pred_shape[:,0] = pred_shape[:,0] + bbox_new[0]\n                                pred_shape[:,1] = pred_shape[:,1] + bbox_new[1]\n\n                                # Get predicted lmks\n                                pred_lmks = pred_shape[keypoints]\n\n\n\n                                sio.savemat(FLAGS.save_output_path + '/' + image_key + '.mat', {'shape_3D': pred_shape, 'lmks_3D': pred_lmks})\n                                #cv2.imwrite(FLAGS.save_output_path + '/' + image_key + '.jpg', img_new)\n\n                        \n\n\n             \n\n\n               \n               \n              \n\n\ndef main(_):\n        os.environ[\"CUDA_DEVICE_ORDER\"]=\"PCI_BUS_ID\"\n        os.environ[\"CUDA_VISIBLE_DEVICES\"]=str(gpuID)\n        if FLAGS.num_gpus == 0:\n                dev = '/cpu:0'\n        elif FLAGS.num_gpus == 1:\n                dev = '/gpu:0'\n        else:\n                raise ValueError('Only support 0 or 1 gpu.')\n\n        \n        print dev\n        with tf.device(dev):\n\t       extract_3dmm_pose()\n\n\n\nif __name__ == '__main__':\n        tf.app.run()\n"
  },
  {
    "path": "main_predict_ProjMat.py",
    "content": "import sys\nimport numpy as np\nimport tensorflow as tf\nimport cv2\nimport scipy.io as sio\nsys.path.append('./utils')\nimport pose_utils as pu\nimport os\nimport os.path\nfrom glob import glob\nimport time\nimport pickle\n\nsys.path.append('./kaffe')\nsys.path.append('./ResNet')\nfrom ThreeDMM_shape import ResNet_101 as resnet101_shape\n\n\n\n# Global parameters\nfactor = 0.25\n_resNetSize = 224\nn_hidden1 = 2048\nn_hidden2 = 4096\nifdropout = 0\n\n\ngpuID = int(sys.argv[1])\ninput_sample_list_path = str(sys.argv[2]) #'./input_list.txt' # You can change to your own image list\n\n\ntf.logging.set_verbosity(tf.logging.INFO)\n\nFLAGS = tf.app.flags.FLAGS\ntf.app.flags.DEFINE_integer('image_size', 224, 'Image side length.')\n\noutput_path = './output_ProjMat' \ntf.app.flags.DEFINE_string('save_output_path', output_path, 'Directory to keep the checkpoints')\ntf.app.flags.DEFINE_integer('num_gpus', 1, 'Number of gpus used for training. (0 or 1)')\ntf.app.flags.DEFINE_integer('batch_size', 1, 'Batch Size') # 60\n\n\n\n\nif not os.path.exists(FLAGS.save_output_path):\n        os.makedirs(FLAGS.save_output_path)\n\n\n\n\n\ndef extract_3dmm_ProjMat():\n\n\n        ########################################\n        # Load train image mean, train label mean and std\n        ########################################\n        \n\n        \n\n        # labels stats on 300W-LP\n        train_label_mean = np.load('./train_stats/train_label_mean_ProjMat.npy')\n        train_label_std = np.load('./train_stats/train_label_std_ProjMat.npy')\n\n        ProjMat_label_mean = train_label_mean[-12:-1]\n        ProjMat_label_std = train_label_std[-12:-1]\n\n       \n        # Get training image mean from Anh's ShapeNet (CVPR2017)\n        mean_image_shape = np.load('./train_stats/3DMM_shape_mean.npy') # 3 x 224 x 224 \n        train_image_mean = np.transpose(mean_image_shape, [1,2,0]) # 224 x 224 x 3, [0,255]\n\n        \n\n       \n\n        ########################################\n        # Build CNN graph\n        ########################################\n\n        # placeholders for the batches                                                                                                                                      \n        x_img = tf.placeholder(tf.float32, [None, FLAGS.image_size, FLAGS.image_size, 3])\n       \n    \n\n        # Resize Image\n        x2 = tf.image.resize_bilinear(x_img, tf.constant([224,224], dtype=tf.int32))\n        x2 = tf.cast(x2, 'float32')\n        x2 = tf.reshape(x2, [-1, 224, 224, 3])\n        \n        # Image normalization\n        mean = tf.reshape(train_image_mean, [1, 224, 224, 3])\n        mean = tf.cast(mean, 'float32')\n        x2 = x2 - mean\n       \n\n\n\n\n        ########################################\n        # New-FPN with ResNet structure\n        ########################################\n\n        with tf.variable_scope('shapeCNN'):\n                net_shape = resnet101_shape({'input': x2}, trainable=True) # False: Freeze the ResNet Layers\n                pool5 = net_shape.layers['pool5']\n                pool5 = tf.squeeze(pool5)\n                pool5 = tf.reshape(pool5, [1, 2048])\n                print pool5.get_shape() # batch_size x 2048\n\n           \n        with tf.variable_scope('Pose'):   \n\n                with tf.variable_scope('fc1'):\n                       \n                        fc1W = tf.Variable(tf.random_normal(tf.stack([pool5.get_shape()[1].value, n_hidden1]), mean=0.0, stddev=0.01), trainable=True, name='W')\n                        fc1b = tf.Variable(tf.zeros([n_hidden1]), trainable=True, name='baises')\n               \n                        fc1 = tf.nn.relu_layer(tf.reshape(pool5, [-1, int(np.prod(pool5.get_shape()[1:]))]), fc1W, fc1b, name='fc1')\n                        print \"\\nfc1 shape:\"\n                        print fc1.get_shape(), fc1W.get_shape(), fc1b.get_shape() # (batch_size, 4096) (2048, 4096) (4096,)\n                        \n                        if ifdropout == 1:\n                                fc1 = tf.nn.dropout(fc1, prob, name='fc1_dropout')\n\n                with tf.variable_scope('fc2'):\n\n                        fc2W = tf.Variable(tf.random_normal([n_hidden1, n_hidden2], mean=0.0, stddev=0.01), trainable=True, name='W')\n                        fc2b = tf.Variable(tf.zeros([n_hidden2]), trainable=True, name='baises')\n\n                        fc2 = tf.nn.relu_layer(fc1, fc2W, fc2b, name='fc2')\n                        print fc2.get_shape(), fc2W.get_shape(), fc2b.get_shape() # (batch_size, 29 (2048, 2048) (2048,)\n\n                        if ifdropout == 1:\n                                fc2 = tf.nn.dropout(fc2, prob, name='fc2_dropout')\n\n                with tf.variable_scope('fc3'):\n               \n                        # Move everything into depth so we can perform a single matrix multiplication.                            \n                        fc2 = tf.reshape(fc2, [FLAGS.batch_size, -1])\n                \n                        dim = fc2.get_shape()[1].value\n                        print \"\\nfc2 dim:\"\n                        print fc2.get_shape(), dim\n                \n                        fc3W = tf.Variable(tf.random_normal(tf.stack([dim,11]), mean=0.0, stddev=0.01), trainable=True, name='W')\n                        fc3b = tf.Variable(tf.zeros([11]), trainable=True, name='baises')\n                        #print \"*** label shape: \" + str(len(train_label_mean))\n                        ProjMat_preds_ZNorm = tf.nn.xw_plus_b(fc2, fc3W, fc3b)  \n                        print \"\\nfc3 shape:\"\n                        print ProjMat_preds_ZNorm.get_shape(), fc3W.get_shape(), fc3b.get_shape() \n\n\n                        label_mean = tf.cast(tf.reshape(ProjMat_label_mean, [1, -1]), 'float32')\n                        label_std = tf.cast(tf.reshape(ProjMat_label_std, [1, -1]), 'float32')\n                        ProjMat_preds = ProjMat_preds_ZNorm * (label_std + 0.000000000000000001) + label_mean\n\n             \n                ProjMat_preds = tf.concat([ProjMat_preds, tf.zeros([FLAGS.batch_size,1])], 1)  \n\n\n\n\n\n        ########################################\n        # Start extracting 3dmm pose\n        ########################################        \n        init_op = tf.global_variables_initializer()\n        saver = tf.train.Saver(var_list=tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES))\n        saver_ini_shape_net = tf.train.Saver(var_list=tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, scope='shapeCNN'))\n        saver_shapeCNN = tf.train.Saver(var_list=tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope='shapeCNN'))\n        saver_Pose = tf.train.Saver(var_list=tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope='Pose'))\n       \n\n        config = tf.ConfigProto(allow_soft_placement=True) #, log_device_placement=True)\n        #config.gpu_options.per_process_gpu_memory_fraction = 0.5\n        config.gpu_options.allow_growth = True\n        with tf.Session(config=config) as sess:\n               \n\n                sess.run(init_op)\n                start_time = time.time()\n\n\n                load_path = \"./models/ini_shapeNet_model_L7L_trainable.ckpt\"\n                saver_ini_shape_net.restore(sess, load_path)\n                \n                load_path = \"./models/model_0.0001_1_18_0.0_2048_4096.ckpt\"\n                saver_shapeCNN.restore(sess, load_path)\n\n                load_path = \"./models/model_iniLR_0.001_wProjMat_1.0_wLmks_10.0_wd_0.0_do_1_122.ckpt\"\n                saver_Pose.restore(sess, load_path)\n\n                \n\n                load_model_time = time.time() - start_time                \n                print(\"Model restored: \" + str(load_model_time))\n\n\n                with open(input_sample_list_path, 'r') as fin:\n\n                        for line in fin:\n\n                                curr_line = line.strip().split(',')\n                                image_path = curr_line[0]\n                                bbox = np.array([float(curr_line[1]), float(curr_line[2]), float(curr_line[3]), float(curr_line[4])]) # [lt_x, lt_y, w, h]\n                                image_key = image_path.split('/')[-1][:-4]\n\n\n                                image = cv2.imread(image_path,1) # BGR                                                                            \n                                image = np.asarray(image)\n                        \n\n                                # Fix the grey image                                                                                                                       \n                                if len(image.shape) < 3:\n                                        image_r = np.reshape(image, (image.shape[0], image.shape[1], 1))\n                                        image = np.append(image_r, image_r, axis=2)\n                                        image = np.append(image, image_r, axis=2)\n\n\n\n                                # Crop and expand (25%) the image based on the tight bbox (from the face detector or detected lmks)\n                                factor = [1.9255, 2.2591, 1.9423, 1.6087];\n                                img_new = pu.preProcessImage_v2(image.copy(), bbox.copy(), factor, _resNetSize, 1)\n                                image_array = np.reshape(img_new, [1, _resNetSize, _resNetSize, 3])\n\n\n                                #print image_array\n                                (params_ProjMat, pool5_feats) = sess.run([ProjMat_preds, pool5], feed_dict={x_img: image_array}) # [scale, pitch, yaw, roll, translation_x, translation_y]\n                                params_ProjMat = params_ProjMat[0]\n                                #print params_ProjMat, pool5_feats      \n\n\n                                # save the predicted pose\n                                with open(FLAGS.save_output_path + '/' + image_key + '.txt', 'w') as fout:\n\n                                        for pp in params_ProjMat:\n                                                fout.write(str(pp) + '\\n')\n                                \n\n                                \n                                # Convert the 6DoF predicted pose to 3x4 projection matrix (weak-perspective projection)\n                                # Load BFM model \n                                shape_mat = sio.loadmat('./BFM/Model_Shape.mat')\n                                mu_shape = shape_mat['mu_shape'].astype('float32')\n                                \n                                expr_mat = sio.loadmat('./BFM/Model_Exp.mat')\n                                mu_exp = expr_mat['mu_exp'].astype('float32')\n                                \n                                mu = mu_shape + mu_exp\n                                len_mu = len(mu) \n                                mu = np.reshape(mu, [-1,1])\n\n                                keypoints = np.reshape(shape_mat['keypoints'], [-1]) - 1 # -1 for python index         \n                                keypoints = keypoints.astype('int32')\n                                \n                                \n                        \n                                vertex = np.reshape(mu, [len_mu/3, 3]) # # of vertices x 3\n                                # mean shape\n                                mesh = vertex.T # 3 x # of vertices\n                                mesh_1 = np.concatenate([mesh, np.ones([1,len_mu/3])], axis=0) # 4 x # of vertices\n\n\n                                # Get projection matrix from 6DoF pose\n                                ProjMat = np.reshape(params_ProjMat, [4,3])\n                                ProjMat = ProjMat.T\n                               \n                                \n\n                                # Get predicted shape\n                                #print ProjMat, ProjMat.shape\n                                #print mesh_1, mesh_1.shape\n                                pred_shape = np.matmul(ProjMat, mesh_1) # 3 x # of vertices\n                                pred_shape = pred_shape.T # # of vertices x 3\n                                \n\n                                pred_shape_x = np.reshape(pred_shape[:,0], [len_mu/3, 1])\n                                pred_shape_z = np.reshape(pred_shape[:,2], [len_mu/3, 1])\n                                pred_shape_y = 224 + 1 - pred_shape[:,1]\n                                pred_shape_y = np.reshape(pred_shape_y, [len_mu/3, 1])\n                                pred_shape = np.concatenate([pred_shape_x, pred_shape_y, pred_shape_z], 1)\n                \n\n\n\n                                # Convert shape and lmks back to the original image scale\n\n                                _, bbox_new, _, _, old_h, old_w, _ = pu.resize_crop_rescaleCASIA(image.copy(), bbox.copy(), pred_shape.copy(), factor)\n                                pred_shape[:,0] = pred_shape[:,0] * old_w / 224.\n                                pred_shape[:,1] = pred_shape[:,1] * old_h / 224.\n                                pred_shape[:,0] = pred_shape[:,0] + bbox_new[0]\n                                pred_shape[:,1] = pred_shape[:,1] + bbox_new[1]\n\n                                # Get predicted lmks\n                                pred_lmks = pred_shape[keypoints]\n\n\n\n                                sio.savemat(FLAGS.save_output_path + '/' + image_key + '.mat', {'shape_3D': pred_shape, 'lmks_3D': pred_lmks})\n\n\n                        \n\n                                # Obtain pose from ProjMat\n                                scale,R,t3d = pu.P2sRt(ProjMat) # decompose affine matrix to s, R, t\n                                pose = pu.matrix2angle(R) # yaw, pitch, roll\n\n                                # print scale, pitch, yaw , roll, translation_x, translation_y\n                                print scale, pose[1], pose[0], pose[2], t3d[0], t3d[1] \n\n\n             \n\n\n               \n               \n              \n\n\ndef main(_):\n        os.environ[\"CUDA_DEVICE_ORDER\"]=\"PCI_BUS_ID\"\n        os.environ[\"CUDA_VISIBLE_DEVICES\"]=str(gpuID)\n        if FLAGS.num_gpus == 0:\n                dev = '/cpu:0'\n        elif FLAGS.num_gpus == 1:\n                dev = '/gpu:0'\n        else:\n                raise ValueError('Only support 0 or 1 gpu.')\n\n        \n        print dev\n        with tf.device(dev):\n\t       extract_3dmm_ProjMat()\n\n\n\nif __name__ == '__main__':\n        tf.app.run()\n"
  },
  {
    "path": "models/README",
    "content": "Please download all the model files and put them in this folder\n"
  },
  {
    "path": "myparse.py",
    "content": "import csv\n\ndef parse_input(input_file):\n\tdata_dict = dict()\n\treader = csv.DictReader(open(input_file,'r'))\n\t#### Reading the metadata into a DICT\n\tfor line in reader:\n\t    key = line['ID']\n\t    data_dict[key] = {'file' :  line['FILE']  ,\\\n\t                      'x' : float( line['FACE_X'] ),\\\n\t                      'y' : float( line['FACE_Y'] ),\\\n\t                      'width' : float( line['FACE_WIDTH'] ),\\\n\t                      'height' : float( line['FACE_HEIGHT'] ),\\\n\t                  }\n\treturn data_dict"
  },
  {
    "path": "output_render/README.md",
    "content": "The rendered images will be saved here!\n\n## Subject 1 ##\n### input: ### \n![sbj1](../images/input1.jpg)\n### rendering: ### \n![sbj1](./subject1/subject1_a_rendered_aug_-00_00_10.jpg)\n![sbj1](./subject1/subject1_a_rendered_aug_-22_00_10.jpg)\n![sbj1](./subject1/subject1_a_rendered_aug_-40_00_10.jpg)\n![sbj1](./subject1/subject1_a_rendered_aug_-55_00_10.jpg)\n![sbj1](./subject1/subject1_a_rendered_aug_-75_00_10.jpg)\n\n\n## Subject 2 ##\n### input: ### \n![sbj2](../images/input2.jpg)\n### rendering: ### \n![sbj2](./subject2/subject2_a_rendered_aug_-40_00_10.jpg)\n![sbj2](./subject2/subject2_a_rendered_aug_-55_00_10.jpg)\n![sbj2](./subject2/subject2_a_rendered_aug_-75_00_10.jpg)\n\n## Subject 3 ##\n### input: ### \n![sbj3](../images/input3.jpg)\n### rendering: ### \n![sbj3](./subject3/subject3_a_rendered_aug_-40_00_10.jpg)\n![sbj3](./subject3/subject3_a_rendered_aug_-55_00_10.jpg)\n![sbj3](./subject3/subject3_a_rendered_aug_-75_00_10.jpg)\n\n## Subject 4 ##\n### input: ### \n![sbj4](../images/input4.jpg)\n### rendering: ### \n![sbj4](./subject4/subject4_a_rendered_aug_-40_00_10.jpg)\n![sbj4](./subject4/subject4_a_rendered_aug_-55_00_10.jpg)\n![sbj4](./subject4/subject4_a_rendered_aug_-75_00_10.jpg)\n\n## Subject 5 ##\n## input: ##\n\n### input: ### \n![sbj5](../images/input5.jpg)\n### rendering: ### \n![sbj5](./subject5/subject5_a_rendered_aug_-40_00_10.jpg)\n![sbj5](./subject5/subject5_a_rendered_aug_-55_00_10.jpg)\n![sbj5](./subject5/subject5_a_rendered_aug_-75_00_10.jpg)\n\n## Subject 6 ##\n### input: ### \n![sbj6](../images/input6.jpg)\n### rendering: ### \n![sbj6](./subject6/subject6_a_rendered_aug_-40_00_10.jpg)\n![sbj6](./subject6/subject6_a_rendered_aug_-55_00_10.jpg)\n![sbj6](./subject6/subject6_a_rendered_aug_-75_00_10.jpg)\n\n## Subject 7 ##\n### input: ### \n![sbj7](../images/input7.jpg)\n### rendering: ### \n![sbj7](./subject7/subject7_a_rendered_aug_-00_00_10.jpg)\n![sbj7](./subject7/subject7_a_rendered_aug_-22_00_10.jpg)\n![sbj7](./subject7/subject7_a_rendered_aug_-40_00_10.jpg)\n![sbj7](./subject7/subject7_a_rendered_aug_-55_00_10.jpg)\n![sbj7](./subject7/subject7_a_rendered_aug_-75_00_10.jpg)\n\n## Subject 8 ##\n### input: ### \n![sbj8](../images/input8.jpg)\n### rendering: ### \n![sbj8](./subject8/subject8_a_rendered_aug_-40_00_10.jpg)\n![sbj8](./subject8/subject8_a_rendered_aug_-55_00_10.jpg)\n![sbj8](./subject8/subject8_a_rendered_aug_-75_00_10.jpg)\n\n## Subject 9 ##\n### input: ### \n![sbj9](../images/input9.jpg)\n### rendering: ### \n![sbj9](./subject9/subject9_a_rendered_aug_-40_00_10.jpg)\n![sbj9](./subject9/subject9_a_rendered_aug_-55_00_10.jpg)\n![sbj9](./subject9/subject9_a_rendered_aug_-75_00_10.jpg)\n\n## Subject 10 ##\n### input: ### \n![sbj10](../images/input10.jpg)\n### rendering: ### \n![sbj10](./subject10/subject10_a_rendered_aug_-00_00_10.jpg)\n![sbj10](./subject10/subject10_a_rendered_aug_-22_00_10.jpg)\n![sbj10](./subject10/subject10_a_rendered_aug_-40_00_10.jpg)\n![sbj10](./subject10/subject10_a_rendered_aug_-55_00_10.jpg)\n![sbj10](./subject10/subject10_a_rendered_aug_-75_00_10.jpg)\n\n"
  },
  {
    "path": "pose_model.py",
    "content": "# Copyright 2016 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\n\"\"\"ResNet model.\n\nRelated papers:\nhttps://arxiv.org/pdf/1603.05027v2.pdf\nhttps://arxiv.org/pdf/1512.03385v1.pdf\nhttps://arxiv.org/pdf/1605.07146v1.pdf\n\"\"\"\n\nimport numpy as np\nimport tensorflow as tf\nfrom tensorflow.python.training import moving_averages\n\n#import sys\n#sys.path.append('/staging/pn/fengjuch/transformer')\n#from spatial_transformer import transformer\n#from tf_utils import weight_variable, bias_variable, dense_to_one_hot\n\n\"\"\"\nHParams = namedtuple('HParams',\n                     'batch_size, num_classes, min_lrn_rate, lrn_rate, '\n                     'num_residual_units, use_bottleneck, weight_decay_rate, '\n                     'relu_leakiness, optimizer')\n\"\"\"\n\nclass ThreeD_Pose_Estimation(object):\n  \"\"\"ResNet model.\"\"\"\n\n  def __init__(self, images, labels, mode, ifdropout, keep_rate_fc6, keep_rate_fc7, lr_rate_fac, net_data, batch_size, mean_labels, std_labels):\n    \"\"\"ResNet constructor.\n\n    Args:\n      hps: Hyperparameters.\n      images: Batches of images. [batch_size, image_size, image_size, 3]\n      labels: Batches of labels. [batch_size, num_classes]\n      mode: One of 'train' and 'eval'.\n    \"\"\"\n    #self.hps = hps\n    self.batch_size = batch_size\n    self._images = images\n    self.labels = labels\n    \n    self.mode = mode\n    self.ifdropout = ifdropout\n    self.keep_rate_fc6 = keep_rate_fc6\n    self.keep_rate_fc7 = keep_rate_fc7\n    self.ifadd_weight_decay = 0 #ifadd_weight_decay\n    self.net_data = net_data\n    self.lr_rate_fac = lr_rate_fac\n    self._extra_train_ops = []\n    self.optimizer = 'Adam'\n    self.mean_labels = mean_labels\n    self.std_labels = std_labels\n    #self.train_mean_vec = train_mean_vec\n\n  def _build_graph(self):\n    \"\"\"Build a whole graph for the model.\"\"\"\n    self.global_step = tf.Variable(0, name='global_step', trainable=False)\n    self._build_model()\n    \n    if self.mode == 'train':\n      self._build_train_op()\n    \n    #self.summaries = tf.merge_all_summaries()\n\n  def _stride_arr(self, stride):\n    \"\"\"Map a stride scalar to the stride array for tf.nn.conv2d.\"\"\"\n    return [1, stride, stride, 1]\n\n  def _build_model(self):\n    \"\"\"Build the core model within the graph.\"\"\"\n    #with tf.variable_scope('init'):\n     # x = self._images\n     # print x, x.get_shape()\n     # x = self._conv('init_conv', x, 3, 3, 16, self._stride_arr(1))\n     # print x, x.get_shape()\n    with tf.variable_scope('Spatial_Transformer'):\n      x = self._images\n      x = tf.image.resize_bilinear(x, tf.constant([227,227], dtype=tf.int32)) # the image should be 227 x 227 x 3\n      print x.get_shape()\n      self.resized_img = x\n      theta = self._ST('ST2', x, 3, (16,16), 3, 16, self._stride_arr(1))\n      #print \"*** \", x.get_shape()\n   \n\n    #with tf.variable_scope('logit'):\n    #  logits = self._fully_connected(theta, self.hps.num_classes)\n    #  self.predictions = tf.nn.softmax(logits)\n      #print \"*** \", logits, self.predictions\n\n    with tf.variable_scope('costs'):\n      self.predictions = theta\n      self.preds_unNormalized = theta * (self.std_labels + 0.000000000000000001) + self.mean_labels\n      pred_dim1 = theta.get_shape()[0]\n      pred_dim2 = theta.get_shape()[1]\n\n      del theta\n      #diff = self.predictions - self.labels\n      #print diff\n      \n      #xent = tf.mul(diff, diff) #tf.nn.l2_loss(diff)\n      #print xent\n      #xent = tf.reduce_sum(xent, 1)\n      pow_res = tf.pow(self.predictions-self.labels, 2)\n      \"\"\"\n      print pow_res, pow_res.get_shape()\n      const1 = tf.constant(1.0,shape=[pred_dim1, 3],dtype=tf.float32)\n      const2 = tf.constant(1.0,shape=[pred_dim1, 3],dtype=tf.float32)\n      #print const1, const2, const1.get_shape(), const2.get_shape()\n      const = tf.concat(1,[const1, const2])\n      print const, const.get_shape()\n      cpow_res = tf.mul(const,pow_res) \n      xent = tf.reduce_sum(cpow_res,1)\n      print xent\n      \"\"\"\n      xent = tf.reduce_sum(pow_res,1)\n      self.cost = tf.reduce_mean(xent, name='xent')\n      #print self.cost\n      \n      #self.cost = tf.nn.l2_loss(diff)\n      #  Add weight decay of needed\n      if self.ifadd_weight_decay == 1:\n        self.cost += self._decay()\n      \n\n      #self.train_step = tf.train.GradientDescentOptimizer(self.hps.lrn_rate).minimize(self.cost)\n\n      #tf.scalar_summary('cost', self.cost)\n\n\n\n  def conv(self, input, kernel, biases, k_h, k_w, c_o, s_h, s_w,  padding=\"VALID\", group=1):\n    '''From https://github.com/ethereon/caffe-tensorflow\n    '''\n    c_i = input.get_shape()[-1]\n    assert c_i%group==0\n    assert c_o%group==0\n    convolve = lambda i, k: tf.nn.conv2d(i, k, [1, s_h, s_w, 1], padding=padding)\n    \n    \n    if group==1:\n        conv = convolve(input, kernel)\n    else:\n        #input_groups = tf.split(3, group, input)\n        #kernel_groups = tf.split(3, group, kernel)\n        input_groups = tf.split(input, group, 3)\n        kernel_groups = tf.split(kernel, group, 3)\n        output_groups = [convolve(i, k) for i,k in zip(input_groups, kernel_groups)]\n        #conv = tf.concat(3, output_groups)\n        conv = tf.concat(output_groups, 3)\n    return  tf.reshape(tf.nn.bias_add(conv, biases), [-1]+conv.get_shape().as_list()[1:])\n\n\n\n\n\n  def _ST(self, name, x, channel_x, out_size, filter_size, out_filters, strides):\n    \"\"\" Spatial Transformer. \"\"\"\n\n    with tf.variable_scope(name):\n\n      # zero-mean input [B,G,R]: [93.5940, 104.7624, 129.1863] --> provided by vgg-face\n      \"\"\"\n      with tf.name_scope('preprocess') as scope:\n        mean = tf.constant(tf.reshape(self.train_mean_vec*255.0, [3]), dtype=tf.float32, shape=[1, 1, 1, 3], name='img_mean')\n        x = x - mean\n      \"\"\"\n\n      # conv1\n      with tf.name_scope('conv1') as scope:\n        #conv(11, 11, 96, 4, 4, padding='VALID', name='conv1')\n        k_h = 11; k_w = 11; c_o = 96; s_h = 4; s_w = 4\n        conv1W = tf.Variable(self.net_data[\"conv1\"][\"weights\"], trainable=True, name='W')\n        conv1b = tf.Variable(self.net_data[\"conv1\"][\"biases\"], trainable=True, name='baises')\n        conv1_in = self.conv(x, conv1W, conv1b, k_h, k_w, c_o, s_h, s_w, padding=\"SAME\", group=1)\n        conv1 = tf.nn.relu(conv1_in, name='conv1')\n        print x.get_shape(), conv1.get_shape()\n        \n\n        #maxpool1\n        #max_pool(3, 3, 2, 2, padding='VALID', name='pool1')\n        k_h = 3; k_w = 3; s_h = 2; s_w = 2; padding = 'VALID'\n        maxpool1 = tf.nn.max_pool(conv1, ksize=[1, k_h, k_w, 1], strides=[1, s_h, s_w, 1], padding=padding, name='pool1')\n        print maxpool1.get_shape()\n        \n\n        #lrn1                                                                                                                   \n        #lrn(2, 2e-05, 0.75, name='norm1')                                                                                      \n        radius = 2; alpha = 2e-05; beta = 0.75; bias = 1.0\n        lrn1 = tf.nn.local_response_normalization(maxpool1,\n                                                  depth_radius=radius,\n                                                  alpha=alpha,\n                                                  beta=beta,\n                                                  bias=bias, name='norm1')\n\n\n\n      # conv2\n      with tf.name_scope('conv2') as scope:\n        #conv(5, 5, 256, 1, 1, group=2, name='conv2')\n        k_h = 5; k_w = 5; c_o = 256; s_h = 1; s_w = 1; group = 2\n        conv2W = tf.Variable(self.net_data[\"conv2\"][\"weights\"], trainable=True, name='W')\n        conv2b = tf.Variable(self.net_data[\"conv2\"][\"biases\"], trainable=True, name='baises')\n        conv2_in = self.conv(lrn1, conv2W, conv2b, k_h, k_w, c_o, s_h, s_w, padding=\"SAME\", group=group)\n        conv2 = tf.nn.relu(conv2_in, name='conv2')\n        print conv2.get_shape()\n\n\n\n        #maxpool2                                                                                                              \n        #max_pool(3, 3, 2, 2, padding='VALID', name='pool2')                                                                    \n        k_h = 3; k_w = 3; s_h = 2; s_w = 2; padding = 'VALID'\n        maxpool2 = tf.nn.max_pool(conv2, ksize=[1, k_h, k_w, 1], strides=[1, s_h, s_w, 1], padding=padding, name='pool2')\n        print maxpool2.get_shape()\n\n\n\n        #lrn2\n        #lrn(2, 2e-05, 0.75, name='norm2')\n        radius = 2; alpha = 2e-05; beta = 0.75; bias = 1.0\n        lrn2 = tf.nn.local_response_normalization(maxpool2,\n                                                  depth_radius=radius,\n                                                  alpha=alpha,\n                                                  beta=beta,\n                                                  bias=bias, name='norm2')\n\n        \n\n      # conv3                                                                                                                                   \n      with tf.name_scope('conv3') as scope:\n        #conv(3, 3, 384, 1, 1, name='conv3')\n        k_h = 3; k_w = 3; c_o = 384; s_h = 1; s_w = 1; group = 1\n        conv3W = tf.Variable(self.net_data[\"conv3\"][\"weights\"], trainable=True, name='W')\n        conv3b = tf.Variable(self.net_data[\"conv3\"][\"biases\"], trainable=True, name='baises')\n        conv3_in = self.conv(lrn2, conv3W, conv3b, k_h, k_w, c_o, s_h, s_w, padding=\"SAME\", group=group)\n        conv3 = tf.nn.relu(conv3_in, name='conv3')\n        print conv3.get_shape()\n    \n      # conv4                                                                                                                                                            \n      with tf.name_scope('conv4') as scope:\n        #conv(3, 3, 384, 1, 1, group=2, name='conv4')\n        k_h = 3; k_w = 3; c_o = 384; s_h = 1; s_w = 1; group = 2\n        conv4W = tf.Variable(self.net_data[\"conv4\"][\"weights\"], trainable=True, name='W')\n        conv4b = tf.Variable(self.net_data[\"conv4\"][\"biases\"], trainable=True, name='baises')\n        conv4_in = self.conv(conv3, conv4W, conv4b, k_h, k_w, c_o, s_h, s_w, padding=\"SAME\", group=group)\n        conv4 = tf.nn.relu(conv4_in, name='conv4')\n        print conv4.get_shape()\n\n      # conv5                                                                                                                                             \n      with tf.name_scope('conv5') as scope:\n        #conv(3, 3, 256, 1, 1, group=2, name='conv5')\n        k_h = 3; k_w = 3; c_o = 256; s_h = 1; s_w = 1; group = 2\n        conv5W = tf.Variable(self.net_data[\"conv5\"][\"weights\"], trainable=True, name='W')\n        conv5b = tf.Variable(self.net_data[\"conv5\"][\"biases\"], trainable=True, name='baises')\n        self.conv5b = conv5b\n        conv5_in = self.conv(conv4, conv5W, conv5b, k_h, k_w, c_o, s_h, s_w, padding=\"SAME\", group=group)\n        conv5 = tf.nn.relu(conv5_in, name='conv5')\n        print conv5.get_shape()\n\n        #maxpool5\n        #max_pool(3, 3, 2, 2, padding='VALID', name='pool5')\n        k_h = 3; k_w = 3; s_h = 2; s_w = 2; padding = 'VALID'\n        maxpool5 = tf.nn.max_pool(conv5, ksize=[1, k_h, k_w, 1], strides=[1, s_h, s_w, 1], padding=padding, name='pool5')\n        print maxpool5.get_shape(), maxpool5.get_shape()[1:], int(np.prod(maxpool5.get_shape()[1:]))\n        \n      \n      # fc6\n      with tf.variable_scope('fc6') as scope:\n        #fc(4096, name='fc6')\n        fc6W = tf.Variable(self.net_data[\"fc6\"][\"weights\"], trainable=True, name='W')\n        fc6b = tf.Variable(self.net_data[\"fc6\"][\"biases\"], trainable=True, name='baises')\n        self.fc6W = fc6W\n        self.fc6b = fc6b\n        fc6 = tf.nn.relu_layer(tf.reshape(maxpool5, [-1, int(np.prod(maxpool5.get_shape()[1:]))]), fc6W, fc6b, name='fc6')\n        print fc6.get_shape()\n        if self.ifdropout == 1:\n          fc6 = tf.nn.dropout(fc6, self.keep_rate_fc6, name='fc6_dropout')\n            \n      # fc7 \n      with tf.variable_scope('fc7') as scope:\n        #fc(4096, name='fc7')\n        fc7W = tf.Variable(self.net_data[\"fc7\"][\"weights\"], trainable=True, name='W')\n        fc7b = tf.Variable(self.net_data[\"fc7\"][\"biases\"], trainable=True, name='baises')\n        self.fc7b = fc7b\n        fc7 = tf.nn.relu_layer(fc6, fc7W, fc7b, name='fc7')\n        print fc7.get_shape()\n        if self.ifdropout == 1:\n          fc7 = tf.nn.dropout(fc7, self.keep_rate_fc7, name='fc7_dropout')\n                                                                                                   \n      # fc8  \n      with tf.variable_scope('fc8') as scope:\n        \"\"\"\n        #fc(6, relu=False, name='fc8')\n        fc8W = tf.Variable(net_data[\"fc8\"][0])\n        fc8b = tf.Variable(net_data[\"fc8\"][1])\n        fc8 = tf.nn.xw_plus_b(fc7, fc8W, fc8b)\n        \"\"\"\n\n        # Move everything into depth so we can perform a single matrix multiplication.                            \n        fc7 = tf.reshape(fc7, [self.batch_size, -1])\n        dim = fc7.get_shape()[1].value\n        #print \"fc7 dim:\\n\"\n        #print fc7.get_shape(), dim\n        fc8W = tf.Variable(tf.random_normal([dim, 6], mean=0.0, stddev=0.01), trainable=True, name='W')                                                                    \n        fc8b = tf.Variable(tf.zeros([6]), trainable=True, name='baises')                                                                                                      \n        self.fc8b = fc8b\n        theta = tf.nn.xw_plus_b(fc7, fc8W, fc8b)  \n\n        \"\"\"\n        weights = self._variable_with_weight_decay('weights', shape=[dim, 6],\n                                          stddev=0.04, wd=None) #wd=0.004)\n        biases = self._variable_on_cpu('biases', [6], tf.constant_initializer(0.1))\n        theta = tf.matmul(reshape, weights) + biases\n        \n        print theta.get_shape()\n        \"\"\"\n\n        self.theta = theta\n        self.fc8W = fc8W\n        self.fc8b = fc8b\n        # %% We'll create a spatial transformer module to identify discriminative\n        # %% patches\n        #h_trans = self._transform(theta, x, out_size, channel_x)\n        #print h_trans.get_shape()\n      return theta\n\n\n\n  def _variable_with_weight_decay(self, name, shape, stddev, wd):\n    \"\"\"Helper to create an initialized Variable with weight decay.                                                                                                               \n    Note that the Variable is initialized with a truncated normal distribution.                                                                                                                             \n    A weight decay is added only if one is specified.                                                                                                                                                       \n    Args:                                                                                                                                                                                                   \n    name: name of the variable                                                                                                                                                                              \n    shape: list of ints                                                                                                                                                                                     \n    stddev: standard deviation of a truncated Gaussian                                                                                                                                                      \n    wd: add L2Loss weight decay multiplied by this float. If None, weight                                                                                                                                   \n        decay is not added for this Variable.                                                                                                                                                               \n    Returns:                                                                                                                                                                                                \n    Variable Tensor                                                                                                                                                                                         \n    \"\"\"\n    dtype = tf.float32 #if FLAGS.use_fp16 else tf.float32                                                                                                                                                    \n    var = self._variable_on_cpu(\n      name,\n      shape,\n      tf.truncated_normal_initializer(stddev=stddev, dtype=dtype))\n    if wd is not None:\n      weight_decay = tf.mul(tf.nn.l2_loss(var), wd, name='weight_loss')\n      tf.add_to_collection('losses', weight_decay)\n    return var\n\n\n\n  def _variable_on_cpu(self, name, shape, initializer):\n    \"\"\"Helper to create a Variable stored on CPU memory.                                                                                                                                                    \n    Args:                                                                                                                                                                                                       name: name of the variable                                                                                                                                                                              \n    shape: list of ints                                                                                                                                                                                     \n    initializer: initializer for Variable                                                                                                                                                                   \n    Returns:                                                                                                                                                                                                \n    Variable Tensor                                                                                                                                                                                         \n    \"\"\"\n    with tf.device('/cpu:0'):\n      dtype = tf.float32 # if FLAGS.use_fp16 else tf.float32                                                                                                                                                 \n      var = tf.get_variable(name, shape, initializer=initializer, dtype=dtype)\n    return var\n\n\n\n\n\n  def _build_train_op(self):\n    \"\"\"Build training specific ops for the graph.\"\"\"\n    #self.lrn_rate = tf.constant(self.hps.lrn_rate, tf.float32)\n    #tf.scalar_summary('learning rate', self.lrn_rate)\n    \"\"\"\n    trainable_variables = tf.trainable_variables()\n    grads = tf.gradients(self.cost, trainable_variables)\n    \"\"\"\n    if self.optimizer == 'sgd':\n      optimizer = tf.train.GradientDescentOptimizer(self.lrn_rate)\n    elif self.optimizer == 'Adam':\n      optimizer = tf.train.AdamOptimizer(0.001 * self.lr_rate_fac)\n    elif self.optimizer == 'mom':\n      optimizer = tf.train.MomentumOptimizer(self.lrn_rate, 0.9)\n    \n    \"\"\"\n    apply_op = optimizer.apply_gradients(\n        zip(grads, trainable_variables),\n        global_step=self.global_step, name='train_step')\n\n    train_ops = [apply_op] + self._extra_train_ops\n    self.train_op = tf.group(*train_ops)\n    \"\"\"\n\n    self.train_op = optimizer.minimize(self.cost)\n\n  # TODO(xpan): Consider batch_norm in contrib/layers/python/layers/layers.py\n  def _batch_norm(self, name, x):\n    \"\"\"Batch normalization.\"\"\"\n    with tf.variable_scope(name):\n      params_shape = [x.get_shape()[-1]]\n      #print x.get_shape(), params_shape\n      beta = tf.get_variable(\n          'beta', params_shape, tf.float32,\n          initializer=tf.constant_initializer(0.0, tf.float32))\n      gamma = tf.get_variable(\n          'gamma', params_shape, tf.float32,\n          initializer=tf.constant_initializer(1.0, tf.float32))\n\n      if self.mode == 'train':\n        mean, variance = tf.nn.moments(x, [0, 1, 2], name='moments')\n\n        moving_mean = tf.get_variable(\n            'moving_mean', params_shape, tf.float32,\n            initializer=tf.constant_initializer(0.0, tf.float32),\n            trainable=False)\n        moving_variance = tf.get_variable(\n            'moving_variance', params_shape, tf.float32,\n            initializer=tf.constant_initializer(1.0, tf.float32),\n            trainable=False)\n\n        self._extra_train_ops.append(moving_averages.assign_moving_average(\n            moving_mean, mean, 0.9))\n        self._extra_train_ops.append(moving_averages.assign_moving_average(\n            moving_variance, variance, 0.9))\n      else:\n        mean = tf.get_variable(\n            'moving_mean', params_shape, tf.float32,\n            initializer=tf.constant_initializer(0.0, tf.float32),\n            trainable=False)\n        variance = tf.get_variable(\n            'moving_variance', params_shape, tf.float32,\n            initializer=tf.constant_initializer(1.0, tf.float32),\n            trainable=False)\n        tf.histogram_summary(mean.op.name, mean)\n        tf.histogram_summary(variance.op.name, variance)\n      # elipson used to be 1e-5. Maybe 0.001 solves NaN problem in deeper net.\n      y = tf.nn.batch_normalization(\n          x, mean, variance, beta, gamma, 0.001)\n      y.set_shape(x.get_shape())\n      return y\n\n  def _residual(self, x, in_filter, out_filter, stride,\n                activate_before_residual=False):\n    \"\"\"Residual unit with 2 sub layers.\"\"\"\n    if activate_before_residual:\n      with tf.variable_scope('shared_activation'):\n        x = self._batch_norm('init_bn', x)\n        x = self._relu(x, self.hps.relu_leakiness)\n        orig_x = x\n    else:\n      with tf.variable_scope('residual_only_activation'):\n        orig_x = x\n        x = self._batch_norm('init_bn', x)\n        x = self._relu(x, self.hps.relu_leakiness)\n\n    with tf.variable_scope('sub1'):\n      x = self._conv('conv1', x, 3, in_filter, out_filter, stride)\n\n    with tf.variable_scope('sub2'):\n      x = self._batch_norm('bn2', x)\n      x = self._relu(x, self.hps.relu_leakiness)\n      x = self._conv('conv2', x, 3, out_filter, out_filter, [1, 1, 1, 1])\n\n    with tf.variable_scope('sub_add'):\n      if in_filter != out_filter:\n        orig_x = tf.nn.avg_pool(orig_x, stride, stride, 'VALID')\n        orig_x = tf.pad(\n            orig_x, [[0, 0], [0, 0], [0, 0],\n                     [(out_filter-in_filter)//2, (out_filter-in_filter)//2]])\n      x += orig_x\n\n    tf.logging.info('image after unit %s', x.get_shape())\n    return x\n\n  def _bottleneck_residual(self, x, in_filter, out_filter, stride,\n                           activate_before_residual=False):\n    \"\"\"Bottleneck resisual unit with 3 sub layers.\"\"\"\n    if activate_before_residual:\n      with tf.variable_scope('common_bn_relu'):\n        x = self._batch_norm('init_bn', x)\n        x = self._relu(x, self.hps.relu_leakiness)\n        orig_x = x\n    else:\n      with tf.variable_scope('residual_bn_relu'):\n        orig_x = x\n        x = self._batch_norm('init_bn', x)\n        x = self._relu(x, self.hps.relu_leakiness)\n\n    with tf.variable_scope('sub1'):\n      x = self._conv('conv1', x, 1, in_filter, out_filter/4, stride)\n\n    with tf.variable_scope('sub2'):\n      x = self._batch_norm('bn2', x)\n      x = self._relu(x, self.hps.relu_leakiness)\n      x = self._conv('conv2', x, 3, out_filter/4, out_filter/4, [1, 1, 1, 1])\n\n    with tf.variable_scope('sub3'):\n      x = self._batch_norm('bn3', x)\n      x = self._relu(x, self.hps.relu_leakiness)\n      x = self._conv('conv3', x, 1, out_filter/4, out_filter, [1, 1, 1, 1])\n\n    with tf.variable_scope('sub_add'):\n      if in_filter != out_filter:\n        orig_x = self._conv('project', orig_x, 1, in_filter, out_filter, stride)\n      x += orig_x\n\n    tf.logging.info('image after unit %s', x.get_shape())\n    return x\n\n  def _decay(self):\n    \"\"\"L2 weight decay loss.\"\"\"\n    costs = []\n    for var in tf.trainable_variables():\n      if var.op.name.find(r'DW') > 0:\n        costs.append(tf.nn.l2_loss(var))\n        aaa = tf.nn.l2_loss(var)\n        #print aaa\n        # tf.histogram_summary(var.op.name, var)\n\n    return tf.mul(self.hps.weight_decay_rate, tf.add_n(costs))\n\n  def _conv(self, name, x, filter_size, in_filters, out_filters, strides):\n    \"\"\"Convolution.\"\"\"\n    with tf.variable_scope(name):\n      n = filter_size * filter_size * out_filters\n      kernel = tf.get_variable(\n          'DW', [filter_size, filter_size, in_filters, out_filters],\n          tf.float32, initializer=tf.random_normal_initializer(\n              stddev=np.sqrt(2.0/n)))\n      return tf.nn.conv2d(x, kernel, strides, padding='SAME')\n\n  def _relu(self, x, leakiness=0.0):\n    \"\"\"Relu, with optional leaky support.\"\"\"\n    return tf.select(tf.less(x, 0.0), leakiness * x, x, name='leaky_relu')\n\n  def _fully_connected(self, x, out_dim):\n    \"\"\"FullyConnected layer for final output.\"\"\"\n    x = tf.reshape(x, [self.hps.batch_size, -1])\n    #print \"*** \", x.get_shape()\n    w = tf.get_variable(\n        'DW', [x.get_shape()[1], out_dim],\n        initializer=tf.uniform_unit_scaling_initializer(factor=1.0))\n    #print \"*** \", w.get_shape()\n    b = tf.get_variable('biases', [out_dim],\n                        initializer=tf.constant_initializer())\n    #print \"*** \", b.get_shape()\n    aaa = tf.nn.xw_plus_b(x, w, b)\n    #print \"*** \", aaa.get_shape()\n    return tf.nn.xw_plus_b(x, w, b)\n\n \n  def _fully_connected_ST(self, x, out_dim):\n    \"\"\"FullyConnected layer for final output of the localization network in the spatial transformer\"\"\"\n    x = tf.reshape(x, [self.hps.batch_size, -1])\n    w = tf.get_variable(\n        'DW2', [x.get_shape()[1], out_dim],\n        initializer=tf.uniform_unit_scaling_initializer(factor=1.0))\n    initial = np.array([[1., 0, 0], [0, 1., 0]])\n    initial = initial.astype('float32')\n    initial = initial.flatten()\n    b = tf.get_variable('biases2', [out_dim],\n                        initializer=tf.constant_initializer(initial))\n    return tf.nn.xw_plus_b(x, w, b)\n\n   \n\n  def _global_avg_pool(self, x):\n    assert x.get_shape().ndims == 4\n    return tf.reduce_mean(x, [1, 2])\n\n\n  def _repeat(self, x, n_repeats):\n    with tf.variable_scope('_repeat'):\n      rep = tf.transpose(\n        tf.expand_dims(tf.ones(shape=tf.pack([n_repeats, ])), 1), [1, 0])\n      rep = tf.cast(rep, 'int32')\n      x = tf.matmul(tf.reshape(x, (-1, 1)), rep)\n      return tf.reshape(x, [-1])\n\n\n  def _interpolate(self, im, x, y, out_size, channel_x):\n    with tf.variable_scope('_interpolate2'):\n      # constants\n      num_batch = self.hps.batch_size #tf.shape(im)[0]\n      print num_batch\n      height = tf.shape(im)[1]\n      width = tf.shape(im)[2]\n      channels = tf.shape(im)[3]\n      print channels\n      #channels = tf.cast(channels, tf.int32)\n      #print channels\n      x = tf.cast(x, 'float32')\n      y = tf.cast(y, 'float32')\n      height_f = tf.cast(height, 'float32')\n      width_f = tf.cast(width, 'float32')\n      out_height = out_size[0]\n      out_width = out_size[1]\n      zero = tf.zeros([], dtype='int32')\n      #max_y = tf.cast(tf.shape(im)[1] - 1, 'int32')\n      #max_x = tf.cast(tf.shape(im)[2] - 1, 'int32')\n      \n      max_y = tf.cast(height - 1, 'int32')\n      max_x = tf.cast(width - 1, 'int32')\n      # scale indices from [-1, 1] to [0, width/height]\n      x = (x + 1.0)*(width_f) / 2.0\n      y = (y + 1.0)*(height_f) / 2.0\n\n      # do sampling\n      x0 = tf.cast(tf.floor(x), 'int32')\n      x1 = x0 + 1\n      y0 = tf.cast(tf.floor(y), 'int32')\n      y1 = y0 + 1\n\n      x0 = tf.clip_by_value(x0, zero, max_x)\n      x1 = tf.clip_by_value(x1, zero, max_x)\n      y0 = tf.clip_by_value(y0, zero, max_y)\n      y1 = tf.clip_by_value(y1, zero, max_y)\n      dim2 = width\n      dim1 = width*height\n      base = self._repeat(tf.range(num_batch)*dim1, out_height*out_width)\n      base_y0 = base + y0*dim2\n      base_y1 = base + y1*dim2\n      idx_a = base_y0 + x0\n      idx_b = base_y1 + x0\n      idx_c = base_y0 + x1\n      idx_d = base_y1 + x1\n\n      # use indices to lookup pixels in the flat image and restore\n      # channels dim\n      im_flat = tf.reshape(im, tf.pack([-1, channel_x]))\n      #aa = tf.pack([-1, channels])\n      #im_flat = tf.reshape(im, [-1, channels])\n      #print im.get_shape(), im_flat.get_shape() #, aa.get_shape()\n      im_flat = tf.cast(im_flat, 'float32')\n      Ia = tf.gather(im_flat, idx_a)\n      Ib = tf.gather(im_flat, idx_b)\n      Ic = tf.gather(im_flat, idx_c)\n      Id = tf.gather(im_flat, idx_d)\n      #print im_flat.get_shape(), idx_a.get_shape()\n      #print Ia.get_shape(), Ib.get_shape(), Ic.get_shape(), Id.get_shape()\n      # and finally calculate interpolated values\n      x0_f = tf.cast(x0, 'float32')\n      x1_f = tf.cast(x1, 'float32')\n      y0_f = tf.cast(y0, 'float32')\n      y1_f = tf.cast(y1, 'float32')\n      wa = tf.expand_dims(((x1_f-x) * (y1_f-y)), 1)\n      wb = tf.expand_dims(((x1_f-x) * (y-y0_f)), 1)\n      wc = tf.expand_dims(((x-x0_f) * (y1_f-y)), 1)\n      wd = tf.expand_dims(((x-x0_f) * (y-y0_f)), 1)\n      #print wa.get_shape(), wb.get_shape(), wc.get_shape(), wd.get_shape()\n      output = tf.add_n([wa*Ia, wb*Ib, wc*Ic, wd*Id])\n      #print output.get_shape()\n      return output\n\n  def _meshgrid(self, height, width):\n    with tf.variable_scope('_meshgrid'):\n      # This should be equivalent to:\n      #  x_t, y_t = np.meshgrid(np.linspace(-1, 1, width),\n      #                         np.linspace(-1, 1, height))\n      #  ones = np.ones(np.prod(x_t.shape))\n      #  grid = np.vstack([x_t.flatten(), y_t.flatten(), ones])\n      x_t = tf.matmul(tf.ones(shape=tf.pack([height, 1])),\n                        tf.transpose(tf.expand_dims(tf.linspace(-1.0, 1.0, width), 1), [1, 0]))\n      y_t = tf.matmul(tf.expand_dims(tf.linspace(-1.0, 1.0, height), 1),\n                        tf.ones(shape=tf.pack([1, width])))\n\n      x_t_flat = tf.reshape(x_t, (1, -1))\n      y_t_flat = tf.reshape(y_t, (1, -1))\n\n      ones = tf.ones_like(x_t_flat)\n      grid = tf.concat(0, [x_t_flat, y_t_flat, ones])\n      return grid\n\n  def _transform(self, theta, input_dim, out_size, channel_input):\n    with tf.variable_scope('_transform'):\n      print input_dim.get_shape(), theta.get_shape(), out_size[0], out_size[1]\n      num_batch = self.hps.batch_size #tf.shape(input_dim)[0]\n      height = tf.shape(input_dim)[1]\n      width = tf.shape(input_dim)[2]\n      num_channels = tf.shape(input_dim)[3]\n      theta = tf.reshape(theta, (-1, 2, 3))\n      theta = tf.cast(theta, 'float32')\n      \n      # grid of (x_t, y_t, 1), eq (1) in ref [1]\n      height_f = tf.cast(height, 'float32')\n      width_f = tf.cast(width, 'float32')\n      out_height = out_size[0]\n      out_width = out_size[1]\n      grid = self._meshgrid(out_height, out_width)\n      #print grid, grid.get_shape()\n      grid = tf.expand_dims(grid, 0)\n      grid = tf.reshape(grid, [-1])\n      grid = tf.tile(grid, tf.pack([num_batch]))\n      grid = tf.reshape(grid, tf.pack([num_batch, 3, -1]))\n      #print grid, grid.get_shape()\n\n      # Transform A x (x_t, y_t, 1)^T -> (x_s, y_s)\n      T_g = tf.batch_matmul(theta, grid)\n      x_s = tf.slice(T_g, [0, 0, 0], [-1, 1, -1])\n      y_s = tf.slice(T_g, [0, 1, 0], [-1, 1, -1])\n      x_s_flat = tf.reshape(x_s, [-1])\n      y_s_flat = tf.reshape(y_s, [-1])\n      #print x_s_flat.get_shape(), y_s_flat.get_shape()\n      input_transformed = self._interpolate(input_dim, x_s_flat, y_s_flat, out_size, channel_input)\n      #print input_transformed.get_shape()\n\n      output = tf.reshape(input_transformed, tf.pack([num_batch, out_height, out_width, channel_input]))\n      return output\n      #return input_dim\n"
  },
  {
    "path": "pose_utils.py",
    "content": "import sys\nimport os\n#sys.path.append('+glaive_pylib+')\n#import JanusUtils\nimport numpy as np\nimport cv2\nimport math\nimport fileinput\nimport shutil\n\ndef increaseBbox(bbox, factor):\n    tlx = bbox[0] \n    tly = bbox[1] \n    brx = bbox[2] \n    bry = bbox[3] \n    dx = factor\n    dy = factor\n    dw = 1 + factor\n    dh = 1 + factor\n    #Getting bbox height and width\n    w = brx-tlx;\n    h = bry-tly;\n    tlx2 = tlx - w * dx\n    tly2 = tly - h * dy\n    brx2 = tlx + w * dw\n    bry2 = tly + h * dh\n    nbbox = np.zeros( (4,1), dtype=np.float32 )\n    nbbox[0] = tlx2\n    nbbox[1] = tly2\n    nbbox[2] = brx2\n    nbbox[3] = bry2 \n    return nbbox\n\ndef image_bbox_processing_v2(img, bbox):\n    img_h, img_w, img_c = img.shape\n    lt_x = bbox[0]\n    lt_y = bbox[1]\n    rb_x = bbox[2]\n    rb_y = bbox[3]\n\n    fillings = np.zeros( (4,1), dtype=np.int32)\n    if lt_x < 0: ## 0 for python\n        fillings[0] = math.ceil(-lt_x)\n    if lt_y < 0:\n        fillings[1] = math.ceil(-lt_y)\n    if rb_x > img_w-1:\n        fillings[2] = math.ceil(rb_x - img_w + 1)\n    if rb_y > img_h-1:\n        fillings[3] = math.ceil(rb_y - img_h + 1)\n    new_bbox = np.zeros( (4,1), dtype=np.float32 )\n    # img = [zeros(size(img,1),fillings(1),img_c), img]\n    # img = [zeros(fillings(2), size(img,2),img_c); img]\n    # img = [img, zeros(size(img,1), fillings(3),img_c)]\n\n    # new_img = [img; zeros(fillings(4), size(img,2),img_c)]\n    imgc = img.copy()\n    if fillings[0] > 0:\n        img_h, img_w, img_c = imgc.shape\n        imgc = np.hstack( [np.zeros( (img_h, fillings[0][0], img_c), dtype=np.uint8 ), imgc] )    \n    if fillings[1] > 0:\n\n        img_h, img_w, img_c = imgc.shape\n        imgc = np.vstack( [np.zeros( (fillings[1][0], img_w, img_c), dtype=np.uint8 ), imgc] )\n    if fillings[2] > 0:\n\n\n        img_h, img_w, img_c = imgc.shape\n        imgc = np.hstack( [ imgc, np.zeros( (img_h, fillings[2][0], img_c), dtype=np.uint8 ) ] )    \n    if fillings[3] > 0:\n        img_h, img_w, img_c = imgc.shape\n        imgc = np.vstack( [ imgc, np.zeros( (fillings[3][0], img_w, img_c), dtype=np.uint8) ] )\n\n\n    new_bbox[0] = lt_x + fillings[0]\n    new_bbox[1] = lt_y + fillings[1]\n    new_bbox[2] = rb_x + fillings[0]\n    new_bbox[3] = rb_y + fillings[1]\n    return imgc, new_bbox\n\ndef preProcessImage(_savingDir, data_dict, data_root, factor, _alexNetSize, _listFile):\n    #### Formatting the images as needed\n    file_output = _listFile\n    count = 1\n    fileIn = open(file_output , 'w' )\n    for key in  data_dict.keys():\n        filename = data_dict[key]['file']\n        im = cv2.imread(data_root +  filename)\n        if im is not None:\n            print 'Processing ' + filename + ' '+ str(count)\n            sys.stdout.flush()\n            lt_x = data_dict[key]['x']\n            lt_y = data_dict[key]['y']\n            rb_x = lt_x + data_dict[key]['width']\n            rb_y = lt_y + data_dict[key]['height']\n            w = data_dict[key]['width']\n            h = data_dict[key]['height']\n            center = ( (lt_x+rb_x)/2, (lt_y+rb_y)/2 )\n            side_length = max(w,h);\n            bbox = np.zeros( (4,1), dtype=np.float32 )\n            bbox[0] = center[0] - side_length/2\n            bbox[1] = center[1] - side_length/2\n            bbox[2] = center[0] + side_length/2\n            bbox[3] = center[1] + side_length/2\n            #img_2, bbox_green = image_bbox_processing_v2(im, bbox)\n            #%% Get the expanded square bbox\n            bbox_red = increaseBbox(bbox, factor)\n            #[img, bbox_red] = image_bbox_processing_v2(img, bbox_red);\n            img_3, bbox_new = image_bbox_processing_v2(im, bbox_red)\n            #%% Crop and resized\n            #bbox_red = ceil(bbox_red);\n            bbox_new =  np.ceil( bbox_new )\n            #side_length = max(bbox_new(3) - bbox_new(1), bbox_new(4) - bbox_new(2));\n            side_length = max( bbox_new[2] - bbox_new[0], bbox_new[3] - bbox_new[1] )\n            bbox_new[2:4] = bbox_new[0:2] + side_length\n            #crop_img = img(bbox_red(2):bbox_red(4), bbox_red(1):bbox_red(3), :);\n            #resized_crop_img = imresize(crop_img, [227, 227]);# % re-scaling to 227 x 227\n            bbox_new = bbox_new.astype(int)\n            crop_img = img_3[bbox_new[1][0]:bbox_new[3][0], bbox_new[0][0]:bbox_new[2][0], :];\n            resized_crop_img = cv2.resize(crop_img, ( _alexNetSize, _alexNetSize ), interpolation = cv2.INTER_CUBIC)\n            cv2.imwrite(_savingDir + key + '.jpg', resized_crop_img )\n            \n            # flip image for latter use\n            img_flip = cv2.flip(resized_crop_img,1)\n            cv2.imwrite(_savingDir + key + '_flip.jpg', img_flip )\n\n            ## Tracking pose image\n            fileIn.write(key + ',')\n            fileIn.write(_savingDir + key + '.jpg\\n')\n            fileIn.write(key + '_flip,')\n            fileIn.write(_savingDir + key + '_flip.jpg\\n')\n        else:\n            print ' '.join(['Skipping image:', filename, 'Image is None', str(count)])\n        count+=1\n    fileIn.close()\n\ndef replaceInFile(filep, before, after):\n    for line in fileinput.input(filep, inplace=True):\n        print line.replace(before,after),\n\n"
  },
  {
    "path": "renderer_fpn.py",
    "content": "import csv\nimport lmdb\nimport sys\nimport numpy as np\nimport cv2\nimport os\nthis_path = os.path.dirname(os.path.abspath(__file__))\nrender_path = this_path+'/face_renderer/'\nsys.path.append(render_path)\ntry:\n    import myutil\nexcept ImportError as ie:\n    print '****************************************************************'\n    print '**** Have you forgotten to \"git clone --recursive\"?          ****'\n    print '**** You have to do that to also download the face renderer ****'\n    print '****************************************************************'\n    print ie.message\n    exit(0)\nimport config\nopts = config.parse()\nimport camera_calibration as calib\nimport ThreeD_Model\nimport renderer as renderer_core\nimport get_Rts as getRts\n#pose_models = ['model3D_aug_-00_00','model3D_aug_-22_00','model3D_aug_-40_00','model3D_aug_-55_00','model3D_aug_-75_00']\nnewModels = opts.getboolean('renderer', 'newRenderedViews')\nif opts.getboolean('renderer', 'newRenderedViews'):\n    pose_models_folder = '/models3d_new/'\n    pose_models = ['model3D_aug_-00_00','model3D_aug_-22_00','model3D_aug_-40_00','model3D_aug_-55_00','model3D_aug_-75_00']\nelse:\n    pose_models_folder = '/models3d/'\n    pose_models = ['model3D_aug_-00','model3D_aug_-40','model3D_aug_-75',]\nnSub = 10\nallModels = myutil.preload(render_path,pose_models_folder,pose_models,nSub)\n\ndef render_fpn(inputFile, output_pose_db, outputFolder):\n    ## Opening FPN pose db\n    pose_env = lmdb.open( output_pose_db, readonly=True )\n    pose_cnn_lmdb = pose_env.begin()\n    ## looping over images\n    with open(inputFile, 'r') as csvfile:\n        csvreader = csv.reader(csvfile, delimiter=',')\n        lines = csvfile.readlines()\n        for lin in lines:\n            ### key1, image_path_key_1\n            image_key = lin.split(',')[0]\n            if 'flip' in image_key:\n                continue\n\n            image_path = lin.split(',')[-1].rstrip('\\n')\n            img = cv2.imread(image_path, 1)\n            pose_Rt_raw = pose_cnn_lmdb.get( image_key )\n            pose_Rt_flip_raw = pose_cnn_lmdb.get(image_key + '_flip')\n\n            if pose_Rt_raw is not None:\n                pose_Rt = np.frombuffer( pose_Rt_raw, np.float32 )\n                pose_Rt_flip = np.frombuffer( pose_Rt_flip_raw, np.float32 )\n                \n                yaw = myutil.decideSide_from_db(img, pose_Rt, allModels)\n                \n\n                if yaw < 0: # Flip image and get the corresponsidng pose\n                    img = cv2.flip(img,1)\n                    pose_Rt = pose_Rt_flip\n                \n                \n\n                listPose = myutil.decidePose(yaw, opts, newModels)\n                ## Looping over the poses\n                for poseId in listPose:\n                    posee = pose_models[poseId]\n                    ## Looping over the subjects\n                    for subj in [10]:\n                        pose =   posee + '_' + str(subj).zfill(2) +'.mat'\n                        print '> Looking at file: ' + image_path + ' with ' + pose\n                        # load detections performed by dlib library on 3D model and Reference Image\n                        print \"> Using pose model in \" + pose\n                        ## Indexing the right model instead of loading it each time from memory.\n                        model3D = allModels[pose]\n                        eyemask = model3D.eyemask\n                        # perform camera calibration according to the first face detected\n                        proj_matrix, camera_matrix, rmat, tvec = calib.estimate_camera(model3D, pose_Rt, pose_db_on=True)\n                        ## We use eyemask only for frontal\n                        if not myutil.isFrontal(pose):\n                            eyemask = None\n                        ##### Main part of the code: doing the rendering #############\n                        rendered_raw, rendered_sym, face_proj, background_proj, temp_proj2_out_2, sym_weight = renderer_core.render(img, proj_matrix,\\\n                                                                                                 model3D.ref_U, eyemask, model3D.facemask, opts)\n                        ########################################################\n\n                        if myutil.isFrontal(pose):\n                            rendered_raw = rendered_sym\n                        ## Cropping if required by crop_models\n                        #rendered_raw = myutil.cropFunc(pose,rendered_raw,crop_models[poseId])\n                        ## Resizing if required\n                        #if resizeCNN:\n                        #    rendered_raw = cv2.resize(rendered_raw, ( cnnSize, cnnSize ), interpolation=cv2.INTER_CUBIC )\n                        ## Saving if required\n                        if opts.getboolean('general', 'saveON'):\n                            subjFolder = outputFolder + '/'+ image_key.split('_')[0]\n                            myutil.mymkdir(subjFolder)\n                            savingString = subjFolder +  '/' + image_key +'_rendered_'+ pose[8:-7]+'_'+str(subj).zfill(2)+'.jpg'\n                            cv2.imwrite(savingString,rendered_raw)\n"
  },
  {
    "path": "tf_utils.py",
    "content": "# Copyright 2016 The TensorFlow Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n# ==============================================================================\n\n# %% Borrowed utils from here: https://github.com/pkmital/tensorflow_tutorials/\n#import tensorflow as tf\nimport numpy as np\nimport csv\n\ndef conv2d(x, n_filters,\n           k_h=5, k_w=5,\n           stride_h=2, stride_w=2,\n           stddev=0.02,\n           activation=lambda x: x,\n           bias=True,\n           padding='SAME',\n           name=\"Conv2D\"):\n    \"\"\"2D Convolution with options for kernel size, stride, and init deviation.\n    Parameters\n    ----------\n    x : Tensor\n        Input tensor to convolve.\n    n_filters : int\n        Number of filters to apply.\n    k_h : int, optional\n        Kernel height.\n    k_w : int, optional\n        Kernel width.\n    stride_h : int, optional\n        Stride in rows.\n    stride_w : int, optional\n        Stride in cols.\n    stddev : float, optional\n        Initialization's standard deviation.\n    activation : arguments, optional\n        Function which applies a nonlinearity\n    padding : str, optional\n        'SAME' or 'VALID'\n    name : str, optional\n        Variable scope to use.\n    Returns\n    -------\n    x : Tensor\n        Convolved input.\n    \"\"\"\n    with tf.variable_scope(name):\n        w = tf.get_variable(\n            'w', [k_h, k_w, x.get_shape()[-1], n_filters],\n            initializer=tf.truncated_normal_initializer(stddev=stddev))\n        conv = tf.nn.conv2d(\n            x, w, strides=[1, stride_h, stride_w, 1], padding=padding)\n        if bias:\n            b = tf.get_variable(\n                'b', [n_filters],\n                initializer=tf.truncated_normal_initializer(stddev=stddev))\n            conv = conv + b\n        return conv\n    \ndef linear(x, n_units, scope=None, stddev=0.02,\n           activation=lambda x: x):\n    \"\"\"Fully-connected network.\n    Parameters\n    ----------\n    x : Tensor\n        Input tensor to the network.\n    n_units : int\n        Number of units to connect to.\n    scope : str, optional\n        Variable scope to use.\n    stddev : float, optional\n        Initialization's standard deviation.\n    activation : arguments, optional\n        Function which applies a nonlinearity\n    Returns\n    -------\n    x : Tensor\n        Fully-connected output.\n    \"\"\"\n    shape = x.get_shape().as_list()\n\n    with tf.variable_scope(scope or \"Linear\"):\n        matrix = tf.get_variable(\"Matrix\", [shape[1], n_units], tf.float32,\n                                 tf.random_normal_initializer(stddev=stddev))\n        return activation(tf.matmul(x, matrix))\n    \n# %%\ndef weight_variable(shape):\n    '''Helper function to create a weight variable initialized with\n    a normal distribution\n    Parameters\n    ----------\n    shape : list\n        Size of weight variable\n    '''\n    #initial = tf.random_normal(shape, mean=0.0, stddev=0.01)\n    initial = tf.zeros(shape)\n    return tf.Variable(initial)\n\n# %%\ndef bias_variable(shape):\n    '''Helper function to create a bias variable initialized with\n    a constant value.\n    Parameters\n    ----------\n    shape : list\n        Size of weight variable\n    '''\n    initial = tf.random_normal(shape, mean=0.0, stddev=0.01)\n    return tf.Variable(initial)\n\n# %% \ndef dense_to_one_hot(labels, n_classes=2):\n    \"\"\"Convert class labels from scalars to one-hot vectors.\"\"\"\n    labels = np.array(labels).astype('int32')\n    n_labels = labels.shape[0]\n    index_offset = (np.arange(n_labels) * n_classes).astype('int32')\n    labels_one_hot = np.zeros((n_labels, n_classes), dtype=np.float32)\n    labels_one_hot.flat[index_offset + labels.ravel()] = 1\n    return labels_one_hot\n\n\ndef prepare_trainVal_img_list(img_list, num_subjs):\n    #num_imgs_per_subj =np.zeros([num_subjs])\n    id_label_list = []\n    for row in img_list:\n        id_label = int(row[8])\n        #num_imgs_per_subj[id_label] += 1\n        id_label_list.append(id_label)\n\n    id_label_list = np.asarray(id_label_list)\n    id_label_list = np.reshape(id_label_list, [-1])\n    \n    train_indices_list = []\n    valid_indices_list= [] \n    eval_train_indices_list = []\n    eval_valid_indices_list = [] \n    for i in range(num_subjs):\n        print i\n        curr_subj_idx = np.nonzero(id_label_list == i)[0]\n        tmp = np.random.permutation(curr_subj_idx)\n        per80 = np.floor(len(curr_subj_idx) * 0.8)\n        t_inds = tmp[0:per80]\n        v_inds = tmp[per80:]\n\n        train_indices_list.append(t_inds)\n        valid_indices_list.append(v_inds)\n\n        eval_train_indices_list.append(t_inds[0])\n        eval_valid_indices_list.append(v_inds[0])\n\n    train_indices_list = np.asarray(train_indices_list)\n    valid_indices_list = np.asarray(valid_indices_list)\n    eval_train_indices_list = np.asarray(eval_train_indices_list)\n    eval_valid_indices_list = np.asarray(eval_valid_indices_list)\n    #print train_indices_list, train_indices_list.shape \n    \n    train_indices_list = np.hstack(train_indices_list).astype('int')\n    valid_indices_list = np.hstack(valid_indices_list).astype('int')\n    eval_train_indices_list = np.hstack(eval_train_indices_list).astype('int')\n    eval_valid_indices_list = np.hstack(eval_valid_indices_list).astype('int')\n    print train_indices_list.shape, valid_indices_list.shape, eval_train_indices_list.shape, eval_valid_indices_list.shape\n    \n    img_list = np.asarray(img_list)\n    print img_list.shape\n    train_list = img_list[train_indices_list]\n    valid_list = img_list[valid_indices_list]\n    eval_train_list = img_list[eval_train_indices_list]\n    eval_valid_list = img_list[eval_valid_indices_list]\n\n    np.savez(\"Oxford_trainVal_data_3DSTN.npz\", train_list=train_list, valid_list=valid_list, eval_train_list=eval_train_list, eval_valid_list=eval_valid_list)\n\n        \ndef select_eval_img_list(img_list, num_subjs, save_file_name):\n    # number of validation subjects\n\n    id_label_list = []\n    for row in img_list:\n        id_label = int(row[8])\n        id_label_list.append(id_label)\n\n    id_label_list = np.asarray(id_label_list)\n    id_label_list = np.reshape(id_label_list, [-1])\n\n\n    eval_indices_list = []\n    for i in range(num_subjs):\n        print i\n        curr_subj_idx = np.nonzero(id_label_list == i)[0]\n        tmp = np.random.permutation(curr_subj_idx)\n        inds = tmp[0:min(5, len(curr_subj_idx))]\n\n        eval_indices_list.append(inds)\n\n\n    eval_indices_list = np.asarray(eval_indices_list)\n    eval_indices_list = np.hstack(eval_indices_list).astype('int')\n    print eval_indices_list.shape\n\n    img_list = np.asarray(img_list)\n    print img_list.shape\n    eval_list = img_list[eval_indices_list]\n\n    np.savez(save_file_name, eval_list=eval_list)\n\n    \n\n    \"\"\"\n    # Record the number of images per subject                                                                                                                                                               \n    num_imgs_per_subj =np.zeros([num_subjs])\n    for row in valid_img_list:\n        id_label = int(row[8])\n        num_imgs_per_subj[id_label] += 1\n\n\n\n    hist_subj = np.zeros([num_subjs])\n    idx = 0\n    count = 0\n    for row in valid_img_list:\n        count += 1\n        print count\n        image_key = row[0]                                               \n        image_path = row[1]\n        id_label = int(row[8])\n         \n        if idx >= num_subjs:\n            break\n        \n        if hist_subj[idx] < min(1, num_imgs_per_subj[idx]):\n            if id_label == idx:\n                with open(save_file_name, \"a\") as f:                                                                                                                                         \n                    f.write(image_key + \",\" + image_path + \",\" + row[2] + \",\" + row[3] + \",\" + row[4] + \",\" + row[5] + \",\" + row[6] + \",\" + row[7] + \",\" + str(id_label) + \"\\n\")\n                hist_subj[idx] += 1\n        else:\n            idx += 1\n    \"\"\"\n\ndef input_processing(images, pose_labels, id_labels, train_mean_vec, mean_labels, std_labels, num_imgs, image_size, num_classes):\n\n    images = images.reshape([num_imgs, image_size, image_size, 3])\n    pose_labels = pose_labels.reshape([num_imgs, 6])\n    id_labels = id_labels.reshape([num_imgs, 1])\n\n                                                                                                                                                               \n    id_labels = dense_to_one_hot(id_labels, num_classes)\n\n    # Subtract train image mean                                                                                                                                                  \n    images = images / 255.\n    train_mean_mat = train_mean_vec2mat(train_mean_vec, images)\n    normalized_images = images - train_mean_mat\n   \n    # Normalize labels\n    normalized_pose_labels = (pose_labels - mean_labels) / (std_labels + 0.000000000000000001)\n\n                                \n    return normalized_images, normalized_pose_labels, id_labels\n\n\n\ndef train_mean_vec2mat(train_mean, images_array):\n        height = images_array.shape[1]\n        width = images_array.shape[2]\n        #batch = images_array.shape[0]                                                                                                                                                                       \n        train_mean_R = np.matlib.repmat(train_mean[0],height,width)\n        train_mean_G = np.matlib.repmat(train_mean[1],height,width)\n        train_mean_B = np.matlib.repmat(train_mean[2],height,width)\n\n        R = np.reshape(train_mean_R, (height,width,1))\n        G = np.reshape(train_mean_G, (height,width,1))\n        B = np.reshape(train_mean_B, (height,width,1))\n        train_mean_image = np.append(R, G, axis=2)\n        train_mean_image = np.append(train_mean_image, B, axis=2)\n\n        return train_mean_image\n\n\ndef create_file_list(csv_file_path):\n\n        with open(csv_file_path, 'r') as csvfile:\n                csvreader = csv.reader(csvfile, delimiter=',')\n                csv_list = list(csvreader)\n\n        return csv_list\n"
  },
  {
    "path": "train_stats/README",
    "content": "Here are the precomputed training data statistics\n"
  },
  {
    "path": "utils/README",
    "content": "Some utility functions are here\n"
  },
  {
    "path": "utils/pose_utils.py",
    "content": "import sys\nimport os\nimport numpy as np\nimport cv2\nimport math\nfrom math import cos, sin, atan2, asin\nimport fileinput\n\n## Index to remap landmarks in case we flip an image                                                                                                                                                     \nrepLand = [ 17,16,15,14,13,12,11,10, 9,8,7,6,5,4,3,2,1,27,26,25, \\\n            24,23,22,21,20,19,18,28,29,30,31,36,35,34,33,32,46,45,44,43, \\\n            48,47,40,39,38,37,42,41,55,54,53,52,51,50,49,60,59,58,57,56, \\\n            65,64,63,62,61,68,67,66 ]\n\n\ndef increaseBbox(bbox, factor):\n    tlx = bbox[0] \n    tly = bbox[1] \n    brx = bbox[2] \n    bry = bbox[3] \n    dx = factor\n    dy = factor\n    dw = 1 + factor\n    dh = 1 + factor\n    #Getting bbox height and width\n    w = brx-tlx;\n    h = bry-tly;\n    tlx2 = tlx - w * dx\n    tly2 = tly - h * dy\n    brx2 = tlx + w * dw\n    bry2 = tly + h * dh\n    nbbox = np.zeros( (4,1), dtype=np.float32 )\n    nbbox[0] = tlx2\n    nbbox[1] = tly2\n    nbbox[2] = brx2\n    nbbox[3] = bry2 \n    return nbbox\n\n\ndef increaseBbox_rescaleCASIA(bbox, factor):\n        tlx = bbox[0] \n        tly = bbox[1] \n        brx = bbox[2] \n        bry = bbox[3] \n    \n        ww = brx - tlx; \n        hh = bry - tly; \n        cx = tlx + ww/2;\n        cy = tly + hh/2;\n        tsize = max(ww,hh)/2;\n    \n        bl = cx - factor[0]*tsize;\n        bt = cy - factor[1]*tsize;\n        br = cx + factor[2]*tsize;\n        bb = cy + factor[3]*tsize;\n    \n        nbbox = np.zeros( (4,1), dtype=np.float32 )\n        nbbox[0] = bl;\n        nbbox[1] = bt;\n        nbbox[2] = br;\n        nbbox[3] = bb;\n\n    \n        return nbbox\n\n\ndef increaseBbox_rescaleYOLO(bbox, im):\n\n    rescaleFrontal = [1.4421, 2.2853, 1.4421, 1.4286];\n    rescaleCS2 = [0.9775, 1.5074, 0.9563, 0.9436];\n\n\n    l = bbox[0]\n    t = bbox[1]\n    ww = bbox[2]\n    hh = bbox[3]\n\n    # Approximate LM tight BB\n    h = im.shape[0];\n    w = im.shape[1];\n    \n    cx = l + ww/2;\n    cy = t + hh/2;\n    tsize = max(ww,hh)/2;\n    l = cx - tsize;\n    t = cy - tsize;\n    cx = l + (2*tsize)/(rescaleCS2[0]+rescaleCS2[2]) * rescaleCS2[0];\n    cy = t + (2*tsize)/(rescaleCS2[1]+rescaleCS2[3]) * rescaleCS2[1];\n    tsize = 2*tsize/(rescaleCS2[0]+rescaleCS2[2]);\n\n    \n    \"\"\"\n    # Approximate inplane align (frontal)\n    nbbox = np.zeros( (4,1), dtype=np.float32 )\n    nbbox[0] = cx - rescaleFrontal[0]*tsize;\n    nbbox[1] = cy - rescaleFrontal[1]*tsize;\n    nbbox[2] = cx + rescaleFrontal[2]*tsize;\n    nbbox[3] = cy + rescaleFrontal[3]*tsize;\n    \"\"\"\n    \n    nbbox = np.zeros( (4,1), dtype=np.float32 )\n    nbbox[0] = cx - tsize;\n    nbbox[1] = cy - tsize;\n    nbbox[2] = cx + tsize;\n    nbbox[3] = cy + tsize;\n    \n    return nbbox\n\n\n\n\ndef image_bbox_processing_v2(img, bbox, landmarks=None):\n    img_h, img_w, img_c = img.shape\n    lt_x = bbox[0]\n    lt_y = bbox[1]\n    rb_x = bbox[2]\n    rb_y = bbox[3]\n\n    fillings = np.zeros( (4,1), dtype=np.int32)\n    if lt_x < 0: ## 0 for python\n        fillings[0] = math.ceil(-lt_x)\n    if lt_y < 0:\n        fillings[1] = math.ceil(-lt_y)\n    if rb_x > img_w-1:\n        fillings[2] = math.ceil(rb_x - img_w + 1)\n    if rb_y > img_h-1:\n        fillings[3] = math.ceil(rb_y - img_h + 1)\n    new_bbox = np.zeros( (4,1), dtype=np.float32 )\n    # img = [zeros(size(img,1),fillings(1),img_c), img]\n    # img = [zeros(fillings(2), size(img,2),img_c); img]\n    # img = [img, zeros(size(img,1), fillings(3),img_c)]\n\n    # new_img = [img; zeros(fillings(4), size(img,2),img_c)]\n    imgc = img.copy()\n    if fillings[0] > 0:\n        img_h, img_w, img_c = imgc.shape\n        imgc = np.hstack( [np.zeros( (img_h, fillings[0][0], img_c), dtype=np.uint8 ), imgc] )    \n    if fillings[1] > 0:\n\n        img_h, img_w, img_c = imgc.shape\n        imgc = np.vstack( [np.zeros( (fillings[1][0], img_w, img_c), dtype=np.uint8 ), imgc] )\n    if fillings[2] > 0:\n\n\n        img_h, img_w, img_c = imgc.shape\n        imgc = np.hstack( [ imgc, np.zeros( (img_h, fillings[2][0], img_c), dtype=np.uint8 ) ] )    \n    if fillings[3] > 0:\n        img_h, img_w, img_c = imgc.shape\n        imgc = np.vstack( [ imgc, np.zeros( (fillings[3][0], img_w, img_c), dtype=np.uint8) ] )\n\n\n    new_bbox[0] = lt_x + fillings[0]\n    new_bbox[1] = lt_y + fillings[1]\n    new_bbox[2] = rb_x + fillings[0]\n    new_bbox[3] = rb_y + fillings[1]\n\n\n    if len(landmarks) == 0: #len(landmarks) == 0: #landmarks == None:\n        return imgc, new_bbox\n    else:\n        landmarks_new = np.zeros([landmarks.shape[0], landmarks.shape[1]])\n        #print \"landmarks_new's shape: \\n\"                                                                                                                                                               \n        #print landmarks_new.shape                                                                                                                                                                       \n        landmarks_new[:,0] = landmarks[:,0] + fillings[0]\n        landmarks_new[:,1] = landmarks[:,1] + fillings[1]\n        return imgc, new_bbox, landmarks_new\n\n    #return imgc, new_bbox\n\n\ndef image_bbox_processing_v3(img, bbox):\n    img_h, img_w, img_c = img.shape\n    lt_x = bbox[0]\n    lt_y = bbox[1]\n    rb_x = bbox[2]\n    rb_y = bbox[3]\n\n    fillings = np.zeros( (4,1), dtype=np.int32)\n    if lt_x < 0: ## 0 for python\n        fillings[0] = math.ceil(-lt_x)\n    if lt_y < 0:\n        fillings[1] = math.ceil(-lt_y)\n    if rb_x > img_w-1:\n        fillings[2] = math.ceil(rb_x - img_w + 1)\n    if rb_y > img_h-1:\n        fillings[3] = math.ceil(rb_y - img_h + 1)\n    new_bbox = np.zeros( (4,1), dtype=np.float32 )\n    # img = [zeros(size(img,1),fillings(1),img_c), img]\n    # img = [zeros(fillings(2), size(img,2),img_c); img]\n    # img = [img, zeros(size(img,1), fillings(3),img_c)]\n\n    # new_img = [img; zeros(fillings(4), size(img,2),img_c)]\n    imgc = img.copy()\n    if fillings[0] > 0:\n        img_h, img_w, img_c = imgc.shape\n        imgc = np.hstack( [np.zeros( (img_h, fillings[0][0], img_c), dtype=np.uint8 ), imgc] )    \n    if fillings[1] > 0:\n\n        img_h, img_w, img_c = imgc.shape\n        imgc = np.vstack( [np.zeros( (fillings[1][0], img_w, img_c), dtype=np.uint8 ), imgc] )\n    if fillings[2] > 0:\n\n\n        img_h, img_w, img_c = imgc.shape\n        imgc = np.hstack( [ imgc, np.zeros( (img_h, fillings[2][0], img_c), dtype=np.uint8 ) ] )    \n    if fillings[3] > 0:\n        img_h, img_w, img_c = imgc.shape\n        imgc = np.vstack( [ imgc, np.zeros( (fillings[3][0], img_w, img_c), dtype=np.uint8) ] )\n\n\n    new_bbox[0] = lt_x + fillings[0]\n    new_bbox[1] = lt_y + fillings[1]\n    new_bbox[2] = rb_x + fillings[0]\n    new_bbox[3] = rb_y + fillings[1]\n\n\n    \n    return imgc, new_bbox\n    \n\ndef preProcessImage(im, lmks, bbox, factor, _alexNetSize, flipped):\n\n        sys.stdout.flush()\n        \n\n        if flipped == 1: # flip landmarks and indices if it's flipped imag\n            lmks = flip_lmk_idx(im, lmks)\n            \n        lmks_flip = lmks\n\n\n        lt_x = bbox[0]\n        lt_y = bbox[1]\n        rb_x = lt_x + bbox[2]\n        rb_y = lt_y + bbox[3]\n        w = bbox[2]\n        h = bbox[3]\n        center = ( (lt_x+rb_x)/2, (lt_y+rb_y)/2 )\n        side_length = max(w,h);\n        \n        # make the bbox be square\n        bbox = np.zeros( (4,1), dtype=np.float32 )\n        bbox[0] = center[0] - side_length/2\n        bbox[1] = center[1] - side_length/2\n        bbox[2] = center[0] + side_length/2\n        bbox[3] = center[1] + side_length/2\n        img_2, bbox_green = image_bbox_processing_v2(im, bbox)\n        \n        #%% Get the expanded square bbox\n        bbox_red = increaseBbox(bbox_green, factor)\n        bbox_red2 = increaseBbox(bbox, factor)\n        bbox_red2[2] = bbox_red2[2] - bbox_red2[0]\n        bbox_red2[3] = bbox_red2[3] - bbox_red2[1]\n        bbox_red2 = np.reshape(bbox_red2, [4])\n\n        img_3, bbox_new, lmks = image_bbox_processing_v2(img_2, bbox_red, lmks)\n    \n        #%% Crop and resized\n        bbox_new =  np.ceil( bbox_new )\n        side_length = max( bbox_new[2] - bbox_new[0], bbox_new[3] - bbox_new[1] )\n        bbox_new[2:4] = bbox_new[0:2] + side_length\n        bbox_new = bbox_new.astype(int)\n\n        crop_img = img_3[bbox_new[1][0]:bbox_new[3][0], bbox_new[0][0]:bbox_new[2][0], :];\n        lmks_new = np.zeros([lmks.shape[0],2])\n        lmks_new[:,0] = lmks[:,0] - bbox_new[0][0]\n        lmks_new[:,1] = lmks[:,1] - bbox_new[1][0]\n\n        resized_crop_img = cv2.resize(crop_img, ( _alexNetSize, _alexNetSize ), interpolation = cv2.INTER_CUBIC)\n        old_h, old_w, channels = crop_img.shape\n        lmks_new2 = np.zeros([lmks.shape[0],2])\n        lmks_new2[:,0] = lmks_new[:,0] * _alexNetSize / old_w\n        lmks_new2[:,1] = lmks_new[:,1] * _alexNetSize / old_h\n        #print _alexNetSize, old_w, old_h\n       \n\n        return  resized_crop_img, lmks_new2, bbox_red2, lmks_flip, side_length, center\n\n\ndef resize_crop_rescaleCASIA(im, bbox, lmks, factor):\n\n    lt_x = bbox[0]\n    lt_y = bbox[1]\n    rb_x = lt_x + bbox[2]\n    rb_y = lt_y + bbox[3]\n    bbox = np.reshape([lt_x, lt_y, rb_x, rb_y], [-1])\n\n    # Get the expanded square bbox\n    bbox_red = increaseBbox_rescaleCASIA(bbox, factor)\n\n\n    img_3, bbox_new, lmks = image_bbox_processing_v2(im, bbox_red, lmks);\n    lmks_filling = lmks.copy()\n\n\n    #%% Crop and resized\n    bbox_new =  np.ceil( bbox_new )\n    side_length = max( bbox_new[2] - bbox_new[0], bbox_new[3] - bbox_new[1] )\n    bbox_new[2:4] = bbox_new[0:2] + side_length\n\n    #bbox_new[0] = max(0, bbox_new[0]) \n    #bbox_new[1] = max(0, bbox_new[1])\n    #bbox_new[2] = min(img_3.shape[1]-1, bbox_new[2])\n    #bbox_new[3] = min(img_3.shape[0]-1, bbox_new[3])\n    bbox_new = bbox_new.astype(int)\n\n\n    crop_img = img_3[bbox_new[1][0]:bbox_new[3][0], bbox_new[0][0]:bbox_new[2][0], :];\n    lmks_new = np.zeros([lmks.shape[0],2])\n    lmks_new[:,0] = lmks[:,0] - bbox_new[0][0]\n    lmks_new[:,1] = lmks[:,1] - bbox_new[1][0]\n    old_h, old_w, channels = crop_img.shape\n\n\n    resized_crop_img = cv2.resize(crop_img, ( 224, 224 ), interpolation = cv2.INTER_CUBIC)\n    lmks_new2 = np.zeros([lmks.shape[0],2])\n    lmks_new2[:,0] = lmks_new[:,0] * 224 / old_w\n    lmks_new2[:,1] = lmks_new[:,1] * 224 / old_h\n\n\n\n    return resized_crop_img, bbox_new, lmks_new2, lmks_filling, old_h, old_w, img_3\n\n\n\ndef resize_crop_rescaleCASIA_v2(im, bbox, lmks, factor, bbox_type):\n\n    \n\n    # Get the expanded square bbox\n    if bbox_type == \"casia\":\n\n        lt_x = bbox[0]\n        lt_y = bbox[1]\n        rb_x = lt_x + bbox[2]\n        rb_y = lt_y + bbox[3]\n        bbox = np.reshape([lt_x, lt_y, rb_x, rb_y], [-1])\n\n        bbox_red = increaseBbox_rescaleCASIA(bbox, factor)\n\n    elif bbox_type == \"yolo\":\n\n        lt_x = bbox[0]\n        lt_y = bbox[1]\n        rb_x = lt_x + bbox[2]\n        rb_y = lt_y + bbox[3]\n        w = bbox[2]\n        h = bbox[3]\n        center = ( (lt_x+rb_x)/2, (lt_y+rb_y)/2 )\n        side_length = max(w,h);\n        \n        # make the bbox be square\n        bbox = np.zeros( (4,1), dtype=np.float32 )\n        bbox[0] = center[0] - side_length/2\n        bbox[1] = center[1] - side_length/2\n        bbox[2] = center[0] + side_length/2\n        bbox[3] = center[1] + side_length/2\n        img_2, bbox_green = image_bbox_processing_v3(im, bbox)\n        \n        #%% Get the expanded square bbox\n        bbox_red = increaseBbox(bbox_green, factor)\n                \n        \n\n\n\n    img_3, bbox_new, lmks = image_bbox_processing_v2(im, bbox_red, lmks);\n    lmks_filling = lmks.copy()\n\n\n    #%% Crop and resized\n    bbox_new =  np.ceil( bbox_new )\n    side_length = max( bbox_new[2] - bbox_new[0], bbox_new[3] - bbox_new[1] )\n    bbox_new[2:4] = bbox_new[0:2] + side_length\n\n    #bbox_new[0] = max(0, bbox_new[0]) \n    #bbox_new[1] = max(0, bbox_new[1])\n    #bbox_new[2] = min(img_3.shape[1]-1, bbox_new[2])\n    #bbox_new[3] = min(img_3.shape[0]-1, bbox_new[3])\n    bbox_new = bbox_new.astype(int)\n\n\n    crop_img = img_3[bbox_new[1][0]:bbox_new[3][0], bbox_new[0][0]:bbox_new[2][0], :];\n    lmks_new = np.zeros([lmks.shape[0],2])\n    lmks_new[:,0] = lmks[:,0] - bbox_new[0][0]\n    lmks_new[:,1] = lmks[:,1] - bbox_new[1][0]\n    old_h, old_w, channels = crop_img.shape\n\n\n    resized_crop_img = cv2.resize(crop_img, ( 224, 224 ), interpolation = cv2.INTER_CUBIC)\n    lmks_new2 = np.zeros([lmks.shape[0],2])\n    lmks_new2[:,0] = lmks_new[:,0] * 224 / old_w\n    lmks_new2[:,1] = lmks_new[:,1] * 224 / old_h\n\n\n\n    return resized_crop_img, bbox_new, lmks_new2, lmks_filling, old_h, old_w, img_3\n\n\ndef resize_crop_AFLW(im, bbox, lmks):\n\n    lt_x = bbox[0]\n    lt_y = bbox[1]\n    rb_x = lt_x + bbox[2]\n    rb_y = lt_y + bbox[3]\n    bbox = np.reshape([lt_x, lt_y, rb_x, rb_y], [-1])\n\n\n    crop_img = img[bbox[1]:bbox[3], bbox[0]:bbox[2], :];\n    lmks_new = np.zeros([lmks.shape[0],2])\n    lmks_new[:,0] = lmks[:,0] - bbox[0]\n    lmks_new[:,1] = lmks[:,1] - bbox[1]\n    old_h, old_w, channels = crop_img.shape\n\n\n    resized_crop_img = cv2.resize(crop_img, ( 224, 224 ), interpolation = cv2.INTER_CUBIC)\n    lmks_new2 = np.zeros([lmks.shape[0],2])\n    lmks_new2[:,0] = lmks_new[:,0] * 224 / old_w\n    lmks_new2[:,1] = lmks_new[:,1] * 224 / old_h\n\n\n    bbox_new = np.zeros([4])\n    bbox_new[0] = bbox[0] * 224 / old_w\n    bbox_new[1] = bbox[1] * 224 / old_h\n    bbox_new[2] = bbox[2] * 224 / old_w\n    bbox_new[3] = bbox[3] * 224 / old_h\n\n    bbox_new[2] = bbox_new[2] - bbox_new[0] # box width\n    bbox_new[3] = bbox_new[3] - bbox_new[1] # box height\n\n\n    return resized_crop_img, bbox_new, lmks_new2\n\n\n\n\n\ndef preProcessImage_v2(im, bbox, factor, _resNetSize, if_cropbyLmks_rescaleCASIA):\n\n        sys.stdout.flush()\n        \n\n        if if_cropbyLmks_rescaleCASIA == 0:\n                lt_x = bbox[0]\n                lt_y = bbox[1]\n                rb_x = lt_x + bbox[2]\n                rb_y = lt_y + bbox[3]\n                w = bbox[2]\n                h = bbox[3]\n                center = ( (lt_x+rb_x)/2, (lt_y+rb_y)/2 )\n                side_length = max(w,h);\n                \n                # make the bbox be square\n                bbox = np.zeros( (4,1), dtype=np.float32 )\n                bbox[0] = center[0] - side_length/2\n                bbox[1] = center[1] - side_length/2\n                bbox[2] = center[0] + side_length/2\n                bbox[3] = center[1] + side_length/2\n                img_2, bbox_green = image_bbox_processing_v2(im, bbox)\n                \n                #%% Get the expanded square bbox\n                bbox_red = increaseBbox(bbox_green, factor)\n                img_3, bbox_new = image_bbox_processing_v2(img_2, bbox_red)\n\n        elif if_cropbyLmks_rescaleCASIA == 1:\n               \n                bbox[2] = bbox[0] + bbox[2]\n                bbox[3] = bbox[1] + bbox[3]\n\n                bbox_red = increaseBbox_rescaleCASIA(bbox, factor)\n                #print bbox_red\n                img_3, bbox_new = image_bbox_processing_v3(im, bbox_red)\n\n        else:\n            \n                bbox2 = increaseBbox_rescaleYOLO(bbox, im)\n                bbox_red = increaseBbox_rescaleCASIA(bbox2, factor)\n\n                img_3, bbox_new = image_bbox_processing_v2(im, bbox_red)\n\n\n\n        #bbox_red2 = increaseBbox(bbox, factor)\n        #bbox_red2[2] = bbox_red2[2] - bbox_red2[0]\n        #bbox_red2[3] = bbox_red2[3] - bbox_red2[1]\n        #bbox_red2 = np.reshape(bbox_red2, [4])\n    \n        #%% Crop and resized\n        bbox_new =  np.ceil( bbox_new )\n        side_length = max( bbox_new[2] - bbox_new[0], bbox_new[3] - bbox_new[1] )\n        bbox_new[2:4] = bbox_new[0:2] + side_length\n        bbox_new = bbox_new.astype(int)\n\n        crop_img = img_3[bbox_new[1][0]:bbox_new[3][0], bbox_new[0][0]:bbox_new[2][0], :];\n        #print crop_img.shape\n\n        resized_crop_img = cv2.resize(crop_img, ( _resNetSize, _resNetSize ), interpolation = cv2.INTER_CUBIC)\n        \n       \n        return  resized_crop_img\n\n\n\ndef preProcessImage_useGTBBox(im, lmks, bbox, factor, _alexNetSize, flipped, to_train_scale, yolo_bbox):\n\n        sys.stdout.flush()\n        #print bbox, yolo_bbox, to_train_scale\n\n        if flipped == 1: # flip landmarks and indices if it's flipped imag\n            lmks = flip_lmk_idx(im, lmks)\n            \n        lmks_flip = lmks\n\n\n        lt_x = bbox[0]\n        lt_y = bbox[1]\n        rb_x = lt_x + bbox[2]\n        rb_y = lt_y + bbox[3]\n        w = bbox[2]\n        h = bbox[3]\n        center = ( (lt_x+rb_x)/2, (lt_y+rb_y)/2 )\n        side_length = max(w,h);\n        \n        # make the bbox be square\n        bbox = np.zeros( (4,1), dtype=np.float32 )\n        #print bbox\n        bbox_red = np.zeros( (4,1), dtype=np.float32 )\n\n        if to_train_scale == 1:\n                _, _, _, _, side_length2, center2 = preProcessImage(im, lmks, yolo_bbox, factor, _alexNetSize, flipped)\n            \n                center3 = ( (center[0]+center2[0])/2, (center[1]+center2[1])/2 )\n                bbox[0] = center3[0] - side_length2/2\n                bbox[1] = center3[1] - side_length2/2\n                bbox[2] = center3[0] + side_length2/2\n                bbox[3] = center3[1] + side_length2/2\n\n                bbox_red[0] = center3[0] - side_length2/2\n                bbox_red[1] = center3[1] - side_length2/2\n                bbox_red[2] = side_length2\n                bbox_red[3] = side_length2\n\n        else:\n\n                bbox[0] = center[0] - side_length/2\n                bbox[1] = center[1] - side_length/2\n                bbox[2] = center[0] + side_length/2\n                bbox[3] = center[1] + side_length/2\n                #print center, side_length, bbox[0], bbox[1], bbox[2], bbox[3]\n\n                \n                bbox_red[0] = center[0] - side_length/2\n                bbox_red[1] = center[1] - side_length/2\n                bbox_red[2] = side_length\n                bbox_red[3] = side_length\n\n        bbox_red = np.reshape(bbox_red, [4])\n\n        #print bbox, bbox_red\n\n        img_2, bbox_green = image_bbox_processing_v2(im, bbox) \n        #print img_2.shape, bbox_green\n\n        #%% Crop and resized\n        bbox_new =  np.ceil( bbox_green )\n        side_length = max( bbox_new[2] - bbox_new[0], bbox_new[3] - bbox_new[1] )\n        bbox_new[2:4] = bbox_new[0:2] + side_length\n        bbox_new = bbox_new.astype(int)\n\n        #print bbox_new\n        crop_img = img_2[bbox_new[1][0]:bbox_new[3][0], bbox_new[0][0]:bbox_new[2][0], :];\n        lmks_new = np.zeros([68,2])\n        lmks_new[:,0] = lmks[:,0] - bbox_new[0][0]\n        lmks_new[:,1] = lmks[:,1] - bbox_new[1][0]\n\n        #print crop_img.shape\n\n        resized_crop_img = cv2.resize(crop_img, ( _alexNetSize, _alexNetSize ), interpolation = cv2.INTER_CUBIC)\n        old_h, old_w, channels = crop_img.shape\n        lmks_new2 = np.zeros([68,2])\n        lmks_new2[:,0] = lmks_new[:,0] * _alexNetSize / old_w\n        lmks_new2[:,1] = lmks_new[:,1] * _alexNetSize / old_h\n        #print _alexNetSize, old_w, old_h\n       \n\n        return  resized_crop_img, lmks_new2, bbox_red, lmks_flip\n\n\n\ndef replaceInFile(filep, before, after):\n    for line in fileinput.input(filep, inplace=True):\n        print line.replace(before,after),\n\n\n\ndef flip_lmk_idx(img, lmarks):\n\n    # Flipping X values for landmarks                                                                                                                    \\                                               \n    lmarks[:,0] = img.shape[1] - lmarks[:,0]\n\n    # Creating flipped landmarks with new indexing                                                                                                                                                       \n    lmarks_flip =  np.zeros((68,2))\n    for i in range(len(repLand)):\n        lmarks_flip[i,:] = lmarks[repLand[i]-1,:]\n\n\n    return lmarks_flip\n\n\n\n\ndef pose_to_LMs(pose_Rt):\n\n        pose_Rt = np.reshape(pose_Rt, [6])\n        ref_lm = np.loadtxt('./lm_m10.txt', delimiter=',')\n        ref_lm_t = np.transpose(ref_lm)\n        numLM = ref_lm_t.shape[1] \n        #PI = np.array([[  4.22519775e+03,0.00000000e+00,1.15000000e+02], [0.00000000e+00, 4.22519775e+03, 1.15000000e+02], [0, 0, 1]]);\n        PI = np.array([[  2.88000000e+03, 0.00000000e+00, 1.12000000e+02], [0.00000000e+00, 2.88000000e+03, 1.12000000e+02], [0, 0, 1]]);\n\n\n        rvecs = pose_Rt[0:3]\n        tvec = np.reshape(pose_Rt[3:6], [3,1])\n        tsum = np.repeat(tvec,numLM,1)\n        rmat, jacobian = cv2.Rodrigues(rvecs, None)\n        transformed_lms = np.matmul(rmat, ref_lm_t) + tsum\n        transformed_lms = np.matmul(PI, transformed_lms)\n        transformed_lms[0,:] = transformed_lms[0,:]/transformed_lms[2,:]\n        transformed_lms[1,:] = transformed_lms[1,:]/transformed_lms[2,:]\n        lms = np.transpose(transformed_lms[:2,:])\n\n\n        return lms\n\n\n\ndef RotationMatrix(angle_x, angle_y, angle_z):\n        # get rotation matrix by rotate angle\n\n        phi = angle_x; # pitch\n        gamma = angle_y; # yaw\n        theta = angle_z; # roll\n\n        R_x = np.array([ [1, 0, 0] , [0, np.cos(phi), np.sin(phi)] , [0, -np.sin(phi), np.cos(phi)] ]);\n        R_y = np.array([ [np.cos(gamma), 0, -np.sin(gamma)] , [0, 1, 0] , [np.sin(gamma), 0, np.cos(gamma)] ]);\n        R_z = np.array([ [np.cos(theta), np.sin(theta), 0] , [-np.sin(theta), np.cos(theta), 0] , [0, 0, 1] ]);\n\n        R = np.matmul( R_x , np.matmul(R_y , R_z) );\n\n\n        return R\n\n\n\ndef matrix2angle(R):\n    ''' compute three Euler angles from a Rotation Matrix. Ref: http://www.gregslabaugh.net/publications/euler.pdf\n    Args:\n        R: (3,3). rotation matrix\n    Returns:\n        x: yaw\n        y: pitch\n        z: roll\n    '''\n    # assert(isRotationMatrix(R))\n\n    if R[2,0] !=1 or R[2,0] != -1:\n        #x = asin(R[2,0])\n        #y = atan2(R[2,1]/cos(x), R[2,2]/cos(x))\n        #z = atan2(R[1,0]/cos(x), R[0,0]/cos(x))\n\n        x = -asin(R[2,0])\n        #x = np.pi - x\n        y = atan2(R[2,1]/cos(x), R[2,2]/cos(x))\n        z = atan2(R[1,0]/cos(x), R[0,0]/cos(x))\n\n        \n    else:# Gimbal lock\n        z = 0 #can be anything\n        if R[2,0] == -1:\n            x = np.pi/2\n            y = z + atan2(R[0,1], R[0,2])\n        else:\n            x = -np.pi/2\n            y = -z + atan2(-R[0,1], -R[0,2])\n\n    return x, y, z\n\n\ndef P2sRt(P):\n    ''' decompositing camera matrix P. \n    Args: \n        P: (3, 4). Affine Camera Matrix.\n    Returns:\n        s: scale factor.\n        R: (3, 3). rotation matrix.\n        t2d: (2,). 2d translation. \n        t3d: (3,). 3d translation.\n    '''\n    #t2d = P[:2, 3]\n    t3d = P[:, 3]\n    R1 = P[0:1, :3]\n    R2 = P[1:2, :3]\n    s = (np.linalg.norm(R1) + np.linalg.norm(R2))/2.0\n    r1 = R1/np.linalg.norm(R1)\n    r2 = R2/np.linalg.norm(R2)\n    r3 = np.cross(r1, r2)\n\n    R = np.concatenate((r1, r2, r3), 0)\n    return s, R, t3d"
  }
]