Repository: yuanyuanli85/FashionAI_KeyPoint_Detection_Challenge_Keras
Branch: master
Commit: 0b3bd8cdee32
Files: 22
Total size: 85.2 KB

Directory structure:
gitextract_r2c2s0cn/

├── .gitignore
├── LICENSE
├── README.md
├── data/
│   └── placeholder.txt
├── src/
│   ├── data_gen/
│   │   ├── data_generator.py
│   │   ├── data_process.py
│   │   ├── dataset.py
│   │   ├── kpAnno.py
│   │   ├── ohem.py
│   │   └── utils.py
│   ├── eval/
│   │   ├── eval_callback.py
│   │   ├── evaluation.py
│   │   └── post_process.py
│   ├── top/
│   │   ├── demo.py
│   │   ├── test.py
│   │   └── train.py
│   └── unet/
│       ├── fashion_net.py
│       ├── refinenet.py
│       ├── refinenet_mask_v3.py
│       └── resnet101.py
├── submission/
│   └── placeholder.txt
└── trained_models/
    └── placeholder.txt

================================================
FILE CONTENTS
================================================

================================================
FILE: .gitignore
================================================
.idea
*.pyc
*.pkl


================================================
FILE: LICENSE
================================================
MIT License

Copyright (c) 2018 VictorLi

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.


================================================
FILE: README.md
================================================
# AiFashion

- Author: VictorLi, yuanyuan.li85@gmail.com
- Code for  FashionAI Global Challenge—Key Points Detection of Apparel
[2018 TianChi](https://tianchi.aliyun.com/competition/introduction.htm?spm=5176.100068.5678.1.4ccc289bCzDJXu&raceId=231648&_lang=en_US)
- Rank 45/2322 at 1st round competition, score 0.61
- Rank 46 at 2nd round competition, score 0.477

## Images with detected keypoints
### Dress
![Dress](./images/dress.jpg)
### Blouse
![Blouse](./images/blouse.jpg)
### Outwear
![Outwear](./images/outwear.jpg)
### Skirt
![Skirt](./images/skirt.jpg)
### Trousers
![Trousers](./images/trousers.jpg)


## Basic idea
- The key idea comes from paper [Cascaded Pyramid Network for Multi-Person Pose Estimation](https://arxiv.org/abs/1711.07319). We have a 2 stage network called global net and refine net who are U-net like. The network was trained to detect the heatmap of cloth's key points. The backbone network used here is resnet101.  
- To overcome the negative impact from different category, `input_mask` was introduced to zero the invalid keypoints. For example, skirt has 4 valid keypoints: `waistband_left`, `waistband_right`, `hemline_left` and `hemline_right`. In `input_mask`, only those valid masks are 1.0 , while other 20 masks are set as zero.
- On line hard negative mining, at last stage of refinenet, only take the top losses as consideration and ignore the easy part (small loss)

## Dependency
- Keras2.0
- Tensorflow
- Opencv/Numpy/Pandas
- Pretrained model weights, resenet101

## Folder Structure
- `data`: folder to store training and testing images and annotations
- `trained_models`: folder to store trained models and logs
- `submission`: folder to store generated submission for evaluation.
- `src`: folder to put all of source code.  
`src/data_gen`: code for data generator including data augmentation and pre-process  
`src/eval`: code for evaluation, including inference and post-processing.  
`src/unet`: code for cnn model definition, including train, fine-tune, loss, optimizer definition.  
`src/top`:top level code for train, test and demo.   

## How to train network  
- Download dataset from competition webpage and put it under data.  
  `data/train` : data used as train. `data/test` : data used for test  
- Download [resnet101](https://gist.github.com/flyyufelix/65018873f8cb2bbe95f429c474aa1294) model and save it as `data/resnet101_weights_tf.h5`.   
Note: all the models here use channel_last dim order.
- Train all-in-one network from scratch  
```
python train.py --category all --epochs 30 --network v11 --batchSize 3 --gpuID 2
```
- The trained model and log will be put under `trained_models/all/xxxx`, i.e `trained_models/all/2018_05_23_15_18_07/`  
- The evaluation  will run for each epoch and details saved to `val.log`
- Resume training from a specific model.  
```
python train.py --gpuID 2 --category all --epochs 30 --network v11 --batchSize 3 --resume True --resumeModel /path/to/model/start/with --initEpoch 6
```

## How to test and generate submission
- Run test and generate submission
Below command search the best score from `modelpath` and use that to generate submission  
```
python test.py --gpuID 2 --modelpath ../../trained_models/all/xxx --outpath ../../submission/2018_04_19/ --augment True
```
The submission will be saved as `submission.csv`

## How to run demo
- Download the pre trained weights from [BaiduDisk](https://pan.baidu.com/s/1t7fB5wnRfW1Vny0gw7xUDQ) (password `1ae2`) or [GoogleDrive](https://drive.google.com/open?id=1VY-AO2F1XMQLBjEZjy6CrOSIPWWaHUGr)
- Save it somewhere, i.e `trained_models/all/fashion_ai_keypoint_weights_epoch28.hdf5`
- Or use your own trained model.
- Run demo and the cloth with keypoints marked will be displayed.   
```
python demo.py --gpuID 2 --modelfile ../../trained_models/all/fashion_ai_keypoint_weights_epoch28.hdf5
```

## Reference
- Resnet 101 Keras : https://github.com/statech/resnet


================================================
FILE: data/placeholder.txt
================================================


================================================
FILE: src/data_gen/data_generator.py
================================================

import os
import cv2
import pandas as pd
import numpy as np
import random

from kpAnno import KpAnno
from dataset import getKpNum, getKpKeys, getFlipMapID,  generate_input_mask
from utils import make_gaussian, load_annotation_from_df
from data_process import pad_image, resize_image, normalize_image, rotate_image, \
    rotate_image_float, rotate_mask, crop_image
from ohem import generate_topk_mask_ohem

class DataGenerator(object):

    def __init__(self, category, annfile):
        self.category = category
        self.annfile  = annfile
        self._initialize()

    def get_dim_order(self):
        # default tensorflow dim order
        return "channels_last"

    def get_dataset_size(self):
        return len(self.annDataFrame)

    def generator_with_mask_ohem(self, graph, kerasModel, batchSize=16, inputSize=(512, 512), flipFlag=False, cropFlag=False,
                            shuffle=True, rotateFlag=True, nStackNum=1):

        '''
        Input:  batch_size * Height (512) * Width (512) * Channel (3)
        Input:  batch_size * 256 * 256 * Channel (N+1). Mask for each category. 1.0 for valid parts in category. 0.0 for invalid parts
        Output: batch_size * Height/2 (256) * Width/2 (256) * Channel (N+1)
        '''
        xdf = self.annDataFrame

        targetHeight, targetWidth = inputSize

        # train_input: npfloat,  height, width, channels
        # train_gthmap: npfloat, N heatmap + 1 background heatmap,
        train_input = np.zeros((batchSize, targetHeight, targetWidth, 3), dtype=np.float)
        train_mask = np.zeros((batchSize, targetHeight / 2, targetWidth / 2, getKpNum(self.category) ), dtype=np.float)
        train_gthmap = np.zeros((batchSize, targetHeight / 2, targetWidth / 2, getKpNum(self.category) ), dtype=np.float)
        train_ohem_mask = np.zeros((batchSize, targetHeight / 2, targetWidth / 2, getKpNum(self.category) ), dtype=np.float)
        train_ohem_gthmap = np.zeros((batchSize, targetHeight / 2, targetWidth / 2, getKpNum(self.category) ), dtype=np.float)

        ## generator need to be infinite loop
        while 1:
            # random shuffle at first
            if shuffle:
                xdf = xdf.sample(frac=1)
            count = 0
            for _index, _row in xdf.iterrows():
                xindex = count % batchSize
                xinput, xhmap = self._prcoess_img(_row, inputSize, rotateFlag, flipFlag, cropFlag, nobgFlag=True)
                xmask = generate_input_mask(_row['image_category'],
                                            (targetHeight, targetWidth, getKpNum(self.category)))

                xohem_mask, xohem_gthmap = generate_topk_mask_ohem([xinput, xmask], xhmap, kerasModel, graph,
                                            8, _row['image_category'], dynamicFlag=False)

                train_input[xindex, :, :, :] = xinput
                train_mask[xindex, :, :, :] = xmask
                train_gthmap[xindex, :, :, :] = xhmap
                train_ohem_mask[xindex, :, :, :] = xohem_mask
                train_ohem_gthmap[xindex, :, :, :] = xohem_gthmap

                # if refinenet enable, refinenet has two outputs, globalnet and refinenet
                if xindex == 0 and count != 0:
                    gthamplst = list()
                    for i in range(nStackNum):
                        gthamplst.append(train_gthmap)

                    # last stack will use ohem gthmap
                    gthamplst.append(train_ohem_gthmap)

                    yield [train_input, train_mask, train_ohem_mask], gthamplst

                count += 1

    def _initialize(self):
        self._load_anno()

    def _load_anno(self):
        '''
        Load annotations from train.csv
        '''
        # Todo: check if category legal
        self.train_img_path = "../../data/train"

        # read into dataframe
        xpd = pd.read_csv(self.annfile)
        xpd = load_annotation_from_df(xpd, self.category)
        self.annDataFrame = xpd

    def _prcoess_img(self, dfrow, inputSize, rotateFlag, flipFlag, cropFlag, nobgFlag):

        mlist = dfrow[getKpKeys(self.category)]
        imgName, kpStr = mlist[0], mlist[1:]

        # read kp annotation from csv file
        kpAnnlst = list()
        for _kpstr in kpStr:
            _kpAn = KpAnno.readFromStr(_kpstr)
            kpAnnlst.append(_kpAn)

        assert (len(kpAnnlst) == getKpNum(self.category)), str(len(kpAnnlst))+" is not the same as "+str(getKpNum(self.category))


        xcvmat = cv2.imread(os.path.join(self.train_img_path, imgName))
        if xcvmat is None:
            return None, None

        #flip as first operation.
        # flip image
        if random.choice([0, 1]) and flipFlag:
            xcvmat, kpAnnlst = self.flip_image(xcvmat, kpAnnlst)

        #if cropFlag:
        #    xcvmat, kpAnnlst = crop_image(xcvmat, kpAnnlst, 0.8, 0.95)

        # pad image to 512x512
        paddedImg, kpAnnlst = pad_image(xcvmat, kpAnnlst, inputSize[0], inputSize[1])

        assert (len(kpAnnlst) == getKpNum(self.category)), str(len(kpAnnlst)) + " is not the same as " + str(
            getKpNum(self.category))

        # output ground truth heatmap is 256x256
        trainGtHmap = self.__generate_hmap(paddedImg, kpAnnlst)

        if random.choice([0,1]) and rotateFlag:
            rAngle = np.random.randint(-1*40, 40)
            rotatedImage,  _ = rotate_image(paddedImg, list(), rAngle)
            rotatedGtHmap  = rotate_mask(trainGtHmap, rAngle)
        else:
            rotatedImage  = paddedImg
            rotatedGtHmap = trainGtHmap

        # resize image
        resizedImg    = cv2.resize(rotatedImage, inputSize)
        resizedGtHmap = cv2.resize(rotatedGtHmap, (inputSize[0]//2, inputSize[1]//2))

        return normalize_image(resizedImg), resizedGtHmap


    def __generate_hmap(self, cvmat, kpAnnolst):
        # kpnum + background
        gthmp = np.zeros((cvmat.shape[0], cvmat.shape[1], getKpNum(self.category)), dtype=np.float)

        for i, _kpAnn in enumerate(kpAnnolst):
            if _kpAnn.visibility == -1:
                continue

            radius = 100
            gaussMask = make_gaussian(radius, radius, 20, None)

            # avoid out of boundary
            top_x, top_y = max(0, _kpAnn.x - radius/2), max(0, _kpAnn.y - radius/2)
            bottom_x, bottom_y = min(cvmat.shape[1], _kpAnn.x + radius/2), min(cvmat.shape[0], _kpAnn.y + radius/2)

            top_x_offset = top_x - (_kpAnn.x - radius/2)
            top_y_offset = top_y - (_kpAnn.y - radius/2)

            gthmp[ top_y:bottom_y, top_x:bottom_x, i] = gaussMask[top_y_offset:top_y_offset + bottom_y-top_y,
                                                                  top_x_offset:top_x_offset + bottom_x-top_x]

        return gthmp

    def flip_image(self, orgimg, orgKpAnolst):
        flipImg = cv2.flip(orgimg, flipCode=1)
        flipannlst = self.flip_annlst(orgKpAnolst, orgimg.shape)
        return flipImg, flipannlst


    def flip_annlst(self, kpannlst, imgshape):
        height, width, channels = imgshape

        # flip first
        flipAnnlst = list()
        for _kp in kpannlst:
            flip_x = width - _kp.x
            flipAnnlst.append(KpAnno(flip_x, _kp.y, _kp.visibility))

        # exchange location of flip keypoints, left->right
        outAnnlst = flipAnnlst[:]
        for i, _kp in enumerate(flipAnnlst):
            mapId = getFlipMapID('all', i)
            outAnnlst[mapId] = _kp

        return outAnnlst


================================================
FILE: src/data_gen/data_process.py
================================================
import pandas as pd
import numpy as np
import cv2
import os
from kpAnno import KpAnno

def normalize_image(cvmat):
    assert (cvmat.dtype == np.uint8) , " only support normalize np.uint8 to float -0.5 ~ 0.5'"
    cvmat = cvmat.astype(np.float)
    cvmat = (cvmat - 128.0) / 256.0
    return cvmat

def resize_image(cvmat, targetWidth, targetHeight):

    assert (cvmat.dtype == np.uint8) , " only support normalize np.uint8  in  resize_image'"

    # get scale
    srcHeight, srcWidth, channles = cvmat.shape
    minScale = min( targetHeight*1.0/srcHeight,  targetWidth*1.0/srcWidth)

    # resize
    resizedMat = cv2.resize(cvmat, None, fx=minScale, fy=minScale)
    reHeight, reWidth, channles = resizedMat.shape

    # pad to targetWidth or targetHeight
    outmat = np.zeros((targetHeight, targetWidth, 3), dtype=cvmat.dtype) + 128

    if targetHeight == reHeight and targetWidth == reWidth:
        outmat = resizedMat
    elif targetWidth != reWidth and targetHeight == reHeight:
        # add pad to width
        outmat[:, 0:reWidth, :] = resizedMat
    elif targetHeight != reHeight and targetWidth == reWidth:
        # add padding to height
        outmat[0:reHeight, :, :] = resizedMat
    else:
        assert(0), "after resize either width or height same as target width or target height"
    return (outmat, minScale)

def pad_image(cvmat, kpAnno, targetWidth, targetHeight):
    '''

    :param cvmat: input mat
    :param targetWidth:  width to pad
    :param targetHeight: height to pad
    :return:
    '''
    assert (cvmat.dtype == np.uint8) , " only support normalize np.uint8  in pad_image'" + str(cvmat.dtype)

    srcHeight, srcWidth, channles = cvmat.shape
    outmat = np.zeros((targetHeight, targetWidth, 3), dtype=cvmat.dtype) + 128

    if targetHeight == srcHeight and targetWidth == srcWidth:
        outmat =  cvmat
        outkpAnno = kpAnno
    elif targetWidth != srcWidth and targetHeight == srcHeight:
        # add pad to width
        outmat[:, 0:srcWidth, :] = cvmat
        outkpAnno = kpAnno
    elif targetHeight != srcHeight and targetWidth == srcWidth:
        # add padding to height
        outmat[0:srcHeight, :, :] = cvmat
        outkpAnno = kpAnno
    else:
        # resize at first, then pad
        outmat, scale = resize_image(cvmat, targetWidth, targetHeight)
        outkpAnno = list()
        for _kpAnno in kpAnno:
            _nkp = KpAnno.applyScale(_kpAnno, scale)
            outkpAnno.append(_nkp)
    return (outmat, outkpAnno)


def pad_image_inference(cvmat, targetWidth, targetHeight):
    '''

    :param cvmat: input mat
    :param targetWidth:  width to pad
    :param targetHeight: height to pad
    :return:
    '''
    assert (cvmat.dtype == np.uint8), " only support normalize np.uint8  in pad_image'" + str(cvmat.dtype)

    srcHeight, srcWidth, channles = cvmat.shape
    outmat = np.zeros((targetHeight, targetWidth, 3), dtype=cvmat.dtype) + 128

    if targetHeight == srcHeight and targetWidth == srcWidth:
        outmat = cvmat
        scale = 1.0
    elif targetWidth > srcWidth and targetHeight == srcHeight:
        # add pad to width
        outmat[:, 0:srcWidth, :] = cvmat
        scale = 1.0
    elif targetHeight > srcHeight and targetWidth == srcWidth:
        # add padding to height
        outmat[0:srcHeight, :, :] = cvmat
        scale = 1.0
    else:
        # resize at first, then pad
        outmat, scale = resize_image(cvmat, targetWidth, targetHeight)

    return (outmat, scale)

def rotate_image(cvmat, kpAnnLst, rotateAngle):

    assert (cvmat.dtype == np.uint8) , " only support normalize np.uint8  in rotate_image'"

    ##Make sure cvmat is square?
    height, width, channel = cvmat.shape

    center = ( width//2, height//2)
    rotateMatrix = cv2.getRotationMatrix2D(center, rotateAngle, 1.0)

    cos, sin = np.abs(rotateMatrix[0,0]), np.abs(rotateMatrix[0, 1])
    newH = int((height*sin)+(width*cos))
    newW = int((height*cos)+(width*sin))

    rotateMatrix[0,2] += (newW/2) - center[0] #x
    rotateMatrix[1,2] += (newH/2) - center[1] #y

    # rotate image
    outMat = cv2.warpAffine(cvmat, rotateMatrix, (newH, newW), borderValue=(128, 128, 128))

    # rotate annotations
    nKpLst = list()
    for _kp in kpAnnLst:
        _newkp = KpAnno.applyRotate(_kp, rotateMatrix)
        nKpLst.append(_newkp)

    return (outMat, nKpLst)


def rotate_image_with_invrmat(cvmat, rotateAngle):

    assert (cvmat.dtype == np.uint8) , " only support normalize np.uint  in rotate_image_with_invrmat'"

    ##Make sure cvmat is square?
    height, width, channel = cvmat.shape

    center = ( width//2, height//2)
    rotateMatrix = cv2.getRotationMatrix2D(center, rotateAngle, 1.0)

    cos, sin = np.abs(rotateMatrix[0,0]), np.abs(rotateMatrix[0, 1])
    newH = int((height*sin)+(width*cos))
    newW = int((height*cos)+(width*sin))

    rotateMatrix[0,2] += (newW/2) - center[0] #x
    rotateMatrix[1,2] += (newH/2) - center[1] #y

    # rotate image
    outMat = cv2.warpAffine(cvmat, rotateMatrix, (newH, newW), borderValue=(128, 128, 128))

    # generate inv rotate matrix
    invRotateMatrix = cv2.invertAffineTransform(rotateMatrix)

    return (outMat, invRotateMatrix, (width, height))

def rotate_mask(mask, rotateAngle):

    outmask = rotate_image_float(mask, rotateAngle)

    return outmask

def rotate_image_float(cvmat, rotateAngle, borderValue=(0.0, 0.0, 0.0)):

    assert (cvmat.dtype == np.float) , " only support normalize np.float  in rotate_image_float'"

    ##Make sure cvmat is square?
    height, width, channels = cvmat.shape

    center = ( width//2, height//2)
    rotateMatrix = cv2.getRotationMatrix2D(center, rotateAngle, 1.0)

    cos, sin = np.abs(rotateMatrix[0,0]), np.abs(rotateMatrix[0, 1])
    newH = int((height*sin)+(width*cos))
    newW = int((height*cos)+(width*sin))

    rotateMatrix[0,2] += (newW/2) - center[0] #x
    rotateMatrix[1,2] += (newH/2) - center[1] #y

    # rotate image
    outMat = cv2.warpAffine(cvmat, rotateMatrix, (newH, newW), borderValue=borderValue)

    return outMat


def crop_image(cvmat, kpAnnLst, lowLimitRatio, upLimitRatio):
    import random

    assert(lowLimitRatio < 1.0), 'lowLimitRatio should be less than 1.0'
    assert(upLimitRatio < 1.0), 'upLimitRatio should be less than 1.0'

    height, width, channels = cvmat.shape

    cropHeight = random.randrange(int(lowLimitRatio*height),  int(upLimitRatio*height))
    cropWidth  = random.randrange(int(lowLimitRatio*width),  int(upLimitRatio*width))

    top_x = random.randrange(0,  width - cropWidth)
    top_y = random.randrange(0,  height - cropHeight)

    # apply offset for keypoints
    nKpLst = list()
    for _kp in kpAnnLst:
        if _kp.visibility == -1:
            _newkp = _kp
        else:
            _newkp = KpAnno.applyOffset(_kp, (top_x, top_y))
            if _newkp.x <=0 or _newkp.y <=0:
                # negative location, return original image
                return cvmat, kpAnnLst
            if _newkp.x >= cropWidth or _newkp.y >= cropHeight:
                # keypoints are cropped out
                return cvmat, kpAnnLst
        nKpLst.append(_newkp)

    return cvmat[top_y:top_y+cropHeight,  top_x:top_x+cropWidth], nKpLst

if __name__ == "__main__":
    pass

================================================
FILE: src/data_gen/dataset.py
================================================


def getKpNum(category):
    # remove one column 'image_id'
    return len(getKpKeys(category)) - 1

TROUSERS_PART_KYES=['waistband_left', 'waistband_right', 'crotch', 'bottom_left_in', 'bottom_left_out', 'bottom_right_in', 'bottom_right_out']
TROUSERS_PART_FLIP_KYES=['waistband_right', 'waistband_left', 'crotch', 'bottom_right_in', 'bottom_right_out', 'bottom_left_in', 'bottom_left_out']

SKIRT_PART_KEYS=['waistband_left', 'waistband_right', 'hemline_left', 'hemline_right']
SKIRT_PART_FLIP_KEYS=['waistband_right', 'waistband_left', 'hemline_right', 'hemline_left']


DRESS_PART_KEYS= ['neckline_left', 'neckline_right', 'shoulder_left', 'shoulder_right', 'center_front',
              'armpit_left', 'armpit_right', 'waistline_left', 'waistline_right', 'cuff_left_in',
              'cuff_left_out', 'cuff_right_in', 'cuff_right_out', 'hemline_left', 'hemline_right']
DRESS_PART_FLIP_KEYS=['neckline_right', 'neckline_left', 'shoulder_right', 'shoulder_left', 'center_front',
               'armpit_right', 'armpit_left', 'waistline_right', 'waistline_left', 'cuff_right_in',
               'cuff_right_out', 'cuff_left_in', 'cuff_left_out', 'hemline_right', 'hemline_left']

BLOUSE_PART_KEYS=['neckline_left', 'neckline_right', 'shoulder_left', 'shoulder_right',
           'center_front', 'armpit_left', 'armpit_right', 'top_hem_left', 'top_hem_right',
           'cuff_left_in', 'cuff_left_out', 'cuff_right_in', 'cuff_right_out']

BLOUSE_PART_FLIP_KEYS=['neckline_right', 'neckline_left', 'shoulder_right', 'shoulder_left',
           'center_front', 'armpit_right', 'armpit_left', 'top_hem_right', 'top_hem_left',
           'cuff_right_in', 'cuff_right_out', 'cuff_left_in', 'cuff_left_out']

OUTWEAR_PART_KEYS=['neckline_left', 'neckline_right', 'shoulder_left', 'shoulder_right',
            'armpit_left', 'armpit_right', 'waistline_left', 'waistline_right', 'cuff_left_in',
            'cuff_left_out', 'cuff_right_in', 'cuff_right_out', 'top_hem_left', 'top_hem_right']

OUTWEAR_PART_FLIP_KEYS = ['neckline_right', 'neckline_left', 'shoulder_right', 'shoulder_left',
           'armpit_right', 'armpit_left', 'waistline_right', 'waistline_left', 'cuff_right_in',
           'cuff_right_out', 'cuff_left_in', 'cuff_left_out', 'top_hem_right', 'top_hem_left']

ALL_PART_KEYS = ['neckline_left', 'neckline_right', 'center_front', 'shoulder_left', 'shoulder_right',
               'armpit_left', 'armpit_right', 'waistline_left', 'waistline_right', 'cuff_left_in', 'cuff_left_out',
               'cuff_right_in', 'cuff_right_out', 'top_hem_left', 'top_hem_right', 'waistband_left', 'waistband_right',
               'hemline_left', 'hemline_right', 'crotch', 'bottom_left_in', 'bottom_left_out',
               'bottom_right_in', 'bottom_right_out']

ALL_PART_FLIP_KEYS = [  'neckline_right', 'neckline_left', 'center_front', 'shoulder_right', 'shoulder_left',
                        'armpit_right', 'armpit_left',   'waistline_right', 'waistline_left', 'cuff_right_in', 'cuff_right_out',
                        'cuff_left_in', 'cuff_left_out', 'top_hem_right', 'top_hem_left',  'waistband_right','waistband_left',
                        'hemline_right', 'hemline_left',  'crotch',  'bottom_right_in', 'bottom_right_out',
                        'bottom_left_in', 'bottom_left_out']

def getFlipKeys(category):
    if category == 'skirt':
        keys, mapkeys = SKIRT_PART_KEYS, SKIRT_PART_FLIP_KEYS
    elif category == 'dress':
        keys, mapkeys = DRESS_PART_KEYS, DRESS_PART_FLIP_KEYS
    elif category == 'trousers':
        keys, mapkeys = TROUSERS_PART_KYES, TROUSERS_PART_FLIP_KYES
    elif category == 'blouse':
        keys, mapkeys = BLOUSE_PART_KEYS, BLOUSE_PART_FLIP_KEYS
    elif category == 'outwear':
        keys, mapkeys = OUTWEAR_PART_KEYS, OUTWEAR_PART_FLIP_KEYS
    elif category == 'all':
        keys, mapkeys = ALL_PART_KEYS, ALL_PART_FLIP_KEYS
    else:
        assert (0), category + " not supported"

    xdict = dict()
    for i in range(len(keys)):
        xdict[keys[i]] = mapkeys[i]
    return keys, xdict

def getFlipMapID(category, partid):
    keys, mapDict = getFlipKeys(category)
    mapKey = mapDict[keys[partid]]
    mapID  = keys.index(mapKey)
    return mapID

def getKpKeys(category):
    '''

    :param category:
    :return: get the keypoint keys in annotation csv
    '''
    SKIRT_KP_KEYS = ['image_id', 'waistband_left', 'waistband_right', 'hemline_left', 'hemline_right']
    DRESS_KP_KEYS = ['image_id', 'neckline_left', 'neckline_right', 'shoulder_left', 'shoulder_right', 'center_front',
                     'armpit_left',  'armpit_right' ,  'waistline_left' , 'waistline_right', 'cuff_left_in',
                     'cuff_left_out', 'cuff_right_in',  'cuff_right_out',  'hemline_left',  'hemline_right']
    TROUSERS_KP_KEYS=['image_id',  'waistband_left', 'waistband_right', 'crotch',  'bottom_left_in',
                      'bottom_left_out', 'bottom_right_in', 'bottom_right_out']
    BLOUSE_KP_KEYS = [ 'image_id', 'neckline_left', 'neckline_right', 'shoulder_left', 'shoulder_right',
                       'center_front', 'armpit_left', 'armpit_right', 'top_hem_left', 'top_hem_right',
                       'cuff_left_in', 'cuff_left_out', 'cuff_right_in', 'cuff_right_out']
    OUTWEAR_KP_KEYS= ['image_id', 'neckline_left', 'neckline_right', 'shoulder_left', 'shoulder_right',
                      'armpit_left', 'armpit_right', 'waistline_left', 'waistline_right', 'cuff_left_in',
                      'cuff_left_out', 'cuff_right_in', 'cuff_right_out', 'top_hem_left', 'top_hem_right']

    ALL_KP_KESY = ['image_id','neckline_left', 'neckline_right', 'center_front', 'shoulder_left', 'shoulder_right',
                 'armpit_left', 'armpit_right', 'waistline_left', 'waistline_right', 'cuff_left_in', 'cuff_left_out', 'cuff_right_in',
                 'cuff_right_out', 'top_hem_left', 'top_hem_right', 'waistband_left', 'waistband_right', 'hemline_left', 'hemline_right' ,
                 'crotch', 'bottom_left_in' , 'bottom_left_out', 'bottom_right_in' ,'bottom_right_out']

    if category == 'skirt':
        return SKIRT_KP_KEYS
    elif category == 'dress':
        return DRESS_KP_KEYS
    elif category == 'trousers':
        return TROUSERS_KP_KEYS
    elif category == 'blouse':
        return BLOUSE_KP_KEYS
    elif category == 'outwear':
        return OUTWEAR_KP_KEYS
    elif category == 'all':
        return ALL_KP_KESY
    else:
        assert(0), category + ' not supported'


def fill_dataframe(kplst, category, dfrow):
    keys = getKpKeys(category)[1:]

    # fill category
    dfrow['image_category'] = category

    assert (len(keys) == len(kplst)), str(len(kplst)) + ' must be the same as ' + str(len(keys))
    for i, _key in enumerate(keys):
        kpann = kplst[i]
        outstr = str(int(kpann.x))+"_"+str(int(kpann.y))+"_"+str(1)
        dfrow[_key] = outstr


def get_kp_index_from_allkeys(kpname):
    ALL_KP_KEYS = ['neckline_left', 'neckline_right', 'center_front', 'shoulder_left', 'shoulder_right',
                   'armpit_left', 'armpit_right', 'waistline_left', 'waistline_right', 'cuff_left_in', 'cuff_left_out',
                   'cuff_right_in', 'cuff_right_out', 'top_hem_left', 'top_hem_right', 'waistband_left', 'waistband_right',
                   'hemline_left', 'hemline_right', 'crotch', 'bottom_left_in', 'bottom_left_out', 'bottom_right_in', 'bottom_right_out']

    return ALL_KP_KEYS.index(kpname)


def generate_input_mask(image_category, shape, nobgFlag=True):
    import numpy as np
    # 0.0 for invalid key points for each category
    # 1.0 for valid key points for each category
    h, w, c = shape
    mask = np.zeros((h // 2, w // 2, c), dtype=np.float)

    for key in getKpKeys(image_category)[1:]:
        index = get_kp_index_from_allkeys(key)
        mask[:, :, index] = 1.0

    # for last channel, background
    if nobgFlag:     mask[:, :, -1] = 0.0
    else:   mask[:, :, -1] = 1.0

    return mask

================================================
FILE: src/data_gen/kpAnno.py
================================================
import numpy as np


class KpAnno(object):
    '''
        Convert string to x, y, visibility
    '''
    def __init__(self, x, y, visibility):
        self.x = int(x)
        self.y = int(y)
        self.visibility = visibility

    @classmethod
    def readFromStr(cls, xstr):
        xarray = xstr.split('_')
        x = int(xarray[0])
        y = int(xarray[1])
        visibility = int(xarray[2])
        return cls(x,y, visibility)

    @classmethod
    def applyScale(cls, kpAnno, scale):
        x = int(kpAnno.x*scale)
        y = int(kpAnno.y*scale)
        v = kpAnno.visibility
        return cls(x, y, v)

    @classmethod
    def applyRotate(cls, kpAnno, rotateMatrix):
        vector = [kpAnno.x, kpAnno.y, 1]
        rotatedV = np.dot(rotateMatrix, vector)
        return cls( int(rotatedV[0]), int(rotatedV[1]), kpAnno.visibility)

    @classmethod
    def applyOffset(cls, kpAnno, offset):
        x = kpAnno.x - offset[0]
        y = kpAnno.y - offset[1]
        v = kpAnno.visibility
        return cls(x, y, v)

    @staticmethod
    def calcDistance(kpA, kpB):
        distance = (kpA.x - kpB.x)**2 + (kpA.y - kpB.y)**2
        return np.sqrt(distance)


================================================
FILE: src/data_gen/ohem.py
================================================

import sys
sys.path.insert(0, "../unet/")

from keras.models import *
from keras.layers import *
from utils import np_euclidean_l2
from dataset import getKpNum

def generate_topk_mask_ohem(input_data, gthmap, keras_model, graph, topK, image_category, dynamicFlag=False):
    '''
    :param input_data: input
    :param gthmap:  ground truth
    :param keras_model: keras model
    :param graph:  tf grpah to WA thread issue
    :param topK: number of kp selected
    :return:
    '''

    # do inference, and calculate loss of each channel
    mimg, mmask = input_data
    ximg  = mimg[np.newaxis,:,:,:]
    xmask = mmask[np.newaxis,:,:,:]

    if len(keras_model.input_layers) == 3:
        # use original mask as ohem_mask
        inputs = [ximg, xmask, xmask]
    else:
        inputs = [ximg, xmask]

    with graph.as_default():
        keras_output = keras_model.predict(inputs)

    # heatmap of last stage
    outhmap = keras_output[-1]

    channel_num = gthmap.shape[-1]

    # calculate loss
    mloss = list()
    for i in range(channel_num):
        _dtmap = outhmap[0, :, :, i]
        _gtmap = gthmap[:, :, i]
        loss   = np_euclidean_l2(_dtmap, _gtmap)
        mloss.append(loss)

    # refill input_mask, set topk as 1.0 and fill 0.0 for rest
    # fixme: topk may different b/w category
    if dynamicFlag:
        topK = getKpNum(image_category)//2

    ohem_mask   = adjsut_mask(mloss, mmask, topK)

    ohem_gthmap = ohem_mask * gthmap

    return ohem_mask, ohem_gthmap

def adjsut_mask(loss, input_mask,  topk):
    # pick topk loss from losses
    # fill topk with 1.0 and fill the rest as 0.0
    assert (len(loss) == input_mask.shape[-1]), \
        "shape should be same" + str(len(loss)) + " vs " + str(input_mask.shape)

    outmask = np.zeros(input_mask.shape, dtype=np.float)

    topk_index = sorted(range(len(loss)), key=lambda i:loss[i])[-topk:]

    for i in range(len(loss)):
        if i in topk_index:
            outmask[:,:,i] = 1.0

    return outmask


================================================
FILE: src/data_gen/utils.py
================================================

import numpy as np
import pandas as pd
import os

def make_gaussian(width, height, sigma=3, center=None):
    '''
        generate 2d guassion heatmap
    :return:
    '''

    x = np.arange(0, width, 1, float)
    y = np.arange(0, height, 1, float)[:, np.newaxis]

    if center is None:
        x0 = width // 2
        y0 = height // 2
    else:
        x0 = center[0]
        y0 = center[1]

    return np.exp( -4*np.log(2)*((x-x0)**2 + (y-y0)**2)/sigma**2)


def split_csv_train_val(allcsv, traincsv, valcsv, ratio=0.8):
    xdf = pd.read_csv(allcsv)
    # random shuffle
    xdf = xdf.sample(frac=1)

    # random sampling
    msk = np.random.rand(len(xdf)) < ratio
    trainDf= xdf[msk]
    valDf= xdf[~msk]
    print "total", len(xdf), "split into train ", len(trainDf), '  val', len(valDf)

    #save to file
    trainDf.to_csv(traincsv, index=False)
    valDf.to_csv(valcsv, index=False)


def np_euclidean_l2(x, y):
    assert (x.shape == y.shape), "shape mismatched " + x.shape +" :  " + y.shape
    loss = np.sum((x - y)**2)
    loss = np.sqrt(loss)
    return loss


def load_annotation_from_df(df, category):
    if category == 'all':
        return df
    else:
        return df[df['image_category'] == category]


================================================
FILE: src/eval/eval_callback.py
================================================

import keras
import os
import datetime
from evaluation import Evaluation
from time import time
class NormalizedErrorCallBack(keras.callbacks.Callback):

    def __init__(self, foldpath, category, multiOut=False, resumeFolder=None):
        self.parentFoldPath = foldpath
        self.category = category

        if resumeFolder is None:
            self.foldPath = os.path.join(self.parentFoldPath, self.category, datetime.datetime.now().strftime('%Y_%m_%d_%H_%M_%S'))
            if not os.path.exists(self.foldPath):
                os.mkdir(self.foldPath)
        else:
            self.foldPath = resumeFolder

        self.valLog = os.path.join(self.foldPath, 'val.log')
        self.multiOut = multiOut

    def get_folder_path(self):
        return self.foldPath

    def on_epoch_end(self, epoch, logs=None):
        modelName = os.path.join(self.foldPath, self.category+"_weights_"+str(epoch)+".hdf5")
        keras.models.save_model(self.model, modelName)
        print "Saving model to ", modelName

        print "Runing evaluation ........."

        xEval = Evaluation(self.category, None)
        xEval.init_from_model(self.model)

        start = time()
        neScore, categoryDict = xEval.eval(self.multiOut, details=True)
        end = time()
        print "Evaluation Done", str(neScore), " cost ", end - start, " seconds!"

        for key in categoryDict.keys():
            scores = categoryDict[key]
            print key, ' score ', sum(scores)/len(scores)

        with open(self.valLog , 'a+') as xfile:
            xfile.write(modelName + ", Socre "+ str(neScore)+"\n")
            for key in categoryDict.keys():
                scores = categoryDict[key]
                xfile.write(key + ": " + str(sum(scores)/len(scores)) + "\n")

        xfile.close()

================================================
FILE: src/eval/evaluation.py
================================================

import sys
sys.path.insert(0, "../data_gen/")
sys.path.insert(0, "../unet/")

import pandas as pd
from dataset import getKpKeys, getKpNum, getFlipMapID, get_kp_index_from_allkeys, generate_input_mask
from kpAnno import KpAnno
from post_process import post_process_heatmap
from keras.models import load_model
import os
from refinenet_mask_v3 import euclidean_loss
import numpy as np
import cv2
from resnet101 import Scale
from utils import load_annotation_from_df
from collections import defaultdict
import copy
from data_process import pad_image_inference

class Evaluation(object):
    def __init__(self, category, modelFile):
        self.category = category
        self.train_img_path = "../../data/train"
        if modelFile is not None:
            self._initialize(modelFile)

    def init_from_model(self, model):
        self._load_anno()
        self.net = model

    def eval(self, multiOut=False, details=False, flip=True):
        xdf = self.annDataFrame
        scores = list()
        xdict = dict()
        xcategoryDict = defaultdict(list)
        for _index, _row in xdf.iterrows():
            imgId = _row['image_id']
            category = _row['image_category']
            imgFile = os.path.join(self.train_img_path, imgId)
            gtKpAnno = self._get_groundtruth_kpAnno(_row)
            if flip:
                predKpAnno = self.predict_kp_with_flip(imgFile, category)
            else:
                predKpAnno = self.predict_kp(imgFile, category, multiOut)
            neScore = Evaluation.calc_ne_score(category, predKpAnno, gtKpAnno)
            scores.extend(neScore)
            if details:
                xcategoryDict[category].extend(neScore)
        if details:
            return sum(scores)/len(scores), xcategoryDict
        else:
            return sum(scores)/len(scores)

    def _initialize(self, modelFile):
        self._load_anno()
        self._initialize_network(modelFile)

    def _initialize_network(self, modelFile):
        self.net = load_model(modelFile, custom_objects={'euclidean_loss': euclidean_loss, 'Scale': Scale})

    def _load_anno(self):
        '''
        Load annotations from train.csv
        '''
        self.annfile = os.path.join("../../data/train/Annotations", "val_split.csv")

        # read into dataframe
        xpd = pd.read_csv(self.annfile)
        xpd = load_annotation_from_df(xpd, self.category)
        self.annDataFrame = xpd


    def _get_groundtruth_kpAnno(self, dfrow):
        mlist = dfrow[getKpKeys(self.category)]
        imgName, kpStr = mlist[0], mlist[1:]
        # read kp annotation from csv file
        kpAnnlst = [KpAnno.readFromStr(_kpstr) for _kpstr in kpStr]
        return kpAnnlst

    def _net_inference_with_mask(self, imgFile, imgCategory):
        import cv2
        from data_process import normalize_image, pad_image_inference
        assert (len(self.net.input_layers) > 1), "input layer need to more than 1"

        # load image and preprocess
        img = cv2.imread(imgFile)

        img, scale = pad_image_inference(img, 512, 512)
        img   = normalize_image(img)
        input_img = img[np.newaxis, :, :, :]

        input_mask = generate_input_mask(imgCategory, (512, 512, getKpNum(self.category)) )
        input_mask = input_mask[np.newaxis, :, :, :]

        # inference
        heatmap = self.net.predict([input_img, input_mask, input_mask])

        return (heatmap, scale)

    def _heatmap_sum(self, heatmaplst):
        outheatmap = np.copy(heatmaplst[0])
        for i in range(1, len(heatmaplst), 1):
            outheatmap += heatmaplst[i]
        return outheatmap

    def predict_kp(self, imgFile, imgCategory, multiOutput=False):

        xnetout, scale = self._net_inference_with_mask(imgFile, imgCategory)

        if multiOutput:
            #todo: fixme, it is tricky that the previous stage has beeter performance than last stage's output.
            #todo: here, we are using multiple stage's output sum.
            heatmap = self._heatmap_sum(xnetout)
        else:
            heatmap = xnetout

        detectedKps = post_process_heatmap(heatmap, kpConfidenceTh=0.2)

        # scale to padded resolution 256X256 -> 512X512
        scaleTo512 = 2.0

        # apply scale to original resolution
        detectedKps = [KpAnno(_kp.x*scaleTo512/scale , _kp.y*scaleTo512/scale, _kp.visibility) for _kp in detectedKps]

        return detectedKps


    def predict_kp_with_flip(self, imgFile, imgCategory):
        #  inference with flip and original image
        heatmap, scale = self._net_inference_flip(imgFile, imgCategory)

        detectedKps = post_process_heatmap(heatmap, kpConfidenceTh=0.2)

        # scale to padded resolution 256X256 -> 512X512
        scaleTo512 = 2.0

        # apply scale to original resolution
        detectedKps = [KpAnno(_kp.x * scaleTo512 / scale, _kp.y * scaleTo512 / scale, _kp.visibility) for _kp in
                       detectedKps]

        return detectedKps

    def _net_inference_flip(self, imgFile, imgCategory):
        import cv2
        from data_process import normalize_image, pad_image_inference
        assert (len(self.net.input_layers) > 1), "input layer need to more than 1"

        batch_size =2

        input_img  = np.zeros(shape=(batch_size, 512, 512, 3), dtype=np.float)
        input_mask = np.zeros(shape=(batch_size, 256, 256, getKpNum(self.category)), dtype=np.float)

        # load image and preprocess
        orgimage = cv2.imread(imgFile)

        padimg, scale = pad_image_inference(orgimage, 512, 512)
        flipimg = cv2.flip(padimg, flipCode=1)

        input_img[0,:,:,:] = normalize_image(padimg)
        input_img[1,:,:,:] = normalize_image(flipimg)

        mask = generate_input_mask(imgCategory, (512, 512, getKpNum(self.category)))
        input_mask[0,:,:,:] = mask
        input_mask[1,:,:,:] = mask

        # inference
        if len(self.net.input_layers) == 2:
            heatmap = self.net.predict([input_img, input_mask])
        elif len(self.net.input_layers) == 3:
            heatmap = self.net.predict([input_img, input_mask, input_mask])
        else:
            assert (0), str(len(self.net.input_layers)) + " should be 2 or 3 "

        # sum heatmap
        avgheatmap = self._heatmap_sum(heatmap)

        orgheatmap = avgheatmap[0,:,:,:]

        # convert to same sequency with original heatmap
        flipheatmap = avgheatmap[1,:,:,:]
        flipheatmap = self._flip_out_heatmap(flipheatmap)

        # average original and flip heatmap
        outheatmap = flipheatmap + orgheatmap
        outheatmap = outheatmap[np.newaxis, :, :, :]

        return (outheatmap, scale)

    def predict_kp_with_rotate(self, imgFile, imgCategory):
        #  inference with rotated image
        rotateheatmap = self._net_inference_rotate(imgFile, imgCategory)
        rotateheatmap = rotateheatmap[np.newaxis, :, :, :]

        # original image and flip image
        orgflipmap, scale = self._net_inference_flip(imgFile, imgCategory)
        mflipmap = cv2.resize(orgflipmap[0,:,:,:], None, fx=2.0/scale, fy=2.0/scale)

        # add mflipmap and rotateheatmap
        avgheatmap = mflipmap[np.newaxis, :, :, :]

        b, h, w , c = rotateheatmap.shape
        avgheatmap[:, 0:h, 0:w,:] += rotateheatmap

        # generate key point locations
        detectedKps = post_process_heatmap(avgheatmap, kpConfidenceTh=0.2)

        return detectedKps

    def _net_inference_rotate(self, imgFile, imgCategory):
        from data_process import normalize_image, pad_image_inference, rotate_image_with_invrmat

        # load image and preprocess
        orgimage = cv2.imread(imgFile)

        anglelst = [-20, -10, 10, 20]

        input_img  = np.zeros(shape=(len(anglelst), 512, 512, 3), dtype=np.float)
        input_mask = np.zeros(shape=(len(anglelst), 256, 256, getKpNum(self.category)), dtype=np.float)

        mlist = list()
        for i, angle in enumerate(anglelst):
            rotateimg, invRotMatrix, orgImgSize = rotate_image_with_invrmat(orgimage, angle)
            padimg, scale = pad_image_inference(rotateimg, 512, 512)
            _img = normalize_image(padimg)
            input_img[i, :, :, :] = _img
            mlist.append((scale, invRotMatrix))

        mask = generate_input_mask(imgCategory, (512, 512, getKpNum(self.category)))
        for i, angle in enumerate(anglelst):
            input_mask[i, :,:,:] = mask

        # inference
        heatmap = self.net.predict([input_img, input_mask, input_mask])
        heatmap = self._heatmap_sum(heatmap)

        # rotate back to original resolution
        sumheatmap =  np.zeros(shape=(orgimage.shape[0], orgimage.shape[1], getKpNum(self.category)), dtype=np.float)
        for i, item in enumerate(mlist):
            _heatmap = heatmap[i, :, :, :]
            _scale, _invRotMatrix = item
            _heatmap = cv2.resize(_heatmap, None, fx=2.0 / _scale, fy=2.0 / _scale)
            _invheatmap = cv2.warpAffine(_heatmap, _invRotMatrix, (orgimage.shape[1], orgimage.shape[0]))
            sumheatmap += _invheatmap

        return sumheatmap

    def _flip_out_heatmap(self, flipout):
        outmap = np.zeros(flipout.shape, dtype=np.float)
        for i in range(flipout.shape[-1]):
            flipid = getFlipMapID(self.category, i)
            mask = np.copy(flipout[:, :, i])
            outmap[:, :, flipid] = cv2.flip(mask, flipCode=1)
        return outmap


    @staticmethod
    def get_normized_distance(category, gtKp):
        '''

        :param category:
        :param gtKp:
        :return: if ground truth's two points do not exist, return a big number 1e6
        '''

        if category in ['skirt' ,'trousers']:
            ##waistband left and right
            waistband_left_index  = get_kp_index_from_allkeys('waistband_left')
            waistband_right_index = get_kp_index_from_allkeys('waistband_right')

            if gtKp[waistband_left_index].visibility != -1 and gtKp[waistband_right_index].visibility != -1:
                distance = KpAnno.calcDistance(gtKp[waistband_left_index], gtKp[waistband_right_index])
            else:
                distance = 1e6
            return distance
        elif category in ['blouse', 'dress', 'outwear']:
            armpit_left_index  = get_kp_index_from_allkeys('armpit_left')
            armpit_right_index = get_kp_index_from_allkeys('armpit_right')
            ##armpit_left armpit_right'
            if gtKp[armpit_left_index].visibility != -1 and gtKp[armpit_right_index].visibility != -1:
                distance = KpAnno.calcDistance(gtKp[armpit_left_index], gtKp[armpit_right_index])
            else:
                distance = 1e6
            return distance
        else:
            assert (0), category + " not implemented in _get_normized_distance"


    @staticmethod
    def calc_ne_score(category, dtKp, gtKp):

        assert (len(dtKp) == len(gtKp)), "predicted keypoint number should be the same as ground truth keypoints" + \
                                         str(dtKp) + " vs " + str(gtKp)

        # calculate normalized error as score
        normalizedDistance = Evaluation.get_normized_distance(category, gtKp)

        mlist = list()
        for i in range(len(gtKp)):
            if gtKp[i].visibility == 1:
                dk = KpAnno.calcDistance(dtKp[i], gtKp[i])
                mlist.append( dk/normalizedDistance)

        return mlist


================================================
FILE: src/eval/post_process.py
================================================
import cv2
import numpy as np
from scipy.ndimage import gaussian_filter, maximum_filter
from keras.layers import *
from kpAnno import KpAnno

def post_process_heatmap(heatMap, kpConfidenceTh=0.2):
    kplst = list()
    for i in range(heatMap.shape[-1]):
        # ignore last channel, background channel
        _map = heatMap[0, :, :, i]
        _map = gaussian_filter(_map, sigma=0.5)
        _nmsPeaks = non_max_supression(_map, windowSize=3, threshold=1e-6)

        y, x = np.where(_nmsPeaks == _nmsPeaks.max())
        confidence = np.amax(_nmsPeaks)
        if confidence > kpConfidenceTh:
            kplst.append(KpAnno(x[0], y[0], 1))
        else:
            kplst.append(KpAnno(x[0], y[0], -1))
    return kplst

def non_max_supression(plain, windowSize=3, threshold=1e-6):
    # clear value less than threshold
    under_th_indices = plain < threshold
    plain[under_th_indices] = 0
    return plain* (plain == maximum_filter(plain, footprint=np.ones((windowSize, windowSize))))


================================================
FILE: src/top/demo.py
================================================
import sys
sys.path.insert(0, "../data_gen/")
sys.path.insert(0, "../eval/")
sys.path.insert(0, "../unet/")

import argparse
import os
import pandas as pd
import cv2
from evaluation import Evaluation
from dataset import getKpKeys, get_kp_index_from_allkeys

def visualize_keypoint(imageName, category, dtkp):
    cvmat = cv2.imread(imageName)
    for key in getKpKeys(category)[1:]:
        index = get_kp_index_from_allkeys(key)
        _kp = dtkp[index]
        cv2.circle(cvmat, center=(_kp.x, _kp.y), radius=7, color=(1.0, 0.0, 0.0), thickness=2)
    cv2.imshow('demo', cvmat)
    cv2.waitKey()

def demo(modelfile):

    # load network
    xEval = Evaluation('all', modelfile)

    # load images and run prediction
    testfile = os.path.join("../../data/test/", 'test.csv')
    xdf = pd.read_csv(testfile)
    xdf = xdf.sample(frac=1.0)

    for _index, _row in xdf.iterrows():
        _image_id = _row['image_id']
        _category = _row['image_category']
        imageName = os.path.join("../../data/test", _image_id)
        print _image_id, _category
        dtkp = xEval.predict_kp_with_rotate(imageName, _category)
        visualize_keypoint(imageName, _category, dtkp)


if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--gpuID", default=0, type=int, help='gpu id')
    parser.add_argument("--modelfile", help="file of model")

    args = parser.parse_args()

    print args

    os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
    os.environ["CUDA_VISIBLE_DEVICES"] = str(args.gpuID)

    demo(args.modelfile)

================================================
FILE: src/top/test.py
================================================
import sys
sys.path.insert(0, "../data_gen/")
sys.path.insert(0, "../eval/")
sys.path.insert(0, "../unet/")

import argparse
import os
from fashion_net import FashionNet
from dataset import getKpNum, getKpKeys
import pandas as pd
from evaluation import Evaluation
import pickle
import numpy as np


def get_best_single_model(valfile):
    '''
    :param valfile: the log file with validation score for each snapshot
    :return: model file and score
    '''

    def get_key(item):
        return item[1]

    with open(valfile) as xval:
        lines = xval.readlines()

    xlist = list()
    for linenum, xline in enumerate(lines):
        if 'hdf5' in xline and 'Socre' in xline:
            modelname = xline.strip().split(',')[0]
            overallscore = xline.strip().split(',')[1]
            xlist.append((modelname, overallscore))

    bestmodel = sorted(xlist, key=get_key)[0]

    return bestmodel


def fill_dataframe(kplst, keys, dfrow, image_category):
    # fill category

    dfrow['image_category'] = image_category

    assert (len(keys) == len(kplst)), str(len(kplst)) + ' must be the same as ' + str(len(keys))
    for i, _key in enumerate(keys):
        kpann = kplst[i]
        outstr = str(int(kpann.x))+"_"+str(int(kpann.y))+"_"+str(1)
        dfrow[_key] = outstr

def get_kp_from_dict(mdict, image_category, image_id):
    if image_category in mdict.keys():
        xdict = mdict[image_category]
    else:
        xdict = mdict['all']
    return xdict[image_id]

def submission(pklpath):
    xdf = pd.read_csv("../../data/train/Annotations/train.csv")
    trainKeys = xdf.keys()

    testdf = pd.read_csv("../../data/test/test.csv")
    print len(testdf), " samples in test.csv"

    mdict = dict()
    for xfile in os.listdir(pklpath):
        if xfile.endswith('.pkl'):
            category = xfile.strip().split('.')[0]
            pkl = open(os.path.join(pklpath, xfile))
            mdict[category] = pickle.load(pkl)

    print testdf.keys()
    print mdict.keys()

    submissionDf = pd.DataFrame(columns=trainKeys, index=np.arange(testdf.shape[0]))
    submissionDf = submissionDf.fillna(value='-1_-1_-1')
    submissionDf['image_id'] = testdf['image_id']
    submissionDf['image_category'] = testdf['image_category']

    for _index, _row in submissionDf.iterrows():
        image_id = _row['image_id']
        image_category = _row['image_category']
        kplst = get_kp_from_dict(mdict, image_category, image_id)
        fill_dataframe(kplst, getKpKeys('all')[1:], _row, image_category)


    print len(submissionDf), "save to ",  os.path.join(pklpath, 'submission.csv')
    submissionDf.to_csv( os.path.join(pklpath, 'submission.csv'), index=False )


def load_image_names(annfile, category):
    # read into dataframe
    xdf = pd.read_csv(annfile)
    xdf = xdf[xdf['image_category'] == category]
    return xdf

def main_test(savepath, modelpath, augmentFlag):

    valfile = os.path.join(modelpath, 'val.log')
    bestmodels = get_best_single_model(valfile)

    print bestmodels, augmentFlag

    xEval = Evaluation('all', bestmodels[0])

    # load images and run prediction
    testfile = os.path.join("../../data/test/", 'test.csv')

    for category in ['skirt', 'blouse', 'trousers', 'outwear', 'dress']:
        xdict = dict()
        xdf = load_image_names(testfile, category)
        print len(xdf), " images to process ", category

        count = 0
        for _index, _row in xdf.iterrows():
            count += 1
            if count%1000 == 0:
                print count, "images have been processed"

            _image_id = _row['image_id']
            imageName = os.path.join("../../data/test", _image_id)
            if augmentFlag:
                dtkp = xEval.predict_kp_with_rotate(imageName, _row['image_category'])
            else:
                dtkp = xEval.predict_kp(imageName, _row['image_category'], multiOutput=True)
            xdict[_image_id] = dtkp

        savefile = os.path.join(savepath, category+'.pkl')
        with open(savefile, 'wb') as xfile:
            pickle.dump(xdict, xfile)

        print "prediction save to ", savefile


if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--gpuID", default=0, type=int, help='gpu id')
    parser.add_argument("--modelpath", help="path of trained model")
    parser.add_argument("--outpath", help="path to save predicted keypoints")
    parser.add_argument("--augment", default=False, type=bool, help="augment or not")

    args = parser.parse_args()

    print args

    os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
    os.environ["CUDA_VISIBLE_DEVICES"] = str(args.gpuID)

    main_test(args.outpath, args.modelpath, args.augment)
    submission(args.outpath)

================================================
FILE: src/top/train.py
================================================
import sys
sys.path.insert(0, "../data_gen/")
sys.path.insert(0, "../unet/")

import argparse
import os
from fashion_net import FashionNet
from dataset import getKpNum
import tensorflow as tf
from keras import backend as k

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--gpuID", default=0, type=int, help='gpu id')
    parser.add_argument("--category", help="specify cloth category")
    parser.add_argument("--network", help="specify  network arch'")
    parser.add_argument("--batchSize", default=8, type=int, help='batch size for training')
    parser.add_argument("--epochs", default=20, type=int, help="number of traning epochs")
    parser.add_argument("--resume", default=False, type=bool,  help="resume training or not")
    parser.add_argument("--lrdecay", default=False, type=bool,  help="lr decay or not")
    parser.add_argument("--resumeModel", help="start point to retrain")
    parser.add_argument("--initEpoch", type=int, help="epoch to resume")


    args = parser.parse_args()

    os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
    os.environ["CUDA_VISIBLE_DEVICES"] = str(args.gpuID)


    # TensorFlow wizardry
    config = tf.ConfigProto()

    # Don't pre-allocate memory; allocate as-needed
    config.gpu_options.allow_growth = True

    # Only allow a total of half the GPU memory to be allocated
    config.gpu_options.per_process_gpu_memory_fraction = 1.0

    # Create a session with the above options specified.
    k.tensorflow_backend.set_session(tf.Session(config=config))

    if not args.resume :
        xnet = FashionNet(512, 512, getKpNum(args.category))
        xnet.build_model(modelName=args.network, show=True)
        xnet.train(args.category, epochs=args.epochs, batchSize=args.batchSize, lrschedule=args.lrdecay)
    else:
        xnet = FashionNet(512, 512, getKpNum(args.category))
        xnet.resume_train(args.category, args.resumeModel, args.network, args.initEpoch,
                          epochs=args.epochs, batchSize=args.batchSize)

================================================
FILE: src/unet/fashion_net.py
================================================

import sys
sys.path.insert(0, "../data_gen/")
sys.path.insert(0, "../eval/")

from data_generator import DataGenerator
from keras.callbacks import ModelCheckpoint, CSVLogger
from keras.models import load_model
from data_process import pad_image, normalize_image
import os
import cv2
import numpy as np
import datetime
from eval_callback import NormalizedErrorCallBack
from refinenet_mask_v3 import Res101RefineNetMaskV3, euclidean_loss
from resnet101 import Scale
import tensorflow as tf

class FashionNet(object):

    def __init__(self, inputHeight, inputWidth, nClasses):
        self.inputWidth = inputWidth
        self.inputHeight = inputHeight
        self.nClass = nClasses

    def build_model(self, modelName='v2', show=False):
        self.modelName = modelName
        self.model = Res101RefineNetMaskV3(self.nClass, self.inputHeight, self.inputWidth, nStackNum=2)
        self.nStackNum = 2

        # show model summary and layer name
        if show:
            self.model.summary()
            for layer in self.model.layers:
                print layer.name, layer.trainable

    def train(self, category, batchSize=8, epochs=20, lrschedule=False):
        trainDt = DataGenerator(category, os.path.join("../../data/train/Annotations", "train_split.csv"))
        trainGen = trainDt.generator_with_mask_ohem( graph=tf.get_default_graph(), kerasModel=self.model,
                                    batchSize= batchSize, inputSize=(self.inputHeight, self.inputWidth),
                                    nStackNum=self.nStackNum, flipFlag=False, cropFlag=False)

        normalizedErrorCallBack = NormalizedErrorCallBack("../../trained_models/", category, True)

        csvlogger = CSVLogger( os.path.join(normalizedErrorCallBack.get_folder_path(),
                               "csv_train_"+self.modelName+"_"+str(datetime.datetime.now().strftime('%H:%M'))+".csv"))

        xcallbacks = [normalizedErrorCallBack, csvlogger]

        self.model.fit_generator(generator=trainGen, steps_per_epoch=trainDt.get_dataset_size()//batchSize,
                                 epochs=epochs,  callbacks=xcallbacks)

    def load_model(self, netWeightFile):
        self.model = load_model(netWeightFile, custom_objects={'euclidean_loss': euclidean_loss, 'Scale': Scale})

    def resume_train(self, category, pretrainModel, modelName, initEpoch, batchSize=8, epochs=20):
        self.modelName = modelName
        self.load_model(pretrainModel)
        refineNetflag = True
        self.nStackNum = 2

        modelPath = os.path.dirname(pretrainModel)

        trainDt = DataGenerator(category, os.path.join("../../data/train/Annotations", "train_split.csv"))
        trainGen = trainDt.generator_with_mask_ohem(graph=tf.get_default_graph(), kerasModel=self.model,
                                                    batchSize=batchSize, inputSize=(self.inputHeight, self.inputWidth),
                                                    nStackNum=self.nStackNum, flipFlag=False, cropFlag=False)


        normalizedErrorCallBack = NormalizedErrorCallBack("../../trained_models/", category, refineNetflag, resumeFolder=modelPath)

        csvlogger = CSVLogger(os.path.join(normalizedErrorCallBack.get_folder_path(),
                                           "csv_train_" + self.modelName + "_" + str(
                                               datetime.datetime.now().strftime('%H:%M')) + ".csv"))

        self.model.fit_generator(initial_epoch=initEpoch, generator=trainGen, steps_per_epoch=trainDt.get_dataset_size() // batchSize,
                                 epochs=epochs, callbacks=[normalizedErrorCallBack, csvlogger])


    def predict_image(self, imgfile):
        # load image and preprocess
        img = cv2.imread(imgfile)
        img, _ = pad_image(img, list(), 512, 512)
        img = normalize_image(img)
        input = img[np.newaxis,:,:,:]
        # inference
        heatmap = self.model.predict(input)
        return heatmap


    def predict(self, input):
        # inference
        heatmap = self.model.predict(input)
        return heatmap

================================================
FILE: src/unet/refinenet.py
================================================
from keras.models import *
from keras.layers import *
from keras.optimizers import Adam, SGD
from keras import backend as K
from keras.applications.resnet50 import ResNet50

IMAGE_ORDERING = 'channels_last'

def Res101RefineNetDilated(n_classes, inputHeight, inputWidth):
    model = build_network_resnet101(inputHeight, inputWidth, n_classes, dilated=True)
    return model

def Res101RefineNetStacked(n_classes, inputHeight, inputWidth, nStackNum):
    model = build_network_resnet101_stack(inputHeight, inputWidth, n_classes, nStackNum)
    return model

def euclidean_loss(x, y):
    return K.sqrt(K.sum(K.square(x - y)))


def create_global_net(lowlevelFeatures, n_classes):
    lf2x, lf4x, lf8x, lf16x = lowlevelFeatures

    o = lf16x

    o = (Conv2D(256, (3, 3), activation='relu', padding='same', name='up16x_conv', data_format=IMAGE_ORDERING))(o)
    o = (BatchNormalization())(o)

    o = (Conv2DTranspose(256, kernel_size=(3, 3), strides=(2, 2), name='upsample_16x', activation='relu', padding='same',
                    data_format=IMAGE_ORDERING))(o)
    o = (concatenate([o, lf8x], axis=-1))
    o = (Conv2D(128, (3, 3), activation='relu', padding='same', name='up8x_conv', data_format=IMAGE_ORDERING))(o)
    o = (BatchNormalization())(o)
    fup8x = o

    o = (Conv2DTranspose(128, kernel_size=(3, 3), strides=(2, 2), name='upsample_8x', padding='same', activation='relu',
                         data_format=IMAGE_ORDERING))(o)
    o = (concatenate([o, lf4x], axis=-1))
    o = (Conv2D(64, (3, 3), activation='relu', padding='same', name='up4x_conv', data_format=IMAGE_ORDERING))(o)
    o = (BatchNormalization())(o)
    fup4x = o

    o = (Conv2DTranspose(64, kernel_size=(3, 3), strides=(2, 2), name='upsample_4x', padding='same', activation='relu',
                         data_format=IMAGE_ORDERING))(o)
    o = (concatenate([o, lf2x], axis=-1))
    o = (Conv2D(64, (3, 3), activation='relu', padding='same', name='up2x_conv', data_format=IMAGE_ORDERING))(o)
    o = (BatchNormalization())(o)
    fup2x = o

    out2x = Conv2D(n_classes, (1, 1), activation='linear', padding='same', name='out2x', data_format=IMAGE_ORDERING)(fup2x)
    out4x = Conv2D(n_classes, (1, 1), activation='linear', padding='same', name='out4x', data_format=IMAGE_ORDERING)(fup4x)
    out8x = Conv2D(n_classes, (1, 1), activation='linear', padding='same', name='out8x', data_format=IMAGE_ORDERING)(fup8x)

    x4x = UpSampling2D((2, 2), data_format=IMAGE_ORDERING)(out8x)
    eadd4x = Add(name='global4x')([x4x, out4x])

    x2x = UpSampling2D((2, 2), data_format=IMAGE_ORDERING)(eadd4x)
    eadd2x = Add(name='global2x')([x2x, out2x])

    return (fup8x, eadd4x, eadd2x)

def create_refine_net(inputFeatures, n_classes):
    f8x, f4x, f2x = inputFeatures

    # 2 Conv2DTranspose f8x -> fup8x
    fup8x = (Conv2DTranspose(128, kernel_size=(3, 3), strides=(2, 2), name='refine8x_deconv_1', padding='same', activation='relu',
                         data_format=IMAGE_ORDERING))(f8x)
    fup8x = (BatchNormalization())(fup8x)

    fup8x = (Conv2DTranspose(128, kernel_size=(3, 3), strides=(2, 2), name='refine8x_deconv_2', padding='same', activation='relu',
                         data_format=IMAGE_ORDERING))(fup8x)
    fup8x = (BatchNormalization())(fup8x)

    # 1 Conv2DTranspose f4x -> fup4x
    fup4x = (Conv2DTranspose(128, kernel_size=(3, 3), strides=(2, 2), name='refine4x_deconv', padding='same', activation='relu',
                    data_format=IMAGE_ORDERING))(f4x)

    fup4x = (BatchNormalization())(fup4x)

    # 1 conv f2x -> fup2x
    fup2x =  (Conv2D(128, (3, 3), activation='relu', padding='same', name='refine2x_conv', data_format=IMAGE_ORDERING))(f2x)
    fup2x =  (BatchNormalization())(fup2x)

    # concat f2x, fup8x, fup4x
    fconcat = (concatenate([fup8x, fup4x, fup2x], axis=-1, name='refine_concat'))

    # 1x1 to map to required feature map
    out2x = Conv2D(n_classes, (1, 1), activation='linear', padding='same', name='refine2x', data_format=IMAGE_ORDERING)(fconcat)

    return out2x


def create_refine_net_bottleneck(inputFeatures, n_classes):
    f8x, f4x, f2x = inputFeatures

    # 2 Conv2DTranspose f8x -> fup8x
    fup8x = (Conv2D(256, kernel_size=(1, 1),  name='refine8x_1', padding='same', activation='relu', data_format=IMAGE_ORDERING))(f8x)
    fup8x = (BatchNormalization())(fup8x)

    fup8x = (Conv2D(128, kernel_size=(1, 1),  name='refine8x_2', padding='same', activation='relu', data_format=IMAGE_ORDERING))(fup8x)
    fup8x = (BatchNormalization())(fup8x)

    fup8x = UpSampling2D((4, 4), data_format=IMAGE_ORDERING)(fup8x)


    # 1 Conv2DTranspose f4x -> fup4x
    fup4x = (Conv2D(128, kernel_size=(1, 1), name='refine4x', padding='same', activation='relu', data_format=IMAGE_ORDERING))(f4x)
    fup4x = (BatchNormalization())(fup4x)
    fup4x = UpSampling2D((2, 2), data_format=IMAGE_ORDERING)(fup4x)


    # 1 conv f2x -> fup2x
    fup2x =  (Conv2D(128, (1, 1), activation='relu', padding='same', name='refine2x_conv', data_format=IMAGE_ORDERING))(f2x)
    fup2x =  (BatchNormalization())(fup2x)

    # concat f2x, fup8x, fup4x
    fconcat = (concatenate([fup8x, fup4x, fup2x], axis=-1, name='refine_concat'))

    # 1x1 to map to required feature map
    out2x = Conv2D(n_classes, (1, 1), activation='linear', padding='same', name='refine2x', data_format=IMAGE_ORDERING)(fconcat)

    return out2x


def create_stack_refinenet(inputFeatures, n_classes, layerName):
    f8x, f4x, f2x = inputFeatures

    # 2 Conv2DTranspose f8x -> fup8x
    fup8x = (Conv2D(256, kernel_size=(1, 1), name=layerName+'_refine8x_1', padding='same', activation='relu'))(f8x)
    fup8x = (BatchNormalization())(fup8x)

    fup8x = (Conv2D(128, kernel_size=(1, 1), name=layerName+'refine8x_2', padding='same', activation='relu'))(fup8x)
    fup8x = (BatchNormalization())(fup8x)

    out8x = fup8x
    fup8x = UpSampling2D((4, 4), data_format=IMAGE_ORDERING)(fup8x)

    # 1 Conv2DTranspose f4x -> fup4x
    fup4x = (Conv2D(128, kernel_size=(1, 1), name=layerName+'refine4x', padding='same', activation='relu'))(f4x)
    fup4x = (BatchNormalization())(fup4x)
    out4x = fup4x
    fup4x = UpSampling2D((2, 2), data_format=IMAGE_ORDERING)(fup4x)

    # 1 conv f2x -> fup2x
    fup2x = (Conv2D(128, (1, 1), activation='relu', padding='same', name=layerName+'refine2x_conv'))(f2x)
    fup2x = (BatchNormalization())(fup2x)

    # concat f2x, fup8x, fup4x
    fconcat = (concatenate([fup8x, fup4x, fup2x], axis=-1, name=layerName+'refine_concat'))

    # 1x1 to map to required feature map
    out2x = Conv2D(n_classes, (1, 1), activation='linear', padding='same', name=layerName+'refine2x')(fconcat)

    return out8x, out4x, out2x


def create_global_net_dilated(lowlevelFeatures, n_classes):
    lf2x, lf4x, lf8x, lf16x = lowlevelFeatures

    o = lf16x

    o = (Conv2D(256, (3, 3), dilation_rate=(2, 2), activation='relu', padding='same', name='up16x_conv', data_format=IMAGE_ORDERING))(o)
    o = (BatchNormalization())(o)

    o = (Conv2DTranspose(256, kernel_size=(3, 3), strides=(2, 2), name='upsample_16x', activation='relu', padding='same',
                    data_format=IMAGE_ORDERING))(o)
    o = (concatenate([o, lf8x], axis=-1))
    o = (Conv2D(128, (3, 3), dilation_rate=(2, 2), activation='relu', padding='same', name='up8x_conv', data_format=IMAGE_ORDERING))(o)
    o = (BatchNormalization())(o)
    fup8x = o

    o = (Conv2DTranspose(128, kernel_size=(3, 3), strides=(2, 2), name='upsample_8x', padding='same', activation='relu',
                         data_format=IMAGE_ORDERING))(o)
    o = (concatenate([o, lf4x], axis=-1))
    o = (Conv2D(64, (3, 3), dilation_rate=(2, 2), activation='relu', padding='same', name='up4x_conv', data_format=IMAGE_ORDERING))(o)
    o = (BatchNormalization())(o)
    fup4x = o

    o = (Conv2DTranspose(64, kernel_size=(3, 3), strides=(2, 2), name='upsample_4x', padding='same', activation='relu',
                         data_format=IMAGE_ORDERING))(o)
    o = (concatenate([o, lf2x], axis=-1))
    o = (Conv2D(64, (3, 3), dilation_rate=(2, 2), activation='relu', padding='same', name='up2x_conv', data_format=IMAGE_ORDERING))(o)
    o = (BatchNormalization())(o)
    fup2x = o

    out2x = Conv2D(n_classes, (1, 1), activation='linear', padding='same', name='out2x', data_format=IMAGE_ORDERING)(fup2x)
    out4x = Conv2D(n_classes, (1, 1), activation='linear', padding='same', name='out4x', data_format=IMAGE_ORDERING)(fup4x)
    out8x = Conv2D(n_classes, (1, 1), activation='linear', padding='same', name='out8x', data_format=IMAGE_ORDERING)(fup8x)

    x4x = UpSampling2D((2, 2), data_format=IMAGE_ORDERING)(out8x)
    eadd4x = Add(name='global4x')([x4x, out4x])

    x2x = UpSampling2D((2, 2), data_format=IMAGE_ORDERING)(eadd4x)
    eadd2x = Add(name='global2x')([x2x, out2x])

    return (fup8x, eadd4x, eadd2x)


def build_network_resnet101(inputHeight, inputWidth, n_classes, frozenlayers=True, dilated=False):
    input, lf2x, lf4x, lf8x, lf16x = load_backbone_res101net(inputHeight, inputWidth)

    # global net 8x, 4x, and 2x
    if dilated:
        g8x, g4x, g2x = create_global_net_dilated((lf2x, lf4x, lf8x, lf16x), n_classes)
    else:
        g8x, g4x, g2x = create_global_net((lf2x, lf4x, lf8x, lf16x), n_classes)

    # refine net, only 2x as output
    refine2x = create_refine_net_bottleneck((g8x, g4x, g2x), n_classes)

    model = Model(inputs=input, outputs=[g2x, refine2x])

    adam = Adam(lr=1e-4)
    model.compile(optimizer=adam, loss=euclidean_loss, metrics=["accuracy"])

    return model


def build_network_resnet101_stack(inputHeight, inputWidth, n_classes, nStack):
    # backbone network
    input, lf2x,lf4x, lf8x, lf16x = load_backbone_res101net(inputHeight, inputWidth)

    # global net
    g8x, g4x, g2x = create_global_net_dilated((lf2x, lf4x, lf8x, lf16x), n_classes)

    s8x, s4x, s2x = g8x, g4x, g2x

    outputs =  [g2x]
    for i in range(nStack):
        s8x, s4x, s2x =  create_stack_refinenet((s8x, s4x, s2x), n_classes, 'stack_'+str(i))
        outputs.append(s2x)

    model = Model(inputs=input, outputs=outputs)

    adam = Adam(lr=1e-4)
    model.compile(optimizer=adam, loss=euclidean_loss, metrics=["accuracy"])
    return model


def load_backbone_res101net(inputHeight, inputWidth):
    from resnet101 import ResNet101
    xresnet = ResNet101(weights='imagenet', include_top=False, input_shape=(inputHeight, inputWidth, 3))

    xresnet.load_weights("../../data/resnet101_weights_tf.h5", by_name=True)

    lf16x = xresnet.get_layer('res4b22_relu').output
    lf8x = xresnet.get_layer('res3b2_relu').output
    lf4x = xresnet.get_layer('res2c_relu').output
    lf2x = xresnet.get_layer('conv1_relu').output

    # add one padding for lf4x whose shape is 127x127
    lf4xp = ZeroPadding2D(padding=((0, 1), (0, 1)))(lf4x)

    return (xresnet.input, lf2x, lf4xp, lf8x, lf16x)

================================================
FILE: src/unet/refinenet_mask_v3.py
================================================

from refinenet import load_backbone_res101net, create_global_net_dilated, create_stack_refinenet
from keras.models import *
from keras.layers import *
from keras.optimizers import Adam, SGD
from keras import backend as K
import keras

def Res101RefineNetMaskV3(n_classes, inputHeight, inputWidth, nStackNum):
    model = build_resnet101_stack_mask_v3(inputHeight, inputWidth, n_classes, nStackNum)
    return model

def euclidean_loss(x, y):
    return K.sqrt(K.sum(K.square(x - y)))

def apply_mask_to_output(output, mask):
    output_with_mask = keras.layers.multiply([output, mask])
    return output_with_mask

def build_resnet101_stack_mask_v3(inputHeight, inputWidth, n_classes, nStack):

    input_mask = Input(shape=(inputHeight//2, inputHeight//2, n_classes), name='mask')
    input_ohem_mask = Input(shape=(inputHeight//2, inputHeight//2, n_classes), name='ohem_mask')

    # backbone network
    input_image, lf2x,lf4x, lf8x, lf16x = load_backbone_res101net(inputHeight, inputWidth)

    # global net
    g8x, g4x, g2x = create_global_net_dilated((lf2x, lf4x, lf8x, lf16x), n_classes)

    s8x, s4x, s2x = g8x, g4x, g2x

    g2x_mask = apply_mask_to_output(g2x, input_mask)

    outputs =  [g2x_mask]
    for i in range(nStack):
        s8x, s4x, s2x =  create_stack_refinenet((s8x, s4x, s2x), n_classes, 'stack_'+str(i))
        if i == (nStack-1): # last stack with ohem_mask
            s2x_mask = apply_mask_to_output(s2x, input_ohem_mask)
        else:
            s2x_mask = apply_mask_to_output(s2x, input_mask)
        outputs.append(s2x_mask)

    model = Model(inputs=[input_image, input_mask, input_ohem_mask], outputs=outputs)

    adam = Adam(lr=1e-4)
    model.compile(optimizer=adam, loss=euclidean_loss, metrics=["accuracy"])
    return model

================================================
FILE: src/unet/resnet101.py
================================================
# -*- coding: utf-8 -*-
"""ResNet-101 model for Keras.

# Reference:

- [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385)

Slightly modified Felix Yu's (https://github.com/flyyufelix) implementation of
ResNet-101 to have consistent API as those pre-trained models within
`keras.applications`. The original implementation is found here
https://gist.github.com/flyyufelix/65018873f8cb2bbe95f429c474aa1294#file-resnet-101_keras-py

Implementation is based on Keras 2.0
"""
from keras.layers import (
    Input, Dense, Conv2D, MaxPooling2D, AveragePooling2D, ZeroPadding2D,
    Flatten, Activation, GlobalAveragePooling2D, GlobalMaxPooling2D, add)
from keras.layers.normalization import BatchNormalization
from keras.models import Model
from keras import initializers
from keras.engine import Layer, InputSpec
from keras.engine.topology import get_source_inputs
from keras import backend as K
from keras.applications.imagenet_utils import _obtain_input_shape
from keras.utils.data_utils import get_file

import warnings
import sys
sys.setrecursionlimit(3000)


WEIGHTS_PATH_TH = 'https://dl.dropboxusercontent.com/s/rrp56zm347fbrdn/resnet101_weights_th.h5?dl=0'
WEIGHTS_PATH_TF = 'https://dl.dropboxusercontent.com/s/a21lyqwgf88nz9b/resnet101_weights_tf.h5?dl=0'
MD5_HASH_TH = '3d2e9a49d05192ce6e22200324b7defe'
MD5_HASH_TF = '867a922efc475e9966d0f3f7b884dc15'


class Scale(Layer):
    '''Learns a set of weights and biases used for scaling the input data.
    the output consists simply in an element-wise multiplication of the input
    and a sum of a set of constants:

        out = in * gamma + beta,

    where 'gamma' and 'beta' are the weights and biases larned.

    # Arguments
        axis: integer, axis along which to normalize in mode 0. For instance,
            if your input tensor has shape (samples, channels, rows, cols),
            set axis to 1 to normalize per feature map (channels axis).
        momentum: momentum in the computation of the
            exponential average of the mean and standard deviation
            of the data, for feature-wise normalization.
        weights: Initialization weights.
            List of 2 Numpy arrays, with shapes:
            `[(input_shape,), (input_shape,)]`
        beta_init: name of initialization function for shift parameter
            (see [initializers](../initializers.md)), or alternatively,
            Theano/TensorFlow function to use for weights initialization.
            This parameter is only relevant if you don't pass a `weights`
            argument.
        gamma_init: name of initialization function for scale parameter (see
            [initializers](../initializers.md)), or alternatively,
            Theano/TensorFlow function to use for weights initialization.
            This parameter is only relevant if you don't pass a `weights`
            argument.
        gamma_init: name of initialization function for scale parameter (see
            [initializers](../initializers.md)), or alternatively,
            Theano/TensorFlow function to use for weights initialization.
            This parameter is only relevant if you don't pass a `weights`
            argument.
    '''
    def __init__(self,
                 weights=None,
                 axis=-1,
                 momentum=0.9,
                 beta_init='zero',
                 gamma_init='one',
                 **kwargs):
        self.momentum = momentum
        self.axis = axis
        self.beta_init = initializers.get(beta_init)
        self.gamma_init = initializers.get(gamma_init)
        self.initial_weights = weights
        super(Scale, self).__init__(**kwargs)

    def build(self, input_shape):
        self.input_spec = [InputSpec(shape=input_shape)]
        shape = (int(input_shape[self.axis]),)

        self.gamma = K.variable(
            self.gamma_init(shape),
            name='{}_gamma'.format(self.name))
        self.beta = K.variable(
            self.beta_init(shape),
            name='{}_beta'.format(self.name))
        self.trainable_weights = [self.gamma, self.beta]

        if self.initial_weights is not None:
            self.set_weights(self.initial_weights)
            del self.initial_weights

    def call(self, x, mask=None):
        input_shape = self.input_spec[0].shape
        broadcast_shape = [1] * len(input_shape)
        broadcast_shape[self.axis] = input_shape[self.axis]

        out = K.reshape(
            self.gamma,
            broadcast_shape) * x + K.reshape(self.beta, broadcast_shape)
        return out

    def get_config(self):
        config = {"momentum": self.momentum, "axis": self.axis}
        base_config = super(Scale, self).get_config()
        return dict(list(base_config.items()) + list(config.items()))


def identity_block(input_tensor, kernel_size, filters, stage, block):
    '''The identity_block is the block that has no conv layer at shortcut
    # Arguments
        input_tensor: input tensor
        kernel_size: defualt 3, the kernel size of middle conv layer at main
            path
        filters: list of integers, the nb_filters of 3 conv layer at main path
        stage: integer, current stage label, used for generating layer names
        block: 'a','b'..., current block label, used for generating layer names
    '''
    eps = 1.1e-5
    if K.image_data_format() == 'channels_last':
        bn_axis = 3
    else:
        bn_axis = 1
    nb_filter1, nb_filter2, nb_filter3 = filters
    conv_name_base = 'res' + str(stage) + block + '_branch'
    bn_name_base = 'bn' + str(stage) + block + '_branch'
    scale_name_base = 'scale' + str(stage) + block + '_branch'

    x = Conv2D(nb_filter1, (1, 1), name=conv_name_base + '2a',
               use_bias=False)(input_tensor)
    x = BatchNormalization(epsilon=eps, axis=bn_axis,
                           name=bn_name_base + '2a')(x)
    x = Scale(axis=bn_axis, name=scale_name_base + '2a')(x)
    x = Activation('relu', name=conv_name_base + '2a_relu')(x)

    x = ZeroPadding2D((1, 1), name=conv_name_base + '2b_zeropadding')(x)
    x = Conv2D(nb_filter2, (kernel_size, kernel_size),
               name=conv_name_base + '2b', use_bias=False)(x)
    x = BatchNormalization(epsilon=eps, axis=bn_axis,
                           name=bn_name_base + '2b')(x)
    x = Scale(axis=bn_axis, name=scale_name_base + '2b')(x)
    x = Activation('relu', name=conv_name_base + '2b_relu')(x)

    x = Conv2D(nb_filter3, (1, 1), name=conv_name_base + '2c',
               use_bias=False)(x)
    x = BatchNormalization(epsilon=eps, axis=bn_axis,
                           name=bn_name_base + '2c')(x)
    x = Scale(axis=bn_axis, name=scale_name_base + '2c')(x)

    x = add([x, input_tensor], name='res' + str(stage) + block)
    x = Activation('relu', name='res' + str(stage) + block + '_relu')(x)
    return x


def conv_block(input_tensor,
               kernel_size,
               filters,
               stage,
               block,
               strides=(2, 2)):
    '''conv_block is the block that has a conv layer at shortcut
    # Arguments
        input_tensor: input tensor
        kernel_size: defualt 3, the kernel size of middle conv layer at main
            path
        filters: list of integers, the nb_filters of 3 conv layer at main path
        stage: integer, current stage label, used for generating layer names
        block: 'a','b'..., current block label, used for generating layer names
    Note that from stage 3, the first conv layer at main path is with
    strides=(2,2). And the shortcut should have strides=(2,2) as well
    '''
    eps = 1.1e-5
    if K.image_data_format() == 'channels_last':
        bn_axis = 3
    else:
        bn_axis = 1
    nb_filter1, nb_filter2, nb_filter3 = filters
    conv_name_base = 'res' + str(stage) + block + '_branch'
    bn_name_base = 'bn' + str(stage) + block + '_branch'
    scale_name_base = 'scale' + str(stage) + block + '_branch'

    x = Conv2D(nb_filter1, (1, 1), strides=strides,
               name=conv_name_base + '2a', use_bias=False)(input_tensor)
    x = BatchNormalization(epsilon=eps, axis=bn_axis,
                           name=bn_name_base + '2a')(x)
    x = Scale(axis=bn_axis, name=scale_name_base + '2a')(x)
    x = Activation('relu', name=conv_name_base + '2a_relu')(x)

    x = ZeroPadding2D((1, 1), name=conv_name_base + '2b_zeropadding')(x)
    x = Conv2D(nb_filter2, (kernel_size, kernel_size),
               name=conv_name_base + '2b', use_bias=False)(x)
    x = BatchNormalization(epsilon=eps, axis=bn_axis,
                           name=bn_name_base + '2b')(x)
    x = Scale(axis=bn_axis, name=scale_name_base + '2b')(x)
    x = Activation('relu', name=conv_name_base + '2b_relu')(x)

    x = Conv2D(nb_filter3, (1, 1),
               name=conv_name_base + '2c', use_bias=False)(x)
    x = BatchNormalization(epsilon=eps, axis=bn_axis,
                           name=bn_name_base + '2c')(x)
    x = Scale(axis=bn_axis, name=scale_name_base + '2c')(x)

    shortcut = Conv2D(nb_filter3, (1, 1), strides=strides,
                      name=conv_name_base + '1', use_bias=False)(input_tensor)
    shortcut = BatchNormalization(epsilon=eps, axis=bn_axis,
                                  name=bn_name_base + '1')(shortcut)
    shortcut = Scale(axis=bn_axis, name=scale_name_base + '1')(shortcut)

    x = add([x, shortcut], name='res' + str(stage) + block)
    x = Activation('relu', name='res' + str(stage) + block + '_relu')(x)
    return x


def ResNet101(include_top=True,
              weights='imagenet',
              input_tensor=None,
              input_shape=None,
              pooling=None,
              classes=1000):
    """Instantiates the ResNet-101 architecture.

    Optionally loads weights pre-trained on ImageNet. Note that when using
    TensorFlow, for best performance you should set
    image_data_format='channels_last'` in your Keras config at
    ~/.keras/keras.json.

    The model and the weights are compatible with both TensorFlow and Theano.
    The data format convention used by the model is the one specified in your
    Keras config file.

    Parameters
    ----------
        include_top: whether to include the fully-connected layer at the top of
            the network.
        weights: one of `None` (random initialization) or 'imagenet'
            (pre-training on ImageNet).
        input_tensor: optional Keras tensor (i.e. output of `layers.Input()`)
            to use as image input for the model.
        input_shape: optional shape tuple, only to be specified if
            `include_top` is False (otherwise the input shape has to be
            `(224, 224, 3)` (with `channels_last` data format) or
            `(3, 224, 224)` (with `channels_first` data format). It should have
            exactly 3 inputs channels, and width and height should be no
            smaller than 197.
            E.g. `(200, 200, 3)` would be one valid value.
        pooling: Optional pooling mode for feature extraction when
            `include_top` is `False`.
            - `None` means that the output of the model will be the 4D tensor
                output of the last convolutional layer.
            - `avg` means that global average pooling will be applied to the
                output of the last convolutional layer, and thus the output of
                the model will be a 2D tensor.
            - `max` means that global max pooling will be applied.
        classes: optional number of classes to classify images into, only to be
            specified if `include_top` is True, and if no `weights` argument is
            specified.

    Returns
    -------
        A Keras model instance.

    Raises
    ------
        ValueError: in case of invalid argument for `weights`, or invalid input
        shape.
    """
    if weights not in {'imagenet', None}:
        raise ValueError('The `weights` argument should be either '
                         '`None` (random initialization) or `imagenet` '
                         '(pre-training on ImageNet).')

    if weights == 'imagenet' and include_top and classes != 1000:
        raise ValueError('If using `weights` as imagenet with `include_top`'
                         ' as true, `classes` should be 1000')

    # Determine proper input shape
    input_shape = _obtain_input_shape(input_shape,
                                      default_size=224,
                                      min_size=197,
                                      data_format=K.image_data_format(),
                                      require_flatten=include_top,
                                      weights=weights)

    if input_tensor is None:
        img_input = Input(shape=input_shape, name='data')
    else:
        if not K.is_keras_tensor(input_tensor):
            img_input = Input(
                tensor=input_tensor, shape=input_shape, name='data')
        else:
            img_input = input_tensor
    if K.image_data_format() == 'channels_last':
        bn_axis = 3
    else:
        bn_axis = 1
    eps = 1.1e-5

    x = ZeroPadding2D((3, 3), name='conv1_zeropadding')(img_input)
    x = Conv2D(64, (7, 7), strides=(2, 2), name='conv1', use_bias=False)(x)
    x = BatchNormalization(epsilon=eps, axis=bn_axis, name='bn_conv1')(x)
    x = Scale(axis=bn_axis, name='scale_conv1')(x)
    x = Activation('relu', name='conv1_relu')(x)
    x = MaxPooling2D((3, 3), strides=(2, 2), name='pool1')(x)

    x = conv_block(x, 3, [64, 64, 256], stage=2, block='a', strides=(1, 1))
    x = identity_block(x, 3, [64, 64, 256], stage=2, block='b')
    x = identity_block(x, 3, [64, 64, 256], stage=2, block='c')

    x = conv_block(x, 3, [128, 128, 512], stage=3, block='a')
    for i in range(1, 3):
        x = identity_block(x, 3, [128, 128, 512], stage=3, block='b' + str(i))

    x = conv_block(x, 3, [256, 256, 1024], stage=4, block='a')
    for i in range(1, 23):
        x = identity_block(x, 3, [256, 256, 1024], stage=4, block='b' + str(i))

    x = conv_block(x, 3, [512, 512, 2048], stage=5, block='a')
    x = identity_block(x, 3, [512, 512, 2048], stage=5, block='b')
    x = identity_block(x, 3, [512, 512, 2048], stage=5, block='c')

    x = AveragePooling2D((7, 7), name='avg_pool')(x)

    if include_top:
        x = Flatten()(x)
        x = Dense(classes, activation='softmax', name='mmfc1000')(x)
    else:
        if pooling == 'avg':
            x = GlobalAveragePooling2D()(x)
        elif pooling == 'max':
            x = GlobalMaxPooling2D()(x)

    # Ensure that the model takes into account
    # any potential predecessors of `input_tensor`.
    if input_tensor is not None:
        inputs = get_source_inputs(input_tensor)
    else:
        inputs = img_input
    # Create model.
    model = Model(inputs, x, name='resnet101')

    '''
    # load weights
    if weights == 'imagenet':
        filename = 'resnet101_weights_{}.h5'.format(K.image_dim_ordering())
        if K.backend() == 'theano':
            path = WEIGHTS_PATH_TH
            md5_hash = MD5_HASH_TH
        else:
            path = WEIGHTS_PATH_TF
            md5_hash = MD5_HASH_TF
        weights_path = get_file(
            fname=filename,
            origin=path,
            cache_subdir='models',
            md5_hash=md5_hash,
            hash_algorithm='md5')
        model.load_weights(weights_path, by_name=True)

        if K.image_data_format() == 'channels_first' and K.backend() == 'tensorflow':
            warnings.warn('You are using the TensorFlow backend, yet you '
                          'are using the Theano '
                          'image data format convention '
                          '(`image_data_format="channels_first"`). '
                          'For best performance, set '
                          '`image_data_format="channels_last"` in '
                          'your Keras config '
                          'at ~/.keras/keras.json.')
    '''
    return model


================================================
FILE: submission/placeholder.txt
================================================


================================================
FILE: trained_models/placeholder.txt
================================================