Repository: yuanyuanli85/FashionAI_KeyPoint_Detection_Challenge_Keras
Branch: master
Commit: 0b3bd8cdee32
Files: 22
Total size: 85.2 KB
Directory structure:
gitextract_r2c2s0cn/
├── .gitignore
├── LICENSE
├── README.md
├── data/
│ └── placeholder.txt
├── src/
│ ├── data_gen/
│ │ ├── data_generator.py
│ │ ├── data_process.py
│ │ ├── dataset.py
│ │ ├── kpAnno.py
│ │ ├── ohem.py
│ │ └── utils.py
│ ├── eval/
│ │ ├── eval_callback.py
│ │ ├── evaluation.py
│ │ └── post_process.py
│ ├── top/
│ │ ├── demo.py
│ │ ├── test.py
│ │ └── train.py
│ └── unet/
│ ├── fashion_net.py
│ ├── refinenet.py
│ ├── refinenet_mask_v3.py
│ └── resnet101.py
├── submission/
│ └── placeholder.txt
└── trained_models/
└── placeholder.txt
================================================
FILE CONTENTS
================================================
================================================
FILE: .gitignore
================================================
.idea
*.pyc
*.pkl
================================================
FILE: LICENSE
================================================
MIT License
Copyright (c) 2018 VictorLi
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
================================================
FILE: README.md
================================================
# AiFashion
- Author: VictorLi, yuanyuan.li85@gmail.com
- Code for FashionAI Global Challenge—Key Points Detection of Apparel
[2018 TianChi](https://tianchi.aliyun.com/competition/introduction.htm?spm=5176.100068.5678.1.4ccc289bCzDJXu&raceId=231648&_lang=en_US)
- Rank 45/2322 at 1st round competition, score 0.61
- Rank 46 at 2nd round competition, score 0.477
## Images with detected keypoints
### Dress

### Blouse

### Outwear

### Skirt

### Trousers

## Basic idea
- The key idea comes from paper [Cascaded Pyramid Network for Multi-Person Pose Estimation](https://arxiv.org/abs/1711.07319). We have a 2 stage network called global net and refine net who are U-net like. The network was trained to detect the heatmap of cloth's key points. The backbone network used here is resnet101.
- To overcome the negative impact from different category, `input_mask` was introduced to zero the invalid keypoints. For example, skirt has 4 valid keypoints: `waistband_left`, `waistband_right`, `hemline_left` and `hemline_right`. In `input_mask`, only those valid masks are 1.0 , while other 20 masks are set as zero.
- On line hard negative mining, at last stage of refinenet, only take the top losses as consideration and ignore the easy part (small loss)
## Dependency
- Keras2.0
- Tensorflow
- Opencv/Numpy/Pandas
- Pretrained model weights, resenet101
## Folder Structure
- `data`: folder to store training and testing images and annotations
- `trained_models`: folder to store trained models and logs
- `submission`: folder to store generated submission for evaluation.
- `src`: folder to put all of source code.
`src/data_gen`: code for data generator including data augmentation and pre-process
`src/eval`: code for evaluation, including inference and post-processing.
`src/unet`: code for cnn model definition, including train, fine-tune, loss, optimizer definition.
`src/top`:top level code for train, test and demo.
## How to train network
- Download dataset from competition webpage and put it under data.
`data/train` : data used as train. `data/test` : data used for test
- Download [resnet101](https://gist.github.com/flyyufelix/65018873f8cb2bbe95f429c474aa1294) model and save it as `data/resnet101_weights_tf.h5`.
Note: all the models here use channel_last dim order.
- Train all-in-one network from scratch
```
python train.py --category all --epochs 30 --network v11 --batchSize 3 --gpuID 2
```
- The trained model and log will be put under `trained_models/all/xxxx`, i.e `trained_models/all/2018_05_23_15_18_07/`
- The evaluation will run for each epoch and details saved to `val.log`
- Resume training from a specific model.
```
python train.py --gpuID 2 --category all --epochs 30 --network v11 --batchSize 3 --resume True --resumeModel /path/to/model/start/with --initEpoch 6
```
## How to test and generate submission
- Run test and generate submission
Below command search the best score from `modelpath` and use that to generate submission
```
python test.py --gpuID 2 --modelpath ../../trained_models/all/xxx --outpath ../../submission/2018_04_19/ --augment True
```
The submission will be saved as `submission.csv`
## How to run demo
- Download the pre trained weights from [BaiduDisk](https://pan.baidu.com/s/1t7fB5wnRfW1Vny0gw7xUDQ) (password `1ae2`) or [GoogleDrive](https://drive.google.com/open?id=1VY-AO2F1XMQLBjEZjy6CrOSIPWWaHUGr)
- Save it somewhere, i.e `trained_models/all/fashion_ai_keypoint_weights_epoch28.hdf5`
- Or use your own trained model.
- Run demo and the cloth with keypoints marked will be displayed.
```
python demo.py --gpuID 2 --modelfile ../../trained_models/all/fashion_ai_keypoint_weights_epoch28.hdf5
```
## Reference
- Resnet 101 Keras : https://github.com/statech/resnet
================================================
FILE: data/placeholder.txt
================================================
================================================
FILE: src/data_gen/data_generator.py
================================================
import os
import cv2
import pandas as pd
import numpy as np
import random
from kpAnno import KpAnno
from dataset import getKpNum, getKpKeys, getFlipMapID, generate_input_mask
from utils import make_gaussian, load_annotation_from_df
from data_process import pad_image, resize_image, normalize_image, rotate_image, \
rotate_image_float, rotate_mask, crop_image
from ohem import generate_topk_mask_ohem
class DataGenerator(object):
def __init__(self, category, annfile):
self.category = category
self.annfile = annfile
self._initialize()
def get_dim_order(self):
# default tensorflow dim order
return "channels_last"
def get_dataset_size(self):
return len(self.annDataFrame)
def generator_with_mask_ohem(self, graph, kerasModel, batchSize=16, inputSize=(512, 512), flipFlag=False, cropFlag=False,
shuffle=True, rotateFlag=True, nStackNum=1):
'''
Input: batch_size * Height (512) * Width (512) * Channel (3)
Input: batch_size * 256 * 256 * Channel (N+1). Mask for each category. 1.0 for valid parts in category. 0.0 for invalid parts
Output: batch_size * Height/2 (256) * Width/2 (256) * Channel (N+1)
'''
xdf = self.annDataFrame
targetHeight, targetWidth = inputSize
# train_input: npfloat, height, width, channels
# train_gthmap: npfloat, N heatmap + 1 background heatmap,
train_input = np.zeros((batchSize, targetHeight, targetWidth, 3), dtype=np.float)
train_mask = np.zeros((batchSize, targetHeight / 2, targetWidth / 2, getKpNum(self.category) ), dtype=np.float)
train_gthmap = np.zeros((batchSize, targetHeight / 2, targetWidth / 2, getKpNum(self.category) ), dtype=np.float)
train_ohem_mask = np.zeros((batchSize, targetHeight / 2, targetWidth / 2, getKpNum(self.category) ), dtype=np.float)
train_ohem_gthmap = np.zeros((batchSize, targetHeight / 2, targetWidth / 2, getKpNum(self.category) ), dtype=np.float)
## generator need to be infinite loop
while 1:
# random shuffle at first
if shuffle:
xdf = xdf.sample(frac=1)
count = 0
for _index, _row in xdf.iterrows():
xindex = count % batchSize
xinput, xhmap = self._prcoess_img(_row, inputSize, rotateFlag, flipFlag, cropFlag, nobgFlag=True)
xmask = generate_input_mask(_row['image_category'],
(targetHeight, targetWidth, getKpNum(self.category)))
xohem_mask, xohem_gthmap = generate_topk_mask_ohem([xinput, xmask], xhmap, kerasModel, graph,
8, _row['image_category'], dynamicFlag=False)
train_input[xindex, :, :, :] = xinput
train_mask[xindex, :, :, :] = xmask
train_gthmap[xindex, :, :, :] = xhmap
train_ohem_mask[xindex, :, :, :] = xohem_mask
train_ohem_gthmap[xindex, :, :, :] = xohem_gthmap
# if refinenet enable, refinenet has two outputs, globalnet and refinenet
if xindex == 0 and count != 0:
gthamplst = list()
for i in range(nStackNum):
gthamplst.append(train_gthmap)
# last stack will use ohem gthmap
gthamplst.append(train_ohem_gthmap)
yield [train_input, train_mask, train_ohem_mask], gthamplst
count += 1
def _initialize(self):
self._load_anno()
def _load_anno(self):
'''
Load annotations from train.csv
'''
# Todo: check if category legal
self.train_img_path = "../../data/train"
# read into dataframe
xpd = pd.read_csv(self.annfile)
xpd = load_annotation_from_df(xpd, self.category)
self.annDataFrame = xpd
def _prcoess_img(self, dfrow, inputSize, rotateFlag, flipFlag, cropFlag, nobgFlag):
mlist = dfrow[getKpKeys(self.category)]
imgName, kpStr = mlist[0], mlist[1:]
# read kp annotation from csv file
kpAnnlst = list()
for _kpstr in kpStr:
_kpAn = KpAnno.readFromStr(_kpstr)
kpAnnlst.append(_kpAn)
assert (len(kpAnnlst) == getKpNum(self.category)), str(len(kpAnnlst))+" is not the same as "+str(getKpNum(self.category))
xcvmat = cv2.imread(os.path.join(self.train_img_path, imgName))
if xcvmat is None:
return None, None
#flip as first operation.
# flip image
if random.choice([0, 1]) and flipFlag:
xcvmat, kpAnnlst = self.flip_image(xcvmat, kpAnnlst)
#if cropFlag:
# xcvmat, kpAnnlst = crop_image(xcvmat, kpAnnlst, 0.8, 0.95)
# pad image to 512x512
paddedImg, kpAnnlst = pad_image(xcvmat, kpAnnlst, inputSize[0], inputSize[1])
assert (len(kpAnnlst) == getKpNum(self.category)), str(len(kpAnnlst)) + " is not the same as " + str(
getKpNum(self.category))
# output ground truth heatmap is 256x256
trainGtHmap = self.__generate_hmap(paddedImg, kpAnnlst)
if random.choice([0,1]) and rotateFlag:
rAngle = np.random.randint(-1*40, 40)
rotatedImage, _ = rotate_image(paddedImg, list(), rAngle)
rotatedGtHmap = rotate_mask(trainGtHmap, rAngle)
else:
rotatedImage = paddedImg
rotatedGtHmap = trainGtHmap
# resize image
resizedImg = cv2.resize(rotatedImage, inputSize)
resizedGtHmap = cv2.resize(rotatedGtHmap, (inputSize[0]//2, inputSize[1]//2))
return normalize_image(resizedImg), resizedGtHmap
def __generate_hmap(self, cvmat, kpAnnolst):
# kpnum + background
gthmp = np.zeros((cvmat.shape[0], cvmat.shape[1], getKpNum(self.category)), dtype=np.float)
for i, _kpAnn in enumerate(kpAnnolst):
if _kpAnn.visibility == -1:
continue
radius = 100
gaussMask = make_gaussian(radius, radius, 20, None)
# avoid out of boundary
top_x, top_y = max(0, _kpAnn.x - radius/2), max(0, _kpAnn.y - radius/2)
bottom_x, bottom_y = min(cvmat.shape[1], _kpAnn.x + radius/2), min(cvmat.shape[0], _kpAnn.y + radius/2)
top_x_offset = top_x - (_kpAnn.x - radius/2)
top_y_offset = top_y - (_kpAnn.y - radius/2)
gthmp[ top_y:bottom_y, top_x:bottom_x, i] = gaussMask[top_y_offset:top_y_offset + bottom_y-top_y,
top_x_offset:top_x_offset + bottom_x-top_x]
return gthmp
def flip_image(self, orgimg, orgKpAnolst):
flipImg = cv2.flip(orgimg, flipCode=1)
flipannlst = self.flip_annlst(orgKpAnolst, orgimg.shape)
return flipImg, flipannlst
def flip_annlst(self, kpannlst, imgshape):
height, width, channels = imgshape
# flip first
flipAnnlst = list()
for _kp in kpannlst:
flip_x = width - _kp.x
flipAnnlst.append(KpAnno(flip_x, _kp.y, _kp.visibility))
# exchange location of flip keypoints, left->right
outAnnlst = flipAnnlst[:]
for i, _kp in enumerate(flipAnnlst):
mapId = getFlipMapID('all', i)
outAnnlst[mapId] = _kp
return outAnnlst
================================================
FILE: src/data_gen/data_process.py
================================================
import pandas as pd
import numpy as np
import cv2
import os
from kpAnno import KpAnno
def normalize_image(cvmat):
assert (cvmat.dtype == np.uint8) , " only support normalize np.uint8 to float -0.5 ~ 0.5'"
cvmat = cvmat.astype(np.float)
cvmat = (cvmat - 128.0) / 256.0
return cvmat
def resize_image(cvmat, targetWidth, targetHeight):
assert (cvmat.dtype == np.uint8) , " only support normalize np.uint8 in resize_image'"
# get scale
srcHeight, srcWidth, channles = cvmat.shape
minScale = min( targetHeight*1.0/srcHeight, targetWidth*1.0/srcWidth)
# resize
resizedMat = cv2.resize(cvmat, None, fx=minScale, fy=minScale)
reHeight, reWidth, channles = resizedMat.shape
# pad to targetWidth or targetHeight
outmat = np.zeros((targetHeight, targetWidth, 3), dtype=cvmat.dtype) + 128
if targetHeight == reHeight and targetWidth == reWidth:
outmat = resizedMat
elif targetWidth != reWidth and targetHeight == reHeight:
# add pad to width
outmat[:, 0:reWidth, :] = resizedMat
elif targetHeight != reHeight and targetWidth == reWidth:
# add padding to height
outmat[0:reHeight, :, :] = resizedMat
else:
assert(0), "after resize either width or height same as target width or target height"
return (outmat, minScale)
def pad_image(cvmat, kpAnno, targetWidth, targetHeight):
'''
:param cvmat: input mat
:param targetWidth: width to pad
:param targetHeight: height to pad
:return:
'''
assert (cvmat.dtype == np.uint8) , " only support normalize np.uint8 in pad_image'" + str(cvmat.dtype)
srcHeight, srcWidth, channles = cvmat.shape
outmat = np.zeros((targetHeight, targetWidth, 3), dtype=cvmat.dtype) + 128
if targetHeight == srcHeight and targetWidth == srcWidth:
outmat = cvmat
outkpAnno = kpAnno
elif targetWidth != srcWidth and targetHeight == srcHeight:
# add pad to width
outmat[:, 0:srcWidth, :] = cvmat
outkpAnno = kpAnno
elif targetHeight != srcHeight and targetWidth == srcWidth:
# add padding to height
outmat[0:srcHeight, :, :] = cvmat
outkpAnno = kpAnno
else:
# resize at first, then pad
outmat, scale = resize_image(cvmat, targetWidth, targetHeight)
outkpAnno = list()
for _kpAnno in kpAnno:
_nkp = KpAnno.applyScale(_kpAnno, scale)
outkpAnno.append(_nkp)
return (outmat, outkpAnno)
def pad_image_inference(cvmat, targetWidth, targetHeight):
'''
:param cvmat: input mat
:param targetWidth: width to pad
:param targetHeight: height to pad
:return:
'''
assert (cvmat.dtype == np.uint8), " only support normalize np.uint8 in pad_image'" + str(cvmat.dtype)
srcHeight, srcWidth, channles = cvmat.shape
outmat = np.zeros((targetHeight, targetWidth, 3), dtype=cvmat.dtype) + 128
if targetHeight == srcHeight and targetWidth == srcWidth:
outmat = cvmat
scale = 1.0
elif targetWidth > srcWidth and targetHeight == srcHeight:
# add pad to width
outmat[:, 0:srcWidth, :] = cvmat
scale = 1.0
elif targetHeight > srcHeight and targetWidth == srcWidth:
# add padding to height
outmat[0:srcHeight, :, :] = cvmat
scale = 1.0
else:
# resize at first, then pad
outmat, scale = resize_image(cvmat, targetWidth, targetHeight)
return (outmat, scale)
def rotate_image(cvmat, kpAnnLst, rotateAngle):
assert (cvmat.dtype == np.uint8) , " only support normalize np.uint8 in rotate_image'"
##Make sure cvmat is square?
height, width, channel = cvmat.shape
center = ( width//2, height//2)
rotateMatrix = cv2.getRotationMatrix2D(center, rotateAngle, 1.0)
cos, sin = np.abs(rotateMatrix[0,0]), np.abs(rotateMatrix[0, 1])
newH = int((height*sin)+(width*cos))
newW = int((height*cos)+(width*sin))
rotateMatrix[0,2] += (newW/2) - center[0] #x
rotateMatrix[1,2] += (newH/2) - center[1] #y
# rotate image
outMat = cv2.warpAffine(cvmat, rotateMatrix, (newH, newW), borderValue=(128, 128, 128))
# rotate annotations
nKpLst = list()
for _kp in kpAnnLst:
_newkp = KpAnno.applyRotate(_kp, rotateMatrix)
nKpLst.append(_newkp)
return (outMat, nKpLst)
def rotate_image_with_invrmat(cvmat, rotateAngle):
assert (cvmat.dtype == np.uint8) , " only support normalize np.uint in rotate_image_with_invrmat'"
##Make sure cvmat is square?
height, width, channel = cvmat.shape
center = ( width//2, height//2)
rotateMatrix = cv2.getRotationMatrix2D(center, rotateAngle, 1.0)
cos, sin = np.abs(rotateMatrix[0,0]), np.abs(rotateMatrix[0, 1])
newH = int((height*sin)+(width*cos))
newW = int((height*cos)+(width*sin))
rotateMatrix[0,2] += (newW/2) - center[0] #x
rotateMatrix[1,2] += (newH/2) - center[1] #y
# rotate image
outMat = cv2.warpAffine(cvmat, rotateMatrix, (newH, newW), borderValue=(128, 128, 128))
# generate inv rotate matrix
invRotateMatrix = cv2.invertAffineTransform(rotateMatrix)
return (outMat, invRotateMatrix, (width, height))
def rotate_mask(mask, rotateAngle):
outmask = rotate_image_float(mask, rotateAngle)
return outmask
def rotate_image_float(cvmat, rotateAngle, borderValue=(0.0, 0.0, 0.0)):
assert (cvmat.dtype == np.float) , " only support normalize np.float in rotate_image_float'"
##Make sure cvmat is square?
height, width, channels = cvmat.shape
center = ( width//2, height//2)
rotateMatrix = cv2.getRotationMatrix2D(center, rotateAngle, 1.0)
cos, sin = np.abs(rotateMatrix[0,0]), np.abs(rotateMatrix[0, 1])
newH = int((height*sin)+(width*cos))
newW = int((height*cos)+(width*sin))
rotateMatrix[0,2] += (newW/2) - center[0] #x
rotateMatrix[1,2] += (newH/2) - center[1] #y
# rotate image
outMat = cv2.warpAffine(cvmat, rotateMatrix, (newH, newW), borderValue=borderValue)
return outMat
def crop_image(cvmat, kpAnnLst, lowLimitRatio, upLimitRatio):
import random
assert(lowLimitRatio < 1.0), 'lowLimitRatio should be less than 1.0'
assert(upLimitRatio < 1.0), 'upLimitRatio should be less than 1.0'
height, width, channels = cvmat.shape
cropHeight = random.randrange(int(lowLimitRatio*height), int(upLimitRatio*height))
cropWidth = random.randrange(int(lowLimitRatio*width), int(upLimitRatio*width))
top_x = random.randrange(0, width - cropWidth)
top_y = random.randrange(0, height - cropHeight)
# apply offset for keypoints
nKpLst = list()
for _kp in kpAnnLst:
if _kp.visibility == -1:
_newkp = _kp
else:
_newkp = KpAnno.applyOffset(_kp, (top_x, top_y))
if _newkp.x <=0 or _newkp.y <=0:
# negative location, return original image
return cvmat, kpAnnLst
if _newkp.x >= cropWidth or _newkp.y >= cropHeight:
# keypoints are cropped out
return cvmat, kpAnnLst
nKpLst.append(_newkp)
return cvmat[top_y:top_y+cropHeight, top_x:top_x+cropWidth], nKpLst
if __name__ == "__main__":
pass
================================================
FILE: src/data_gen/dataset.py
================================================
def getKpNum(category):
# remove one column 'image_id'
return len(getKpKeys(category)) - 1
TROUSERS_PART_KYES=['waistband_left', 'waistband_right', 'crotch', 'bottom_left_in', 'bottom_left_out', 'bottom_right_in', 'bottom_right_out']
TROUSERS_PART_FLIP_KYES=['waistband_right', 'waistband_left', 'crotch', 'bottom_right_in', 'bottom_right_out', 'bottom_left_in', 'bottom_left_out']
SKIRT_PART_KEYS=['waistband_left', 'waistband_right', 'hemline_left', 'hemline_right']
SKIRT_PART_FLIP_KEYS=['waistband_right', 'waistband_left', 'hemline_right', 'hemline_left']
DRESS_PART_KEYS= ['neckline_left', 'neckline_right', 'shoulder_left', 'shoulder_right', 'center_front',
'armpit_left', 'armpit_right', 'waistline_left', 'waistline_right', 'cuff_left_in',
'cuff_left_out', 'cuff_right_in', 'cuff_right_out', 'hemline_left', 'hemline_right']
DRESS_PART_FLIP_KEYS=['neckline_right', 'neckline_left', 'shoulder_right', 'shoulder_left', 'center_front',
'armpit_right', 'armpit_left', 'waistline_right', 'waistline_left', 'cuff_right_in',
'cuff_right_out', 'cuff_left_in', 'cuff_left_out', 'hemline_right', 'hemline_left']
BLOUSE_PART_KEYS=['neckline_left', 'neckline_right', 'shoulder_left', 'shoulder_right',
'center_front', 'armpit_left', 'armpit_right', 'top_hem_left', 'top_hem_right',
'cuff_left_in', 'cuff_left_out', 'cuff_right_in', 'cuff_right_out']
BLOUSE_PART_FLIP_KEYS=['neckline_right', 'neckline_left', 'shoulder_right', 'shoulder_left',
'center_front', 'armpit_right', 'armpit_left', 'top_hem_right', 'top_hem_left',
'cuff_right_in', 'cuff_right_out', 'cuff_left_in', 'cuff_left_out']
OUTWEAR_PART_KEYS=['neckline_left', 'neckline_right', 'shoulder_left', 'shoulder_right',
'armpit_left', 'armpit_right', 'waistline_left', 'waistline_right', 'cuff_left_in',
'cuff_left_out', 'cuff_right_in', 'cuff_right_out', 'top_hem_left', 'top_hem_right']
OUTWEAR_PART_FLIP_KEYS = ['neckline_right', 'neckline_left', 'shoulder_right', 'shoulder_left',
'armpit_right', 'armpit_left', 'waistline_right', 'waistline_left', 'cuff_right_in',
'cuff_right_out', 'cuff_left_in', 'cuff_left_out', 'top_hem_right', 'top_hem_left']
ALL_PART_KEYS = ['neckline_left', 'neckline_right', 'center_front', 'shoulder_left', 'shoulder_right',
'armpit_left', 'armpit_right', 'waistline_left', 'waistline_right', 'cuff_left_in', 'cuff_left_out',
'cuff_right_in', 'cuff_right_out', 'top_hem_left', 'top_hem_right', 'waistband_left', 'waistband_right',
'hemline_left', 'hemline_right', 'crotch', 'bottom_left_in', 'bottom_left_out',
'bottom_right_in', 'bottom_right_out']
ALL_PART_FLIP_KEYS = [ 'neckline_right', 'neckline_left', 'center_front', 'shoulder_right', 'shoulder_left',
'armpit_right', 'armpit_left', 'waistline_right', 'waistline_left', 'cuff_right_in', 'cuff_right_out',
'cuff_left_in', 'cuff_left_out', 'top_hem_right', 'top_hem_left', 'waistband_right','waistband_left',
'hemline_right', 'hemline_left', 'crotch', 'bottom_right_in', 'bottom_right_out',
'bottom_left_in', 'bottom_left_out']
def getFlipKeys(category):
if category == 'skirt':
keys, mapkeys = SKIRT_PART_KEYS, SKIRT_PART_FLIP_KEYS
elif category == 'dress':
keys, mapkeys = DRESS_PART_KEYS, DRESS_PART_FLIP_KEYS
elif category == 'trousers':
keys, mapkeys = TROUSERS_PART_KYES, TROUSERS_PART_FLIP_KYES
elif category == 'blouse':
keys, mapkeys = BLOUSE_PART_KEYS, BLOUSE_PART_FLIP_KEYS
elif category == 'outwear':
keys, mapkeys = OUTWEAR_PART_KEYS, OUTWEAR_PART_FLIP_KEYS
elif category == 'all':
keys, mapkeys = ALL_PART_KEYS, ALL_PART_FLIP_KEYS
else:
assert (0), category + " not supported"
xdict = dict()
for i in range(len(keys)):
xdict[keys[i]] = mapkeys[i]
return keys, xdict
def getFlipMapID(category, partid):
keys, mapDict = getFlipKeys(category)
mapKey = mapDict[keys[partid]]
mapID = keys.index(mapKey)
return mapID
def getKpKeys(category):
'''
:param category:
:return: get the keypoint keys in annotation csv
'''
SKIRT_KP_KEYS = ['image_id', 'waistband_left', 'waistband_right', 'hemline_left', 'hemline_right']
DRESS_KP_KEYS = ['image_id', 'neckline_left', 'neckline_right', 'shoulder_left', 'shoulder_right', 'center_front',
'armpit_left', 'armpit_right' , 'waistline_left' , 'waistline_right', 'cuff_left_in',
'cuff_left_out', 'cuff_right_in', 'cuff_right_out', 'hemline_left', 'hemline_right']
TROUSERS_KP_KEYS=['image_id', 'waistband_left', 'waistband_right', 'crotch', 'bottom_left_in',
'bottom_left_out', 'bottom_right_in', 'bottom_right_out']
BLOUSE_KP_KEYS = [ 'image_id', 'neckline_left', 'neckline_right', 'shoulder_left', 'shoulder_right',
'center_front', 'armpit_left', 'armpit_right', 'top_hem_left', 'top_hem_right',
'cuff_left_in', 'cuff_left_out', 'cuff_right_in', 'cuff_right_out']
OUTWEAR_KP_KEYS= ['image_id', 'neckline_left', 'neckline_right', 'shoulder_left', 'shoulder_right',
'armpit_left', 'armpit_right', 'waistline_left', 'waistline_right', 'cuff_left_in',
'cuff_left_out', 'cuff_right_in', 'cuff_right_out', 'top_hem_left', 'top_hem_right']
ALL_KP_KESY = ['image_id','neckline_left', 'neckline_right', 'center_front', 'shoulder_left', 'shoulder_right',
'armpit_left', 'armpit_right', 'waistline_left', 'waistline_right', 'cuff_left_in', 'cuff_left_out', 'cuff_right_in',
'cuff_right_out', 'top_hem_left', 'top_hem_right', 'waistband_left', 'waistband_right', 'hemline_left', 'hemline_right' ,
'crotch', 'bottom_left_in' , 'bottom_left_out', 'bottom_right_in' ,'bottom_right_out']
if category == 'skirt':
return SKIRT_KP_KEYS
elif category == 'dress':
return DRESS_KP_KEYS
elif category == 'trousers':
return TROUSERS_KP_KEYS
elif category == 'blouse':
return BLOUSE_KP_KEYS
elif category == 'outwear':
return OUTWEAR_KP_KEYS
elif category == 'all':
return ALL_KP_KESY
else:
assert(0), category + ' not supported'
def fill_dataframe(kplst, category, dfrow):
keys = getKpKeys(category)[1:]
# fill category
dfrow['image_category'] = category
assert (len(keys) == len(kplst)), str(len(kplst)) + ' must be the same as ' + str(len(keys))
for i, _key in enumerate(keys):
kpann = kplst[i]
outstr = str(int(kpann.x))+"_"+str(int(kpann.y))+"_"+str(1)
dfrow[_key] = outstr
def get_kp_index_from_allkeys(kpname):
ALL_KP_KEYS = ['neckline_left', 'neckline_right', 'center_front', 'shoulder_left', 'shoulder_right',
'armpit_left', 'armpit_right', 'waistline_left', 'waistline_right', 'cuff_left_in', 'cuff_left_out',
'cuff_right_in', 'cuff_right_out', 'top_hem_left', 'top_hem_right', 'waistband_left', 'waistband_right',
'hemline_left', 'hemline_right', 'crotch', 'bottom_left_in', 'bottom_left_out', 'bottom_right_in', 'bottom_right_out']
return ALL_KP_KEYS.index(kpname)
def generate_input_mask(image_category, shape, nobgFlag=True):
import numpy as np
# 0.0 for invalid key points for each category
# 1.0 for valid key points for each category
h, w, c = shape
mask = np.zeros((h // 2, w // 2, c), dtype=np.float)
for key in getKpKeys(image_category)[1:]:
index = get_kp_index_from_allkeys(key)
mask[:, :, index] = 1.0
# for last channel, background
if nobgFlag: mask[:, :, -1] = 0.0
else: mask[:, :, -1] = 1.0
return mask
================================================
FILE: src/data_gen/kpAnno.py
================================================
import numpy as np
class KpAnno(object):
'''
Convert string to x, y, visibility
'''
def __init__(self, x, y, visibility):
self.x = int(x)
self.y = int(y)
self.visibility = visibility
@classmethod
def readFromStr(cls, xstr):
xarray = xstr.split('_')
x = int(xarray[0])
y = int(xarray[1])
visibility = int(xarray[2])
return cls(x,y, visibility)
@classmethod
def applyScale(cls, kpAnno, scale):
x = int(kpAnno.x*scale)
y = int(kpAnno.y*scale)
v = kpAnno.visibility
return cls(x, y, v)
@classmethod
def applyRotate(cls, kpAnno, rotateMatrix):
vector = [kpAnno.x, kpAnno.y, 1]
rotatedV = np.dot(rotateMatrix, vector)
return cls( int(rotatedV[0]), int(rotatedV[1]), kpAnno.visibility)
@classmethod
def applyOffset(cls, kpAnno, offset):
x = kpAnno.x - offset[0]
y = kpAnno.y - offset[1]
v = kpAnno.visibility
return cls(x, y, v)
@staticmethod
def calcDistance(kpA, kpB):
distance = (kpA.x - kpB.x)**2 + (kpA.y - kpB.y)**2
return np.sqrt(distance)
================================================
FILE: src/data_gen/ohem.py
================================================
import sys
sys.path.insert(0, "../unet/")
from keras.models import *
from keras.layers import *
from utils import np_euclidean_l2
from dataset import getKpNum
def generate_topk_mask_ohem(input_data, gthmap, keras_model, graph, topK, image_category, dynamicFlag=False):
'''
:param input_data: input
:param gthmap: ground truth
:param keras_model: keras model
:param graph: tf grpah to WA thread issue
:param topK: number of kp selected
:return:
'''
# do inference, and calculate loss of each channel
mimg, mmask = input_data
ximg = mimg[np.newaxis,:,:,:]
xmask = mmask[np.newaxis,:,:,:]
if len(keras_model.input_layers) == 3:
# use original mask as ohem_mask
inputs = [ximg, xmask, xmask]
else:
inputs = [ximg, xmask]
with graph.as_default():
keras_output = keras_model.predict(inputs)
# heatmap of last stage
outhmap = keras_output[-1]
channel_num = gthmap.shape[-1]
# calculate loss
mloss = list()
for i in range(channel_num):
_dtmap = outhmap[0, :, :, i]
_gtmap = gthmap[:, :, i]
loss = np_euclidean_l2(_dtmap, _gtmap)
mloss.append(loss)
# refill input_mask, set topk as 1.0 and fill 0.0 for rest
# fixme: topk may different b/w category
if dynamicFlag:
topK = getKpNum(image_category)//2
ohem_mask = adjsut_mask(mloss, mmask, topK)
ohem_gthmap = ohem_mask * gthmap
return ohem_mask, ohem_gthmap
def adjsut_mask(loss, input_mask, topk):
# pick topk loss from losses
# fill topk with 1.0 and fill the rest as 0.0
assert (len(loss) == input_mask.shape[-1]), \
"shape should be same" + str(len(loss)) + " vs " + str(input_mask.shape)
outmask = np.zeros(input_mask.shape, dtype=np.float)
topk_index = sorted(range(len(loss)), key=lambda i:loss[i])[-topk:]
for i in range(len(loss)):
if i in topk_index:
outmask[:,:,i] = 1.0
return outmask
================================================
FILE: src/data_gen/utils.py
================================================
import numpy as np
import pandas as pd
import os
def make_gaussian(width, height, sigma=3, center=None):
'''
generate 2d guassion heatmap
:return:
'''
x = np.arange(0, width, 1, float)
y = np.arange(0, height, 1, float)[:, np.newaxis]
if center is None:
x0 = width // 2
y0 = height // 2
else:
x0 = center[0]
y0 = center[1]
return np.exp( -4*np.log(2)*((x-x0)**2 + (y-y0)**2)/sigma**2)
def split_csv_train_val(allcsv, traincsv, valcsv, ratio=0.8):
xdf = pd.read_csv(allcsv)
# random shuffle
xdf = xdf.sample(frac=1)
# random sampling
msk = np.random.rand(len(xdf)) < ratio
trainDf= xdf[msk]
valDf= xdf[~msk]
print "total", len(xdf), "split into train ", len(trainDf), ' val', len(valDf)
#save to file
trainDf.to_csv(traincsv, index=False)
valDf.to_csv(valcsv, index=False)
def np_euclidean_l2(x, y):
assert (x.shape == y.shape), "shape mismatched " + x.shape +" : " + y.shape
loss = np.sum((x - y)**2)
loss = np.sqrt(loss)
return loss
def load_annotation_from_df(df, category):
if category == 'all':
return df
else:
return df[df['image_category'] == category]
================================================
FILE: src/eval/eval_callback.py
================================================
import keras
import os
import datetime
from evaluation import Evaluation
from time import time
class NormalizedErrorCallBack(keras.callbacks.Callback):
def __init__(self, foldpath, category, multiOut=False, resumeFolder=None):
self.parentFoldPath = foldpath
self.category = category
if resumeFolder is None:
self.foldPath = os.path.join(self.parentFoldPath, self.category, datetime.datetime.now().strftime('%Y_%m_%d_%H_%M_%S'))
if not os.path.exists(self.foldPath):
os.mkdir(self.foldPath)
else:
self.foldPath = resumeFolder
self.valLog = os.path.join(self.foldPath, 'val.log')
self.multiOut = multiOut
def get_folder_path(self):
return self.foldPath
def on_epoch_end(self, epoch, logs=None):
modelName = os.path.join(self.foldPath, self.category+"_weights_"+str(epoch)+".hdf5")
keras.models.save_model(self.model, modelName)
print "Saving model to ", modelName
print "Runing evaluation ........."
xEval = Evaluation(self.category, None)
xEval.init_from_model(self.model)
start = time()
neScore, categoryDict = xEval.eval(self.multiOut, details=True)
end = time()
print "Evaluation Done", str(neScore), " cost ", end - start, " seconds!"
for key in categoryDict.keys():
scores = categoryDict[key]
print key, ' score ', sum(scores)/len(scores)
with open(self.valLog , 'a+') as xfile:
xfile.write(modelName + ", Socre "+ str(neScore)+"\n")
for key in categoryDict.keys():
scores = categoryDict[key]
xfile.write(key + ": " + str(sum(scores)/len(scores)) + "\n")
xfile.close()
================================================
FILE: src/eval/evaluation.py
================================================
import sys
sys.path.insert(0, "../data_gen/")
sys.path.insert(0, "../unet/")
import pandas as pd
from dataset import getKpKeys, getKpNum, getFlipMapID, get_kp_index_from_allkeys, generate_input_mask
from kpAnno import KpAnno
from post_process import post_process_heatmap
from keras.models import load_model
import os
from refinenet_mask_v3 import euclidean_loss
import numpy as np
import cv2
from resnet101 import Scale
from utils import load_annotation_from_df
from collections import defaultdict
import copy
from data_process import pad_image_inference
class Evaluation(object):
def __init__(self, category, modelFile):
self.category = category
self.train_img_path = "../../data/train"
if modelFile is not None:
self._initialize(modelFile)
def init_from_model(self, model):
self._load_anno()
self.net = model
def eval(self, multiOut=False, details=False, flip=True):
xdf = self.annDataFrame
scores = list()
xdict = dict()
xcategoryDict = defaultdict(list)
for _index, _row in xdf.iterrows():
imgId = _row['image_id']
category = _row['image_category']
imgFile = os.path.join(self.train_img_path, imgId)
gtKpAnno = self._get_groundtruth_kpAnno(_row)
if flip:
predKpAnno = self.predict_kp_with_flip(imgFile, category)
else:
predKpAnno = self.predict_kp(imgFile, category, multiOut)
neScore = Evaluation.calc_ne_score(category, predKpAnno, gtKpAnno)
scores.extend(neScore)
if details:
xcategoryDict[category].extend(neScore)
if details:
return sum(scores)/len(scores), xcategoryDict
else:
return sum(scores)/len(scores)
def _initialize(self, modelFile):
self._load_anno()
self._initialize_network(modelFile)
def _initialize_network(self, modelFile):
self.net = load_model(modelFile, custom_objects={'euclidean_loss': euclidean_loss, 'Scale': Scale})
def _load_anno(self):
'''
Load annotations from train.csv
'''
self.annfile = os.path.join("../../data/train/Annotations", "val_split.csv")
# read into dataframe
xpd = pd.read_csv(self.annfile)
xpd = load_annotation_from_df(xpd, self.category)
self.annDataFrame = xpd
def _get_groundtruth_kpAnno(self, dfrow):
mlist = dfrow[getKpKeys(self.category)]
imgName, kpStr = mlist[0], mlist[1:]
# read kp annotation from csv file
kpAnnlst = [KpAnno.readFromStr(_kpstr) for _kpstr in kpStr]
return kpAnnlst
def _net_inference_with_mask(self, imgFile, imgCategory):
import cv2
from data_process import normalize_image, pad_image_inference
assert (len(self.net.input_layers) > 1), "input layer need to more than 1"
# load image and preprocess
img = cv2.imread(imgFile)
img, scale = pad_image_inference(img, 512, 512)
img = normalize_image(img)
input_img = img[np.newaxis, :, :, :]
input_mask = generate_input_mask(imgCategory, (512, 512, getKpNum(self.category)) )
input_mask = input_mask[np.newaxis, :, :, :]
# inference
heatmap = self.net.predict([input_img, input_mask, input_mask])
return (heatmap, scale)
def _heatmap_sum(self, heatmaplst):
outheatmap = np.copy(heatmaplst[0])
for i in range(1, len(heatmaplst), 1):
outheatmap += heatmaplst[i]
return outheatmap
def predict_kp(self, imgFile, imgCategory, multiOutput=False):
xnetout, scale = self._net_inference_with_mask(imgFile, imgCategory)
if multiOutput:
#todo: fixme, it is tricky that the previous stage has beeter performance than last stage's output.
#todo: here, we are using multiple stage's output sum.
heatmap = self._heatmap_sum(xnetout)
else:
heatmap = xnetout
detectedKps = post_process_heatmap(heatmap, kpConfidenceTh=0.2)
# scale to padded resolution 256X256 -> 512X512
scaleTo512 = 2.0
# apply scale to original resolution
detectedKps = [KpAnno(_kp.x*scaleTo512/scale , _kp.y*scaleTo512/scale, _kp.visibility) for _kp in detectedKps]
return detectedKps
def predict_kp_with_flip(self, imgFile, imgCategory):
# inference with flip and original image
heatmap, scale = self._net_inference_flip(imgFile, imgCategory)
detectedKps = post_process_heatmap(heatmap, kpConfidenceTh=0.2)
# scale to padded resolution 256X256 -> 512X512
scaleTo512 = 2.0
# apply scale to original resolution
detectedKps = [KpAnno(_kp.x * scaleTo512 / scale, _kp.y * scaleTo512 / scale, _kp.visibility) for _kp in
detectedKps]
return detectedKps
def _net_inference_flip(self, imgFile, imgCategory):
import cv2
from data_process import normalize_image, pad_image_inference
assert (len(self.net.input_layers) > 1), "input layer need to more than 1"
batch_size =2
input_img = np.zeros(shape=(batch_size, 512, 512, 3), dtype=np.float)
input_mask = np.zeros(shape=(batch_size, 256, 256, getKpNum(self.category)), dtype=np.float)
# load image and preprocess
orgimage = cv2.imread(imgFile)
padimg, scale = pad_image_inference(orgimage, 512, 512)
flipimg = cv2.flip(padimg, flipCode=1)
input_img[0,:,:,:] = normalize_image(padimg)
input_img[1,:,:,:] = normalize_image(flipimg)
mask = generate_input_mask(imgCategory, (512, 512, getKpNum(self.category)))
input_mask[0,:,:,:] = mask
input_mask[1,:,:,:] = mask
# inference
if len(self.net.input_layers) == 2:
heatmap = self.net.predict([input_img, input_mask])
elif len(self.net.input_layers) == 3:
heatmap = self.net.predict([input_img, input_mask, input_mask])
else:
assert (0), str(len(self.net.input_layers)) + " should be 2 or 3 "
# sum heatmap
avgheatmap = self._heatmap_sum(heatmap)
orgheatmap = avgheatmap[0,:,:,:]
# convert to same sequency with original heatmap
flipheatmap = avgheatmap[1,:,:,:]
flipheatmap = self._flip_out_heatmap(flipheatmap)
# average original and flip heatmap
outheatmap = flipheatmap + orgheatmap
outheatmap = outheatmap[np.newaxis, :, :, :]
return (outheatmap, scale)
def predict_kp_with_rotate(self, imgFile, imgCategory):
# inference with rotated image
rotateheatmap = self._net_inference_rotate(imgFile, imgCategory)
rotateheatmap = rotateheatmap[np.newaxis, :, :, :]
# original image and flip image
orgflipmap, scale = self._net_inference_flip(imgFile, imgCategory)
mflipmap = cv2.resize(orgflipmap[0,:,:,:], None, fx=2.0/scale, fy=2.0/scale)
# add mflipmap and rotateheatmap
avgheatmap = mflipmap[np.newaxis, :, :, :]
b, h, w , c = rotateheatmap.shape
avgheatmap[:, 0:h, 0:w,:] += rotateheatmap
# generate key point locations
detectedKps = post_process_heatmap(avgheatmap, kpConfidenceTh=0.2)
return detectedKps
def _net_inference_rotate(self, imgFile, imgCategory):
from data_process import normalize_image, pad_image_inference, rotate_image_with_invrmat
# load image and preprocess
orgimage = cv2.imread(imgFile)
anglelst = [-20, -10, 10, 20]
input_img = np.zeros(shape=(len(anglelst), 512, 512, 3), dtype=np.float)
input_mask = np.zeros(shape=(len(anglelst), 256, 256, getKpNum(self.category)), dtype=np.float)
mlist = list()
for i, angle in enumerate(anglelst):
rotateimg, invRotMatrix, orgImgSize = rotate_image_with_invrmat(orgimage, angle)
padimg, scale = pad_image_inference(rotateimg, 512, 512)
_img = normalize_image(padimg)
input_img[i, :, :, :] = _img
mlist.append((scale, invRotMatrix))
mask = generate_input_mask(imgCategory, (512, 512, getKpNum(self.category)))
for i, angle in enumerate(anglelst):
input_mask[i, :,:,:] = mask
# inference
heatmap = self.net.predict([input_img, input_mask, input_mask])
heatmap = self._heatmap_sum(heatmap)
# rotate back to original resolution
sumheatmap = np.zeros(shape=(orgimage.shape[0], orgimage.shape[1], getKpNum(self.category)), dtype=np.float)
for i, item in enumerate(mlist):
_heatmap = heatmap[i, :, :, :]
_scale, _invRotMatrix = item
_heatmap = cv2.resize(_heatmap, None, fx=2.0 / _scale, fy=2.0 / _scale)
_invheatmap = cv2.warpAffine(_heatmap, _invRotMatrix, (orgimage.shape[1], orgimage.shape[0]))
sumheatmap += _invheatmap
return sumheatmap
def _flip_out_heatmap(self, flipout):
outmap = np.zeros(flipout.shape, dtype=np.float)
for i in range(flipout.shape[-1]):
flipid = getFlipMapID(self.category, i)
mask = np.copy(flipout[:, :, i])
outmap[:, :, flipid] = cv2.flip(mask, flipCode=1)
return outmap
@staticmethod
def get_normized_distance(category, gtKp):
'''
:param category:
:param gtKp:
:return: if ground truth's two points do not exist, return a big number 1e6
'''
if category in ['skirt' ,'trousers']:
##waistband left and right
waistband_left_index = get_kp_index_from_allkeys('waistband_left')
waistband_right_index = get_kp_index_from_allkeys('waistband_right')
if gtKp[waistband_left_index].visibility != -1 and gtKp[waistband_right_index].visibility != -1:
distance = KpAnno.calcDistance(gtKp[waistband_left_index], gtKp[waistband_right_index])
else:
distance = 1e6
return distance
elif category in ['blouse', 'dress', 'outwear']:
armpit_left_index = get_kp_index_from_allkeys('armpit_left')
armpit_right_index = get_kp_index_from_allkeys('armpit_right')
##armpit_left armpit_right'
if gtKp[armpit_left_index].visibility != -1 and gtKp[armpit_right_index].visibility != -1:
distance = KpAnno.calcDistance(gtKp[armpit_left_index], gtKp[armpit_right_index])
else:
distance = 1e6
return distance
else:
assert (0), category + " not implemented in _get_normized_distance"
@staticmethod
def calc_ne_score(category, dtKp, gtKp):
assert (len(dtKp) == len(gtKp)), "predicted keypoint number should be the same as ground truth keypoints" + \
str(dtKp) + " vs " + str(gtKp)
# calculate normalized error as score
normalizedDistance = Evaluation.get_normized_distance(category, gtKp)
mlist = list()
for i in range(len(gtKp)):
if gtKp[i].visibility == 1:
dk = KpAnno.calcDistance(dtKp[i], gtKp[i])
mlist.append( dk/normalizedDistance)
return mlist
================================================
FILE: src/eval/post_process.py
================================================
import cv2
import numpy as np
from scipy.ndimage import gaussian_filter, maximum_filter
from keras.layers import *
from kpAnno import KpAnno
def post_process_heatmap(heatMap, kpConfidenceTh=0.2):
kplst = list()
for i in range(heatMap.shape[-1]):
# ignore last channel, background channel
_map = heatMap[0, :, :, i]
_map = gaussian_filter(_map, sigma=0.5)
_nmsPeaks = non_max_supression(_map, windowSize=3, threshold=1e-6)
y, x = np.where(_nmsPeaks == _nmsPeaks.max())
confidence = np.amax(_nmsPeaks)
if confidence > kpConfidenceTh:
kplst.append(KpAnno(x[0], y[0], 1))
else:
kplst.append(KpAnno(x[0], y[0], -1))
return kplst
def non_max_supression(plain, windowSize=3, threshold=1e-6):
# clear value less than threshold
under_th_indices = plain < threshold
plain[under_th_indices] = 0
return plain* (plain == maximum_filter(plain, footprint=np.ones((windowSize, windowSize))))
================================================
FILE: src/top/demo.py
================================================
import sys
sys.path.insert(0, "../data_gen/")
sys.path.insert(0, "../eval/")
sys.path.insert(0, "../unet/")
import argparse
import os
import pandas as pd
import cv2
from evaluation import Evaluation
from dataset import getKpKeys, get_kp_index_from_allkeys
def visualize_keypoint(imageName, category, dtkp):
cvmat = cv2.imread(imageName)
for key in getKpKeys(category)[1:]:
index = get_kp_index_from_allkeys(key)
_kp = dtkp[index]
cv2.circle(cvmat, center=(_kp.x, _kp.y), radius=7, color=(1.0, 0.0, 0.0), thickness=2)
cv2.imshow('demo', cvmat)
cv2.waitKey()
def demo(modelfile):
# load network
xEval = Evaluation('all', modelfile)
# load images and run prediction
testfile = os.path.join("../../data/test/", 'test.csv')
xdf = pd.read_csv(testfile)
xdf = xdf.sample(frac=1.0)
for _index, _row in xdf.iterrows():
_image_id = _row['image_id']
_category = _row['image_category']
imageName = os.path.join("../../data/test", _image_id)
print _image_id, _category
dtkp = xEval.predict_kp_with_rotate(imageName, _category)
visualize_keypoint(imageName, _category, dtkp)
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--gpuID", default=0, type=int, help='gpu id')
parser.add_argument("--modelfile", help="file of model")
args = parser.parse_args()
print args
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = str(args.gpuID)
demo(args.modelfile)
================================================
FILE: src/top/test.py
================================================
import sys
sys.path.insert(0, "../data_gen/")
sys.path.insert(0, "../eval/")
sys.path.insert(0, "../unet/")
import argparse
import os
from fashion_net import FashionNet
from dataset import getKpNum, getKpKeys
import pandas as pd
from evaluation import Evaluation
import pickle
import numpy as np
def get_best_single_model(valfile):
'''
:param valfile: the log file with validation score for each snapshot
:return: model file and score
'''
def get_key(item):
return item[1]
with open(valfile) as xval:
lines = xval.readlines()
xlist = list()
for linenum, xline in enumerate(lines):
if 'hdf5' in xline and 'Socre' in xline:
modelname = xline.strip().split(',')[0]
overallscore = xline.strip().split(',')[1]
xlist.append((modelname, overallscore))
bestmodel = sorted(xlist, key=get_key)[0]
return bestmodel
def fill_dataframe(kplst, keys, dfrow, image_category):
# fill category
dfrow['image_category'] = image_category
assert (len(keys) == len(kplst)), str(len(kplst)) + ' must be the same as ' + str(len(keys))
for i, _key in enumerate(keys):
kpann = kplst[i]
outstr = str(int(kpann.x))+"_"+str(int(kpann.y))+"_"+str(1)
dfrow[_key] = outstr
def get_kp_from_dict(mdict, image_category, image_id):
if image_category in mdict.keys():
xdict = mdict[image_category]
else:
xdict = mdict['all']
return xdict[image_id]
def submission(pklpath):
xdf = pd.read_csv("../../data/train/Annotations/train.csv")
trainKeys = xdf.keys()
testdf = pd.read_csv("../../data/test/test.csv")
print len(testdf), " samples in test.csv"
mdict = dict()
for xfile in os.listdir(pklpath):
if xfile.endswith('.pkl'):
category = xfile.strip().split('.')[0]
pkl = open(os.path.join(pklpath, xfile))
mdict[category] = pickle.load(pkl)
print testdf.keys()
print mdict.keys()
submissionDf = pd.DataFrame(columns=trainKeys, index=np.arange(testdf.shape[0]))
submissionDf = submissionDf.fillna(value='-1_-1_-1')
submissionDf['image_id'] = testdf['image_id']
submissionDf['image_category'] = testdf['image_category']
for _index, _row in submissionDf.iterrows():
image_id = _row['image_id']
image_category = _row['image_category']
kplst = get_kp_from_dict(mdict, image_category, image_id)
fill_dataframe(kplst, getKpKeys('all')[1:], _row, image_category)
print len(submissionDf), "save to ", os.path.join(pklpath, 'submission.csv')
submissionDf.to_csv( os.path.join(pklpath, 'submission.csv'), index=False )
def load_image_names(annfile, category):
# read into dataframe
xdf = pd.read_csv(annfile)
xdf = xdf[xdf['image_category'] == category]
return xdf
def main_test(savepath, modelpath, augmentFlag):
valfile = os.path.join(modelpath, 'val.log')
bestmodels = get_best_single_model(valfile)
print bestmodels, augmentFlag
xEval = Evaluation('all', bestmodels[0])
# load images and run prediction
testfile = os.path.join("../../data/test/", 'test.csv')
for category in ['skirt', 'blouse', 'trousers', 'outwear', 'dress']:
xdict = dict()
xdf = load_image_names(testfile, category)
print len(xdf), " images to process ", category
count = 0
for _index, _row in xdf.iterrows():
count += 1
if count%1000 == 0:
print count, "images have been processed"
_image_id = _row['image_id']
imageName = os.path.join("../../data/test", _image_id)
if augmentFlag:
dtkp = xEval.predict_kp_with_rotate(imageName, _row['image_category'])
else:
dtkp = xEval.predict_kp(imageName, _row['image_category'], multiOutput=True)
xdict[_image_id] = dtkp
savefile = os.path.join(savepath, category+'.pkl')
with open(savefile, 'wb') as xfile:
pickle.dump(xdict, xfile)
print "prediction save to ", savefile
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--gpuID", default=0, type=int, help='gpu id')
parser.add_argument("--modelpath", help="path of trained model")
parser.add_argument("--outpath", help="path to save predicted keypoints")
parser.add_argument("--augment", default=False, type=bool, help="augment or not")
args = parser.parse_args()
print args
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = str(args.gpuID)
main_test(args.outpath, args.modelpath, args.augment)
submission(args.outpath)
================================================
FILE: src/top/train.py
================================================
import sys
sys.path.insert(0, "../data_gen/")
sys.path.insert(0, "../unet/")
import argparse
import os
from fashion_net import FashionNet
from dataset import getKpNum
import tensorflow as tf
from keras import backend as k
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--gpuID", default=0, type=int, help='gpu id')
parser.add_argument("--category", help="specify cloth category")
parser.add_argument("--network", help="specify network arch'")
parser.add_argument("--batchSize", default=8, type=int, help='batch size for training')
parser.add_argument("--epochs", default=20, type=int, help="number of traning epochs")
parser.add_argument("--resume", default=False, type=bool, help="resume training or not")
parser.add_argument("--lrdecay", default=False, type=bool, help="lr decay or not")
parser.add_argument("--resumeModel", help="start point to retrain")
parser.add_argument("--initEpoch", type=int, help="epoch to resume")
args = parser.parse_args()
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = str(args.gpuID)
# TensorFlow wizardry
config = tf.ConfigProto()
# Don't pre-allocate memory; allocate as-needed
config.gpu_options.allow_growth = True
# Only allow a total of half the GPU memory to be allocated
config.gpu_options.per_process_gpu_memory_fraction = 1.0
# Create a session with the above options specified.
k.tensorflow_backend.set_session(tf.Session(config=config))
if not args.resume :
xnet = FashionNet(512, 512, getKpNum(args.category))
xnet.build_model(modelName=args.network, show=True)
xnet.train(args.category, epochs=args.epochs, batchSize=args.batchSize, lrschedule=args.lrdecay)
else:
xnet = FashionNet(512, 512, getKpNum(args.category))
xnet.resume_train(args.category, args.resumeModel, args.network, args.initEpoch,
epochs=args.epochs, batchSize=args.batchSize)
================================================
FILE: src/unet/fashion_net.py
================================================
import sys
sys.path.insert(0, "../data_gen/")
sys.path.insert(0, "../eval/")
from data_generator import DataGenerator
from keras.callbacks import ModelCheckpoint, CSVLogger
from keras.models import load_model
from data_process import pad_image, normalize_image
import os
import cv2
import numpy as np
import datetime
from eval_callback import NormalizedErrorCallBack
from refinenet_mask_v3 import Res101RefineNetMaskV3, euclidean_loss
from resnet101 import Scale
import tensorflow as tf
class FashionNet(object):
def __init__(self, inputHeight, inputWidth, nClasses):
self.inputWidth = inputWidth
self.inputHeight = inputHeight
self.nClass = nClasses
def build_model(self, modelName='v2', show=False):
self.modelName = modelName
self.model = Res101RefineNetMaskV3(self.nClass, self.inputHeight, self.inputWidth, nStackNum=2)
self.nStackNum = 2
# show model summary and layer name
if show:
self.model.summary()
for layer in self.model.layers:
print layer.name, layer.trainable
def train(self, category, batchSize=8, epochs=20, lrschedule=False):
trainDt = DataGenerator(category, os.path.join("../../data/train/Annotations", "train_split.csv"))
trainGen = trainDt.generator_with_mask_ohem( graph=tf.get_default_graph(), kerasModel=self.model,
batchSize= batchSize, inputSize=(self.inputHeight, self.inputWidth),
nStackNum=self.nStackNum, flipFlag=False, cropFlag=False)
normalizedErrorCallBack = NormalizedErrorCallBack("../../trained_models/", category, True)
csvlogger = CSVLogger( os.path.join(normalizedErrorCallBack.get_folder_path(),
"csv_train_"+self.modelName+"_"+str(datetime.datetime.now().strftime('%H:%M'))+".csv"))
xcallbacks = [normalizedErrorCallBack, csvlogger]
self.model.fit_generator(generator=trainGen, steps_per_epoch=trainDt.get_dataset_size()//batchSize,
epochs=epochs, callbacks=xcallbacks)
def load_model(self, netWeightFile):
self.model = load_model(netWeightFile, custom_objects={'euclidean_loss': euclidean_loss, 'Scale': Scale})
def resume_train(self, category, pretrainModel, modelName, initEpoch, batchSize=8, epochs=20):
self.modelName = modelName
self.load_model(pretrainModel)
refineNetflag = True
self.nStackNum = 2
modelPath = os.path.dirname(pretrainModel)
trainDt = DataGenerator(category, os.path.join("../../data/train/Annotations", "train_split.csv"))
trainGen = trainDt.generator_with_mask_ohem(graph=tf.get_default_graph(), kerasModel=self.model,
batchSize=batchSize, inputSize=(self.inputHeight, self.inputWidth),
nStackNum=self.nStackNum, flipFlag=False, cropFlag=False)
normalizedErrorCallBack = NormalizedErrorCallBack("../../trained_models/", category, refineNetflag, resumeFolder=modelPath)
csvlogger = CSVLogger(os.path.join(normalizedErrorCallBack.get_folder_path(),
"csv_train_" + self.modelName + "_" + str(
datetime.datetime.now().strftime('%H:%M')) + ".csv"))
self.model.fit_generator(initial_epoch=initEpoch, generator=trainGen, steps_per_epoch=trainDt.get_dataset_size() // batchSize,
epochs=epochs, callbacks=[normalizedErrorCallBack, csvlogger])
def predict_image(self, imgfile):
# load image and preprocess
img = cv2.imread(imgfile)
img, _ = pad_image(img, list(), 512, 512)
img = normalize_image(img)
input = img[np.newaxis,:,:,:]
# inference
heatmap = self.model.predict(input)
return heatmap
def predict(self, input):
# inference
heatmap = self.model.predict(input)
return heatmap
================================================
FILE: src/unet/refinenet.py
================================================
from keras.models import *
from keras.layers import *
from keras.optimizers import Adam, SGD
from keras import backend as K
from keras.applications.resnet50 import ResNet50
IMAGE_ORDERING = 'channels_last'
def Res101RefineNetDilated(n_classes, inputHeight, inputWidth):
model = build_network_resnet101(inputHeight, inputWidth, n_classes, dilated=True)
return model
def Res101RefineNetStacked(n_classes, inputHeight, inputWidth, nStackNum):
model = build_network_resnet101_stack(inputHeight, inputWidth, n_classes, nStackNum)
return model
def euclidean_loss(x, y):
return K.sqrt(K.sum(K.square(x - y)))
def create_global_net(lowlevelFeatures, n_classes):
lf2x, lf4x, lf8x, lf16x = lowlevelFeatures
o = lf16x
o = (Conv2D(256, (3, 3), activation='relu', padding='same', name='up16x_conv', data_format=IMAGE_ORDERING))(o)
o = (BatchNormalization())(o)
o = (Conv2DTranspose(256, kernel_size=(3, 3), strides=(2, 2), name='upsample_16x', activation='relu', padding='same',
data_format=IMAGE_ORDERING))(o)
o = (concatenate([o, lf8x], axis=-1))
o = (Conv2D(128, (3, 3), activation='relu', padding='same', name='up8x_conv', data_format=IMAGE_ORDERING))(o)
o = (BatchNormalization())(o)
fup8x = o
o = (Conv2DTranspose(128, kernel_size=(3, 3), strides=(2, 2), name='upsample_8x', padding='same', activation='relu',
data_format=IMAGE_ORDERING))(o)
o = (concatenate([o, lf4x], axis=-1))
o = (Conv2D(64, (3, 3), activation='relu', padding='same', name='up4x_conv', data_format=IMAGE_ORDERING))(o)
o = (BatchNormalization())(o)
fup4x = o
o = (Conv2DTranspose(64, kernel_size=(3, 3), strides=(2, 2), name='upsample_4x', padding='same', activation='relu',
data_format=IMAGE_ORDERING))(o)
o = (concatenate([o, lf2x], axis=-1))
o = (Conv2D(64, (3, 3), activation='relu', padding='same', name='up2x_conv', data_format=IMAGE_ORDERING))(o)
o = (BatchNormalization())(o)
fup2x = o
out2x = Conv2D(n_classes, (1, 1), activation='linear', padding='same', name='out2x', data_format=IMAGE_ORDERING)(fup2x)
out4x = Conv2D(n_classes, (1, 1), activation='linear', padding='same', name='out4x', data_format=IMAGE_ORDERING)(fup4x)
out8x = Conv2D(n_classes, (1, 1), activation='linear', padding='same', name='out8x', data_format=IMAGE_ORDERING)(fup8x)
x4x = UpSampling2D((2, 2), data_format=IMAGE_ORDERING)(out8x)
eadd4x = Add(name='global4x')([x4x, out4x])
x2x = UpSampling2D((2, 2), data_format=IMAGE_ORDERING)(eadd4x)
eadd2x = Add(name='global2x')([x2x, out2x])
return (fup8x, eadd4x, eadd2x)
def create_refine_net(inputFeatures, n_classes):
f8x, f4x, f2x = inputFeatures
# 2 Conv2DTranspose f8x -> fup8x
fup8x = (Conv2DTranspose(128, kernel_size=(3, 3), strides=(2, 2), name='refine8x_deconv_1', padding='same', activation='relu',
data_format=IMAGE_ORDERING))(f8x)
fup8x = (BatchNormalization())(fup8x)
fup8x = (Conv2DTranspose(128, kernel_size=(3, 3), strides=(2, 2), name='refine8x_deconv_2', padding='same', activation='relu',
data_format=IMAGE_ORDERING))(fup8x)
fup8x = (BatchNormalization())(fup8x)
# 1 Conv2DTranspose f4x -> fup4x
fup4x = (Conv2DTranspose(128, kernel_size=(3, 3), strides=(2, 2), name='refine4x_deconv', padding='same', activation='relu',
data_format=IMAGE_ORDERING))(f4x)
fup4x = (BatchNormalization())(fup4x)
# 1 conv f2x -> fup2x
fup2x = (Conv2D(128, (3, 3), activation='relu', padding='same', name='refine2x_conv', data_format=IMAGE_ORDERING))(f2x)
fup2x = (BatchNormalization())(fup2x)
# concat f2x, fup8x, fup4x
fconcat = (concatenate([fup8x, fup4x, fup2x], axis=-1, name='refine_concat'))
# 1x1 to map to required feature map
out2x = Conv2D(n_classes, (1, 1), activation='linear', padding='same', name='refine2x', data_format=IMAGE_ORDERING)(fconcat)
return out2x
def create_refine_net_bottleneck(inputFeatures, n_classes):
f8x, f4x, f2x = inputFeatures
# 2 Conv2DTranspose f8x -> fup8x
fup8x = (Conv2D(256, kernel_size=(1, 1), name='refine8x_1', padding='same', activation='relu', data_format=IMAGE_ORDERING))(f8x)
fup8x = (BatchNormalization())(fup8x)
fup8x = (Conv2D(128, kernel_size=(1, 1), name='refine8x_2', padding='same', activation='relu', data_format=IMAGE_ORDERING))(fup8x)
fup8x = (BatchNormalization())(fup8x)
fup8x = UpSampling2D((4, 4), data_format=IMAGE_ORDERING)(fup8x)
# 1 Conv2DTranspose f4x -> fup4x
fup4x = (Conv2D(128, kernel_size=(1, 1), name='refine4x', padding='same', activation='relu', data_format=IMAGE_ORDERING))(f4x)
fup4x = (BatchNormalization())(fup4x)
fup4x = UpSampling2D((2, 2), data_format=IMAGE_ORDERING)(fup4x)
# 1 conv f2x -> fup2x
fup2x = (Conv2D(128, (1, 1), activation='relu', padding='same', name='refine2x_conv', data_format=IMAGE_ORDERING))(f2x)
fup2x = (BatchNormalization())(fup2x)
# concat f2x, fup8x, fup4x
fconcat = (concatenate([fup8x, fup4x, fup2x], axis=-1, name='refine_concat'))
# 1x1 to map to required feature map
out2x = Conv2D(n_classes, (1, 1), activation='linear', padding='same', name='refine2x', data_format=IMAGE_ORDERING)(fconcat)
return out2x
def create_stack_refinenet(inputFeatures, n_classes, layerName):
f8x, f4x, f2x = inputFeatures
# 2 Conv2DTranspose f8x -> fup8x
fup8x = (Conv2D(256, kernel_size=(1, 1), name=layerName+'_refine8x_1', padding='same', activation='relu'))(f8x)
fup8x = (BatchNormalization())(fup8x)
fup8x = (Conv2D(128, kernel_size=(1, 1), name=layerName+'refine8x_2', padding='same', activation='relu'))(fup8x)
fup8x = (BatchNormalization())(fup8x)
out8x = fup8x
fup8x = UpSampling2D((4, 4), data_format=IMAGE_ORDERING)(fup8x)
# 1 Conv2DTranspose f4x -> fup4x
fup4x = (Conv2D(128, kernel_size=(1, 1), name=layerName+'refine4x', padding='same', activation='relu'))(f4x)
fup4x = (BatchNormalization())(fup4x)
out4x = fup4x
fup4x = UpSampling2D((2, 2), data_format=IMAGE_ORDERING)(fup4x)
# 1 conv f2x -> fup2x
fup2x = (Conv2D(128, (1, 1), activation='relu', padding='same', name=layerName+'refine2x_conv'))(f2x)
fup2x = (BatchNormalization())(fup2x)
# concat f2x, fup8x, fup4x
fconcat = (concatenate([fup8x, fup4x, fup2x], axis=-1, name=layerName+'refine_concat'))
# 1x1 to map to required feature map
out2x = Conv2D(n_classes, (1, 1), activation='linear', padding='same', name=layerName+'refine2x')(fconcat)
return out8x, out4x, out2x
def create_global_net_dilated(lowlevelFeatures, n_classes):
lf2x, lf4x, lf8x, lf16x = lowlevelFeatures
o = lf16x
o = (Conv2D(256, (3, 3), dilation_rate=(2, 2), activation='relu', padding='same', name='up16x_conv', data_format=IMAGE_ORDERING))(o)
o = (BatchNormalization())(o)
o = (Conv2DTranspose(256, kernel_size=(3, 3), strides=(2, 2), name='upsample_16x', activation='relu', padding='same',
data_format=IMAGE_ORDERING))(o)
o = (concatenate([o, lf8x], axis=-1))
o = (Conv2D(128, (3, 3), dilation_rate=(2, 2), activation='relu', padding='same', name='up8x_conv', data_format=IMAGE_ORDERING))(o)
o = (BatchNormalization())(o)
fup8x = o
o = (Conv2DTranspose(128, kernel_size=(3, 3), strides=(2, 2), name='upsample_8x', padding='same', activation='relu',
data_format=IMAGE_ORDERING))(o)
o = (concatenate([o, lf4x], axis=-1))
o = (Conv2D(64, (3, 3), dilation_rate=(2, 2), activation='relu', padding='same', name='up4x_conv', data_format=IMAGE_ORDERING))(o)
o = (BatchNormalization())(o)
fup4x = o
o = (Conv2DTranspose(64, kernel_size=(3, 3), strides=(2, 2), name='upsample_4x', padding='same', activation='relu',
data_format=IMAGE_ORDERING))(o)
o = (concatenate([o, lf2x], axis=-1))
o = (Conv2D(64, (3, 3), dilation_rate=(2, 2), activation='relu', padding='same', name='up2x_conv', data_format=IMAGE_ORDERING))(o)
o = (BatchNormalization())(o)
fup2x = o
out2x = Conv2D(n_classes, (1, 1), activation='linear', padding='same', name='out2x', data_format=IMAGE_ORDERING)(fup2x)
out4x = Conv2D(n_classes, (1, 1), activation='linear', padding='same', name='out4x', data_format=IMAGE_ORDERING)(fup4x)
out8x = Conv2D(n_classes, (1, 1), activation='linear', padding='same', name='out8x', data_format=IMAGE_ORDERING)(fup8x)
x4x = UpSampling2D((2, 2), data_format=IMAGE_ORDERING)(out8x)
eadd4x = Add(name='global4x')([x4x, out4x])
x2x = UpSampling2D((2, 2), data_format=IMAGE_ORDERING)(eadd4x)
eadd2x = Add(name='global2x')([x2x, out2x])
return (fup8x, eadd4x, eadd2x)
def build_network_resnet101(inputHeight, inputWidth, n_classes, frozenlayers=True, dilated=False):
input, lf2x, lf4x, lf8x, lf16x = load_backbone_res101net(inputHeight, inputWidth)
# global net 8x, 4x, and 2x
if dilated:
g8x, g4x, g2x = create_global_net_dilated((lf2x, lf4x, lf8x, lf16x), n_classes)
else:
g8x, g4x, g2x = create_global_net((lf2x, lf4x, lf8x, lf16x), n_classes)
# refine net, only 2x as output
refine2x = create_refine_net_bottleneck((g8x, g4x, g2x), n_classes)
model = Model(inputs=input, outputs=[g2x, refine2x])
adam = Adam(lr=1e-4)
model.compile(optimizer=adam, loss=euclidean_loss, metrics=["accuracy"])
return model
def build_network_resnet101_stack(inputHeight, inputWidth, n_classes, nStack):
# backbone network
input, lf2x,lf4x, lf8x, lf16x = load_backbone_res101net(inputHeight, inputWidth)
# global net
g8x, g4x, g2x = create_global_net_dilated((lf2x, lf4x, lf8x, lf16x), n_classes)
s8x, s4x, s2x = g8x, g4x, g2x
outputs = [g2x]
for i in range(nStack):
s8x, s4x, s2x = create_stack_refinenet((s8x, s4x, s2x), n_classes, 'stack_'+str(i))
outputs.append(s2x)
model = Model(inputs=input, outputs=outputs)
adam = Adam(lr=1e-4)
model.compile(optimizer=adam, loss=euclidean_loss, metrics=["accuracy"])
return model
def load_backbone_res101net(inputHeight, inputWidth):
from resnet101 import ResNet101
xresnet = ResNet101(weights='imagenet', include_top=False, input_shape=(inputHeight, inputWidth, 3))
xresnet.load_weights("../../data/resnet101_weights_tf.h5", by_name=True)
lf16x = xresnet.get_layer('res4b22_relu').output
lf8x = xresnet.get_layer('res3b2_relu').output
lf4x = xresnet.get_layer('res2c_relu').output
lf2x = xresnet.get_layer('conv1_relu').output
# add one padding for lf4x whose shape is 127x127
lf4xp = ZeroPadding2D(padding=((0, 1), (0, 1)))(lf4x)
return (xresnet.input, lf2x, lf4xp, lf8x, lf16x)
================================================
FILE: src/unet/refinenet_mask_v3.py
================================================
from refinenet import load_backbone_res101net, create_global_net_dilated, create_stack_refinenet
from keras.models import *
from keras.layers import *
from keras.optimizers import Adam, SGD
from keras import backend as K
import keras
def Res101RefineNetMaskV3(n_classes, inputHeight, inputWidth, nStackNum):
model = build_resnet101_stack_mask_v3(inputHeight, inputWidth, n_classes, nStackNum)
return model
def euclidean_loss(x, y):
return K.sqrt(K.sum(K.square(x - y)))
def apply_mask_to_output(output, mask):
output_with_mask = keras.layers.multiply([output, mask])
return output_with_mask
def build_resnet101_stack_mask_v3(inputHeight, inputWidth, n_classes, nStack):
input_mask = Input(shape=(inputHeight//2, inputHeight//2, n_classes), name='mask')
input_ohem_mask = Input(shape=(inputHeight//2, inputHeight//2, n_classes), name='ohem_mask')
# backbone network
input_image, lf2x,lf4x, lf8x, lf16x = load_backbone_res101net(inputHeight, inputWidth)
# global net
g8x, g4x, g2x = create_global_net_dilated((lf2x, lf4x, lf8x, lf16x), n_classes)
s8x, s4x, s2x = g8x, g4x, g2x
g2x_mask = apply_mask_to_output(g2x, input_mask)
outputs = [g2x_mask]
for i in range(nStack):
s8x, s4x, s2x = create_stack_refinenet((s8x, s4x, s2x), n_classes, 'stack_'+str(i))
if i == (nStack-1): # last stack with ohem_mask
s2x_mask = apply_mask_to_output(s2x, input_ohem_mask)
else:
s2x_mask = apply_mask_to_output(s2x, input_mask)
outputs.append(s2x_mask)
model = Model(inputs=[input_image, input_mask, input_ohem_mask], outputs=outputs)
adam = Adam(lr=1e-4)
model.compile(optimizer=adam, loss=euclidean_loss, metrics=["accuracy"])
return model
================================================
FILE: src/unet/resnet101.py
================================================
# -*- coding: utf-8 -*-
"""ResNet-101 model for Keras.
# Reference:
- [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385)
Slightly modified Felix Yu's (https://github.com/flyyufelix) implementation of
ResNet-101 to have consistent API as those pre-trained models within
`keras.applications`. The original implementation is found here
https://gist.github.com/flyyufelix/65018873f8cb2bbe95f429c474aa1294#file-resnet-101_keras-py
Implementation is based on Keras 2.0
"""
from keras.layers import (
Input, Dense, Conv2D, MaxPooling2D, AveragePooling2D, ZeroPadding2D,
Flatten, Activation, GlobalAveragePooling2D, GlobalMaxPooling2D, add)
from keras.layers.normalization import BatchNormalization
from keras.models import Model
from keras import initializers
from keras.engine import Layer, InputSpec
from keras.engine.topology import get_source_inputs
from keras import backend as K
from keras.applications.imagenet_utils import _obtain_input_shape
from keras.utils.data_utils import get_file
import warnings
import sys
sys.setrecursionlimit(3000)
WEIGHTS_PATH_TH = 'https://dl.dropboxusercontent.com/s/rrp56zm347fbrdn/resnet101_weights_th.h5?dl=0'
WEIGHTS_PATH_TF = 'https://dl.dropboxusercontent.com/s/a21lyqwgf88nz9b/resnet101_weights_tf.h5?dl=0'
MD5_HASH_TH = '3d2e9a49d05192ce6e22200324b7defe'
MD5_HASH_TF = '867a922efc475e9966d0f3f7b884dc15'
class Scale(Layer):
'''Learns a set of weights and biases used for scaling the input data.
the output consists simply in an element-wise multiplication of the input
and a sum of a set of constants:
out = in * gamma + beta,
where 'gamma' and 'beta' are the weights and biases larned.
# Arguments
axis: integer, axis along which to normalize in mode 0. For instance,
if your input tensor has shape (samples, channels, rows, cols),
set axis to 1 to normalize per feature map (channels axis).
momentum: momentum in the computation of the
exponential average of the mean and standard deviation
of the data, for feature-wise normalization.
weights: Initialization weights.
List of 2 Numpy arrays, with shapes:
`[(input_shape,), (input_shape,)]`
beta_init: name of initialization function for shift parameter
(see [initializers](../initializers.md)), or alternatively,
Theano/TensorFlow function to use for weights initialization.
This parameter is only relevant if you don't pass a `weights`
argument.
gamma_init: name of initialization function for scale parameter (see
[initializers](../initializers.md)), or alternatively,
Theano/TensorFlow function to use for weights initialization.
This parameter is only relevant if you don't pass a `weights`
argument.
gamma_init: name of initialization function for scale parameter (see
[initializers](../initializers.md)), or alternatively,
Theano/TensorFlow function to use for weights initialization.
This parameter is only relevant if you don't pass a `weights`
argument.
'''
def __init__(self,
weights=None,
axis=-1,
momentum=0.9,
beta_init='zero',
gamma_init='one',
**kwargs):
self.momentum = momentum
self.axis = axis
self.beta_init = initializers.get(beta_init)
self.gamma_init = initializers.get(gamma_init)
self.initial_weights = weights
super(Scale, self).__init__(**kwargs)
def build(self, input_shape):
self.input_spec = [InputSpec(shape=input_shape)]
shape = (int(input_shape[self.axis]),)
self.gamma = K.variable(
self.gamma_init(shape),
name='{}_gamma'.format(self.name))
self.beta = K.variable(
self.beta_init(shape),
name='{}_beta'.format(self.name))
self.trainable_weights = [self.gamma, self.beta]
if self.initial_weights is not None:
self.set_weights(self.initial_weights)
del self.initial_weights
def call(self, x, mask=None):
input_shape = self.input_spec[0].shape
broadcast_shape = [1] * len(input_shape)
broadcast_shape[self.axis] = input_shape[self.axis]
out = K.reshape(
self.gamma,
broadcast_shape) * x + K.reshape(self.beta, broadcast_shape)
return out
def get_config(self):
config = {"momentum": self.momentum, "axis": self.axis}
base_config = super(Scale, self).get_config()
return dict(list(base_config.items()) + list(config.items()))
def identity_block(input_tensor, kernel_size, filters, stage, block):
'''The identity_block is the block that has no conv layer at shortcut
# Arguments
input_tensor: input tensor
kernel_size: defualt 3, the kernel size of middle conv layer at main
path
filters: list of integers, the nb_filters of 3 conv layer at main path
stage: integer, current stage label, used for generating layer names
block: 'a','b'..., current block label, used for generating layer names
'''
eps = 1.1e-5
if K.image_data_format() == 'channels_last':
bn_axis = 3
else:
bn_axis = 1
nb_filter1, nb_filter2, nb_filter3 = filters
conv_name_base = 'res' + str(stage) + block + '_branch'
bn_name_base = 'bn' + str(stage) + block + '_branch'
scale_name_base = 'scale' + str(stage) + block + '_branch'
x = Conv2D(nb_filter1, (1, 1), name=conv_name_base + '2a',
use_bias=False)(input_tensor)
x = BatchNormalization(epsilon=eps, axis=bn_axis,
name=bn_name_base + '2a')(x)
x = Scale(axis=bn_axis, name=scale_name_base + '2a')(x)
x = Activation('relu', name=conv_name_base + '2a_relu')(x)
x = ZeroPadding2D((1, 1), name=conv_name_base + '2b_zeropadding')(x)
x = Conv2D(nb_filter2, (kernel_size, kernel_size),
name=conv_name_base + '2b', use_bias=False)(x)
x = BatchNormalization(epsilon=eps, axis=bn_axis,
name=bn_name_base + '2b')(x)
x = Scale(axis=bn_axis, name=scale_name_base + '2b')(x)
x = Activation('relu', name=conv_name_base + '2b_relu')(x)
x = Conv2D(nb_filter3, (1, 1), name=conv_name_base + '2c',
use_bias=False)(x)
x = BatchNormalization(epsilon=eps, axis=bn_axis,
name=bn_name_base + '2c')(x)
x = Scale(axis=bn_axis, name=scale_name_base + '2c')(x)
x = add([x, input_tensor], name='res' + str(stage) + block)
x = Activation('relu', name='res' + str(stage) + block + '_relu')(x)
return x
def conv_block(input_tensor,
kernel_size,
filters,
stage,
block,
strides=(2, 2)):
'''conv_block is the block that has a conv layer at shortcut
# Arguments
input_tensor: input tensor
kernel_size: defualt 3, the kernel size of middle conv layer at main
path
filters: list of integers, the nb_filters of 3 conv layer at main path
stage: integer, current stage label, used for generating layer names
block: 'a','b'..., current block label, used for generating layer names
Note that from stage 3, the first conv layer at main path is with
strides=(2,2). And the shortcut should have strides=(2,2) as well
'''
eps = 1.1e-5
if K.image_data_format() == 'channels_last':
bn_axis = 3
else:
bn_axis = 1
nb_filter1, nb_filter2, nb_filter3 = filters
conv_name_base = 'res' + str(stage) + block + '_branch'
bn_name_base = 'bn' + str(stage) + block + '_branch'
scale_name_base = 'scale' + str(stage) + block + '_branch'
x = Conv2D(nb_filter1, (1, 1), strides=strides,
name=conv_name_base + '2a', use_bias=False)(input_tensor)
x = BatchNormalization(epsilon=eps, axis=bn_axis,
name=bn_name_base + '2a')(x)
x = Scale(axis=bn_axis, name=scale_name_base + '2a')(x)
x = Activation('relu', name=conv_name_base + '2a_relu')(x)
x = ZeroPadding2D((1, 1), name=conv_name_base + '2b_zeropadding')(x)
x = Conv2D(nb_filter2, (kernel_size, kernel_size),
name=conv_name_base + '2b', use_bias=False)(x)
x = BatchNormalization(epsilon=eps, axis=bn_axis,
name=bn_name_base + '2b')(x)
x = Scale(axis=bn_axis, name=scale_name_base + '2b')(x)
x = Activation('relu', name=conv_name_base + '2b_relu')(x)
x = Conv2D(nb_filter3, (1, 1),
name=conv_name_base + '2c', use_bias=False)(x)
x = BatchNormalization(epsilon=eps, axis=bn_axis,
name=bn_name_base + '2c')(x)
x = Scale(axis=bn_axis, name=scale_name_base + '2c')(x)
shortcut = Conv2D(nb_filter3, (1, 1), strides=strides,
name=conv_name_base + '1', use_bias=False)(input_tensor)
shortcut = BatchNormalization(epsilon=eps, axis=bn_axis,
name=bn_name_base + '1')(shortcut)
shortcut = Scale(axis=bn_axis, name=scale_name_base + '1')(shortcut)
x = add([x, shortcut], name='res' + str(stage) + block)
x = Activation('relu', name='res' + str(stage) + block + '_relu')(x)
return x
def ResNet101(include_top=True,
weights='imagenet',
input_tensor=None,
input_shape=None,
pooling=None,
classes=1000):
"""Instantiates the ResNet-101 architecture.
Optionally loads weights pre-trained on ImageNet. Note that when using
TensorFlow, for best performance you should set
image_data_format='channels_last'` in your Keras config at
~/.keras/keras.json.
The model and the weights are compatible with both TensorFlow and Theano.
The data format convention used by the model is the one specified in your
Keras config file.
Parameters
----------
include_top: whether to include the fully-connected layer at the top of
the network.
weights: one of `None` (random initialization) or 'imagenet'
(pre-training on ImageNet).
input_tensor: optional Keras tensor (i.e. output of `layers.Input()`)
to use as image input for the model.
input_shape: optional shape tuple, only to be specified if
`include_top` is False (otherwise the input shape has to be
`(224, 224, 3)` (with `channels_last` data format) or
`(3, 224, 224)` (with `channels_first` data format). It should have
exactly 3 inputs channels, and width and height should be no
smaller than 197.
E.g. `(200, 200, 3)` would be one valid value.
pooling: Optional pooling mode for feature extraction when
`include_top` is `False`.
- `None` means that the output of the model will be the 4D tensor
output of the last convolutional layer.
- `avg` means that global average pooling will be applied to the
output of the last convolutional layer, and thus the output of
the model will be a 2D tensor.
- `max` means that global max pooling will be applied.
classes: optional number of classes to classify images into, only to be
specified if `include_top` is True, and if no `weights` argument is
specified.
Returns
-------
A Keras model instance.
Raises
------
ValueError: in case of invalid argument for `weights`, or invalid input
shape.
"""
if weights not in {'imagenet', None}:
raise ValueError('The `weights` argument should be either '
'`None` (random initialization) or `imagenet` '
'(pre-training on ImageNet).')
if weights == 'imagenet' and include_top and classes != 1000:
raise ValueError('If using `weights` as imagenet with `include_top`'
' as true, `classes` should be 1000')
# Determine proper input shape
input_shape = _obtain_input_shape(input_shape,
default_size=224,
min_size=197,
data_format=K.image_data_format(),
require_flatten=include_top,
weights=weights)
if input_tensor is None:
img_input = Input(shape=input_shape, name='data')
else:
if not K.is_keras_tensor(input_tensor):
img_input = Input(
tensor=input_tensor, shape=input_shape, name='data')
else:
img_input = input_tensor
if K.image_data_format() == 'channels_last':
bn_axis = 3
else:
bn_axis = 1
eps = 1.1e-5
x = ZeroPadding2D((3, 3), name='conv1_zeropadding')(img_input)
x = Conv2D(64, (7, 7), strides=(2, 2), name='conv1', use_bias=False)(x)
x = BatchNormalization(epsilon=eps, axis=bn_axis, name='bn_conv1')(x)
x = Scale(axis=bn_axis, name='scale_conv1')(x)
x = Activation('relu', name='conv1_relu')(x)
x = MaxPooling2D((3, 3), strides=(2, 2), name='pool1')(x)
x = conv_block(x, 3, [64, 64, 256], stage=2, block='a', strides=(1, 1))
x = identity_block(x, 3, [64, 64, 256], stage=2, block='b')
x = identity_block(x, 3, [64, 64, 256], stage=2, block='c')
x = conv_block(x, 3, [128, 128, 512], stage=3, block='a')
for i in range(1, 3):
x = identity_block(x, 3, [128, 128, 512], stage=3, block='b' + str(i))
x = conv_block(x, 3, [256, 256, 1024], stage=4, block='a')
for i in range(1, 23):
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='b' + str(i))
x = conv_block(x, 3, [512, 512, 2048], stage=5, block='a')
x = identity_block(x, 3, [512, 512, 2048], stage=5, block='b')
x = identity_block(x, 3, [512, 512, 2048], stage=5, block='c')
x = AveragePooling2D((7, 7), name='avg_pool')(x)
if include_top:
x = Flatten()(x)
x = Dense(classes, activation='softmax', name='mmfc1000')(x)
else:
if pooling == 'avg':
x = GlobalAveragePooling2D()(x)
elif pooling == 'max':
x = GlobalMaxPooling2D()(x)
# Ensure that the model takes into account
# any potential predecessors of `input_tensor`.
if input_tensor is not None:
inputs = get_source_inputs(input_tensor)
else:
inputs = img_input
# Create model.
model = Model(inputs, x, name='resnet101')
'''
# load weights
if weights == 'imagenet':
filename = 'resnet101_weights_{}.h5'.format(K.image_dim_ordering())
if K.backend() == 'theano':
path = WEIGHTS_PATH_TH
md5_hash = MD5_HASH_TH
else:
path = WEIGHTS_PATH_TF
md5_hash = MD5_HASH_TF
weights_path = get_file(
fname=filename,
origin=path,
cache_subdir='models',
md5_hash=md5_hash,
hash_algorithm='md5')
model.load_weights(weights_path, by_name=True)
if K.image_data_format() == 'channels_first' and K.backend() == 'tensorflow':
warnings.warn('You are using the TensorFlow backend, yet you '
'are using the Theano '
'image data format convention '
'(`image_data_format="channels_first"`). '
'For best performance, set '
'`image_data_format="channels_last"` in '
'your Keras config '
'at ~/.keras/keras.json.')
'''
return model
================================================
FILE: submission/placeholder.txt
================================================
================================================
FILE: trained_models/placeholder.txt
================================================
gitextract_r2c2s0cn/
├── .gitignore
├── LICENSE
├── README.md
├── data/
│ └── placeholder.txt
├── src/
│ ├── data_gen/
│ │ ├── data_generator.py
│ │ ├── data_process.py
│ │ ├── dataset.py
│ │ ├── kpAnno.py
│ │ ├── ohem.py
│ │ └── utils.py
│ ├── eval/
│ │ ├── eval_callback.py
│ │ ├── evaluation.py
│ │ └── post_process.py
│ ├── top/
│ │ ├── demo.py
│ │ ├── test.py
│ │ └── train.py
│ └── unet/
│ ├── fashion_net.py
│ ├── refinenet.py
│ ├── refinenet_mask_v3.py
│ └── resnet101.py
├── submission/
│ └── placeholder.txt
└── trained_models/
└── placeholder.txt
SYMBOL INDEX (103 symbols across 15 files)
FILE: src/data_gen/data_generator.py
class DataGenerator (line 15) | class DataGenerator(object):
method __init__ (line 17) | def __init__(self, category, annfile):
method get_dim_order (line 22) | def get_dim_order(self):
method get_dataset_size (line 26) | def get_dataset_size(self):
method generator_with_mask_ohem (line 29) | def generator_with_mask_ohem(self, graph, kerasModel, batchSize=16, in...
method _initialize (line 83) | def _initialize(self):
method _load_anno (line 86) | def _load_anno(self):
method _prcoess_img (line 98) | def _prcoess_img(self, dfrow, inputSize, rotateFlag, flipFlag, cropFla...
method __generate_hmap (line 148) | def __generate_hmap(self, cvmat, kpAnnolst):
method flip_image (line 171) | def flip_image(self, orgimg, orgKpAnolst):
method flip_annlst (line 177) | def flip_annlst(self, kpannlst, imgshape):
FILE: src/data_gen/data_process.py
function normalize_image (line 7) | def normalize_image(cvmat):
function resize_image (line 13) | def resize_image(cvmat, targetWidth, targetHeight):
function pad_image (line 40) | def pad_image(cvmat, kpAnno, targetWidth, targetHeight):
function pad_image_inference (line 74) | def pad_image_inference(cvmat, targetWidth, targetHeight):
function rotate_image (line 104) | def rotate_image(cvmat, kpAnnLst, rotateAngle):
function rotate_image_with_invrmat (line 133) | def rotate_image_with_invrmat(cvmat, rotateAngle):
function rotate_mask (line 158) | def rotate_mask(mask, rotateAngle):
function rotate_image_float (line 164) | def rotate_image_float(cvmat, rotateAngle, borderValue=(0.0, 0.0, 0.0)):
function crop_image (line 187) | def crop_image(cvmat, kpAnnLst, lowLimitRatio, upLimitRatio):
FILE: src/data_gen/dataset.py
function getKpNum (line 3) | def getKpNum(category):
function getFlipKeys (line 49) | def getFlipKeys(category):
function getFlipMapID (line 70) | def getFlipMapID(category, partid):
function getKpKeys (line 76) | def getKpKeys(category):
function fill_dataframe (line 116) | def fill_dataframe(kplst, category, dfrow):
function get_kp_index_from_allkeys (line 129) | def get_kp_index_from_allkeys(kpname):
function generate_input_mask (line 138) | def generate_input_mask(image_category, shape, nobgFlag=True):
FILE: src/data_gen/kpAnno.py
class KpAnno (line 4) | class KpAnno(object):
method __init__ (line 8) | def __init__(self, x, y, visibility):
method readFromStr (line 14) | def readFromStr(cls, xstr):
method applyScale (line 22) | def applyScale(cls, kpAnno, scale):
method applyRotate (line 29) | def applyRotate(cls, kpAnno, rotateMatrix):
method applyOffset (line 35) | def applyOffset(cls, kpAnno, offset):
method calcDistance (line 42) | def calcDistance(kpA, kpB):
FILE: src/data_gen/ohem.py
function generate_topk_mask_ohem (line 10) | def generate_topk_mask_ohem(input_data, gthmap, keras_model, graph, topK...
function adjsut_mask (line 58) | def adjsut_mask(loss, input_mask, topk):
FILE: src/data_gen/utils.py
function make_gaussian (line 6) | def make_gaussian(width, height, sigma=3, center=None):
function split_csv_train_val (line 25) | def split_csv_train_val(allcsv, traincsv, valcsv, ratio=0.8):
function np_euclidean_l2 (line 41) | def np_euclidean_l2(x, y):
function load_annotation_from_df (line 48) | def load_annotation_from_df(df, category):
FILE: src/eval/eval_callback.py
class NormalizedErrorCallBack (line 7) | class NormalizedErrorCallBack(keras.callbacks.Callback):
method __init__ (line 9) | def __init__(self, foldpath, category, multiOut=False, resumeFolder=No...
method get_folder_path (line 23) | def get_folder_path(self):
method on_epoch_end (line 26) | def on_epoch_end(self, epoch, logs=None):
FILE: src/eval/evaluation.py
class Evaluation (line 21) | class Evaluation(object):
method __init__ (line 22) | def __init__(self, category, modelFile):
method init_from_model (line 28) | def init_from_model(self, model):
method eval (line 32) | def eval(self, multiOut=False, details=False, flip=True):
method _initialize (line 55) | def _initialize(self, modelFile):
method _initialize_network (line 59) | def _initialize_network(self, modelFile):
method _load_anno (line 62) | def _load_anno(self):
method _get_groundtruth_kpAnno (line 74) | def _get_groundtruth_kpAnno(self, dfrow):
method _net_inference_with_mask (line 81) | def _net_inference_with_mask(self, imgFile, imgCategory):
method _heatmap_sum (line 101) | def _heatmap_sum(self, heatmaplst):
method predict_kp (line 107) | def predict_kp(self, imgFile, imgCategory, multiOutput=False):
method predict_kp_with_flip (line 129) | def predict_kp_with_flip(self, imgFile, imgCategory):
method _net_inference_flip (line 144) | def _net_inference_flip(self, imgFile, imgCategory):
method predict_kp_with_rotate (line 190) | def predict_kp_with_rotate(self, imgFile, imgCategory):
method _net_inference_rotate (line 210) | def _net_inference_rotate(self, imgFile, imgCategory):
method _flip_out_heatmap (line 248) | def _flip_out_heatmap(self, flipout):
method get_normized_distance (line 258) | def get_normized_distance(category, gtKp):
method calc_ne_score (line 290) | def calc_ne_score(category, dtKp, gtKp):
FILE: src/eval/post_process.py
function post_process_heatmap (line 7) | def post_process_heatmap(heatMap, kpConfidenceTh=0.2):
function non_max_supression (line 23) | def non_max_supression(plain, windowSize=3, threshold=1e-6):
FILE: src/top/demo.py
function visualize_keypoint (line 13) | def visualize_keypoint(imageName, category, dtkp):
function demo (line 22) | def demo(modelfile):
FILE: src/top/test.py
function get_best_single_model (line 16) | def get_best_single_model(valfile):
function fill_dataframe (line 40) | def fill_dataframe(kplst, keys, dfrow, image_category):
function get_kp_from_dict (line 51) | def get_kp_from_dict(mdict, image_category, image_id):
function submission (line 58) | def submission(pklpath):
function load_image_names (line 91) | def load_image_names(annfile, category):
function main_test (line 97) | def main_test(savepath, modelpath, augmentFlag):
FILE: src/unet/fashion_net.py
class FashionNet (line 19) | class FashionNet(object):
method __init__ (line 21) | def __init__(self, inputHeight, inputWidth, nClasses):
method build_model (line 26) | def build_model(self, modelName='v2', show=False):
method train (line 37) | def train(self, category, batchSize=8, epochs=20, lrschedule=False):
method load_model (line 53) | def load_model(self, netWeightFile):
method resume_train (line 56) | def resume_train(self, category, pretrainModel, modelName, initEpoch, ...
method predict_image (line 80) | def predict_image(self, imgfile):
method predict (line 91) | def predict(self, input):
FILE: src/unet/refinenet.py
function Res101RefineNetDilated (line 9) | def Res101RefineNetDilated(n_classes, inputHeight, inputWidth):
function Res101RefineNetStacked (line 13) | def Res101RefineNetStacked(n_classes, inputHeight, inputWidth, nStackNum):
function euclidean_loss (line 17) | def euclidean_loss(x, y):
function create_global_net (line 21) | def create_global_net(lowlevelFeatures, n_classes):
function create_refine_net (line 62) | def create_refine_net(inputFeatures, n_classes):
function create_refine_net_bottleneck (line 93) | def create_refine_net_bottleneck(inputFeatures, n_classes):
function create_stack_refinenet (line 125) | def create_stack_refinenet(inputFeatures, n_classes, layerName):
function create_global_net_dilated (line 157) | def create_global_net_dilated(lowlevelFeatures, n_classes):
function build_network_resnet101 (line 199) | def build_network_resnet101(inputHeight, inputWidth, n_classes, frozenla...
function build_network_resnet101_stack (line 219) | def build_network_resnet101_stack(inputHeight, inputWidth, n_classes, nS...
function load_backbone_res101net (line 240) | def load_backbone_res101net(inputHeight, inputWidth):
FILE: src/unet/refinenet_mask_v3.py
function Res101RefineNetMaskV3 (line 9) | def Res101RefineNetMaskV3(n_classes, inputHeight, inputWidth, nStackNum):
function euclidean_loss (line 13) | def euclidean_loss(x, y):
function apply_mask_to_output (line 16) | def apply_mask_to_output(output, mask):
function build_resnet101_stack_mask_v3 (line 20) | def build_resnet101_stack_mask_v3(inputHeight, inputWidth, n_classes, nS...
FILE: src/unet/resnet101.py
class Scale (line 38) | class Scale(Layer):
method __init__ (line 73) | def __init__(self,
method build (line 87) | def build(self, input_shape):
method call (line 103) | def call(self, x, mask=None):
method get_config (line 113) | def get_config(self):
function identity_block (line 119) | def identity_block(input_tensor, kernel_size, filters, stage, block):
function conv_block (line 165) | def conv_block(input_tensor,
function ResNet101 (line 224) | def ResNet101(include_top=True,
Condensed preview — 22 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (91K chars).
[
{
"path": ".gitignore",
"chars": 18,
"preview": ".idea\n*.pyc\n*.pkl\n"
},
{
"path": "LICENSE",
"chars": 1065,
"preview": "MIT License\n\nCopyright (c) 2018 VictorLi\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\no"
},
{
"path": "README.md",
"chars": 3924,
"preview": "# AiFashion\n\n- Author: VictorLi, yuanyuan.li85@gmail.com\n- Code for FashionAI Global Challenge—Key Points Detection of "
},
{
"path": "data/placeholder.txt",
"chars": 0,
"preview": ""
},
{
"path": "src/data_gen/data_generator.py",
"chars": 7499,
"preview": "\nimport os\nimport cv2\nimport pandas as pd\nimport numpy as np\nimport random\n\nfrom kpAnno import KpAnno\nfrom dataset impor"
},
{
"path": "src/data_gen/data_process.py",
"chars": 7248,
"preview": "import pandas as pd\nimport numpy as np\nimport cv2\nimport os\nfrom kpAnno import KpAnno\n\ndef normalize_image(cvmat):\n a"
},
{
"path": "src/data_gen/dataset.py",
"chars": 7971,
"preview": "\n\ndef getKpNum(category):\n # remove one column 'image_id'\n return len(getKpKeys(category)) - 1\n\nTROUSERS_PART_KYES"
},
{
"path": "src/data_gen/kpAnno.py",
"chars": 1175,
"preview": "import numpy as np\n\n\nclass KpAnno(object):\n '''\n Convert string to x, y, visibility\n '''\n def __init__(s"
},
{
"path": "src/data_gen/ohem.py",
"chars": 1999,
"preview": "\nimport sys\nsys.path.insert(0, \"../unet/\")\n\nfrom keras.models import *\nfrom keras.layers import *\nfrom utils import np_e"
},
{
"path": "src/data_gen/utils.py",
"chars": 1232,
"preview": "\nimport numpy as np\nimport pandas as pd\nimport os\n\ndef make_gaussian(width, height, sigma=3, center=None):\n '''\n "
},
{
"path": "src/eval/eval_callback.py",
"chars": 1788,
"preview": "\nimport keras\nimport os\nimport datetime\nfrom evaluation import Evaluation\nfrom time import time\nclass NormalizedErrorCal"
},
{
"path": "src/eval/evaluation.py",
"chars": 11402,
"preview": "\nimport sys\nsys.path.insert(0, \"../data_gen/\")\nsys.path.insert(0, \"../unet/\")\n\nimport pandas as pd\nfrom dataset import g"
},
{
"path": "src/eval/post_process.py",
"chars": 995,
"preview": "import cv2\nimport numpy as np\nfrom scipy.ndimage import gaussian_filter, maximum_filter\nfrom keras.layers import *\nfrom "
},
{
"path": "src/top/demo.py",
"chars": 1565,
"preview": "import sys\nsys.path.insert(0, \"../data_gen/\")\nsys.path.insert(0, \"../eval/\")\nsys.path.insert(0, \"../unet/\")\n\nimport argp"
},
{
"path": "src/top/test.py",
"chars": 4740,
"preview": "import sys\nsys.path.insert(0, \"../data_gen/\")\nsys.path.insert(0, \"../eval/\")\nsys.path.insert(0, \"../unet/\")\n\nimport argp"
},
{
"path": "src/top/train.py",
"chars": 2035,
"preview": "import sys\nsys.path.insert(0, \"../data_gen/\")\nsys.path.insert(0, \"../unet/\")\n\nimport argparse\nimport os\nfrom fashion_net"
},
{
"path": "src/unet/fashion_net.py",
"chars": 4085,
"preview": "\nimport sys\nsys.path.insert(0, \"../data_gen/\")\nsys.path.insert(0, \"../eval/\")\n\nfrom data_generator import DataGenerator\n"
},
{
"path": "src/unet/refinenet.py",
"chars": 10828,
"preview": "from keras.models import *\nfrom keras.layers import *\nfrom keras.optimizers import Adam, SGD\nfrom keras import backend a"
},
{
"path": "src/unet/refinenet_mask_v3.py",
"chars": 1770,
"preview": "\nfrom refinenet import load_backbone_res101net, create_global_net_dilated, create_stack_refinenet\nfrom keras.models impo"
},
{
"path": "src/unet/resnet101.py",
"chars": 15937,
"preview": "# -*- coding: utf-8 -*-\n\"\"\"ResNet-101 model for Keras.\n\n# Reference:\n\n- [Deep Residual Learning for Image Recognition](h"
},
{
"path": "submission/placeholder.txt",
"chars": 0,
"preview": ""
},
{
"path": "trained_models/placeholder.txt",
"chars": 0,
"preview": ""
}
]
About this extraction
This page contains the full source code of the yuanyuanli85/FashionAI_KeyPoint_Detection_Challenge_Keras GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 22 files (85.2 KB), approximately 24.7k tokens, and a symbol index with 103 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.