Repository: aimagelab/dress-code Branch: main Commit: 2e9e7ed55bec Files: 10 Total size: 36.2 KB Directory structure: gitextract_dt5ybub4/ ├── LICENCE ├── README.md ├── conf.py ├── data/ │ ├── __init__.py │ ├── dataloader.py │ ├── dataset.py │ └── labelmap.py ├── main.py └── utils/ ├── __init__.py └── label_map.py ================================================ FILE CONTENTS ================================================ ================================================ FILE: LICENCE ================================================ Yoox Net-a-Porter Dress Code Dataset – Licence Terms SUMMARY This is a summary of, and not a substitution for, the licence terms set out below. - You are free to use and distribute the Dress Code Dataset for the purposes of non-commercial academic research, teaching and publication. - You can use the Dress Code Dataset only as provided: you are not permitted to create and distribute alterations to the Dress Code Dataset or any part of it. - When you share or publish any part of the Dress Code Dataset you must include an attribution. - The Dress Code Dataset is provided “as is” and without warranty. - Your rights to use the Dress Code Dataset may be revoked. LICENCE TERMS By making any use of the Dress Code Dataset (as defined below), you accept and agree to comply with the terms and conditions of this Licence. Your right to use the Dress Code Dataset is subject to and conditional upon your compliance with those terms and conditions. *** Definitions *** - The Dress Code Dataset means the dataset of image pairs, for image-based virtual try-on, made available by us through Unimore. - Unimore means The University of Modena and Reggio Emilia, whose address is at Via Università, 4, 41121 Modena MO, Italy. - We - or YNAP means Yoox Net-a-Porter Group S.p.A., whose address is at Via Morimondo, 17, 20143 Milan MI, Italy. - You means the natural or legal person making use of the Dress Code Dataset *** What you can do *** You can use the Dress Code Dataset only for the purposes of research, teaching and publication (which must in each case be academic and non-commercial). For the purposes of this provision: - using the Dress Code Dataset means you can use, copy, publish, distribute and transmit it (in whole or in part), but does not allow you to adapt it except as expressly set out below. - non-commercial means you cannot use the Dress Code Dataset for purposes which are mainly directed towards payment or some other commercial advantage. You are not prohibited from academic use simply because it has some incidental commercial nature. For example, use of the Dress Code Dataset in academic publication is permitted even if the relevant journal is subject to a subscription fee, and use of the Dress Code Dataset in academic teaching is permitted even if that teaching is subject to tuition fees. - academic means in connection with education, teaching and research activities undertaken by an accredited not-for-profit academic institution, and excludes research undertaken in collaboration with any commercial entity except under terms which prohibit that entity from making commercial use of any results arising from that research. We grant to you a worldwide, royalty-free, non-sublicensable, non-exclusive license under our respective rights (including copyright and database right) in the Dress Code Dataset for the permitted purposes set out above. You may adapt the Dress Code Dataset only as necessary to make it interoperable with any other systems or technology which you are using for non-commercial academic research, teaching or publication. Otherwise, you may not adapt it without our permission. Any adaptations which you do make (for interoperability, or with our permission) and any other works you make using the Dress Code Dataset will be subject to the same terms and restrictions of this Licence: in particular that means you cannot use them for commercial purposes. If you create any adaptations or other works using the Dress Code Dataset, then you hereby grant to us a non-exclusive, royalty free, perpetual, irrevocable licence to use, copy, modify, distribute, and otherwise exploit those adaptations or other works for any purposes. *** Attribution and your downstream obligations *** If you share the Dress Code Dataset (in whole or in part), you must: - retain any notices which identify the origins, authors, or rightsholders of the Dress Code Dataset (including without limitation any copyright notice); - retain any notices which refer to this Licence in whole or in part, or which contain any disclaimers of warranties in relation to the Dress Code Dataset; - actively notify the recipient that their use of the Dress Code Dataset will be subject to this Licence (and provide them with a link); - not seek to impose any additional obligations on the recipient in relation to their use of the Dress Code Dataset which would be incompatible with the terms of this Licence. If you publish the Dress Code Dataset (in whole or in part), you must include in that publication an attribution in the following form: “The Dress Code Dataset is proprietary to and © Yoox Net-a-Porter Group S.p.A., and its licensors. It is distributed by the University of Modena and Reggio Emilia, and available for non-commercial academic use under licence terms set out at https://github.com/aimagelab/dress-code.” While you must ensure that you acknowledge us as the source of the Dress Code Dataset as described above, you must not in doing so, or otherwise, suggest that you or your use of the Dress Code Dataset is endorsed or sponsored by us, or otherwise connected with us, unless we have separately given you permission to do so. *** What you must not do *** You must not use the Dress Code Dataset except as expressly permitted above. In particular: - You may not use the Dress Code Dataset, or any adaptations or other works created using the Dress Code Dataset, for commercial purposes. - You may not adapt the Dress Code Dataset except as expressly set out above. In particular, you may not create new works derived from or based upon the Dress Code Dataset in which the Dress Code Dataset is wholly or partially translated, altered, or modified in a manner which would otherwise require permissions from the relevant rightsholder under applicable laws relating to copyright and/or database rights. - The images in the Dress Code Dataset depict garments whose designs may be protected by copyright and/or design rights. You are not granted any licence under those rights except in connection with your permitted use of the Dress Code Dataset. In particular you are not authorised to make or manufacture any of those garments, or to alter their designs in your use of the Dress Code Dataset. - The images in the Dress Code Dataset have been cropped so that the models featured in those images are not individually identifiable. You must not attempt to re-identify those models. If you become aware of any use of the Dress Code Dataset other than as permitted by this Licence (whether by your own organisation or any third party) you must notify us promptly at legalteam_it@ynap.com and segreteria.aimagelab@unimore.it. *** No warranty *** The Dress Code Dataset is provided to you “as is” and we exclude all representations, warranties, and liabilities in relation to the Dress Code Dataset (including without limitation as to quality, fitness for purpose, non-infringement and accuracy) to the maximum extent permitted by law. Without limiting that exclusion, we will not be liable for any errors or omissions in the Dress Code Dataset or for any loss or damages of any kind arising from its use. We do not guarantee that the Dress Code Dataset will remain available, and we may withdraw it at any time. *** Term and Termination *** This Licence applies for the entire duration of any copyright, database right or other rights we may have in the Dress Code Dataset. If you breach any of your obligations under this Licence, then without limiting our other rights and remedies, your rights under it will automatically terminate. Your rights will be reinstated automatically if you remedy that breach within thirty days after discovery or may be reinstated by our written approval.We may withdraw this Licence at any time, and if we do you must stop using the Dress Code Dataset as soon as possible unless we make it available to you on alternative terms and conditions. *** Other provisions *** We will not be subject to any additional terms or conditions which you may seek to introduce in relation to your use of the Dress Code Dataset. This Licence does not limit your lawful freedoms to use the Dress Code Dataset in reliance on exceptions and exclusions from laws relating to copyright or database rights (such as fair dealing or fair use). If you are in breach of this Licence, we can only waive that breach by written notice. If we do not immediately take action in relation to your breach, we may still do so later. This Licence is the entire agreement between you and us relating to your use of the Dress Code Dataset. You acknowledge that you have not entered into this Licence based on any other representation or warranty. This Licence is governed by the laws of Italy. *** About this Licence *** This licence is © Yoox Net-a-Porter Group S.p.A.. ================================================ FILE: README.md ================================================ # Dress Code Dataset This repository presents the virtual try-on dataset proposed in: *D. Morelli, M. Fincato, M. Cornia, F. Landi, F. Cesari, R. Cucchiara*
**Dress Code: High-Resolution Multi-Category Virtual Try-On**
**[[Paper](https://arxiv.org/abs/2204.08532)]** **[[Dataset Request Form](https://forms.gle/72Bpeh48P7zQimin7)]** **[[Try-On Demo](https://ailb-web.ing.unimore.it/dress-code)]** **IMPORTANT!** - By making any use of the Dress Code Dataset, you accept and agree to comply with the terms and conditions reported [here](https://github.com/aimagelab/dress-code/blob/main/LICENCE). - The dataset will not be released to private companies. - When filling the dataset request form, non-institutional emails (e.g. gmail.com, qq.com, etc.) are not allowed. - The signed release agreement form is mandatory (see the dataset request form for more details). Incomplete or unsigned release agreement forms are not accepted and will not receive a response. Typed signatures are not allowed. **Requests are manually validated on a weekly basis. If you do not receive a response, your request does not meet the outlined requirements.**
Please cite with the following BibTeX: ``` @inproceedings{morelli2022dresscode, title={{Dress Code: High-Resolution Multi-Category Virtual Try-On}}, author={Morelli, Davide and Fincato, Matteo and Cornia, Marcella and Landi, Federico and Cesari, Fabio and Cucchiara, Rita}, booktitle={Proceedings of the European Conference on Computer Vision}, year={2022} } ```

## Dataset We collected a new dataset for image-based virtual try-on composed of image pairs coming from different catalogs of YOOX NET-A-PORTER.
The dataset contains more than 50k high resolution model clothing images pairs divided into three different categories (*i.e.* dresses, upper-body clothes, lower-body clothes).

### Summary - 53792 garments - 107584 images - 3 categories - upper body - lower body - dresses - 1024 x 768 image resolution - additional info - keypoints - skeletons - human label maps - human dense poses ### Additional Info Along with model and garment image pair, we provide also the keypoints, skeleton, human label map, and dense pose.

More info ### Keypoints For all image pairs of the dataset, we stored the joint coordinates of human poses. In particular, we used [OpenPose](https://github.com/Hzzone/pytorch-openpose) [1] to extract 18 keypoints for each human body. For each image, we provided a json file containing a dictionary with the `keypoints` key. The value of this key is a list of 18 elements, representing the joints of the human body. Each element is a list of 4 values, where the first two indicate the coordinates on the x and y axis respectively. ### Skeletons Skeletons are RGB images obtained connecting keypoints with lines. ### Human Label Map We employed a human parser to assign each pixel of the image to a specific category thus obtaining a segmentation mask for each target model. Specifically, we used the [SCHP model](https://github.com/PeikeLi/Self-Correction-Human-Parsing) [2] trained on the ATR dataset, a large single person human parsing dataset focused on fashion images with 18 classes. Obtained images are composed of 1 channel filled with the category label value. Categories are mapped as follows: ```ruby 0 background 1 hat 2 hair 3 sunglasses 4 upper_clothes 5 skirt 6 pants 7 dress 8 belt 9 left_shoe 10 right_shoe 11 head 12 left_leg 13 right_leg 14 left_arm 15 right_arm 16 bag 17 scarf ``` ### Human Dense Pose We also extracted dense label and UV mapping from all the model images using [DensePose](https://github.com/facebookresearch/detectron2/tree/main/projects/DensePose) [3].
## Experimental Results ### Low Resolution 256 x 192
Name SSIM FID KID
CP-VTON [4] 0.803 35.16 2.245
CP-VTON+ [5] 0.902 25.19 1.586
CP-VTON* [4] 0.874 18.99 1.117
PFAFN [6] 0.902 14.38 0.743
VITON-GT [7] 0.899 13.80 0.711
WUTON [8] 0.902 13.28 0.771
ACGPN [9] 0.868 13.79 0.818
OURS 0.906 11.40 0.570
## Code Due to a firm collaboration, we cannot release the code. However, we supply an empty Pytorch project to load data. ## References [1] Cao, et al. "OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields." IEEE TPAMI, 2019. [2] Li, et al. "Self-Correction for Human Parsing." arXiv, 2019. [3] Güler, et al. "Densepose: Dense human pose estimation in the wild." CVPR, 2018. [4] Wang, et al. "Toward Characteristic-Preserving Image-based Virtual Try-On Network." ECCV, 2018. [5] Minar, et al. "CP-VTON+: Clothing Shape and Texture Preserving Image-Based Virtual Try-On." CVPR Workshops, 2020. [6] Ge, et al. "Parser-Free Virtual Try-On via Distilling Appearance Flows." CVPR, 2021. [7] Fincato, et al. "VITON-GT: An Image-based Virtual Try-On Model with Geometric Transformations." ICPR, 2020. [8] Issenhuth, el al. "Do Not Mask What You Do Not Need to Mask: a Parser-Free Virtual Try-On." ECCV, 2020. [9] Yang, et al. "Towards Photo-Realistic Virtual Try-On by Adaptively Generating-Preserving Image Content." CVPR, 2020. ## Contact If you have any general doubt about our dataset, please use the [public issues section](https://github.com/aimagelab/dress-code/issues) on this github repo. Alternatively, drop us an e-mail at davide.morelli [at] unimore.it or marcella.cornia [at] unimore.it. ================================================ FILE: conf.py ================================================ import argparse def get_conf(train=True): parser = argparse.ArgumentParser() parser.add_argument("--exp_name", type=str, default="") parser.add_argument("--category", default='all', type=str) parser.add_argument("--dataroot", type=str, default="") parser.add_argument("--data_pairs", default="{}_pairs") parser.add_argument('--checkpoint_dir', type=str, default='', help='save checkpoint infos') parser.add_argument('-b', '--batch_size', type=int, default=8) parser.add_argument('-j', '--workers', type=int, default=0) parser.add_argument("--epochs", type=int, default=150) parser.add_argument("--step", type=int, default=100000) parser.add_argument("--display_count", type=int, default=1000) parser.add_argument("--shuffle", default=True, action='store_true', help='shuffle input data') parser.add_argument("--height", type=int, default=256) parser.add_argument("--width", type=int, default=192) parser.add_argument("--radius", type=int, default=5) args = parser.parse_args() print(args) return args ================================================ FILE: data/__init__.py ================================================ from .dataset import Dataset from .dataloader import DataLoader ================================================ FILE: data/dataloader.py ================================================ import torch class DataLoader(object): def __init__(self, opt, dataset, dist_sampler=False): super(DataLoader, self).__init__() if dist_sampler: train_sampler = torch.utils.data.distributed.DistributedSampler( dataset, num_replicas=opt.world_size, rank=opt.rank, shuffle=True) else: if opt.shuffle: train_sampler = torch.utils.data.sampler.RandomSampler(dataset) else: train_sampler = None self.sampler = train_sampler self.data_loader = torch.utils.data.DataLoader( dataset, batch_size=opt.batch_size, shuffle=(train_sampler is None), num_workers=opt.workers, pin_memory=True, sampler=train_sampler) self.dataset = dataset self.data_iter = self.data_loader.__iter__() def next_batch(self): try: batch = self.data_iter.__next__() except StopIteration: self.data_iter = self.data_loader.__iter__() batch = self.data_iter.__next__() return batch ================================================ FILE: data/dataset.py ================================================ import cv2 import torch import torch.utils.data as data import torchvision.transforms as transforms from PIL import Image, ImageDraw import os import numpy as np import json from typing import List, Tuple from data.labelmap import label_map from numpy.linalg import lstsq class Dataset(data.Dataset): def __init__(self, args, dataroot_path: str, phase: str, order: str = 'paired', category: List[str] = ['dresses', 'upper_body', 'lower_body'], size: Tuple[int, int] = (256, 192)): """ Initialize the PyTroch Dataset Class :param args: argparse parameters :type args: argparse :param dataroot_path: dataset root folder :type dataroot_path: string :param phase: phase (train | test) :type phase: string :param order: setting (paired | unpaired) :type order: string :param category: clothing category (upper_body | lower_body | dresses) :type category: list(str) :param size: image size (height, width) :type size: tuple(int) """ super(Dataset, self).__init__() self.args = args self.dataroot = dataroot_path self.phase = phase self.category = category self.height = size[0] self.width = size[1] self.radius = args.radius self.transform = transforms.Compose([ transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)) ]) self.transform2D = transforms.Compose([ transforms.ToTensor(), transforms.Normalize((0.5, ), (0.5, )) ]) im_names = [] c_names = [] dataroot_names = [] for c in category: assert c in ['dresses', 'upper_body', 'lower_body'] dataroot = os.path.join(self.dataroot, c) if phase == 'train': filename = os.path.join(dataroot, f"{phase}_pairs.txt") else: filename = os.path.join(dataroot, f"{phase}_pairs_{order}.txt") with open(filename, 'r') as f: for line in f.readlines(): im_name, c_name = line.strip().split() im_names.append(im_name) c_names.append(c_name) dataroot_names.append(dataroot) self.im_names = im_names self.c_names = c_names self.dataroot_names = dataroot_names def __getitem__(self, index): """ For each index return the corresponding sample in the dataset :param index: data index :type index: int :return: dict containing dataset samples :rtype: dict """ c_name = self.c_names[index] im_name = self.im_names[index] dataroot = self.dataroot_names[index] # Clothing image cloth = Image.open(os.path.join(dataroot, 'images', c_name)) cloth = cloth.resize((self.width, self.height)) cloth = self.transform(cloth) # [-1,1] # Person image im = Image.open(os.path.join(dataroot, 'images', im_name)) im = im.resize((self.width, self.height)) im = self.transform(im) # [-1,1] # Skeleton skeleton = Image.open(os.path.join(dataroot, 'skeletons', im_name.replace("_0", "_5"))) skeleton = skeleton.resize((self.width, self.height)) skeleton = self.transform(skeleton) # Label Map parse_name = im_name.replace('_0.jpg', '_4.png') im_parse = Image.open(os.path.join(dataroot, 'label_maps', parse_name)) im_parse = im_parse.resize((self.width, self.height), Image.NEAREST) parse_array = np.array(im_parse) parse_shape = (parse_array > 0).astype(np.float32) parse_head = (parse_array == 1).astype(np.float32) + \ (parse_array == 2).astype(np.float32) + \ (parse_array == 3).astype(np.float32) + \ (parse_array == 11).astype(np.float32) parser_mask_fixed = (parse_array == label_map["hair"]).astype(np.float32) + \ (parse_array == label_map["left_shoe"]).astype(np.float32) + \ (parse_array == label_map["right_shoe"]).astype(np.float32) + \ (parse_array == label_map["hat"]).astype(np.float32) + \ (parse_array == label_map["sunglasses"]).astype(np.float32) + \ (parse_array == label_map["scarf"]).astype(np.float32) + \ (parse_array == label_map["bag"]).astype(np.float32) parser_mask_changeable = (parse_array == label_map["background"]).astype(np.float32) arms = (parse_array == 14).astype(np.float32) + (parse_array == 15).astype(np.float32) if dataroot.split('/')[-1] == 'dresses': label_cat = 7 parse_cloth = (parse_array == 7).astype(np.float32) parse_mask = (parse_array == 7).astype(np.float32) + \ (parse_array == 12).astype(np.float32) + \ (parse_array == 13).astype(np.float32) parser_mask_changeable += np.logical_and(parse_array, np.logical_not(parser_mask_fixed)) elif dataroot.split('/')[-1] == 'upper_body': label_cat = 4 parse_cloth = (parse_array == 4).astype(np.float32) parse_mask = (parse_array == 4).astype(np.float32) parser_mask_fixed += (parse_array == label_map["skirt"]).astype(np.float32) + \ (parse_array == label_map["pants"]).astype(np.float32) parser_mask_changeable += np.logical_and(parse_array, np.logical_not(parser_mask_fixed)) elif dataroot.split('/')[-1] == 'lower_body': label_cat = 6 parse_cloth = (parse_array == 6).astype(np.float32) parse_mask = (parse_array == 6).astype(np.float32) + \ (parse_array == 12).astype(np.float32) + \ (parse_array == 13).astype(np.float32) parser_mask_fixed += (parse_array == label_map["upper_clothes"]).astype(np.float32) + \ (parse_array == 14).astype(np.float32) + \ (parse_array == 15).astype(np.float32) parser_mask_changeable += np.logical_and(parse_array, np.logical_not(parser_mask_fixed)) parse_head = torch.from_numpy(parse_head) # [0,1] parse_cloth = torch.from_numpy(parse_cloth) # [0,1] parse_mask = torch.from_numpy(parse_mask) # [0,1] parser_mask_fixed = torch.from_numpy(parser_mask_fixed) parser_mask_changeable = torch.from_numpy(parser_mask_changeable) # dilation parse_without_cloth = np.logical_and(parse_shape, np.logical_not(parse_mask)) parse_mask = parse_mask.cpu().numpy() # Masked cloth im_head = im * parse_head - (1 - parse_head) im_cloth = im * parse_cloth + (1 - parse_cloth) # Shape parse_shape = Image.fromarray((parse_shape * 255).astype(np.uint8)) parse_shape = parse_shape.resize((self.width // 16, self.height // 16), Image.BILINEAR) parse_shape = parse_shape.resize((self.width, self.height), Image.BILINEAR) shape = self.transform2D(parse_shape) # [-1,1] # Load pose points pose_name = im_name.replace('_0.jpg', '_2.json') with open(os.path.join(dataroot, 'keypoints', pose_name), 'r') as f: pose_label = json.load(f) pose_data = pose_label['keypoints'] pose_data = np.array(pose_data) pose_data = pose_data.reshape((-1, 4)) point_num = pose_data.shape[0] pose_map = torch.zeros(point_num, self.height, self.width) r = self.radius * (self.height/512.0) im_pose = Image.new('L', (self.width, self.height)) pose_draw = ImageDraw.Draw(im_pose) neck = Image.new('L', (self.width, self.height)) neck_draw = ImageDraw.Draw(neck) for i in range(point_num): one_map = Image.new('L', (self.width, self.height)) draw = ImageDraw.Draw(one_map) point_x = np.multiply(pose_data[i, 0], self.width/384.0) point_y = np.multiply(pose_data[i, 1], self.height/512.0) if point_x > 1 and point_y > 1: draw.rectangle((point_x - r, point_y - r, point_x + r, point_y + r), 'white', 'white') pose_draw.rectangle((point_x - r, point_y - r, point_x + r, point_y + r), 'white', 'white') if i == 2 or i == 5: neck_draw.ellipse((point_x - r*4, point_y - r*4, point_x + r*4, point_y + r*4), 'white', 'white') one_map = self.transform2D(one_map) pose_map[i] = one_map[0] # just for visualization im_pose = self.transform2D(im_pose) im_arms = Image.new('L', (self.width, self.height)) arms_draw = ImageDraw.Draw(im_arms) if dataroot.split('/')[-1] == 'dresses' or dataroot.split('/')[-1] == 'upper_body': with open(os.path.join(dataroot, 'keypoints', pose_name), 'r') as f: data = json.load(f) shoulder_right = np.multiply(tuple(data['keypoints'][2][:2]), self.height / 512.0) shoulder_left = np.multiply(tuple(data['keypoints'][5][:2]), self.height / 512.0) elbow_right = np.multiply(tuple(data['keypoints'][3][:2]), self.height / 512.0) elbow_left = np.multiply(tuple(data['keypoints'][6][:2]), self.height / 512.0) wrist_right = np.multiply(tuple(data['keypoints'][4][:2]), self.height / 512.0) wrist_left = np.multiply(tuple(data['keypoints'][7][:2]), self.height / 512.0) if wrist_right[0] <= 1. and wrist_right[1] <= 1.: if elbow_right[0] <= 1. and elbow_right[1] <= 1.: arms_draw.line(np.concatenate((wrist_left, elbow_left, shoulder_left, shoulder_right)).astype(np.uint16).tolist(), 'white', 30, 'curve') else: arms_draw.line(np.concatenate((wrist_left, elbow_left, shoulder_left, shoulder_right, elbow_right)).astype(np.uint16).tolist(), 'white', 30, 'curve') elif wrist_left[0] <= 1. and wrist_left[1] <= 1.: if elbow_left[0] <= 1. and elbow_left[1] <= 1.: arms_draw.line(np.concatenate((shoulder_left, shoulder_right, elbow_right, wrist_right)).astype(np.uint16).tolist(), 'white', 30, 'curve') else: arms_draw.line(np.concatenate((elbow_left, shoulder_left, shoulder_right, elbow_right, wrist_right)).astype(np.uint16).tolist(), 'white', 30, 'curve') else: arms_draw.line(np.concatenate((wrist_left, elbow_left, shoulder_left, shoulder_right, elbow_right, wrist_right)).astype(np.uint16).tolist(),'white', 30, 'curve') if self.args.height > 512: im_arms = cv2.dilate(np.float32(im_arms), np.ones((10, 10), np.uint16), iterations=5) # elif self.args.height > 256: # im_arms = cv2.dilate(np.float32(im_arms), np.ones((5, 5), np.uint16), iterations=5) hands = np.logical_and(np.logical_not(im_arms), arms) parse_mask += im_arms parser_mask_fixed += hands #delete neck parse_head_2 = torch.clone(parse_head) if dataroot.split('/')[-1] == 'dresses' or dataroot.split('/')[-1] == 'upper_body': with open(os.path.join(dataroot, 'keypoints', pose_name), 'r') as f: data = json.load(f) points = [] points.append(np.multiply(tuple(data['keypoints'][2][:2]), self.height/512.0)) points.append(np.multiply(tuple(data['keypoints'][5][:2]), self.height/512.0)) x_coords, y_coords = zip(*points) A = np.vstack([x_coords, np.ones(len(x_coords))]).T m, c = lstsq(A, y_coords, rcond=None)[0] for i in range(parse_array.shape[1]): y = i * m + c parse_head_2[int(y - 20*(self.height/512.0)):, i] = 0 parser_mask_fixed = np.logical_or(parser_mask_fixed, np.array(parse_head_2, dtype=np.uint16)) parse_mask += np.logical_or(parse_mask, np.logical_and(np.array(parse_head, dtype=np.uint16), np.logical_not(np.array(parse_head_2, dtype=np.uint16)))) if self.args.height > 512: parse_mask = cv2.dilate(parse_mask, np.ones((20, 20), np.uint16), iterations=5) # elif self.args.height > 256: # parse_mask = cv2.dilate(parse_mask, np.ones((10, 10), np.uint16), iterations=5) else: parse_mask = cv2.dilate(parse_mask, np.ones((5, 5), np.uint16), iterations=5) parse_mask = np.logical_and(parser_mask_changeable, np.logical_not(parse_mask)) parse_mask_total = np.logical_or(parse_mask, parser_mask_fixed) im_mask = im * parse_mask_total parse_mask_total = parse_mask_total.numpy() parse_mask_total = parse_array * parse_mask_total parse_mask_total = torch.from_numpy(parse_mask_total) uv = np.load(os.path.join(dataroot, 'dense', im_name.replace('_0.jpg', '_5_uv.npz'))) uv = uv['uv'] uv = torch.from_numpy(uv) uv = transforms.functional.resize(uv, (self.height, self.width)) labels = Image.open(os.path.join(dataroot, 'dense', im_name.replace('_0.jpg', '_5.png'))) labels = labels.resize((self.width, self.height), Image.NEAREST) labels = np.array(labels) result = { 'c_name': c_name, # for visualization 'im_name': im_name, # for visualization or ground truth 'cloth': cloth, # for input 'image': im, # for visualization 'im_cloth': im_cloth, # for ground truth 'shape': shape, # for visualization 'im_head': im_head, # for visualization 'im_pose': im_pose, # for visualization 'pose_map': pose_map, 'parse_array': parse_array, 'dense_labels': labels, 'dense_uv': uv, 'skeleton': skeleton, 'm': im_mask, # for input 'parse_mask_total': parse_mask_total, } return result def __len__(self): return len(self.c_names) ================================================ FILE: data/labelmap.py ================================================ label_map={ "background": 0, "hat": 1, "hair": 2, "sunglasses": 3, "upper_clothes": 4, "skirt": 5, "pants": 6, "dress": 7, "belt": 8, "left_shoe": 9, "right_shoe": 10, "head": 11, "left_leg": 12, "right_leg": 13, "left_arm": 14, "right_arm": 15, "bag": 16, "scarf": 17, } ================================================ FILE: main.py ================================================ import torch from data import Dataset, DataLoader from tqdm import tqdm from utils import sem2onehot import conf def test_unpaired(dataloader, model, e, args): with tqdm(desc="Iteration %d - images extraction" % e, unit='it', total=len(dataloader.data_loader)) as pbar: for step in range(0, len(dataloader.data_loader)): inputs = dataloader.next_batch() with torch.no_grad(): image_name = inputs['im_name'] cloth_name = inputs['c_name'] image = inputs['image'].cuda() cloth = inputs['cloth'].cuda() cropped_cloth = inputs['im_cloth'].cuda() im_head = inputs['im_head'].cuda() pose_map = inputs['pose_map'].cuda() skeleton = inputs['skeleton'].cuda() im_pose = inputs['im_pose'].cuda() shape = inputs['shape'].cuda() parse_array = inputs['parse_array'].cuda() dense_labels = inputs['dense_labels'].cuda() dense_uv = inputs['dense_uv'].cuda() parse_array = sem2onehot(18, parse_array) # model here def training_loop(dataloader, model, e, args): with tqdm(desc="Iteration %d - train" % e, unit='it', total=args.display_count) as pbar: for step in range(0, args.display_count): inputs = dataloader.next_batch() image_name = inputs['im_name'] cloth_name = inputs['c_name'] image = inputs['image'].cuda() cloth = inputs['cloth'].cuda() cropped_cloth = inputs['im_cloth'].cuda() im_head = inputs['im_head'].cuda() pose_map = inputs['pose_map'].cuda() skeleton = inputs['skeleton'].cuda() im_pose = inputs['im_pose'].cuda() shape = inputs['shape'].cuda() parse_array = inputs['parse_array'].cuda() dense_labels = inputs['dense_labels'].cuda() dense_uv = inputs['dense_uv'].cuda() parse_array = sem2onehot(18, parse_array) # model here pbar.update() def main_worker(args): # Dataset & Dataloader dataset_train = Dataset(args, dataroot_path=args.dataroot, phase='train', order='paired', size=(int(args.height), int(args.width))) dataloader_train = DataLoader(args, dataset_train, dist_sampler=False) dataset_test_unpaired = Dataset(args, dataroot_path=args.dataroot, phase='test', order='unpaired', size=(int(args.height), int(args.width))) dataloader_test_unpaired = DataLoader(args, dataset_test_unpaired, dist_sampler=False) # Instance here your model model = None # Loop in epochs for e in range(0, args.epochs): # Training loop training_loop(dataloader_train, model, e, args) # Test unpaired inputs = dataloader_test_unpaired.next_batch() test_unpaired(inputs, model, e, args) if __name__ == '__main__': # Get argparser configuration args = conf.get_conf() print(args.exp_name) # Call main worker main_worker(args) ================================================ FILE: utils/__init__.py ================================================ from .label_map import sem2onehot ================================================ FILE: utils/label_map.py ================================================ import torch def sem2onehot(n, labelmap): label_map = labelmap.long().unsqueeze(1).cuda() bs, _, h, w = label_map.size() nc = n input_label = torch.FloatTensor(bs, nc, h, w).zero_().cuda() input_semantics = input_label.scatter_(1, label_map, 1.0) return input_semantics