Repository: aimagelab/dress-code
Branch: main
Commit: 2e9e7ed55bec
Files: 10
Total size: 36.2 KB

Directory structure:
gitextract_dt5ybub4/

├── LICENCE
├── README.md
├── conf.py
├── data/
│   ├── __init__.py
│   ├── dataloader.py
│   ├── dataset.py
│   └── labelmap.py
├── main.py
└── utils/
    ├── __init__.py
    └── label_map.py

================================================
FILE CONTENTS
================================================

================================================
FILE: LICENCE
================================================
Yoox Net-a-Porter Dress Code Dataset – Licence Terms

SUMMARY
This is a summary of, and not a substitution for, the licence terms set out below.
- You are free to use and distribute the Dress Code Dataset for the purposes of non-commercial academic research, teaching and publication.
- You can use the Dress Code Dataset only as provided: you are not permitted to create and distribute alterations to the Dress Code Dataset 
or any part of it.
- When you share or publish any part of the Dress Code Dataset you must include an attribution.
- The Dress Code Dataset is provided “as is” and without warranty.
- Your rights to use the Dress Code Dataset may be revoked. 

LICENCE TERMS
By making any use of the Dress Code Dataset (as defined below), you accept and agree to comply with the terms and conditions of this Licence. 
Your right to use the Dress Code Dataset is subject to and conditional upon your compliance with those terms and conditions.

*** Definitions *** 
- The Dress Code Dataset means the dataset of image pairs, for image-based virtual try-on, made available by us through Unimore.
- Unimore means The University of Modena and Reggio Emilia, whose address is at Via Università, 4, 41121 Modena MO, Italy.
- We 
- or YNAP means Yoox Net-a-Porter Group S.p.A., whose address is at Via Morimondo, 17, 20143 Milan MI, Italy. 
- You means the natural or legal person making use of the Dress Code Dataset

*** What you can do ***
You can use the Dress Code Dataset only for the purposes of research, teaching and publication (which must in each case be academic and 
non-commercial).

For the purposes of this provision:
- using the Dress Code Dataset means you can use, copy, publish, distribute and transmit it (in whole or in part), but does not allow you 
to adapt it except as expressly set out below.
- non-commercial means you cannot use the Dress Code Dataset for purposes which are mainly directed towards payment or some other commercial 
advantage. 
You are not prohibited from academic use simply because it has some incidental commercial nature. For example, use of the Dress Code Dataset 
in academic publication is permitted even if the relevant journal is subject to a subscription fee, and use of the Dress Code Dataset in 
academic teaching is permitted even if that teaching is subject to tuition fees.
- academic means in connection with education, teaching and research activities undertaken by an accredited not-for-profit academic institution, 
and excludes research undertaken in collaboration with any commercial entity except under terms which prohibit that entity from making 
commercial use of any results arising from that research.

We grant to you a worldwide, royalty-free, non-sublicensable, non-exclusive license under our respective rights (including copyright and 
database right) in the Dress Code Dataset for the permitted purposes set out above. 
You may adapt the Dress Code Dataset only as necessary to make it interoperable with any other systems or technology which you are using 
for non-commercial academic research, teaching or publication.  Otherwise, you may not adapt it without our permission. Any adaptations 
which you do make (for interoperability, or with our permission) and any other works you make using the Dress Code Dataset will be subject 
to the same terms and restrictions of this Licence: in particular that means you cannot use them for commercial purposes.

If you create any adaptations or other works using the Dress Code Dataset, then you hereby grant to us a non-exclusive, royalty free, 
perpetual, irrevocable licence to use, copy, modify, distribute, and otherwise exploit those adaptations or other works for any purposes.

*** Attribution and your downstream obligations ***
If you share the Dress Code Dataset (in whole or in part), you must:
- retain any notices which identify the origins, authors, or rightsholders of the Dress Code Dataset (including without limitation any 
copyright notice);
- retain any notices which refer to this Licence in whole or in part, or which contain any disclaimers of warranties in relation to the 
Dress Code Dataset;
- actively notify the recipient that their use of the Dress Code Dataset will be subject to this Licence (and provide them with a link);
- not seek to impose any additional obligations on the recipient in relation to their use of the Dress Code Dataset which would be 
incompatible with the terms of this Licence.

If you publish the Dress Code Dataset (in whole or in part), you must include in that publication an attribution in the following form:
“The Dress Code Dataset is proprietary to and © Yoox Net-a-Porter Group S.p.A., and its licensors. It is distributed by the University of 
Modena and Reggio Emilia, and available for non-commercial academic use under licence terms set out at https://github.com/aimagelab/dress-code.” 

While you must ensure that you acknowledge us as the source of the Dress Code Dataset as described above, you must not in doing so, 
or otherwise, suggest that you or your use of the Dress Code Dataset is endorsed or sponsored by us, or otherwise connected with us, unless 
we have separately given you permission to do so.

*** What you must not do ***
You must not use the Dress Code Dataset except as expressly permitted above. In particular:
- You may not use the Dress Code Dataset, or any adaptations or other works created using the Dress Code Dataset, for commercial purposes.
- You may not adapt the Dress Code Dataset except as expressly set out above. In particular, you may not create new works derived from or 
based upon the Dress Code Dataset in which the Dress Code Dataset is wholly or partially translated, altered, or modified in a manner which 
would otherwise require permissions from the relevant rightsholder under applicable laws relating to copyright and/or database rights.
- The images in the Dress Code Dataset depict garments whose designs may be protected by copyright and/or design rights. You are not granted 
any licence under those rights except in connection with your permitted use of the Dress Code Dataset. In particular you are not authorised 
to make or manufacture any of those garments, or to alter their designs in your use of the Dress Code Dataset.
- The images in the Dress Code Dataset have been cropped so that the models featured in those images are not individually identifiable. 
You must not attempt to re-identify those models.

If you become aware of any use of the Dress Code Dataset other than as permitted by this Licence (whether by your own organisation 
or any third party) you must notify us promptly at legalteam_it@ynap.com and segreteria.aimagelab@unimore.it.

*** No warranty ***
The Dress Code Dataset is provided to you “as is” and we exclude all representations, warranties, and liabilities in relation to the 
Dress Code Dataset (including without limitation as to quality, fitness for purpose, non-infringement and accuracy) to the maximum extent 
permitted by law. 
Without limiting that exclusion, we will not be liable for any errors or omissions in the Dress Code Dataset or for any loss or damages of any 
kind arising from its use. We do not guarantee that the Dress Code Dataset will remain available, and we may withdraw it at any time. 

*** Term and Termination ***
This Licence applies for the entire duration of any copyright, database right or other rights we may have in the Dress Code Dataset. 
If you breach any of your obligations under this Licence, then without limiting our other rights and remedies, your rights under it will 
automatically terminate. Your rights will be reinstated automatically if you remedy that breach within thirty days after discovery or may be 
reinstated by our written approval.We may withdraw this Licence at any time, and if we do you must stop using the Dress Code Dataset as 
soon as possible unless we make it available to you on alternative terms and conditions.

*** Other provisions ***
We will not be subject to any additional terms or conditions which you may seek to introduce in relation to your use of the Dress Code Dataset.
This Licence does not limit your lawful freedoms to use the Dress Code Dataset in reliance on exceptions and exclusions from laws relating to 
copyright or database rights (such as fair dealing or fair use).
If you are in breach of this Licence, we can only waive that breach by written notice. If we do not immediately take action in relation to 
your breach, we may still do so later.
This Licence is the entire agreement between you and us relating to your use of the Dress Code Dataset. You acknowledge that you have not 
entered into this Licence based on any other representation or warranty.
This Licence is governed by the laws of Italy. 

*** About this Licence ***
This licence is © Yoox Net-a-Porter Group S.p.A.. 


================================================
FILE: README.md
================================================
# Dress Code Dataset

This repository presents the virtual try-on dataset proposed in:

*D. Morelli, M. Fincato, M. Cornia, F. Landi, F. Cesari, R. Cucchiara* </br>
**Dress Code: High-Resolution Multi-Category Virtual Try-On** </br>

**[[Paper](https://arxiv.org/abs/2204.08532)]** **[[Dataset Request Form](https://forms.gle/72Bpeh48P7zQimin7)]** **[[Try-On Demo](https://ailb-web.ing.unimore.it/dress-code)]**

**IMPORTANT!**
- By making any use of the Dress Code Dataset, you accept and agree to comply with the terms and conditions reported [here](https://github.com/aimagelab/dress-code/blob/main/LICENCE).
- The dataset will not be released to private companies. 
- When filling the dataset request form, non-institutional emails (e.g. gmail.com, qq.com, etc.) are not allowed.
- The signed release agreement form is mandatory (see the dataset request form for more details). Incomplete or unsigned release agreement forms are not accepted and will not receive a response. Typed signatures are not allowed.

**Requests are manually validated on a weekly basis. If you do not receive a response, your request does not meet the outlined requirements.**

<hr>

Please cite with the following BibTeX:

```
@inproceedings{morelli2022dresscode,
  title={{Dress Code: High-Resolution Multi-Category Virtual Try-On}},
  author={Morelli, Davide and Fincato, Matteo and Cornia, Marcella and Landi, Federico and Cesari, Fabio and Cucchiara, Rita},
  booktitle={Proceedings of the European Conference on Computer Vision},
  year={2022}
}
```

<p align="center">
    <img src="images/dressCodePrev.gif" style="max-width: 800px; width: 80%"/>
</p>

## Dataset

We collected a new dataset for image-based virtual try-on composed of image pairs coming from different catalogs of YOOX NET-A-PORTER. </br>
The dataset contains more than 50k high resolution model clothing images pairs divided into three different categories (*i.e.* dresses, upper-body clothes, lower-body clothes).

<p align="center">
    <img src="images/dataset_comparison.gif" style="max-width: 800px; width: 80%">
</p>

### Summary
- 53792 garments
- 107584 images
- 3 categories
  - upper body
  - lower body
  - dresses
- 1024 x 768 image resolution
- additional info
  - keypoints
  - skeletons
  - human label maps
  - human dense poses

### Additional Info
Along with model and garment image pair, we provide also the keypoints, skeleton, human label map, and dense pose. 

<p align="center">
  <img src="images/addittional_infos.png" style="max-width: 800px; width: 80%"/>
</p>

<details><summary>More info</summary>

### Keypoints
For all image pairs of the dataset, we stored the joint coordinates of human poses.
In particular, we used [OpenPose](https://github.com/Hzzone/pytorch-openpose) [1] to extract 18 keypoints for each human body. 

For each image, we provided a json file containing a dictionary with the `keypoints` key.
The value of this key is a list of 18 elements, representing the joints of the human body. Each element is a list of 4 values, where the first two indicate the coordinates on the x and y axis respectively.

### Skeletons
Skeletons are RGB images obtained connecting keypoints with lines.

### Human Label Map

We employed a human parser to assign each pixel of the image to a specific category thus obtaining a segmentation mask for each target model. 
Specifically, we used the [SCHP model](https://github.com/PeikeLi/Self-Correction-Human-Parsing) [2] trained on the ATR dataset, a large single person human parsing dataset focused on fashion images with 18 classes.

Obtained images are composed of 1 channel filled with the category label value. 
Categories are mapped as follows:

```ruby
 0    background
 1    hat
 2    hair
 3    sunglasses
 4    upper_clothes
 5    skirt
 6    pants
 7    dress
 8    belt
 9    left_shoe
10    right_shoe
11    head
12    left_leg
13    right_leg
14    left_arm
15    right_arm
16    bag
17    scarf
```


### Human Dense Pose

We also extracted dense label and UV mapping from all the model images using [DensePose](https://github.com/facebookresearch/detectron2/tree/main/projects/DensePose) [3].

</details>

## Experimental Results

### Low Resolution 256 x 192
<table>
<!-- TABLE BODY -->
<tbody>
  <!-- TABLE HEADER -->
    <th valign="bottom">Name</th>
    <th valign="bottom">SSIM</th>
    <th valign="bottom">FID</th>
    <th valign="bottom">KID</th>
  <!-- ROW: CP VTON -->
    <tr>
      <td align="center">CP-VTON [4]</td>
      <td align="center">0.803</td>
      <td align="center">35.16</td>
      <td align="center">2.245</td>
    </tr>
  <!-- ROW: CP VTON+ -->
    <tr>
      <td align="center">CP-VTON+ [5]</td>
      <td align="center">0.902</td>
      <td align="center">25.19</td>
      <td align="center">1.586</td>
    </tr>
  <!-- ROW: CP VTON' -->
    <tr>
      <td align="center">CP-VTON* [4]</td>
      <td align="center">0.874</td>
      <td align="center">18.99</td>
      <td align="center">1.117</td>
    </tr>
  <!-- ROW: FPAFN -->
    <tr>
      <td align="center">PFAFN [6]</td>
      <td align="center">0.902</td>
      <td align="center">14.38</td>
      <td align="center">0.743</td>
    </tr>
  <!-- ROW: VITON GT -->
    <tr>
      <td align="center">VITON-GT [7]</td>
      <td align="center">0.899</td>
      <td align="center">13.80</td>
      <td align="center">0.711</td>
    </tr>
  <!-- ROW: WUTON -->
    <tr>
      <td align="center">WUTON [8]</td>
      <td align="center">0.902</td>
      <td align="center">13.28</td>
      <td align="center">0.771</td>
    </tr>
  <!-- ROW: ACGPN -->
    <tr>
      <td align="center">ACGPN [9]</td>
      <td align="center">0.868</td>
      <td align="center">13.79</td>
      <td align="center">0.818</td>
    </tr>
  <!-- ROW: OURS PSAD -->
    <tr>
      <td align="center">OURS</td>
      <td align="center">0.906</td>
      <td align="center">11.40</td>
      <td align="center">0.570</td>
    </tr>
  </tbody>
</table>

## Code
Due to a firm collaboration, we cannot release the code. However, we supply an empty Pytorch project to load data.
## References

[1] Cao, et al. "OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields." IEEE TPAMI, 2019.

[2] Li, et al. "Self-Correction for Human Parsing." arXiv, 2019.

[3] Güler, et al. "Densepose: Dense human pose estimation in the wild." CVPR, 2018.

[4] Wang, et al. "Toward Characteristic-Preserving Image-based Virtual Try-On Network." ECCV, 2018.

[5] Minar, et al. "CP-VTON+: Clothing Shape and Texture Preserving Image-Based Virtual Try-On." CVPR Workshops, 2020.

[6] Ge, et al. "Parser-Free Virtual Try-On via Distilling Appearance Flows." CVPR, 2021.

[7] Fincato, et al. "VITON-GT: An Image-based Virtual Try-On Model with Geometric Transformations." ICPR, 2020.

[8] Issenhuth, el al. "Do Not Mask What You Do Not Need to Mask: a Parser-Free Virtual Try-On." ECCV, 2020.

[9] Yang, et al. "Towards Photo-Realistic Virtual Try-On by Adaptively Generating-Preserving Image Content." CVPR, 2020.

## Contact

If you have any general doubt about our dataset, please use the [public issues section](https://github.com/aimagelab/dress-code/issues) on this github repo. Alternatively, drop us an e-mail at davide.morelli [at] unimore.it or marcella.cornia [at] unimore.it.


================================================
FILE: conf.py
================================================
import argparse


def get_conf(train=True):
    parser = argparse.ArgumentParser()
    parser.add_argument("--exp_name", type=str, default="")
    parser.add_argument("--category", default='all', type=str)
    parser.add_argument("--dataroot", type=str, default="<Dress Code Path here>")
    parser.add_argument("--data_pairs", default="{}_pairs")

    parser.add_argument('--checkpoint_dir', type=str, default='',
                        help='save checkpoint infos')

    parser.add_argument('-b', '--batch_size', type=int, default=8)
    parser.add_argument('-j', '--workers', type=int, default=0)

    parser.add_argument("--epochs", type=int, default=150)
    parser.add_argument("--step", type=int, default=100000)
    parser.add_argument("--display_count", type=int, default=1000)
    parser.add_argument("--shuffle", default=True, action='store_true', help='shuffle input data')

    parser.add_argument("--height", type=int, default=256)
    parser.add_argument("--width", type=int, default=192)
    parser.add_argument("--radius", type=int, default=5)

    args = parser.parse_args()
    print(args)
    return args


================================================
FILE: data/__init__.py
================================================
from .dataset import Dataset
from .dataloader import DataLoader

================================================
FILE: data/dataloader.py
================================================
import torch


class DataLoader(object):
    def __init__(self, opt, dataset, dist_sampler=False):
        super(DataLoader, self).__init__()
        if dist_sampler:
            train_sampler = torch.utils.data.distributed.DistributedSampler(
                dataset, num_replicas=opt.world_size, rank=opt.rank, shuffle=True)
        else:
            if opt.shuffle:
                train_sampler = torch.utils.data.sampler.RandomSampler(dataset)
            else:
                train_sampler = None

        self.sampler = train_sampler
        self.data_loader = torch.utils.data.DataLoader(
            dataset, batch_size=opt.batch_size, shuffle=(train_sampler is None),
            num_workers=opt.workers, pin_memory=True, sampler=train_sampler)
        self.dataset = dataset
        self.data_iter = self.data_loader.__iter__()

    def next_batch(self):
        try:
            batch = self.data_iter.__next__()
        except StopIteration:
            self.data_iter = self.data_loader.__iter__()
            batch = self.data_iter.__next__()

        return batch


================================================
FILE: data/dataset.py
================================================
import cv2
import torch
import torch.utils.data as data
import torchvision.transforms as transforms
from PIL import Image, ImageDraw
import os
import numpy as np
import json
from typing import List, Tuple
from data.labelmap import label_map
from numpy.linalg import lstsq


class Dataset(data.Dataset):
    def __init__(self, args, dataroot_path: str,
                 phase: str,
                 order: str = 'paired',
                 category: List[str] = ['dresses', 'upper_body', 'lower_body'],
                 size: Tuple[int, int] = (256, 192)):
        """
        Initialize the PyTroch Dataset Class
        :param args: argparse parameters
        :type args: argparse
        :param dataroot_path: dataset root folder
        :type dataroot_path:  string
        :param phase: phase (train | test)
        :type phase: string
        :param order: setting (paired | unpaired)
        :type order: string
        :param category: clothing category (upper_body | lower_body | dresses)
        :type category: list(str)
        :param size: image size (height, width)
        :type size: tuple(int)
        """
        super(Dataset, self).__init__()
        self.args = args
        self.dataroot = dataroot_path
        self.phase = phase
        self.category = category
        self.height = size[0]
        self.width = size[1]
        self.radius = args.radius
        self.transform = transforms.Compose([
            transforms.ToTensor(),
            transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
        ])
        self.transform2D = transforms.Compose([
            transforms.ToTensor(),
            transforms.Normalize((0.5, ), (0.5, ))
        ])

        im_names = []
        c_names = []
        dataroot_names = []

        for c in category:
            assert c in ['dresses', 'upper_body', 'lower_body']

            dataroot = os.path.join(self.dataroot, c)
            if phase == 'train':
                filename = os.path.join(dataroot, f"{phase}_pairs.txt")
            else:
                filename = os.path.join(dataroot, f"{phase}_pairs_{order}.txt")
            with open(filename, 'r') as f:
                for line in f.readlines():
                    im_name, c_name = line.strip().split()
                    im_names.append(im_name)
                    c_names.append(c_name)
                    dataroot_names.append(dataroot)

        self.im_names = im_names
        self.c_names = c_names
        self.dataroot_names = dataroot_names

    def __getitem__(self, index):
        """
        For each index return the corresponding sample in the dataset
        :param index: data index
        :type index: int
        :return: dict containing dataset samples
        :rtype: dict
        """
        c_name = self.c_names[index]
        im_name = self.im_names[index]
        dataroot = self.dataroot_names[index]

        # Clothing image
        cloth = Image.open(os.path.join(dataroot, 'images', c_name))
        cloth = cloth.resize((self.width, self.height))
        cloth = self.transform(cloth)   # [-1,1]

        # Person image
        im = Image.open(os.path.join(dataroot, 'images', im_name))
        im = im.resize((self.width, self.height))
        im = self.transform(im)   # [-1,1]

        # Skeleton
        skeleton = Image.open(os.path.join(dataroot, 'skeletons', im_name.replace("_0", "_5")))
        skeleton = skeleton.resize((self.width, self.height))
        skeleton = self.transform(skeleton)

        # Label Map
        parse_name = im_name.replace('_0.jpg', '_4.png')
        im_parse = Image.open(os.path.join(dataroot, 'label_maps', parse_name))
        im_parse = im_parse.resize((self.width, self.height), Image.NEAREST)
        parse_array = np.array(im_parse)

        parse_shape = (parse_array > 0).astype(np.float32)

        parse_head = (parse_array == 1).astype(np.float32) + \
                     (parse_array == 2).astype(np.float32) + \
                     (parse_array == 3).astype(np.float32) + \
                     (parse_array == 11).astype(np.float32)

        parser_mask_fixed = (parse_array == label_map["hair"]).astype(np.float32) + \
                            (parse_array == label_map["left_shoe"]).astype(np.float32) + \
                            (parse_array == label_map["right_shoe"]).astype(np.float32) + \
                            (parse_array == label_map["hat"]).astype(np.float32) + \
                            (parse_array == label_map["sunglasses"]).astype(np.float32) + \
                            (parse_array == label_map["scarf"]).astype(np.float32) + \
                            (parse_array == label_map["bag"]).astype(np.float32)

        parser_mask_changeable = (parse_array == label_map["background"]).astype(np.float32)

        arms = (parse_array == 14).astype(np.float32) + (parse_array == 15).astype(np.float32)

        if dataroot.split('/')[-1] == 'dresses':
            label_cat = 7
            parse_cloth = (parse_array == 7).astype(np.float32)
            parse_mask = (parse_array == 7).astype(np.float32) + \
                         (parse_array == 12).astype(np.float32) + \
                         (parse_array == 13).astype(np.float32)
            parser_mask_changeable += np.logical_and(parse_array, np.logical_not(parser_mask_fixed))

        elif dataroot.split('/')[-1] == 'upper_body':
            label_cat = 4
            parse_cloth = (parse_array == 4).astype(np.float32)
            parse_mask = (parse_array == 4).astype(np.float32)

            parser_mask_fixed += (parse_array == label_map["skirt"]).astype(np.float32) + \
                                 (parse_array == label_map["pants"]).astype(np.float32)

            parser_mask_changeable += np.logical_and(parse_array, np.logical_not(parser_mask_fixed))
        elif dataroot.split('/')[-1] == 'lower_body':
            label_cat = 6
            parse_cloth = (parse_array == 6).astype(np.float32)
            parse_mask = (parse_array == 6).astype(np.float32) + \
                         (parse_array == 12).astype(np.float32) + \
                         (parse_array == 13).astype(np.float32)

            parser_mask_fixed += (parse_array == label_map["upper_clothes"]).astype(np.float32) + \
                                 (parse_array == 14).astype(np.float32) + \
                                 (parse_array == 15).astype(np.float32)
            parser_mask_changeable += np.logical_and(parse_array, np.logical_not(parser_mask_fixed))

        parse_head = torch.from_numpy(parse_head)  # [0,1]
        parse_cloth = torch.from_numpy(parse_cloth)   # [0,1]
        parse_mask = torch.from_numpy(parse_mask)  # [0,1]
        parser_mask_fixed = torch.from_numpy(parser_mask_fixed)
        parser_mask_changeable = torch.from_numpy(parser_mask_changeable)

        # dilation
        parse_without_cloth = np.logical_and(parse_shape, np.logical_not(parse_mask))
        parse_mask = parse_mask.cpu().numpy()

        # Masked cloth
        im_head = im * parse_head - (1 - parse_head)
        im_cloth = im * parse_cloth + (1 - parse_cloth)

        # Shape
        parse_shape = Image.fromarray((parse_shape * 255).astype(np.uint8))
        parse_shape = parse_shape.resize((self.width // 16, self.height // 16), Image.BILINEAR)
        parse_shape = parse_shape.resize((self.width, self.height), Image.BILINEAR)
        shape = self.transform2D(parse_shape)  # [-1,1]

        # Load pose points
        pose_name = im_name.replace('_0.jpg', '_2.json')
        with open(os.path.join(dataroot, 'keypoints', pose_name), 'r') as f:
            pose_label = json.load(f)
            pose_data = pose_label['keypoints']
            pose_data = np.array(pose_data)
            pose_data = pose_data.reshape((-1, 4))

        point_num = pose_data.shape[0]
        pose_map = torch.zeros(point_num, self.height, self.width)
        r = self.radius * (self.height/512.0)
        im_pose = Image.new('L', (self.width, self.height))
        pose_draw = ImageDraw.Draw(im_pose)
        neck = Image.new('L', (self.width, self.height))
        neck_draw = ImageDraw.Draw(neck)
        for i in range(point_num):
            one_map = Image.new('L', (self.width, self.height))
            draw = ImageDraw.Draw(one_map)
            point_x = np.multiply(pose_data[i, 0], self.width/384.0)
            point_y = np.multiply(pose_data[i, 1], self.height/512.0)
            if point_x > 1 and point_y > 1:
                draw.rectangle((point_x - r, point_y - r, point_x + r, point_y + r), 'white', 'white')
                pose_draw.rectangle((point_x - r, point_y - r, point_x + r, point_y + r), 'white', 'white')
                if i == 2 or i == 5:
                    neck_draw.ellipse((point_x - r*4, point_y - r*4, point_x + r*4, point_y + r*4), 'white', 'white')
            one_map = self.transform2D(one_map)
            pose_map[i] = one_map[0]

        # just for visualization
        im_pose = self.transform2D(im_pose)

        im_arms = Image.new('L', (self.width, self.height))
        arms_draw = ImageDraw.Draw(im_arms)
        if dataroot.split('/')[-1] == 'dresses' or dataroot.split('/')[-1] == 'upper_body':
            with open(os.path.join(dataroot, 'keypoints', pose_name), 'r') as f:
                data = json.load(f)
                shoulder_right = np.multiply(tuple(data['keypoints'][2][:2]), self.height / 512.0)
                shoulder_left = np.multiply(tuple(data['keypoints'][5][:2]), self.height / 512.0)
                elbow_right = np.multiply(tuple(data['keypoints'][3][:2]), self.height / 512.0)
                elbow_left = np.multiply(tuple(data['keypoints'][6][:2]), self.height / 512.0)
                wrist_right = np.multiply(tuple(data['keypoints'][4][:2]), self.height / 512.0)
                wrist_left = np.multiply(tuple(data['keypoints'][7][:2]), self.height / 512.0)
                if wrist_right[0] <= 1. and wrist_right[1] <= 1.:
                    if elbow_right[0] <= 1. and elbow_right[1] <= 1.:
                        arms_draw.line(np.concatenate((wrist_left, elbow_left, shoulder_left, shoulder_right)).astype(np.uint16).tolist(), 'white', 30, 'curve')
                    else:
                        arms_draw.line(np.concatenate((wrist_left, elbow_left, shoulder_left, shoulder_right, elbow_right)).astype(np.uint16).tolist(), 'white', 30, 'curve')
                elif wrist_left[0] <= 1. and wrist_left[1] <= 1.:
                    if elbow_left[0] <= 1. and elbow_left[1] <= 1.:
                        arms_draw.line(np.concatenate((shoulder_left, shoulder_right, elbow_right, wrist_right)).astype(np.uint16).tolist(), 'white', 30, 'curve')
                    else:
                        arms_draw.line(np.concatenate((elbow_left, shoulder_left, shoulder_right, elbow_right, wrist_right)).astype(np.uint16).tolist(), 'white', 30, 'curve')
                else:
                    arms_draw.line(np.concatenate((wrist_left, elbow_left, shoulder_left, shoulder_right, elbow_right, wrist_right)).astype(np.uint16).tolist(),'white', 30, 'curve')

            if self.args.height > 512:
                im_arms = cv2.dilate(np.float32(im_arms), np.ones((10, 10), np.uint16), iterations=5)
            # elif self.args.height > 256:
            #     im_arms = cv2.dilate(np.float32(im_arms), np.ones((5, 5), np.uint16), iterations=5)
            hands = np.logical_and(np.logical_not(im_arms), arms)
            parse_mask += im_arms
            parser_mask_fixed += hands

        #delete neck
        parse_head_2 = torch.clone(parse_head)
        if dataroot.split('/')[-1] == 'dresses' or dataroot.split('/')[-1] == 'upper_body':
            with open(os.path.join(dataroot, 'keypoints', pose_name), 'r') as f:
                data = json.load(f)
                points = []
                points.append(np.multiply(tuple(data['keypoints'][2][:2]), self.height/512.0))
                points.append(np.multiply(tuple(data['keypoints'][5][:2]), self.height/512.0))
                x_coords, y_coords = zip(*points)
                A = np.vstack([x_coords, np.ones(len(x_coords))]).T
                m, c = lstsq(A, y_coords, rcond=None)[0]
                for i in range(parse_array.shape[1]):
                    y = i * m + c
                    parse_head_2[int(y - 20*(self.height/512.0)):, i] = 0

        parser_mask_fixed = np.logical_or(parser_mask_fixed, np.array(parse_head_2, dtype=np.uint16))
        parse_mask += np.logical_or(parse_mask, np.logical_and(np.array(parse_head, dtype=np.uint16), np.logical_not(np.array(parse_head_2, dtype=np.uint16))))

        if self.args.height > 512:
            parse_mask = cv2.dilate(parse_mask, np.ones((20, 20), np.uint16), iterations=5)
        # elif self.args.height > 256:
        #     parse_mask = cv2.dilate(parse_mask, np.ones((10, 10), np.uint16), iterations=5)
        else:
            parse_mask = cv2.dilate(parse_mask, np.ones((5, 5), np.uint16), iterations=5)
        parse_mask = np.logical_and(parser_mask_changeable, np.logical_not(parse_mask))
        parse_mask_total = np.logical_or(parse_mask, parser_mask_fixed)
        im_mask = im * parse_mask_total
        parse_mask_total = parse_mask_total.numpy()
        parse_mask_total = parse_array * parse_mask_total
        parse_mask_total = torch.from_numpy(parse_mask_total)

        uv = np.load(os.path.join(dataroot, 'dense', im_name.replace('_0.jpg', '_5_uv.npz')))
        uv = uv['uv']
        uv = torch.from_numpy(uv)
        uv = transforms.functional.resize(uv, (self.height, self.width))

        labels = Image.open(os.path.join(dataroot, 'dense', im_name.replace('_0.jpg', '_5.png')))
        labels = labels.resize((self.width, self.height), Image.NEAREST)
        labels = np.array(labels)

        result = {
            'c_name': c_name,  # for visualization
            'im_name': im_name,  # for visualization or ground truth
            'cloth': cloth,  # for input
            'image': im,  # for visualization
            'im_cloth': im_cloth,  # for ground truth
            'shape': shape,  # for visualization
            'im_head': im_head,  # for visualization
            'im_pose': im_pose,  # for visualization
            'pose_map': pose_map,
            'parse_array': parse_array,
            'dense_labels': labels,
            'dense_uv': uv,
            'skeleton': skeleton,
            'm': im_mask,  # for input
            'parse_mask_total': parse_mask_total,
        }

        return result

    def __len__(self):
        return len(self.c_names)


================================================
FILE: data/labelmap.py
================================================
label_map={
    "background": 0,
    "hat": 1,
    "hair": 2,
    "sunglasses": 3,
    "upper_clothes": 4,
    "skirt": 5,
    "pants": 6,
    "dress": 7,
    "belt": 8,
    "left_shoe": 9,
    "right_shoe": 10,
    "head": 11,
    "left_leg": 12,
    "right_leg": 13,
    "left_arm": 14,
    "right_arm": 15,
    "bag": 16,
    "scarf": 17,
}

================================================
FILE: main.py
================================================
import torch

from data import Dataset, DataLoader
from tqdm import tqdm
from utils import sem2onehot
import conf


def test_unpaired(dataloader, model, e, args):
    with tqdm(desc="Iteration %d - images extraction" % e, unit='it', total=len(dataloader.data_loader)) as pbar:
        for step in range(0, len(dataloader.data_loader)):
            inputs = dataloader.next_batch()

            with torch.no_grad():
                image_name = inputs['im_name']
                cloth_name = inputs['c_name']
                image = inputs['image'].cuda()
                cloth = inputs['cloth'].cuda()
                cropped_cloth = inputs['im_cloth'].cuda()
                im_head = inputs['im_head'].cuda()
                pose_map = inputs['pose_map'].cuda()
                skeleton = inputs['skeleton'].cuda()
                im_pose = inputs['im_pose'].cuda()
                shape = inputs['shape'].cuda()
                parse_array = inputs['parse_array'].cuda()
                dense_labels = inputs['dense_labels'].cuda()
                dense_uv = inputs['dense_uv'].cuda()

                parse_array = sem2onehot(18, parse_array)

                # model here


def training_loop(dataloader, model, e, args):

    with tqdm(desc="Iteration %d - train" % e, unit='it', total=args.display_count) as pbar:
        for step in range(0, args.display_count):
            inputs = dataloader.next_batch()

            image_name = inputs['im_name']
            cloth_name = inputs['c_name']
            image = inputs['image'].cuda()
            cloth = inputs['cloth'].cuda()
            cropped_cloth = inputs['im_cloth'].cuda()
            im_head = inputs['im_head'].cuda()
            pose_map = inputs['pose_map'].cuda()
            skeleton = inputs['skeleton'].cuda()
            im_pose = inputs['im_pose'].cuda()
            shape = inputs['shape'].cuda()
            parse_array = inputs['parse_array'].cuda()
            dense_labels = inputs['dense_labels'].cuda()
            dense_uv = inputs['dense_uv'].cuda()

            parse_array = sem2onehot(18, parse_array)

            # model here

            pbar.update()


def main_worker(args):

    # Dataset & Dataloader
    dataset_train = Dataset(args, 
                            dataroot_path=args.dataroot, 
                            phase='train',
                            order='paired',
                            size=(int(args.height), int(args.width)))

    dataloader_train = DataLoader(args, dataset_train, dist_sampler=False)

    dataset_test_unpaired = Dataset(args,
                                    dataroot_path=args.dataroot,
                                    phase='test',
                                    order='unpaired',
                                    size=(int(args.height), int(args.width)))

    dataloader_test_unpaired = DataLoader(args, dataset_test_unpaired, dist_sampler=False)

    # Instance here your model
    model = None

    # Loop in epochs
    for e in range(0, args.epochs):
        # Training loop
        training_loop(dataloader_train, model, e, args)

        # Test unpaired
        inputs = dataloader_test_unpaired.next_batch()
        test_unpaired(inputs, model, e, args)


if __name__ == '__main__':
    # Get argparser configuration
    args = conf.get_conf()
    print(args.exp_name)

    # Call main worker
    main_worker(args)


================================================
FILE: utils/__init__.py
================================================
from .label_map import sem2onehot


================================================
FILE: utils/label_map.py
================================================
import torch


def sem2onehot(n, labelmap):
    label_map = labelmap.long().unsqueeze(1).cuda()
    bs, _, h, w = label_map.size()
    nc = n
    input_label = torch.FloatTensor(bs, nc, h, w).zero_().cuda()
    input_semantics = input_label.scatter_(1, label_map, 1.0)
    return input_semantics