Showing preview only (550K chars total). Download the full file or copy to clipboard to get everything.
Repository: yumingj/Talk-to-Edit
Branch: main
Commit: 72c45e109006
Files: 120
Total size: 516.1 KB
Directory structure:
gitextract_av2x8cy4/
├── .gitignore
├── README.md
├── configs/
│ ├── attributes_5.json
│ ├── editing/
│ │ ├── editing_with_dialog.yml
│ │ └── editing_wo_dialog.yml
│ └── train/
│ ├── field_1024_bangs.yml
│ ├── field_1024_beard.yml
│ ├── field_1024_eyeglasses.yml
│ ├── field_1024_smiling.yml
│ ├── field_1024_young.yml
│ ├── field_128_bangs.yml
│ ├── field_128_beard.yml
│ ├── field_128_eyeglasses.yml
│ ├── field_128_smiling.yml
│ └── field_128_young.yml
├── data/
│ ├── __init__.py
│ └── latent_code_dataset.py
├── editing_quantitative.py
├── editing_with_dialog.py
├── editing_wo_dialog.py
├── environment.yml
├── language/
│ ├── accuracy.py
│ ├── build_vocab.py
│ ├── dataset.py
│ ├── generate_feedback.py
│ ├── generate_training_request.py
│ ├── language_utils.py
│ ├── lstm.py
│ ├── preprocess_request.py
│ ├── run_encoder.py
│ ├── templates/
│ │ ├── attr_wise_caption_templates.json
│ │ ├── feedback.json
│ │ ├── gender.json
│ │ ├── metadata_fsm.json
│ │ ├── overall_caption_templates.json
│ │ ├── pool.json
│ │ ├── system_mode.json
│ │ ├── user_fsm.json
│ │ ├── user_old_templates.json
│ │ └── vocab.json
│ ├── train_encoder.py
│ └── utils/
│ ├── __init__.py
│ ├── eval.py
│ ├── logger.py
│ ├── lr_schedule.py
│ ├── misc.py
│ ├── numerical.py
│ ├── progress/
│ │ ├── .gitignore
│ │ ├── LICENSE
│ │ ├── MANIFEST.in
│ │ ├── README.rst
│ │ ├── progress/
│ │ │ ├── __init__.py
│ │ │ ├── bar.py
│ │ │ ├── counter.py
│ │ │ ├── helpers.py
│ │ │ └── spinner.py
│ │ ├── setup.py
│ │ └── test_progress.py
│ ├── setup_logger.py
│ └── visualize.py
├── models/
│ ├── __init__.py
│ ├── archs/
│ │ ├── __init__.py
│ │ ├── attribute_predictor_arch.py
│ │ ├── field_function_arch.py
│ │ └── stylegan2/
│ │ ├── .gitignore
│ │ ├── LICENSE
│ │ ├── LICENSE-FID
│ │ ├── LICENSE-LPIPS
│ │ ├── LICENSE-NVIDIA
│ │ ├── __init__.py
│ │ ├── apply_factor.py
│ │ ├── calc_inception.py
│ │ ├── checkpoint/
│ │ │ └── .gitignore
│ │ ├── convert_weight.py
│ │ ├── dataset.py
│ │ ├── distributed.py
│ │ ├── fid.py
│ │ ├── generate.py
│ │ ├── inception.py
│ │ ├── inversion.py
│ │ ├── lpips/
│ │ │ ├── __init__.py
│ │ │ ├── base_model.py
│ │ │ ├── dist_model.py
│ │ │ ├── networks_basic.py
│ │ │ ├── pretrained_networks.py
│ │ │ └── weights/
│ │ │ ├── v0.0/
│ │ │ │ ├── alex.pth
│ │ │ │ ├── squeeze.pth
│ │ │ │ └── vgg.pth
│ │ │ └── v0.1/
│ │ │ ├── alex.pth
│ │ │ ├── squeeze.pth
│ │ │ └── vgg.pth
│ │ ├── model.py
│ │ ├── non_leaking.py
│ │ ├── op/
│ │ │ ├── __init__.py
│ │ │ ├── fused_act.py
│ │ │ ├── fused_bias_act.cpp
│ │ │ ├── fused_bias_act_kernel.cu
│ │ │ ├── upfirdn2d.cpp
│ │ │ ├── upfirdn2d.py
│ │ │ └── upfirdn2d_kernel.cu
│ │ ├── ppl.py
│ │ ├── sample/
│ │ │ └── .gitignore
│ │ └── train.py
│ ├── base_model.py
│ ├── field_function_model.py
│ ├── losses/
│ │ ├── __init__.py
│ │ ├── arcface_loss.py
│ │ └── discriminator_loss.py
│ └── utils.py
├── quantitative_results.py
├── train.py
└── utils/
├── __init__.py
├── crop_img.py
├── dialog_edit_utils.py
├── editing_utils.py
├── inversion_utils.py
├── logger.py
├── numerical_metrics.py
├── options.py
└── util.py
================================================
FILE CONTENTS
================================================
================================================
FILE: .gitignore
================================================
experiments/
results/
tb_logger/
*.pyc
.vscode/
download
download/*
*.sh
================================================
FILE: README.md
================================================
# Talk-to-Edit (ICCV2021)


This repository contains the implementation of the following paper:
> **Talk-to-Edit: Fine-Grained Facial Editing via Dialog**<br>
> Yuming Jiang<sup>∗</sup>, Ziqi Huang<sup>∗</sup>, Xingang Pan, Chen Change Loy, Ziwei Liu<br>
> IEEE International Conference on Computer Vision (**ICCV**), 2021<br>
[[Paper](https://arxiv.org/abs/2109.04425)]
[[Project Page](https://www.mmlab-ntu.com/project/talkedit/)]
[[CelebA-Dialog Dataset](https://github.com/ziqihuangg/CelebA-Dialog)]
[[Poster](https://drive.google.com/file/d/1KaojezBNqDrkwcT0yOkvAgqW1grwUDed/view?usp=sharing)]
[[Video](https://www.youtube.com/watch?v=ZKMkQhkMXPI)]
You can try our colab demo here. Enjoy!
1. Editing with dialog: <a href="https://colab.research.google.com/drive/14inhJjrNIj_SdhIA7NEtGS2kKOWXXSjb?usp=sharing"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="google colab logo"></a>
1. Editing without dialog: <a href="https://colab.research.google.com/drive/1mO5NmlPi4YV359cPkLZnOpG_kShQi_hN?usp=sharing"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="google colab logo"></a>
## Overview

## Dependencies and Installation
1. Clone Repo
```bash
git clone git@github.com:yumingj/Talk-to-Edit.git
```
1. Create Conda Environment and Install Dependencies
```bash
conda env create -f environment.yml
conda activate talk_edit
```
- Python >= 3.7
- PyTorch >= 1.6
- CUDA 10.1
- GCC 5.4.0
## Get Started
## Editing
We provide scripts for editing using our pretrained models.
1. First, download the pretrained models from this [link](https://drive.google.com/drive/folders/1W9dvjz8bUolEIG524o8ZvM62uEWKJ5do?usp=sharing) and put them under `./download/pretrained_models` as follows:
```
./download/pretrained_models
├── 1024_field
│ ├── Bangs.pth
│ ├── Eyeglasses.pth
│ ├── No_Beard.pth
│ ├── Smiling.pth
│ └── Young.pth
├── 128_field
│ ├── Bangs.pth
│ ├── Eyeglasses.pth
│ ├── No_Beard.pth
│ ├── Smiling.pth
│ └── Young.pth
├── arcface_resnet18_110.pth
├── language_encoder.pth.tar
├── predictor_1024.pth.tar
├── predictor_128.pth.tar
├── stylegan2_1024.pth
├── stylegan2_128.pt
├── StyleGAN2_FFHQ1024_discriminator.pth
└── eval_predictor.pth.tar
```
1. You can try pure image editing without dialog instructions:
```bash
python editing_wo_dialog.py \
--opt ./configs/editing/editing_wo_dialog.yml \
--attr 'Bangs' \
--target_val 5
```
The editing results will be saved in `./results`.
You can change `attr` to one of the following attributes: `Bangs`, `Eyeglasses`, `Beard`, `Smiling`, and `Young(i.e. Age)`. And the `target_val` can be `[0, 1, 2, 3, 4, 5]`.
1. You can also try dialog-based editing, where you talk to the system through the command prompt:
```bash
python editing_with_dialog.py --opt ./configs/editing/editing_with_dialog.yml
```
The editing results will be saved in `./results`.
**How to talk to the system:**
* Our system is able to edit five facial attributes: `Bangs`, `Eyeglasses`, `Beard`, `Smiling`, and `Young(i.e. Age)`.
* When prompted with `"Enter your request (Press enter when you finish):"`, you can enter an editing request about one of the five attributes. For example, you can say `"Make the bangs longer."`
* To respond to the system's feedback, just talk as if you were talking to a real person. For example, if the system asks `"Is the length of the bangs just right?"` after one round of editing, You can say things like `"Yes."` / `"No."` / `"Yes, and I also want her to smile more happily."`.
* To end the conversation, just tell the system things like `"That's all"` / `"Nothing else, thank you."`
1. By default, the above editing would be performed on the teaser image. You may change the image to be edited in two ways: 1) change `line 11: latent_code_index` to other values ranging from `0` to `99`; 2) set `line 10: latent_code_path` to `~`, so that an image would be randomly generated.
1. If you want to try editing on real images, you may download the real images from this [link](https://drive.google.com/drive/folders/1BunrwvlwCBZJnb9QqeUp_uIXMxeXXJrY?usp=sharing) and put them under `./download/real_images`. You could also provide other real images at your choice. You need to change `line 12: img_path` in `editing_with_dialog.yml` or `editing_wo_dialog.yml` according to the path to the real image and set `line 11: is_real_image` as `True`.
1. You can switch the default image size to `128 x 128` by setting `line 3: img_res` to `128` in config files.
## Train the Semantic Field
1. To train the Semantic Field, a number of sampled latent codes should be prepared and then we use the attribute predictor to predict the facial attributes for their corresponding images. The attribute predictor is trained using fine-grained annotations in [CelebA-Dialog](https://github.com/ziqihuangg/CelebA-Dialog) dataset. Here, we provide the latent codes we used. You can download the train data from this [link](https://drive.google.com/drive/folders/1CYBpLIwts3ZVFiFAPb4TTnqYH3NBR63p?usp=sharing) and put them under `./download/train_data` as follows:
```
./download/train_data
├── 1024
│ ├── Bangs
│ ├── Eyeglasses
│ ├── No_Beard
│ ├── Smiling
│ └── Young
└── 128
├── Bangs
├── Eyeglasses
├── No_Beard
├── Smiling
└── Young
```
1. We will also use some editing latent codes to monitor the training phase. You can download the editing latent code from this [link](https://drive.google.com/drive/folders/1G-0srCePEXcPq9HY38Il_4FTVHX_rOa-?usp=sharing) and put them under `./download/editing_data` as follows:
```
./download/editing_data
├── 1024
│ ├── Bangs.npz.npy
│ ├── Eyeglasses.npz.npy
│ ├── No_Beard.npz.npy
│ ├── Smiling.npz.npy
│ └── Young.npz.npy
└── 128
├── Bangs.npz.npy
├── Eyeglasses.npz.npy
├── No_Beard.npz.npy
├── Smiling.npz.npy
└── Young.npz.npy
```
1. All logging files in the training process, *e.g.*, log message, checkpoints, and snapshots, will be saved to `./experiments` and `./tb_logger` directory.
1. There are 10 configuration files under `./configs/train`, named in the format of `field_<IMAGE_RESOLUTION>_<ATTRIBUTE_NAME>`.
Choose the corresponding configuration file for the attribute and resolution you want.
1. For example, to train the semantic field which edits the attribute `Bangs` in `128x128` image resolution, simply run:
```bash
python train.py --opt ./configs/train/field_128_Bangs.yml
```
## Quantitative Results
We provide codes for quantitative results shown in Table 1. Here we use `Bangs` in `128x128` resolution as an example.
1. Use the trained semantic field to edit images.
```bash
python editing_quantitative.py \
--opt ./configs/train/field_128_bangs.yml \
--pretrained_path ./download/pretrained_models/128_field/Bangs.pth
```
2. Evaluate the edited images using quantitative metircs. Change `image_num` for different attribute accordingly: `Bangs: 148`, `Eyeglasses: 82`, `Beard: 129`, `Smiling: 140`, `Young: 61`.
```bash
python quantitative_results.py \
--attribute Bangs \
--work_dir ./results/field_128_bangs \
--image_dir ./results/field_128_bangs/visualization \
--image_num 148
```
## Qualitative Results

## CelebA-Dialog Dataset

Our [**CelebA-Dialog Dataset**](https://github.com/ziqihuangg/CelebA-Dialog) is available for [Download](https://drive.google.com/drive/folders/18nejI_hrwNzWyoF6SW8bL27EYnM4STAs?usp=sharing).
**CelebA-Dialog** is a large-scale visual-language face dataset with the following features:
- Facial images are annotated with rich **fine-grained labels**, which classify one attribute into multiple degrees according to its semantic meaning.
- Accompanied with each image, there are **captions** describing the attributes and a **user request** sample.

The dataset can be employed as the training and test sets for the following computer vision tasks: fine-grained facial attribute recognition, fine-grained facial manipulation, text-based facial generation and manipulation, face image captioning, and broader natural language based facial recognition and manipulation tasks.
## Citation
If you find our repo useful for your research, please consider citing our paper:
```bibtex
@inproceedings{jiang2021talk,
title={Talk-to-Edit: Fine-Grained Facial Editing via Dialog},
author={Jiang, Yuming and Huang, Ziqi and Pan, Xingang and Loy, Chen Change and Liu, Ziwei},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={13799--13808},
year={2021}
}
@article{jiang2023talk,
title={Talk-to-edit: Fine-grained 2d and 3d facial editing via dialog},
author={Jiang, Yuming and Huang, Ziqi and Wu, Tianxing and Pan, Xingang and Loy, Chen Change and Liu, Ziwei},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
year={2023},
publisher={IEEE}
}
```
## Contact
If you have any question, please feel free to contact us via `yuming002@ntu.edu.sg` or `hu0007qi@ntu.edu.sg`.
## Acknowledgement
The codebase is maintained by [Yuming Jiang](https://yumingj.github.io/) and [Ziqi Huang](https://ziqihuangg.github.io/).
Part of the code is borrowed from [stylegan2-pytorch](https://github.com/rosinality/stylegan2-pytorch), [IEP](https://github.com/facebookresearch/clevr-iep) and [face-attribute-prediction](https://github.com/d-li14/face-attribute-prediction).
================================================
FILE: configs/attributes_5.json
================================================
{
"attr_info":{
"6": {
"name": "Bangs",
"value":[0, 1, 2, 3, 4, 5],
"idx_scale": 1,
"idx_bias": 0
},
"16": {
"name": "Eyeglasses",
"value":[0, 1, 2, 3, 4, 5],
"idx_scale": 1,
"idx_bias": 0
},
"25": {
"name": "No_Beard",
"value":[0, 1, 2, 3, 4, 5],
"idx_scale": -1,
"idx_bias": 5
},
"32": {
"name": "Smiling",
"value":[0, 1, 2, 3, 4, 5],
"idx_scale": 1,
"idx_bias": 0
},
"40": {
"name": "Young",
"value":[0, 1, 2, 3, 4, 5],
"idx_scale": -1,
"idx_bias": 5
}
},
"newIdx_to_attrIdx":{
"0": "6",
"1": "16",
"2": "25",
"3": "32",
"4": "40"
},
"newIdx_to_attrName":{
"0": "Bangs",
"1": "Eyeglasses",
"2": "No_Beard",
"3": "Smiling",
"4": "Young"
},
"attrName_to_newIdx":{
"Bangs": "0",
"Eyeglasses": "1",
"No_Beard": "2",
"Smiling": "3",
"Young": "4"
},
"attrIdx_to_newIdx":{
"6": 0,
"16": 1,
"25": 2,
"32": 3,
"40": 4
}
}
================================================
FILE: configs/editing/editing_with_dialog.yml
================================================
name: dialog_editing
img_res: 1024 # 128
# latent code
latent_code_path: ./download/editing_data/teaser_latent_code.npz.npy
latent_code_index: 38
# inversion
inversion:
is_real_image: False # False
img_path: ./download/real_images/annehathaway.png
crop_img: True
device: cuda
img_mse_weight: 1.0
step: 600
noise: 0.05
noise_ramp: 0.75
lr: 0.1
lr_gen: !!float 1e-4
use_tb_logger: true
set_CUDA_VISIBLE_DEVICES: ~
gpu_ids: [3]
attribute: Eyeglasses
model_type: FieldFunctionModel
fix_layers: true
replaced_layers_128: 8
replaced_layers_1024: 10
manual_seed: 2021
# editing configs
confidence_thresh: 0
max_cls_num: 5
min_cls_num: 0
max_trials_num: 100
print_every: False
transform_z_to_w: False
# field_function configs
num_layer: 8
hidden_dim: 512
leaky_relu_neg_slope: 0.2
# predictor
attr_file: ./configs/attributes_5.json
baseline: classification
use_sigmoid: True
gt_remapping_file: ~
predictor_ckpt_128: ./download/pretrained_models/predictor_128.pth.tar
predictor_ckpt_1024: ./download/pretrained_models/predictor_1024.pth.tar
# stylegan configs
latent_dim: 512
n_mlp: 8
channel_multiplier_128: 1
channel_multiplier_1024: 2
generator_ckpt_128: ./download/pretrained_models/stylegan2_128.pt
generator_ckpt_1024: ./download/pretrained_models/stylegan2_1024.pth
latent_space: w
# ---------- Dialog Editing -----------
has_dialog: True
device_name: gpu
# pretrained field
pretrained_field_128:
Bangs: ./download/pretrained_models/128_field/Bangs.pth
Eyeglasses: ./download/pretrained_models/128_field/Eyeglasses.pth
No_Beard: ./download/pretrained_models/128_field/No_Beard.pth
Smiling: ./download/pretrained_models/128_field/Smiling.pth
Young: ./download/pretrained_models/128_field/Young.pth
pretrained_field_1024:
Bangs: ./download/pretrained_models/1024_field/Bangs.pth
Eyeglasses: ./download/pretrained_models/1024_field/Eyeglasses.pth
No_Beard: ./download/pretrained_models/1024_field/No_Beard.pth
Smiling: ./download/pretrained_models/1024_field/Smiling.pth
Young: ./download/pretrained_models/1024_field/Young.pth
attr_to_idx:
Bangs: 0
Eyeglasses: 1
No_Beard: 2
Smiling: 3
Young: 4
# language template files set up
feedback_templates_file: ./language/templates/feedback.json
metadata_file: ./language/templates/metadata_fsm.json
pool_file: ./language/templates/pool.json
system_mode_file: ./language/templates/system_mode.json
input_vocab_file: ./language/templates/vocab.json
# dialog setting
postfix_prob: 0.3
whether_enough_general_prob: 0.2
allow_unknown: 1
verbose: 0
# pretrained language encoder
pretrained_language_encoder: ./download/pretrained_models/language_encoder.pth.tar
language_encoder:
word_embedding_dim: 300
text_embed_size: 1024
linear_hidden_size: 256
linear_dropout_rate: 0
================================================
FILE: configs/editing/editing_wo_dialog.yml
================================================
name: editing_wo_dialog
img_res: 1024 # 128
# latent code
latent_code_path: ./download/editing_data/teaser_latent_code.npz.npy
latent_code_index: 38
# inversion
inversion:
is_real_image: False # False
img_path: ./download/real_images/annehathaway.png
crop_img: True
device: cuda
img_mse_weight: 1.0
step: 600
noise: 0.05
noise_ramp: 0.75
lr: 0.1
lr_gen: !!float 1e-4
use_tb_logger: true
set_CUDA_VISIBLE_DEVICES: ~
gpu_ids: [3]
attribute: Eyeglasses
model_type: FieldFunctionModel
fix_layers: true
replaced_layers_128: 8
replaced_layers_1024: 10
manual_seed: 2021
# editing configs
confidence_thresh: 0
max_cls_num: 5
min_cls_num: 0
max_trials_num: 100
print_every: False
transform_z_to_w: False
# field_function configs
num_layer: 8
hidden_dim: 512
leaky_relu_neg_slope: 0.2
# predictor
attr_file: ./configs/attributes_5.json
baseline: classification
use_sigmoid: True
gt_remapping_file: ~
predictor_ckpt_128: ./download/pretrained_models/predictor_128.pth.tar
predictor_ckpt_1024: ./download/pretrained_models/predictor_1024.pth.tar
# stylegan configs
latent_dim: 512
n_mlp: 8
channel_multiplier_128: 1
channel_multiplier_1024: 2
generator_ckpt_128: ./download/pretrained_models/stylegan2_128.pt
generator_ckpt_1024: ./download/pretrained_models/stylegan2_1024.pth
latent_space: w
# ---------- Dialog Editing -----------
has_dialog: False
device_name: gpu
# pretrained field
pretrained_field_128:
Bangs: ./download/pretrained_models/128_field/Bangs.pth
Eyeglasses: ./download/pretrained_models/128_field/Eyeglasses.pth
No_Beard: ./download/pretrained_models/128_field/No_Beard.pth
Smiling: ./download/pretrained_models/128_field/Smiling.pth
Young: ./download/pretrained_models/128_field/Young.pth
pretrained_field_1024:
Bangs: ./download/pretrained_models/1024_field/Bangs.pth
Eyeglasses: ./download/pretrained_models/1024_field/Eyeglasses.pth
No_Beard: ./download/pretrained_models/1024_field/No_Beard.pth
Smiling: ./download/pretrained_models/1024_field/Smiling.pth
Young: ./download/pretrained_models/1024_field/Young.pth
attr_to_idx:
Bangs: 0
Eyeglasses: 1
No_Beard: 2
Smiling: 3
Young: 4
================================================
FILE: configs/train/field_1024_bangs.yml
================================================
name: field_1024_bangs
use_tb_logger: true
set_CUDA_VISIBLE_DEVICES: ~
gpu_ids: [3]
attribute: Bangs
model_type: FieldFunctionModel
fix_layers: true
replaced_layers: 10
# dataset configs
batch_size: 8
num_workers: 8
input_latent_dir: ./download/train_data/1024/Bangs
editing_latent_code_path: ./download/editing_data/1024/Bangs.npz.npy
num_attr: 5
val_on_train_subset: true
val_on_valset: true
# training configs
val_freq: 1
print_freq: 100
weight_decay: 0
manual_seed: 2021
num_epochs: 500
lr: !!float 1e-4
lr_decay: step
gamma: 0.1
step: 100
# editing configs
confidence_thresh: 0.8
max_cls_num: 5
max_trials_num: 100
print_every: False
# field_function configs
num_layer: 8
hidden_dim: 512
leaky_relu_neg_slope: 0.2
# loss configs
# predictor loss
edited_attribute_weight: 1.0
attr_file: ./configs/attributes_5.json
predictor_ckpt: ./download/pretrained_models/predictor_1024.pth.tar
# arcface loss
pretrained_arcface: ./download/pretrained_models/arcface_resnet18_110.pth
arcface_weight: 5.0
arcface_loss_type: l1
# disciminator loss
disc_weight: 1.0
discriminator_ckpt: ./download/pretrained_models/StyleGAN2_FFHQ1024_discriminator.pth
# stylegan configs
img_res: 1024
latent_dim: 512
n_mlp: 8
channel_multiplier: 2
generator_ckpt: ./download/pretrained_models/stylegan2_1024.pth
latent_space: w
================================================
FILE: configs/train/field_1024_beard.yml
================================================
name: field_1024_beard
use_tb_logger: true
set_CUDA_VISIBLE_DEVICES: ~
gpu_ids: [3]
attribute: No_Beard
model_type: FieldFunctionModel
fix_layers: true
replaced_layers: 10
# dataset configs
batch_size: 8
num_workers: 8
input_latent_dir: ./download/train_data/1024/No_Beard
editing_latent_code_path: ./download/editing_data/1024/No_Beard.npz.npy
num_attr: 5
val_on_train_subset: true
val_on_valset: true
# training configs
val_freq: 1
print_freq: 100
weight_decay: 0
manual_seed: 2021
num_epochs: 30
lr: !!float 1e-4
lr_decay: step
gamma: 0.1
step: 100
# editing configs
confidence_thresh: 0.8
max_cls_num: 5
max_trials_num: 100
print_every: False
# field_function configs
num_layer: 8
hidden_dim: 512
leaky_relu_neg_slope: 0.2
# loss configs
# predictor loss
edited_attribute_weight: 1.0
attr_file: ./configs/attributes_5.json
predictor_ckpt: ./download/pretrained_models/predictor_1024.pth.tar
# arcface loss
pretrained_arcface: ./download/pretrained_models/arcface_resnet18_110.pth
arcface_weight: 10.0
arcface_loss_type: l1
# disciminator loss
disc_weight: 1.0
discriminator_ckpt: ./download/pretrained_models/StyleGAN2_FFHQ1024_discriminator.pth
# stylegan configs
img_res: 1024
latent_dim: 512
n_mlp: 8
channel_multiplier: 2
generator_ckpt: ./download/pretrained_models/stylegan2_1024.pth
latent_space: w
================================================
FILE: configs/train/field_1024_eyeglasses.yml
================================================
name: field_1024_eyeglasses
use_tb_logger: true
set_CUDA_VISIBLE_DEVICES: ~
gpu_ids: [3]
attribute: Eyeglasses
model_type: FieldFunctionModel
fix_layers: true
replaced_layers: 10
# dataset configs
batch_size: 8
num_workers: 8
input_latent_dir: ./download/train_data/1024/Eyeglasses
editing_latent_code_path: ./download/editing_data/1024/Eyeglasses.npz.npy
num_attr: 5
val_on_train_subset: true
val_on_valset: true
# training configs
val_freq: 1
print_freq: 100
weight_decay: 0
manual_seed: 2021
num_epochs: 30
lr: !!float 1e-4
lr_decay: step
gamma: 0.1
step: 100
# editing configs
confidence_thresh: 0.8
max_cls_num: 5
max_trials_num: 100
print_every: False
# field_function configs
num_layer: 8
hidden_dim: 512
leaky_relu_neg_slope: 0.2
# loss configs
# predictor loss
edited_attribute_weight: 1.0
attr_file: ./configs/attributes_5.json
predictor_ckpt: ./download/pretrained_models/predictor_1024.pth.tar
# arcface loss
pretrained_arcface: ./download/pretrained_models/arcface_resnet18_110.pth
arcface_weight: 10.0
arcface_loss_type: l1
# disciminator loss
disc_weight: 1.0
discriminator_ckpt: ./download/pretrained_models/StyleGAN2_FFHQ1024_discriminator.pth
# stylegan configs
img_res: 1024
latent_dim: 512
n_mlp: 8
channel_multiplier: 2
generator_ckpt: ./download/pretrained_models/stylegan2_1024.pth
latent_space: w
================================================
FILE: configs/train/field_1024_smiling.yml
================================================
name: field_1024_smiling
use_tb_logger: true
set_CUDA_VISIBLE_DEVICES: ~
gpu_ids: [3]
attribute: Smiling
model_type: FieldFunctionModel
fix_layers: true
replaced_layers: 10
# dataset configs
batch_size: 8
num_workers: 8
input_latent_dir: ./download/train_data/1024/Smiling
editing_latent_code_path: ./download/editing_data/1024/Smiling.npz.npy
num_attr: 5
val_on_train_subset: true
val_on_valset: true
# training configs
val_freq: 1
print_freq: 100
weight_decay: 0
manual_seed: 2021
num_epochs: 30
lr: !!float 1e-4
lr_decay: step
gamma: 0.1
step: 100
# editing configs
confidence_thresh: 0.8
max_cls_num: 5
max_trials_num: 100
print_every: False
# field_function configs
num_layer: 8
hidden_dim: 512
leaky_relu_neg_slope: 0.2
# loss configs
# predictor loss
edited_attribute_weight: 1.0
attr_file: ./configs/attributes_5.json
predictor_ckpt: ./download/pretrained_models/predictor_1024.pth.tar
# arcface loss
pretrained_arcface: ./download/pretrained_models/arcface_resnet18_110.pth
arcface_weight: 5.0
arcface_loss_type: l1
# disciminator loss
disc_weight: 1.0
discriminator_ckpt: ./download/pretrained_models/StyleGAN2_FFHQ1024_discriminator.pth
# stylegan configs
img_res: 1024
latent_dim: 512
n_mlp: 8
channel_multiplier: 2
generator_ckpt: ./download/pretrained_models/stylegan2_1024.pth
latent_space: w
================================================
FILE: configs/train/field_1024_young.yml
================================================
name: field_1024_young
use_tb_logger: true
set_CUDA_VISIBLE_DEVICES: ~
gpu_ids: [3]
attribute: Young
model_type: FieldFunctionModel
fix_layers: true
replaced_layers: 10
# dataset configs
batch_size: 8
num_workers: 8
input_latent_dir: ./download/train_data/1024/Young
editing_latent_code_path: ./download/editing_data/1024/Young.npz.npy
num_attr: 5
val_on_train_subset: true
val_on_valset: true
# training configs
val_freq: 1
print_freq: 100
weight_decay: 0
manual_seed: 2021
num_epochs: 30
lr: !!float 1e-4
lr_decay: step
gamma: 0.1
step: 100
# editing configs
confidence_thresh: 0.8
max_cls_num: 5
max_trials_num: 100
print_every: False
# field_function configs
num_layer: 8
hidden_dim: 512
leaky_relu_neg_slope: 0.2
# loss configs
# predictor loss
edited_attribute_weight: 1.0
attr_file: ./configs/attributes_5.json
predictor_ckpt: ./download/pretrained_models/predictor_1024.pth.tar
# arcface loss
pretrained_arcface: ./download/pretrained_models/arcface_resnet18_110.pth
arcface_weight: 10.0
arcface_loss_type: l1
# disciminator loss
disc_weight: 1.0
discriminator_ckpt: ./download/pretrained_models/StyleGAN2_FFHQ1024_discriminator.pth
# stylegan configs
img_res: 1024
latent_dim: 512
n_mlp: 8
channel_multiplier: 2
generator_ckpt: ./download/pretrained_models/stylegan2_1024.pth
latent_space: w
================================================
FILE: configs/train/field_128_bangs.yml
================================================
name: field_128_bangs
use_tb_logger: true
set_CUDA_VISIBLE_DEVICES: ~
gpu_ids: [3]
attribute: Bangs
model_type: FieldFunctionModel
fix_layers: true
replaced_layers: 8
# dataset configs
batch_size: 32
num_workers: 8
input_latent_dir: ./download/train_data/128/Bangs
editing_latent_code_path: ./download/editing_data/128/Bangs.npz.npy
num_attr: 5
val_on_train_subset: true
val_on_valset: true
# training configs
val_freq: 1
print_freq: 100
weight_decay: 0
manual_seed: 2021
num_epochs: 30
lr: !!float 1e-4
lr_decay: step
gamma: 0.1
step: 100
# editing configs
confidence_thresh: 0.8
max_cls_num: 5
max_trials_num: 100
print_every: False
# field_function configs
num_layer: 8
hidden_dim: 512
leaky_relu_neg_slope: 0.2
# loss configs
# predictor loss
edited_attribute_weight: 1.0
attr_file: ./configs/attributes_5.json
predictor_ckpt: ./download/pretrained_models/predictor_128.pth.tar
# arcface loss
pretrained_arcface: ./download/pretrained_models/arcface_resnet18_110.pth
arcface_weight: 5.0
arcface_loss_type: l1
# disciminator loss
disc_weight: 1.0
discriminator_ckpt: ./download/pretrained_models/stylegan2_128.pt
# stylegan configs
img_res: 128
latent_dim: 512
n_mlp: 8
channel_multiplier: 1
generator_ckpt: ./download/pretrained_models/stylegan2_128.pt
latent_space: w
================================================
FILE: configs/train/field_128_beard.yml
================================================
name: field_128_beard
use_tb_logger: true
set_CUDA_VISIBLE_DEVICES: ~
gpu_ids: [3]
attribute: No_Beard
model_type: FieldFunctionModel
fix_layers: true
replaced_layers: 8
# dataset configs
batch_size: 32
num_workers: 8
input_latent_dir: ./download/train_data/128/No_Beard
editing_latent_code_path: ./download/editing_data/128/No_Beard.npz.npy
num_attr: 5
val_on_train_subset: true
val_on_valset: true
# training configs
val_freq: 1
print_freq: 100
weight_decay: 0
manual_seed: 2021
num_epochs: 30
lr: !!float 1e-4
lr_decay: step
gamma: 0.1
step: 100
# editing configs
confidence_thresh: 0.8
max_cls_num: 5
max_trials_num: 100
print_every: False
# field_function configs
num_layer: 8
hidden_dim: 512
leaky_relu_neg_slope: 0.2
# loss configs
# predictor loss
edited_attribute_weight: 1.0
attr_file: ./configs/attributes_5.json
predictor_ckpt: ./download/pretrained_models/predictor_128.pth.tar
# arcface loss
pretrained_arcface: ./download/pretrained_models/arcface_resnet18_110.pth
arcface_weight: 5.0
arcface_loss_type: l1
# disciminator loss
disc_weight: 1.0
discriminator_ckpt: ./download/pretrained_models/stylegan2_128.pt
# stylegan configs
img_res: 128
latent_dim: 512
n_mlp: 8
channel_multiplier: 1
generator_ckpt: ./download/pretrained_models/stylegan2_128.pt
latent_space: w
================================================
FILE: configs/train/field_128_eyeglasses.yml
================================================
name: field_128_eyeglasses
use_tb_logger: true
set_CUDA_VISIBLE_DEVICES: ~
gpu_ids: [3]
attribute: Eyeglasses
model_type: FieldFunctionModel
fix_layers: true
replaced_layers: 8
# dataset configs
batch_size: 32
num_workers: 8
input_latent_dir: ./download/train_data/128/Eyeglasses
editing_latent_code_path: ./download/editing_data/128/Eyeglasses.npz.npy
num_attr: 5
val_on_train_subset: true
val_on_valset: true
# training configs
val_freq: 1
print_freq: 100
weight_decay: 0
manual_seed: 2021
num_epochs: 30
lr: !!float 1e-4
lr_decay: step
gamma: 0.1
step: 100
# editing configs
confidence_thresh: 0.8
max_cls_num: 5
max_trials_num: 100
print_every: False
# field_function configs
num_layer: 8
hidden_dim: 512
leaky_relu_neg_slope: 0.2
# loss configs
# predictor loss
edited_attribute_weight: 1.0
attr_file: ./configs/attributes_5.json
predictor_ckpt: ./download/pretrained_models/predictor_128.pth.tar
# arcface loss
pretrained_arcface: ./download/pretrained_models/arcface_resnet18_110.pth
arcface_weight: 5.0
arcface_loss_type: l1
# disciminator loss
disc_weight: 1.0
discriminator_ckpt: ./download/pretrained_models/stylegan2_128.pt
# stylegan configs
img_res: 128
latent_dim: 512
n_mlp: 8
channel_multiplier: 1
generator_ckpt: ./download/pretrained_models/stylegan2_128.pt
latent_space: w
================================================
FILE: configs/train/field_128_smiling.yml
================================================
name: field_128_smiling
use_tb_logger: true
set_CUDA_VISIBLE_DEVICES: ~
gpu_ids: [3]
attribute: Smiling
model_type: FieldFunctionModel
fix_layers: true
replaced_layers: 8
# dataset configs
batch_size: 32
num_workers: 8
input_latent_dir: ./download/train_data/128/Smiling
editing_latent_code_path: ./download/editing_data/128/Smiling.npz.npy
num_attr: 5
val_on_train_subset: true
val_on_valset: true
# training configs
val_freq: 1
print_freq: 100
weight_decay: 0
manual_seed: 2021
num_epochs: 30
lr: !!float 1e-4
lr_decay: step
gamma: 0.1
step: 100
# editing configs
confidence_thresh: 0.8
max_cls_num: 5
max_trials_num: 100
print_every: False
# field_function configs
num_layer: 8
hidden_dim: 512
leaky_relu_neg_slope: 0.2
# loss configs
# predictor loss
edited_attribute_weight: 1.0
attr_file: ./configs/attributes_5.json
predictor_ckpt: ./download/pretrained_models/predictor_128.pth.tar
# arcface loss
pretrained_arcface: ./download/pretrained_models/arcface_resnet18_110.pth
arcface_weight: 5.0
arcface_loss_type: l1
# disciminator loss
disc_weight: 1.0
discriminator_ckpt: ./download/pretrained_models/stylegan2_128.pt
# stylegan configs
img_res: 128
latent_dim: 512
n_mlp: 8
channel_multiplier: 1
generator_ckpt: ./download/pretrained_models/stylegan2_128.pt
================================================
FILE: configs/train/field_128_young.yml
================================================
name: field_128_young
use_tb_logger: true
set_CUDA_VISIBLE_DEVICES: ~
gpu_ids: [3]
attribute: Young
model_type: FieldFunctionModel
fix_layers: true
replaced_layers: 8
# dataset configs
batch_size: 32
num_workers: 8
input_latent_dir: ./download/train_data/128/Young
editing_latent_code_path: ./download/editing_data/128/Young.npz.npy
num_attr: 5
val_on_train_subset: true
val_on_valset: true
# training configs
val_freq: 1
print_freq: 100
weight_decay: 0
manual_seed: 2021
num_epochs: 30
lr: !!float 1e-4
lr_decay: step
gamma: 0.1
step: 100
# editing configs
confidence_thresh: 0.5
max_cls_num: 5
max_trials_num: 100
print_every: False
# field_function configs
num_layer: 8
hidden_dim: 512
leaky_relu_neg_slope: 0.2
# loss configs
# predictor loss
edited_attribute_weight: 1.0
attr_file: ./configs/attributes_5.json
predictor_ckpt: ./download/pretrained_models/predictor_128.pth.tar
# arcface loss
pretrained_arcface: ./download/pretrained_models/arcface_resnet18_110.pth
arcface_weight: 5.0
arcface_loss_type: l1
# disciminator loss
disc_weight: 1.0
discriminator_ckpt: ./download/pretrained_models/stylegan2_128.pt
# stylegan configs
img_res: 128
latent_dim: 512
n_mlp: 8
channel_multiplier: 1
generator_ckpt: ./download/pretrained_models/stylegan2_128.pt
latent_space: w
================================================
FILE: data/__init__.py
================================================
================================================
FILE: data/latent_code_dataset.py
================================================
"""
Dataset for field function
"""
import os
import os.path
import random
import numpy as np
import torch
import torch.utils.data as data
class LatentCodeDataset(data.Dataset):
def __init__(self, input_dir, subset_samples=None):
assert os.path.exists(input_dir)
self.latent_codes = np.load(
os.path.join(input_dir, 'selected_latent_code.npy')).astype(float)
self.labels = np.load(
os.path.join(input_dir, 'selected_pred_class.npy')).astype(int)
self.scores = np.load(
os.path.join(input_dir, 'selected_pred_scores.npy')).astype(float)
self.latent_codes = torch.FloatTensor(self.latent_codes)
self.labels = torch.LongTensor(self.labels)
self.scores = torch.FloatTensor(self.scores)
# select a subset from train set
if subset_samples is not None and len(
self.latent_codes) > subset_samples:
idx = list(range(len(self.latent_codes)))
selected_idx = random.sample(idx, subset_samples)
self.latent_codes = [self.latent_codes[i] for i in selected_idx]
self.labels = [self.labels[i] for i in selected_idx]
self.scores = [self.scores[i] for i in selected_idx]
assert len(self.latent_codes) == len(self.labels)
assert len(self.labels) == len(self.scores)
def __getitem__(self, index):
return (self.latent_codes[index], self.labels[index],
self.scores[index])
def __len__(self):
return len(self.latent_codes)
================================================
FILE: editing_quantitative.py
================================================
import argparse
import logging
import os
import numpy as np
from models import create_model
from utils.logger import get_root_logger
from utils.numerical_metrics import compute_num_metrics
from utils.options import dict2str, dict_to_nonedict, parse
from utils.util import make_exp_dirs
def main():
# options
parser = argparse.ArgumentParser()
parser.add_argument('--opt', type=str, help='Path to option YAML file.')
parser.add_argument(
'--pretrained_path', type=str, help='Path to pretrained field model')
args = parser.parse_args()
opt = parse(args.opt, is_train=False)
# mkdir and loggers
make_exp_dirs(opt)
# convert to NoneDict, which returns None for missing keys
opt = dict_to_nonedict(opt)
# load editing latent code
editing_latent_codes = np.load(opt['editing_latent_code_path'])
num_latent_codes = editing_latent_codes.shape[0]
save_path = f'{opt["path"]["visualization"]}'
os.makedirs(save_path)
editing_logger = get_root_logger(
logger_name='editing',
log_level=logging.INFO,
log_file=f'{save_path}/editing.log')
editing_logger.info(dict2str(opt))
field_model = create_model(opt)
field_model.load_network(args.pretrained_path)
field_model.continuous_editing(editing_latent_codes, save_path,
editing_logger)
_, _ = compute_num_metrics(save_path, num_latent_codes,
opt['pretrained_arcface'], opt['attr_file'],
opt['predictor_ckpt'],
opt['attr_dict'][opt['attribute']],
editing_logger)
if __name__ == '__main__':
main()
================================================
FILE: editing_with_dialog.py
================================================
import argparse
import json
import logging
import os.path
import numpy as np
import torch
from models import create_model
from utils.dialog_edit_utils import dialog_with_real_user
from utils.inversion_utils import inversion
from utils.logger import get_root_logger
from utils.options import (dict2str, dict_to_nonedict, parse,
parse_args_from_opt, parse_opt_wrt_resolution)
from utils.util import make_exp_dirs
def parse_args():
"""Parses arguments."""
parser = argparse.ArgumentParser(description='')
parser.add_argument(
'--opt', default=None, type=str, help='Path to option YAML file.')
return parser.parse_args()
def main():
# ---------- Set up -----------
args = parse_args()
opt = parse(args.opt, is_train=False)
opt = parse_opt_wrt_resolution(opt)
args = parse_args_from_opt(args, opt)
make_exp_dirs(opt)
# convert to NoneDict, which returns None for missing keys
opt = dict_to_nonedict(opt)
# set up logger
save_log_path = f'{opt["path"]["log"]}'
dialog_logger = get_root_logger(
logger_name='dialog',
log_level=logging.INFO,
log_file=f'{save_log_path}/dialog.log')
dialog_logger.info(dict2str(opt))
save_image_path = f'{opt["path"]["visualization"]}'
os.makedirs(save_image_path)
# ---------- Load files -----------
dialog_logger.info('loading template files')
with open(opt['feedback_templates_file'], 'r') as f:
args.feedback_templates = json.load(f)
args.feedback_replacement = args.feedback_templates['replacement']
with open(opt['pool_file'], 'r') as f:
pool = json.load(f)
args.synonyms_dict = pool["synonyms"]
# ---------- create model ----------
field_model = create_model(opt)
# ---------- load latent code ----------
if opt['inversion']['is_real_image']:
latent_code = inversion(opt, field_model)
else:
if opt['latent_code_path'] is None:
latent_code = torch.randn(1, 512, device=torch.device('cuda'))
with torch.no_grad():
latent_code = field_model.stylegan_gen.get_latent(latent_code)
latent_code = latent_code.cpu().numpy()
np.save(f'{opt["path"]["visualization"]}/latent_code.npz.npy',
latent_code)
else:
i = opt['latent_code_index']
latent_code = np.load(
opt['latent_code_path'],
allow_pickle=True).item()[f"{str(i).zfill(7)}.png"]
latent_code = torch.from_numpy(latent_code).to(
torch.device('cuda'))
with torch.no_grad():
latent_code = field_model.stylegan_gen.get_latent(latent_code)
latent_code = latent_code.cpu().numpy()
np.save(f'{opt["path"]["visualization"]}/latent_code.npz.npy', latent_code)
# ---------- Perform dialog-based editing with user -----------
dialog_overall_log = dialog_with_real_user(field_model, latent_code, opt,
args, dialog_logger)
# ---------- Log the dialog history -----------
for (key, value) in dialog_overall_log.items():
dialog_logger.info(f'{key}: {value}')
dialog_logger.info('successfully end.')
if __name__ == '__main__':
main()
================================================
FILE: editing_wo_dialog.py
================================================
import argparse
import logging
import os
import numpy as np
import torch
from models import create_model
from models.utils import save_image
from utils.editing_utils import edit_target_attribute
from utils.inversion_utils import inversion
from utils.logger import get_root_logger
from utils.options import (dict2str, dict_to_nonedict, parse,
parse_opt_wrt_resolution)
from utils.util import make_exp_dirs
def parse_args():
"""Parses arguments."""
parser = argparse.ArgumentParser(description='')
parser.add_argument('--opt', type=str, help='Path to option YAML file.')
parser.add_argument('--attr', type=str, help='Attribute to be edited.')
parser.add_argument(
'--target_val', type=int, help='Target Attribute Value.')
return parser.parse_args()
def main():
# ---------- Set up -----------
args = parse_args()
opt = parse(args.opt, is_train=False)
opt = parse_opt_wrt_resolution(opt)
# args = parse_args_from_opt(args, opt)
make_exp_dirs(opt)
# convert to NoneDict, which returns None for missing keys
opt = dict_to_nonedict(opt)
# set up logger
save_log_path = f'{opt["path"]["log"]}'
editing_logger = get_root_logger(
logger_name='editing',
log_level=logging.INFO,
log_file=f'{save_log_path}/editing.log')
editing_logger.info(dict2str(opt))
save_image_path = f'{opt["path"]["visualization"]}'
os.makedirs(save_image_path)
# ---------- create model ----------
field_model = create_model(opt)
# ---------- load latent code ----------
if opt['inversion']['is_real_image']:
latent_code = inversion(opt, field_model)
else:
if opt['latent_code_path'] is None:
latent_code = torch.randn(1, 512, device=torch.device('cuda'))
with torch.no_grad():
latent_code = field_model.stylegan_gen.get_latent(latent_code)
latent_code = latent_code.cpu().numpy()
np.save(f'{opt["path"]["visualization"]}/latent_code.npz.npy',
latent_code)
else:
i = opt['latent_code_index']
latent_code = np.load(
opt['latent_code_path'],
allow_pickle=True).item()[f"{str(i).zfill(7)}.png"]
latent_code = torch.from_numpy(latent_code).to(
torch.device('cuda'))
with torch.no_grad():
latent_code = field_model.stylegan_gen.get_latent(latent_code)
latent_code = latent_code.cpu().numpy()
# ---------- synthesize images ----------
with torch.no_grad():
start_image, start_label, start_score = \
field_model.synthesize_and_predict(torch.from_numpy(latent_code).to(torch.device('cuda'))) # noqa
save_image(start_image, f'{opt["path"]["visualization"]}/start_image.png')
# initialize attribtue_dict
attribute_dict = {
"Bangs": start_label[0],
"Eyeglasses": start_label[1],
"No_Beard": start_label[2],
"Smiling": start_label[3],
"Young": start_label[4],
}
edit_label = {'attribute': args.attr, 'target_score': args.target_val}
edited_latent_code = None
print_intermediate_result = True
round_idx = 0
attribute_dict, exception_mode, latent_code, edited_latent_code = edit_target_attribute(
opt, attribute_dict, edit_label, round_idx, latent_code,
edited_latent_code, field_model, editing_logger,
print_intermediate_result)
if exception_mode != 'normal':
if exception_mode == 'already_at_target_class':
editing_logger.info("This attribute is already at the degree that you want. Let's try a different attribute degree or another attribute.")
elif exception_mode == 'max_edit_num_reached':
editing_logger.info("Sorry, we are unable to edit this attribute. Perhaps we can try something else.")
if __name__ == '__main__':
main()
================================================
FILE: environment.yml
================================================
name: talk_edit
channels:
- pytorch
- conda-forge
- anaconda
- defaults
dependencies:
- _libgcc_mutex=0.1=main
- absl-py=0.11.0=pyhd3eb1b0_1
- aiohttp=3.7.3=py37h27cfd23_1
- async-timeout=3.0.1=py37h06a4308_0
- attrs=20.3.0=pyhd3eb1b0_0
- backcall=0.2.0=py_0
- blas=1.0=mkl
- blinker=1.4=py37h06a4308_0
- blosc=1.21.0=h8c45485_0
- brotli=1.0.9=he6710b0_2
- brotlipy=0.7.0=py37h27cfd23_1003
- brunsli=0.1=h2531618_0
- bzip2=1.0.8=h7b6447c_0
- c-ares=1.17.1=h27cfd23_0
- ca-certificates=2021.7.5=h06a4308_1
- cachetools=4.2.1=pyhd3eb1b0_0
- certifi=2021.5.30=py37h06a4308_0
- cffi=1.14.4=py37h261ae71_0
- chardet=3.0.4=py37h06a4308_1003
- charls=2.1.0=he6710b0_2
- click=7.1.2=pyhd3eb1b0_0
- cloudpickle=1.6.0=py_0
- cryptography=2.9.2=py37h1ba5d50_0
- cudatoolkit=10.1.243=h6bb024c_0
- cycler=0.10.0=py_2
- cytoolz=0.11.0=py37h7b6447c_0
- dask-core=2021.3.0=pyhd3eb1b0_0
- decorator=4.4.2=pyhd3eb1b0_0
- freetype=2.10.4=h5ab3b9f_0
- giflib=5.1.4=h14c3975_1
- google-auth=1.24.0=pyhd3eb1b0_0
- google-auth-oauthlib=0.4.2=pyhd3eb1b0_2
- grpcio=1.31.0=py37hf8bcb03_0
- icu=67.1=he1b5a44_0
- idna=2.10=pyhd3eb1b0_0
- imagecodecs=2021.1.11=py37h581e88b_1
- imageio=2.9.0=py_0
- intel-openmp=2020.2=254
- ipython=7.18.1=py37h5ca1d4c_0
- ipython_genutils=0.2.0=py37_0
- jedi=0.18.0=py37h06a4308_1
- joblib=1.0.0=pyhd3eb1b0_0
- jpeg=9b=h024ee3a_2
- jxrlib=1.1=h7b6447c_2
- kiwisolver=1.3.1=py37hc928c03_0
- lcms2=2.11=h396b838_0
- ld_impl_linux-64=2.33.1=h53a641e_7
- lerc=2.2.1=h2531618_0
- libaec=1.0.4=he6710b0_1
- libdeflate=1.7=h27cfd23_5
- libedit=3.1.20191231=h14c3975_1
- libffi=3.3=he6710b0_2
- libgcc-ng=9.1.0=hdf63c60_0
- libgfortran-ng=7.3.0=hdf63c60_0
- libpng=1.6.37=hbc83047_0
- libprotobuf=3.13.0.1=h8b12597_0
- libstdcxx-ng=9.1.0=hdf63c60_0
- libtiff=4.1.0=h2733197_1
- libwebp=1.0.1=h8e7db2f_0
- libzopfli=1.0.3=he6710b0_0
- lz4-c=1.9.3=h2531618_0
- markdown=3.3.3=py37h06a4308_0
- matplotlib=3.2.2=1
- matplotlib-base=3.2.2=py37h1d35a4c_1
- mkl=2020.2=256
- mkl-service=2.3.0=py37he8ac12f_0
- mkl_fft=1.2.0=py37h23d657b_0
- mkl_random=1.1.1=py37h0573a6f_0
- multidict=4.7.6=py37h7b6447c_1
- ncurses=6.2=he6710b0_1
- networkx=2.5=py_0
- ninja=1.10.2=py37hff7bd54_0
- numpy=1.19.2=py37h54aff64_0
- numpy-base=1.19.2=py37hfa32c7d_0
- oauthlib=3.1.0=py_0
- olefile=0.46=py37_0
- openjpeg=2.3.0=h05c96fa_1
- openssl=1.1.1k=h27cfd23_0
- parso=0.8.0=py_0
- pexpect=4.8.0=py37_1
- pickleshare=0.7.5=py37_1001
- pillow=8.2.0=py37he98fc37_0
- pip=20.3.3=py37h06a4308_0
- prompt-toolkit=3.0.8=py_0
- protobuf=3.13.0.1=py37he6710b0_1
- ptyprocess=0.6.0=py37_0
- pyasn1=0.4.8=py_0
- pyasn1-modules=0.2.8=py_0
- pycparser=2.20=py_2
- pygments=2.7.1=py_0
- pyjwt=2.0.1=py37h06a4308_0
- pyopenssl=20.0.1=pyhd3eb1b0_1
- pyparsing=2.4.7=pyh9f0ad1d_0
- pysocks=1.7.1=py37_1
- python=3.7.9=h7579374_0
- python-dateutil=2.8.1=py_0
- python_abi=3.7=1_cp37m
- pytorch=1.6.0=py3.7_cuda10.1.243_cudnn7.6.3_0
- pywavelets=1.1.1=py37h7b6447c_2
- pyyaml=5.4.1=py37h27cfd23_1
- readline=8.0=h7b6447c_0
- requests=2.25.1=pyhd3eb1b0_0
- requests-oauthlib=1.3.0=py_0
- rsa=4.7=pyhd3eb1b0_1
- scikit-image=0.17.2=py37hdf5156a_0
- scikit-learn=0.23.2=py37h0573a6f_0
- scipy=1.6.2=py37h91f5cce_0
- setuptools=52.0.0=py37h06a4308_0
- six=1.15.0=py37h06a4308_0
- snappy=1.1.8=he6710b0_0
- sqlite=3.33.0=h62c20be_0
- tensorboard=2.3.0=pyh4dce500_0
- tensorboard-plugin-wit=1.6.0=py_0
- tensorboardx=2.1=py_0
- threadpoolctl=2.1.0=pyh5ca1d4c_0
- tifffile=2021.3.5=pyhd3eb1b0_1
- tk=8.6.10=hbc83047_0
- toolz=0.11.1=pyhd3eb1b0_0
- torchvision=0.7.0=py37_cu101
- tornado=6.1=py37h4abf009_0
- tqdm=4.55.1=pyhd3eb1b0_0
- traitlets=5.0.5=py_0
- typing-extensions=3.7.4.3=hd3eb1b0_0
- typing_extensions=3.7.4.3=pyh06a4308_0
- urllib3=1.26.3=pyhd3eb1b0_0
- wcwidth=0.2.5=py_0
- werkzeug=1.0.1=pyhd3eb1b0_0
- wheel=0.36.2=pyhd3eb1b0_0
- xz=5.2.5=h7b6447c_0
- yaml=0.2.5=h7b6447c_0
- yarl=1.5.1=py37h7b6447c_0
- zfp=0.5.5=h2531618_4
- zipp=3.4.0=pyhd3eb1b0_0
- zlib=1.2.11=h7b6447c_3
- zstd=1.4.5=h9ceee32_0
- pip:
- cmake==3.21.2
- dlib==19.22.1
- facenet-pytorch==2.5.2
- flake8==3.8.4
- future==0.18.2
- importlib-metadata==3.4.0
- isort==5.7.0
- lpips==0.1.4
- mccabe==0.6.1
- opencv-python==4.5.1.48
- pycodestyle==2.6.0
- pyflakes==2.2.0
- yapf==0.30.0
================================================
FILE: language/accuracy.py
================================================
import torch
def head_accuracy(output, target, unlabeled_value=999):
"""
Computes the precision@k for the specified values of k
output: batch_size * num_cls (for a specific attribute)
target: batch_size * 1 (for a specific attribute)
return res: res = 100 * num_correct / batch_size, for a specific attribute
for a batch
"""
with torch.no_grad():
batch_size = target.size(0)
# _ = the largest score, pred = cls_idx with the largest score
_, pred = output.topk(1, 1, True, True)
pred = pred.reshape(-1)
# acc = float(torch.sum(pred == target)) / float(batch_size) * 100
return_dict = {}
if unlabeled_value is not None:
correct_count = torch.sum(
(target != unlabeled_value) * (pred == target))
labeled_count = torch.sum(target != unlabeled_value)
if labeled_count:
labeled_acc = float(correct_count) / float(labeled_count) * 100
else:
labeled_acc = 0
return_dict['acc'] = labeled_acc
return_dict['labeled_count'] = labeled_count
else:
return_dict['acc'] = acc # noqa
return_dict['labeled_count'] = batch_size
return return_dict
================================================
FILE: language/build_vocab.py
================================================
import argparse
import json
import os
import sys
sys.path.append('.')
from language_utils import * # noqa
"""
Build vocabulary from all instantiated templates
"""
def parse_args():
"""Parses arguments."""
parser = argparse.ArgumentParser(description='Build vocabulary')
parser.add_argument(
'--input_data_path',
required=True,
type=str,
help='path to the input data file')
parser.add_argument(
'--output_dir',
required=True,
type=str,
help='folder to save the output vocabulary file')
return parser.parse_args()
def main():
args = parse_args()
# prepare output directory
if not os.path.isdir(args.output_dir):
os.makedirs(args.output_dir, exist_ok=True)
# load text data
print("Loading text data from", args.input_data_path)
with open(args.input_data_path, 'r') as f:
input_data = json.load(f)
# gather a list of text
print("Building vocabulary from", len(input_data), "text data samples")
text_list = []
for idx, data_sample in enumerate(input_data):
if idx % 10000 == 0:
print('loaded', idx, '/', len(input_data))
text = data_sample['text']
text_list.append(text)
# build vocabulary
text_token_to_idx = build_vocab(text_list=text_list) # noqa
vocab = {
'text_token_to_idx': text_token_to_idx,
}
# save vocabulary
print("Saving vocabulary file to",
os.path.join(args.output_dir, 'vocab.json'))
with open(os.path.join(args.output_dir, 'vocab.json'), 'w') as f:
json.dump(vocab, f, indent=4)
if __name__ == '__main__':
main()
================================================
FILE: language/dataset.py
================================================
import os.path
import numpy as np
from torch.utils.data import Dataset
class EncoderDataset(Dataset):
def __init__(self, preprocessed_dir):
# load text
text_path = os.path.join(preprocessed_dir, 'text.npy')
self.text = np.load(text_path)
# load system_mode
system_mode_path = os.path.join(preprocessed_dir, 'system_mode.npy')
self.system_mode = np.load(system_mode_path)
# load labels
labels_path = os.path.join(preprocessed_dir, 'labels.npy')
self.labels = np.load(labels_path)
def __getitem__(self, index):
# retrieve text
text = self.text[index]
# retrieve system_mode
system_mode = self.system_mode[index]
# retrieve labels
labels = self.labels[index]
return text, system_mode, labels
def __len__(self):
return len(self.text)
def main():
""" Testing the Dataset"""
encoderdataset = EncoderDataset(
preprocessed_dir= # noqa
'' # noqa
)
print('len(encoderdataset):', len(encoderdataset))
print('encoderdataset[0]:', encoderdataset[0])
if __name__ == '__main__':
main()
================================================
FILE: language/generate_feedback.py
================================================
import argparse
import json
import os.path
import random
import numpy as np
from .language_utils import proper_capitalize
def parse_args():
"""Parses arguments."""
parser = argparse.ArgumentParser(description='')
parser.add_argument(
'--feedback_templates_file',
default='./templates/feedback.json',
type=str,
help='directory to the request templates file')
parser.add_argument(
'--pool_file',
default='./templates/pool.json',
type=str,
help='directory to the word pool file')
parser.add_argument(
'--num_feedback',
default=100,
type=int,
help='number of feedback data to generate')
parser.add_argument(
'--output_file_dir',
required=True,
type=str,
help='folder to save the output request file')
parser.add_argument(
'--output_file_name',
required=True,
type=str,
help='name of the output request file')
parser.add_argument(
'--whether_enough_general_prob',
default=0.2,
type=float,
help='probability of using general templates in whether_enough mode')
return parser.parse_args()
def main():
args = parse_args()
if not os.path.isdir(args.output_file_dir):
os.makedirs(args.output_file_dir, exist_ok=True)
# load template files
print('loading template files')
with open(args.feedback_templates_file, 'r') as f:
args.feedback_templates = json.load(f)
args.feedback_replacement = args.feedback_templates['replacement']
with open(args.pool_file, 'r') as f:
pool = json.load(f)
args.synonyms_dict = pool["synonyms"]
system_mode_list = ['whats_next', 'whether_enough', 'suggestion']
attribute_list = ['Bangs', "Eyeglasses", "No_Beard", "Smiling", "Young"]
feedback_list = []
output_txt = []
# instantiate feedback
for index in range(args.num_feedback):
if index % 1000 == 0:
print('generated', index, '/', args.num_feedback, 'feedback')
# initialize feedback parameters
attribute = None
# randomly choose the feedback parameters
system_mode = random.choice(system_mode_list)
if system_mode == 'whether_enough' or system_mode == 'suggestion':
attribute = random.choice(attribute_list)
feedback = instantiate_feedback(
args, system_mode=system_mode, attribute=attribute)
feedback['index'] = index
feedback_list.append(feedback)
output_txt.append(feedback['text'])
# save feedback dataset
with open(os.path.join(args.output_file_dir, args.output_file_name),
'w') as f:
json.dump(feedback_list, f, indent=4)
np.savetxt(
os.path.join(args.output_file_dir, "feedback.txt"),
output_txt,
fmt='%s',
delimiter='\t')
print('successfully saved.')
def instantiate_feedback(args,
system_mode=None,
attribute=None,
exception_mode='normal'):
"""
Given the feedback mode (i.e. system_mode) and the attribute (if any),
return a feedback.
"""
if exception_mode != 'normal':
candidate_templates = args.feedback_templates[exception_mode]
template = random.choice(candidate_templates)
attribute = attribute
else:
# ---------- STEP 1: 1st part of feedback: 'ok' template ----------
# instantiate the feedback prefix like "ok"
ok_distribution_prob = random.uniform(0, 1)
ok_template = ''
if ok_distribution_prob < 0.7:
ok_templates = args.feedback_templates['ok']
for idx, templates in enumerate(ok_templates):
if 0.3 < ok_distribution_prob < 0.7 and (idx == 0 or idx == 1):
continue
ok_template += random.choice(templates)
ok_template += ' '
ok_template = ok_template[0].capitalize() + ok_template[1:]
# ---------- STEP 2: 2nd part of feedback: content template ----------
# feedback is trivial like "what's next?"
if system_mode == 'whats_next':
candidate_templates = args.feedback_templates['whats_next']
template = random.choice(candidate_templates)
# feedback asks whether the editing extent is enough
elif system_mode == 'whether_enough':
whether_enough_general_prob = random.uniform(0, 1)
if whether_enough_general_prob < args.whether_enough_general_prob \
or args.feedback_templates[
'whether_enough'][attribute] == []:
candidate_templates = args.feedback_templates[
'whether_enough']['general']
else:
candidate_templates = args.feedback_templates[
'whether_enough'][attribute]
template = random.choice(candidate_templates)
# feedback provides suggestion on the next edit
elif system_mode == 'suggestion':
candidate_templates = args.feedback_templates['suggestion']
template = random.choice(candidate_templates)
else:
raise KeyError('System mode "%s" not recognized' % system_mode)
# ---------- STEP 3: Postprocess the instantiated template sentence ---------- # noqa
# replace the <xxx> in the template with
# proper attribute-specific words.
# this is not applicable to 'whats_next' type of feedback
if system_mode != 'whats_next':
for word in args.feedback_replacement:
new_word_dict = args.feedback_replacement[word]
new_word = new_word_dict[attribute]
template = template.replace(word, new_word)
# to lower case
template = template.lower()
# randomly replace words with synonyms
for word in args.synonyms_dict:
replacing_word = random.choice(args.synonyms_dict[word])
template = template.replace(word, replacing_word)
# capitalize
template = proper_capitalize(template)
if exception_mode != 'normal':
# after given feedback of cannot_edit
# encode user request by pretending that
# the system_mode is 'whats_next'
system_mode = 'whats_next'
else:
template = ok_template + template
# ---------- STEP 4: Return the feedback and its annotations ----------
feedback = {
"text": template,
"system_mode": system_mode,
"attribute": attribute
}
return feedback
if __name__ == '__main__':
main()
================================================
FILE: language/generate_training_request.py
================================================
import argparse
import json
import os.path
import random
import sys
sys.path.append('.')
from language_utils import proper_capitalize # noqa
def parse_args():
"""Parses arguments."""
parser = argparse.ArgumentParser(description='')
parser.add_argument(
'--num_request',
default=100,
type=int,
help='number of request data to generate')
# template files
parser.add_argument(
'--user_templates_file',
type=str,
default='./templates/user_fsm.json',
help='directory to the request templates file')
parser.add_argument(
'--pool_file',
type=str,
default='./templates/pool.json',
help='directory to the word pool file')
parser.add_argument(
'--metadata_file',
type=str,
default='./templates/metadata_fsm.json',
help='directory to the metadata file')
parser.add_argument(
'--system_mode_file',
type=str,
default='./templates/system_mode.json',
help='directory to the system_mode file')
# output
parser.add_argument(
'--output_file_dir',
required=True,
type=str,
help='folder to save the output request file')
return parser.parse_args()
def main():
args = parse_args()
if not os.path.isdir(args.output_file_dir):
os.makedirs(args.output_file_dir, exist_ok=False)
# load template files
print('loading template files')
with open(args.user_templates_file, 'r') as f:
args.user_templates = json.load(f)
with open(args.pool_file, 'r') as f:
pool = json.load(f)
args.synonyms_dict = pool["synonyms"]
args.postfix_list = pool["postfix"]
with open(args.metadata_file, 'r') as f:
args.metadata = json.load(f)
with open(args.system_mode_file, 'r') as f:
args.system_mode_dict = json.load(f)
args.system_mode_list = []
for key, value in args.system_mode_dict.items():
args.system_mode_list.append(key)
attribute_list = ['Bangs', "Eyeglasses", "No_Beard", "Smiling", "Young"]
target_score_list = [0, 1, 2, 3, 4, 5]
score_change_direction_list = ['positive', 'negative']
score_change_value_list = [1, 2, 3, 4, 5]
request_list = []
# instantiate requests
for index in range(args.num_request):
if index % 1000 == 0:
print('generated', index, '/', args.num_request, 'requests')
# randomly choose the semantic editing parameters
system_mode = random.choice(args.system_mode_list)
user_mode_list = list(args.metadata[system_mode].keys())
user_mode = random.choice(user_mode_list)
attribute = random.choice(attribute_list)
score_change_value = random.choice(score_change_value_list)
score_change_direction = random.choice(score_change_direction_list)
target_score = random.choice(target_score_list)
# instantiate a request according to the
# chosen semantic editing parameters
request = instantiate_training_request(
args,
attribute=attribute,
user_mode=user_mode,
score_change_direction=score_change_direction,
score_change_value=score_change_value,
target_score=target_score)
request['system_mode'] = system_mode
# assign each system_mode's user_mode
for mode in args.system_mode_list:
if system_mode == mode:
request[mode] = request['user_mode']
else:
request[mode] = None
request['index'] = index
request_list.append(request)
# save request dataset
if not os.path.isdir(args.output_file_dir):
os.makedirs(args.output_file_dir, exist_ok=True)
with open(
os.path.join(args.output_file_dir, 'training_request.json'),
'w') as f:
json.dump(request_list, f, indent=4)
print('successfully saved.')
def instantiate_training_request(
args,
attribute=None,
user_mode=None,
score_change_direction=None,
score_change_value=None,
target_score=None,
):
"""
Given semantic editing parameters, instantiate the request
using the request templates.
"""
request_mode = None
instantiated_sentence = ''
user_sub_mode_list = user_mode.split('_')
for user_sub_mode_idx, user_sub_mode in enumerate(user_sub_mode_list):
sub_mode_template = ''
if user_sub_mode != 'pureRequest':
sub_mode_templates = args.user_templates[user_sub_mode]
for templates in sub_mode_templates:
sub_mode_template += random.choice(templates)
else:
request_mode = random.choice(
['target', 'change_definite', 'change_indefinite'])
request_templates = args.user_templates['pureRequest']
attribute_templates = request_templates[attribute]
# request is the score change direction and value
if request_mode == 'change_definite':
assert score_change_direction is not None
assert score_change_value is not None
target_score = None
candidate_templates = attribute_templates['change'][
score_change_direction]['definite'][str(
score_change_value)]
# request is the score change direction without value
elif request_mode == 'change_indefinite':
assert score_change_direction is not None
score_change_value = None
target_score = None
candidate_templates = attribute_templates['change'][
score_change_direction]['indefinite']
# request is the edit target
elif request_mode == 'target':
score_change_direction = None
score_change_value = None
assert target_score is not None
candidate_templates = attribute_templates['target'][str(
target_score)]
else:
raise KeyError('Request mode "%s" not recognized' %
request_mode)
# randomly choose one request template
sub_mode_template = random.choice(candidate_templates)
if user_sub_mode_idx >= 1:
instantiated_sentence += ' '
instantiated_sentence += sub_mode_template
if 'pureRequest' not in user_sub_mode_list:
score_change_direction = None
score_change_value = None
target_score = None
attribute = None
# to lower case
instantiated_sentence = instantiated_sentence.lower()
# randomly replace words with synonyms
for word in args.synonyms_dict:
new_word = random.choice(args.synonyms_dict[word])
instantiated_sentence = instantiated_sentence.replace(word, new_word)
# capitalize
instantiated_sentence = proper_capitalize(instantiated_sentence)
request = {
"text": instantiated_sentence,
"user_mode": user_mode,
'request_mode': request_mode,
"attribute": attribute,
"score_change_direction": score_change_direction,
"score_change_value": score_change_value,
"target_score": target_score,
}
return request
if __name__ == '__main__':
main()
================================================
FILE: language/language_utils.py
================================================
import numpy as np
import torch
# global variables
PUNCTUATION_TO_KEEP = ['?', ';']
PUNCTUATION_TO_REMOVE = ['.', '!', ',']
SPECIAL_TOKENS = {
'<NULL>': 0,
'<START>': 1,
'<END>': 2,
'<UNK>': 3,
}
def build_vocab(text_list,
min_token_count=1,
delimiter=' ',
punct_to_keep=None,
punct_to_remove=None,
print_every=10000):
"""
Build token to index mapping from a list of text strings
-- Input: a list of text string
-- Output: a dict which is a mapping from token to index,
"""
token_to_count = {}
# tokenize text and add tokens to token_to_count dict
for text_idx, text in enumerate(text_list):
if text_idx % print_every == 0:
print('tokenized', text_idx, '/', len(text_list))
text_tokens = tokenize(text=text, delimiter=delimiter)
for token in text_tokens:
if token in token_to_count:
token_to_count[token] += 1
else:
token_to_count[token] = 1
token_to_idx = {}
print('Mapping tokens to indices')
# reserve indices for special tokens (must-have tokens)
for token, idx in SPECIAL_TOKENS.items():
token_to_idx[token] = idx
# assign indices to tokens
for token, count in sorted(token_to_count.items()):
if count >= min_token_count:
token_to_idx[token] = len(token_to_idx)
return token_to_idx
def tokenize(text,
delimiter=' ',
add_start_token=False,
add_end_token=False,
punctuation_to_keep=PUNCTUATION_TO_KEEP,
punctuation_to_remove=PUNCTUATION_TO_REMOVE):
"""
Tokenize a text string
-- Input: a text string
-- Output: a list of tokens,
each token is still a string (usually an english word)
"""
# (1) Optionally keep or remove certain punctuation
if punctuation_to_keep is not None:
for punctuation in punctuation_to_keep:
text = text.replace(punctuation, '%s%s' % (delimiter, punctuation))
if punctuation_to_remove is not None:
for punctuation in punctuation_to_remove:
text = text.replace(punctuation, '')
# (2) Split the text string into a list of tokens
text = text.lower()
tokens = text.split(delimiter)
# (3) Optionally add start and end tokens
if add_start_token:
tokens.insert(0, '<START>')
if add_end_token:
tokens.append('<END>')
return tokens
def encode(text_tokens, token_to_idx, allow_unk=False):
text_encoded = []
for token in text_tokens:
if token not in token_to_idx:
if allow_unk:
token = '<UNK>'
else:
raise KeyError('Token "%s" not in vocab' % token)
text_encoded.append(token_to_idx[token])
return text_encoded
def decode(seq_idx, idx_to_token, delim=None, stop_at_end=True):
tokens = []
for idx in seq_idx:
tokens.append(idx_to_token[idx])
if stop_at_end and tokens[-1] == '<END>':
break
if delim is None:
return tokens
else:
return delim.join(tokens)
def reverse_dict(input_dict):
reversed_dict = {}
for key in input_dict.keys():
val = input_dict[key]
reversed_dict[val] = key
return reversed_dict
def to_long_tensor(dset):
arr = np.asarray(dset, dtype=np.int64)
tensor = torch.LongTensor(arr)
return tensor
def proper_capitalize(text):
if len(text) > 0:
text = text.lower()
text = text[0].capitalize() + text[1:]
for idx, char in enumerate(text):
if char in ['.', '!', '?'] and (idx + 2) < len(text):
text = text[:idx + 2] + text[idx + 2].capitalize() + text[idx +
3:]
text = text.replace(' i ', ' I ')
text = text.replace(',i ', ',I ')
text = text.replace('.i ', '.I ')
text = text.replace('!i ', '!I ')
return text
================================================
FILE: language/lstm.py
================================================
"""
LSTM
Input: batch_size x max_text_length (tokenized questions)
Output: batch_size x lstm_hidden_size (question embedding)
Details:
Tokenized text are first word-embedded (300-D), then passed to
2-layer LSTM, where each cell has is 1024-D. For each text,
output the hidden state of the last non-null token.
"""
from __future__ import print_function
import json
import torch
import torch.nn as nn
from torch.autograd import Variable
class Encoder(nn.Module):
def __init__(self,
token_to_idx,
word_embedding_dim=300,
text_embed_size=1024,
metadata_file='./templates/metadata_fsm.json',
linear_hidden_size=256,
linear_dropout_rate=0):
super(Encoder, self).__init__()
# LSTM (shared)
self.lstm = LSTM(
token_to_idx=token_to_idx,
word_embedding_dim=word_embedding_dim,
lstm_hidden_size=text_embed_size)
# classifiers (not shared)
with open(metadata_file, 'r') as f:
self.metadata = json.load(f)
self.classifier_names = []
for idx, (key, val) in enumerate(self.metadata.items()):
num_val = len(val.items())
classifier_name = key
self.classifier_names.append(classifier_name)
setattr(
self, classifier_name,
nn.Sequential(
fc_block(text_embed_size, linear_hidden_size,
linear_dropout_rate),
nn.Linear(linear_hidden_size, num_val)))
def forward(self, text):
# LSTM (shared)
# Input: batch_size x max_text_length
# Output: batch_size x text_embed_size
text_embedding = self.lstm(text)
# classifiers (not shared)
output = []
for classifier_name in self.classifier_names:
classifier = getattr(self, classifier_name)
output.append(classifier(text_embedding))
return output
class LSTM(nn.Module):
def __init__(self,
token_to_idx,
word_embedding_dim=300,
lstm_hidden_size=1024,
lstm_num_layers=2,
lstm_dropout=0):
super(LSTM, self).__init__()
# token
self.token_to_idx = token_to_idx
self.NULL = token_to_idx['<NULL>']
self.START = token_to_idx['<START>']
self.END = token_to_idx['<END>']
# word embedding
self.word2vec = nn.Embedding(
num_embeddings=len(token_to_idx), embedding_dim=word_embedding_dim)
# LSTM
self.rnn = nn.LSTM(
input_size=word_embedding_dim,
hidden_size=lstm_hidden_size,
num_layers=lstm_num_layers,
bias=True,
batch_first=True,
dropout=lstm_dropout,
bidirectional=False)
def forward(self, x):
batch_size, max_text_length = x.size()
# Find the last non-null element in each sequence, store in idx
idx = torch.LongTensor(batch_size).fill_(max_text_length - 1)
x_cpu = x.data.cpu()
for text_idx in range(batch_size):
for token_idx in range(max_text_length - 1):
if (x_cpu[text_idx, token_idx] != self.NULL
) and x_cpu[text_idx, token_idx + 1] == self.NULL: # noqa
idx[text_idx] = token_idx
break
idx = idx.type_as(x.data).long()
idx = Variable(idx, requires_grad=False)
# reduce memory access time
self.rnn.flatten_parameters()
# hs: all hidden states
# [batch_size x max_text_length x hidden_size]
# h_n: [2 x batch_size x hidden_size]
# c_n: [2 x batch_size x hidden_size]
hidden_states, (_, _) = self.rnn(self.word2vec(x))
idx = idx.view(batch_size, 1, 1).expand(batch_size, 1,
hidden_states.size(2))
hidden_size = hidden_states.size(2)
# only retrieve the hidden state of the last non-null element
# [batch_size x 1 x hidden_size]
hidden_state_at_last_token = hidden_states.gather(1, idx)
# [batch_size x hidden_size]
hidden_state_at_last_token = hidden_state_at_last_token.view(
batch_size, hidden_size)
return hidden_state_at_last_token
class fc_block(nn.Module):
def __init__(self, inplanes, planes, drop_rate=0.15):
super(fc_block, self).__init__()
self.fc = nn.Linear(inplanes, planes)
self.bn = nn.BatchNorm1d(planes)
if drop_rate > 0:
self.dropout = nn.Dropout(drop_rate)
self.relu = nn.ReLU(inplace=True)
self.drop_rate = drop_rate
def forward(self, x):
x = self.fc(x)
x = self.bn(x)
if self.drop_rate > 0:
x = self.dropout(x)
x = self.relu(x)
return x
def main():
""" Test Code """
# ################### LSTM #########################
question_token_to_idx = {
".": 4,
"missing": 34,
"large": 28,
"is": 26,
"cubes": 19,
"cylinder": 21,
"what": 54,
"<START>": 1,
"green": 24,
"<END>": 2,
"object": 35,
"things": 51,
"<UNK>": 3,
"matte": 31,
"rubber": 41,
"tiny": 52,
"yellow": 55,
"red": 40,
"visible": 53,
"color": 17,
"size": 44,
"balls": 11,
"the": 48,
"any": 8,
"blocks": 14,
"ball": 10,
"a": 6,
"it": 27,
"an": 7,
"one": 38,
"purple": 39,
"how": 25,
"thing": 50,
"?": 5,
"objects": 36,
"blue": 15,
"block": 13,
"small": 45,
"shiny": 43,
"material": 30,
"cylinders": 22,
"<NULL>": 0,
"many": 29,
"of": 37,
"cube": 18,
"metallic": 33,
"gray": 23,
"brown": 16,
"spheres": 47,
"there": 49,
"sphere": 46,
"shape": 42,
"are": 9,
"metal": 32,
"cyan": 20,
"big": 12
},
batch_size = 64
print('batch size:', batch_size)
# questions=torch.ones(batch_size, 15, dtype=torch.long)
questions = torch.randint(0, 10, (batch_size, 15), dtype=torch.long)
print('intput size:', questions.size())
lstm = LSTM(token_to_idx=question_token_to_idx[0])
output = lstm(questions)
print('output size:', output.size())
# ################### Language Encoder #########################
encoder = Encoder(
token_to_idx=question_token_to_idx[0],
metadata_file='./templates/metadata_fsm.json')
output = encoder(questions)
print('output length:', len(output))
for classifier in output:
print('classifier.size():', classifier.size())
if __name__ == '__main__':
main()
================================================
FILE: language/preprocess_request.py
================================================
import argparse
import json
import os
import sys
import numpy as np
sys.path.append('.')
from language_utils import * # noqa
"""
Preprocess the text
"""
def parse_args():
"""Parses arguments."""
parser = argparse.ArgumentParser()
parser.add_argument(
'--input_vocab_path',
required=True,
type=str,
help='path to the input vocabulary file')
parser.add_argument(
'--input_data_path',
required=True,
type=str,
help='path to the input data file')
parser.add_argument(
'--metadata_file',
type=str,
default='./templates/metadata_fsm.json',
help='directory to the metadata file')
parser.add_argument(
'--system_mode_file',
type=str,
default='./templates/system_mode.json',
help='directory to the system_mode file')
parser.add_argument(
'--allow_unknown',
default=0,
type=int,
help='whether allow unknown tokens (i.e. words)')
parser.add_argument(
'--expand_vocab',
default=0,
type=int,
help='whether expand vocabularies')
parser.add_argument(
'--output_dir',
required=True,
type=str,
help='folder to save the output vocabulary file')
parser.add_argument(
'--unlabeled_value',
default=999,
type=int,
help='value to represent unlabeled value')
return parser.parse_args()
def main():
args = parse_args()
if not os.path.isdir(args.output_dir):
os.makedirs(args.output_dir, exist_ok=False)
# load vocabulary
print("Loading vocab")
with open(args.input_vocab_path, 'r') as f:
vocab = json.load(f)
text_token_to_idx = vocab['text_token_to_idx']
# load metadata file
with open(args.metadata_file, 'r') as f:
metadata = json.load(f)
# load system_mode file
with open(args.system_mode_file, 'r') as f:
system_mode_file = json.load(f)
# load input data
with open(args.input_data_path, 'r') as f:
input_data = json.load(f)
# initialize lists to store encoded data
text_encoded_list = []
system_mode_encoded_list = []
labels_encoded_list = []
print('Encoding')
for idx, data_sample in enumerate(input_data):
# encode text
text = data_sample['text']
text_tokens = tokenize(text=text) # noqa
text_encoded = encode( # noqa
text_tokens=text_tokens,
token_to_idx=text_token_to_idx,
allow_unk=args.allow_unknown)
text_encoded_list.append(text_encoded)
# encode system_mode
system_mode = data_sample['system_mode']
system_mode_encoded = system_mode_file[system_mode]
system_mode_encoded_list.append(system_mode_encoded)
# encode labels
labels_encoded = []
for idx, (key, val) in enumerate(metadata.items()):
label = data_sample[key]
if label is None:
# use args.unlabeled_value to represent missing labels
label_encoded = args.unlabeled_value
else:
label_encoded = val[str(label)]
labels_encoded.append(label_encoded)
labels_encoded_list.append(labels_encoded)
# Pad encoded text to equal length
print('Padding tokens')
text_encoded_padded_list = []
max_text_length = max(len(text) for text in text_encoded_list)
for text_encoded in text_encoded_list:
while len(text_encoded) < max_text_length:
text_encoded.append(text_token_to_idx['<NULL>'])
text_encoded_padded_list.append(text_encoded)
# save processed text
np.save(
os.path.join(args.output_dir, 'text.npy'), text_encoded_padded_list)
np.savetxt(
os.path.join(args.output_dir, 'text.txt'),
text_encoded_padded_list,
fmt='%.0f')
# save processed system_mode
np.save(
os.path.join(args.output_dir, 'system_mode.npy'),
system_mode_encoded_list)
np.savetxt(
os.path.join(args.output_dir, 'system_mode.txt'),
system_mode_encoded_list,
fmt='%.0f')
# save processed labels
np.save(os.path.join(args.output_dir, 'labels.npy'), labels_encoded_list)
np.savetxt(
os.path.join(args.output_dir, 'labels.txt'),
labels_encoded_list,
fmt='%.0f')
if __name__ == '__main__':
main()
================================================
FILE: language/run_encoder.py
================================================
import argparse
import json
import random
import torch
from .language_utils import * # noqa
from .lstm import Encoder
def parse_args():
"""Parses arguments."""
parser = argparse.ArgumentParser()
parser.add_argument(
'--input_vocab_file',
required=True,
type=str,
help='path to the input vocabulary file')
parser.add_argument(
'--allow_unknown',
default=1,
type=int,
help='whether allow unknown tokens (i.e. words)')
parser.add_argument(
'--pretrained_checkpoint',
default='',
type=str,
help='The pretrained network weights for testing')
parser.add_argument(
'--metadata_file',
default='./templates/metadata_fsm.json',
type=str,
help='path to metadata file.')
parser.add_argument(
'--system_mode_file',
default='./templates/system_mode.json',
type=str,
help='path to system_mode file.')
parser.add_argument(
'--device_name',
default='gpu',
type=str,
)
parser.add_argument(
'--verbose',
default=0,
type=int,
)
# LSTM hyperparameter
parser.add_argument('--word_embedding_dim', default=300, type=int)
parser.add_argument('--text_embed_size', default=1024, type=int)
parser.add_argument('--linear_hidden_size', default=256, type=int)
parser.add_argument('--linear_dropout_rate', default=0, type=float)
return parser.parse_args()
def main():
args = parse_args()
encode_request(args)
def encode_request(args, system_mode=None, dialog_logger=None):
# set up
if args.device_name == 'cpu':
args.device = torch.device('cpu')
elif args.device_name == 'gpu':
args.device = torch.device('cuda')
if dialog_logger is None:
output_function = print
else:
# output_function = dialog_logger.info
def output_function(input):
# suppress output when called by other scripts
pass
return
compulsory_output_function = dialog_logger.info
# ---------------- STEP 1: Input the Request ----------------
# choose system_mode
with open(args.system_mode_file, 'r') as f:
system_mode_dict = json.load(f)
system_mode_list = []
for (mode, mode_idx) in system_mode_dict.items():
system_mode_list.append(mode)
if __name__ == '__main__':
assert system_mode is None
system_mode = random.choice(system_mode_list)
output_function(' PREDEFINED system_mode:', system_mode)
else:
assert system_mode is not None
# input request
if True:
compulsory_output_function(
'Enter your request (Press enter when you finish):')
input_text = input()
else:
input_text = 'make the bangs slightly longer.'
compulsory_output_function('USER INPUT >>> ' + input_text)
# ---------------- STEP 2: Preprocess Request ----------------
# output_function(" The system is trying to understand your request:")
# output_function(" ########################################")
# load vocabulary
with open(args.input_vocab_file, 'r') as f:
vocab = json.load(f)
text_token_to_idx = vocab['text_token_to_idx']
text_tokens = tokenize(text=input_text) # noqa
text_encoded = encode( # noqa
text_tokens=text_tokens,
token_to_idx=text_token_to_idx,
allow_unk=args.allow_unknown)
text_encoded = to_long_tensor([text_encoded]).to(args.device) # noqa
# ---------------- STEP 3: Encode Request ----------------
# prepare encoder
encoder = Encoder(
token_to_idx=text_token_to_idx,
word_embedding_dim=args.word_embedding_dim,
text_embed_size=args.text_embed_size,
metadata_file=args.metadata_file,
linear_hidden_size=args.linear_hidden_size,
linear_dropout_rate=args.linear_dropout_rate)
encoder = encoder.to(args.device)
checkpoint = torch.load(args.pretrained_checkpoint)
encoder.load_state_dict(checkpoint['state_dict'], True)
encoder.eval()
# forward pass
output = encoder(text_encoded)
# ---------------- STEP 4: Process Encoder Output ----------------
output_labels = []
for head_idx in range(len(output)):
_, pred = torch.max(output[head_idx], 1)
head_label = pred.cpu().numpy()[0]
output_labels.append(head_label)
# load metadata file
with open(args.metadata_file, 'r') as f:
metadata = json.load(f)
# find mapping from value to label
reversed_metadata = {}
for idx, (key, val) in enumerate(metadata.items()):
reversed_val = reverse_dict(val) # noqa
reversed_metadata[key] = reversed_val
if args.verbose:
output_function('reversed_metadata:', reversed_metadata)
# convert predicted values to a dict of predicted labels
output_semantic_labels = {} # from LSTM output
valid_semantic_labels = {} # useful information among LSTM output
for idx, (key, val) in enumerate(reversed_metadata.items()):
output_semantic_labels[key] = val[output_labels[idx]]
valid_semantic_labels[key] = None
if args.verbose:
output_function('output_semantic_labels:', output_semantic_labels)
# extract predicted labels
user_mode = output_semantic_labels[system_mode]
valid_semantic_labels[system_mode] = user_mode
request_mode = output_semantic_labels['request_mode']
attribute = output_semantic_labels['attribute']
score_change_direction = output_semantic_labels['score_change_direction']
if output_semantic_labels['score_change_value'] is None:
score_change_value = None
else:
score_change_value = int(output_semantic_labels['score_change_value'])
if output_semantic_labels['target_score'] is None:
target_score = None
else:
target_score = int(output_semantic_labels['target_score'])
# print to screen
output_function(' ENCODED user_mode:' + ' ' + user_mode)
valid_semantic_labels['user_mode'] = user_mode
if 'pureRequest' in user_mode:
output_function(' ENCODED request_mode: ' + ' ' + request_mode)
valid_semantic_labels['request_mode'] = request_mode
output_function(' ENCODED attribute:' + ' ' + attribute)
valid_semantic_labels['attribute'] = attribute
# only output_function labels valid for this request_mode
if request_mode == 'change_definite':
output_function(' ENCODED score_change_direction:' + ' ' +
(score_change_direction))
valid_semantic_labels[
'score_change_direction'] = score_change_direction
output_function(' ENCODED score_change_value:' + ' ' +
str(score_change_value))
valid_semantic_labels['score_change_value'] = score_change_value
elif request_mode == 'change_indefinite':
output_function(' ENCODED score_change_direction:' + ' ' +
score_change_direction)
valid_semantic_labels[
'score_change_direction'] = score_change_direction
elif request_mode == 'target':
output_function(' ENCODED target_score:' + ' ' +
str(target_score))
valid_semantic_labels['target_score'] = target_score
valid_semantic_labels['text'] = input_text
if args.verbose:
output_function('valid_semantic_labels:' + ' ' +
str(valid_semantic_labels))
# output_function(" ########################################")
return valid_semantic_labels
if __name__ == '__main__':
main()
================================================
FILE: language/templates/attr_wise_caption_templates.json
================================================
{
"Bangs": {
"0": [
"<He> has no bangs at all.",
"<He> has no bangs at all and <his> forehead is visible.",
"<He> doesn't have any bangs.",
"<He> doesn't have any bangs and <his> forehead is visible.",
"<his> entire forehead is visible.",
"<his> entire forehead is visible without any bangs.",
"<He> shows <his> entire forehead without any bangs."
],
"1": [
"<He> has very short bangs which only covers a tiny portion of <his> forehead."
],
"2": [
"<He> has short bangs that covers a small portion of <his> forehead.",
"<He> has short bangs that only covers a small portion of <his> forehead."
],
"3": [
"<He> has medium bangs that covers half of <his> forehead.",
"<He> has bangs of medium length that covers half of <his> forehead.",
"<He> has bangs of medium length that leaves half of <his> forehead visible."
],
"4": [
"<He> has long bangs that almost covers all of <his> forehead.",
"<He> has long bangs that almost covers This entire forehead."
],
"5": [
"<He> has extremely long bangs that almost covers all of <his> forehead.",
"<He> has extremely long bangs that almost covers This entire forehead."
]
},
"Eyeglasses": {
"0": [
"<He> is not wearing any eyeglasses.",
"There is not any eyeglasses on <his> face."
],
"1": [
"<He> is wearing rimless eyeglasses."
],
"2": [
"<He> is wearing eyeglasses with thin frame.",
"<He> is wearing thin frame eyeglasses."
],
"3": [
"<He> is wearing eyeglasses with thick frame.",
"<He> is wearing thick frame eyeglasses."
],
"4": [
"<He> is wearing sunglasses with thin frame.",
"<He> is wearing thin frame sunglasses."
],
"5": [
"<He> is wearing sunglasses with thick frame.",
"<He> is wearing thick frame sunglasses."
]
},
"No_Beard": {
"0": [
"<He> doesn't have any beard.",
"<He> doesn't have any beard at all."
],
"1": [
"<his> face is covered with short pointed beard.",
"<his> face is covered with his stubble.",
"<his> face has a rough growth of stubble.",
"<He> has a rough growth of stubble.",
"There should be stubble covering <his> cheeks and chin."
],
"2": [
"<his> face is covered with short beard."
],
"3": [
"<his> face is covered with beard of medium length.",
"<He> has beard of medium length."
],
"4": [
"<He> has a big mustache on his face.",
"<his> has a bushy beard."
],
"5": [
"<his> has very long beard.",
"<his> has full beard.",
"<his> has very thick beard.",
"<his> has a very bushy beard."
]
},
"Smiling": {
"0": [
"<He> looks serious with no smile in <his> face."
],
"1": [
"<He> smiles with corners of the mouth turned up.",
"<He> smiles with corners of <his> mouth turned up.",
"<He> turns up the corners of <his> mouth."
],
"2": [
"This corners of <his> mouth curve up and we can see some teeth.",
"<He> smiles broadly and shows some teeth."
],
"3": [
"The entire face of this <man> is beamed with happiness.",
"<He> has a beaming face.",
"<He> is smiling with <his> teeth visible.",
"<his> entire face is beamed with happiness."
],
"4": [
"<He> has a big smile.",
"<He> has a big smile on <his> face.",
"<He> is smiling with <his> mouth slightly open.",
"<He> is smiling with <his> mouth slightly open and teeth visible."
],
"5": [
"This <man> in the image is laughing happily.",
"<He> has a deep rumbling laugh.",
"<He> has a very big smile.",
"<He> has a very big smile on <his> face.",
"<He> is smiling with <his> mouth wide open.",
"<He> is smiling with <his> mouth wide open and teeth visible."
]
},
"Young": {
"0": [
"This is a young kid.",
"This is a young child."
],
"1": [
"<He> is a teenager.",
"<He> looks very young."
],
"2": [
"<He> is a young adult.",
"<He> is in <his> thirties."
],
"3": [
"<He> is in <his> forties.",
"<He> is in <his> middle age."
],
"4": [
"<He> is in <his> sixties.",
"<He> is in <his> fifties.",
"<He> looks like an elderly."
],
"5": [
"<He> is in <his> eighties.",
"This old <man> is in <his> eighties.",
"<He> is in <his> seventies.",
"This old <man> is in <his> seventies.",
"<He> looks very old."
]
}
}
================================================
FILE: language/templates/feedback.json
================================================
{
"replacement": {
"<ATTR_NAME>": {
"Bangs": "bangs",
"Eyeglasses": "glasses",
"No_Beard": "beard",
"Smiling": "smile",
"Young": "age"
},
"<ATTR_PRONOUN>": {
"Bangs": "them",
"Eyeglasses": "them",
"No_Beard": "it",
"Smiling": "it",
"Young": "it"
},
"<ATTR_BE>": {
"Bangs": "are",
"Eyeglasses": "are",
"No_Beard": "is",
"Smiling": "is",
"Young": "is"
},
"<ATTR_DEGREE>": {
"Bangs": "length",
"Eyeglasses": "style",
"No_Beard": "shape",
"Smiling": "degree",
"Young": "level"
}
},
"suggestion": [
"Do you want to try manipulating the <ATTR_NAME>?",
"Do you want to try manipulating the <ATTR_NAME> instead?",
"Do you want to try manipulating the <ATTR_NAME> as well?",
"Do you want to try editing the <ATTR_NAME>?",
"Do you want to try editing the <ATTR_NAME> instead?",
"Do you want to try editing the <ATTR_NAME> as well?",
"What about the <ATTR_NAME>? Do you want to play with <ATTR_PRONOUN>?",
"Do you want to play with the <ATTR_NAME>?",
"What about the <ATTR_NAME>? Do you want to edit <ATTR_PRONOUN>?",
"Do you want to edit the <ATTR_NAME>?",
"What about the <ATTR_NAME>? Do you want to manipulate <ATTR_PRONOUN>?",
"Do you want to manipulate the <ATTR_NAME>?"
],
"whether_enough": {
"general": [
"Is this enough?",
"Is this good enough?",
"<ATTR_BE> the <ATTR_NAME> just right now?",
"<ATTR_BE> the <ATTR_NAME> what you want now?",
"<ATTR_BE> the <ATTR_NAME> of the person just right now?",
"<ATTR_BE> the <ATTR_NAME> of the person what you want now?",
"<ATTR_BE> the <ATTR_NAME> of proper degree now?",
"<ATTR_BE> the <ATTR_DEGREE> of the <ATTR_NAME> ok now?",
"<ATTR_BE> the <ATTR_DEGREE> of the <ATTR_NAME> okay now?"
],
"Bangs": [
"Are the bangs in proper shape now?",
"Is the length of the bangs ok now?"
],
"Eyeglasses": [],
"No_Beard": [],
"Smiling": [],
"Young": [
"Is the age of the person ok now?"
]
},
"whats_next": [
"What's next?",
"What else do you want to play with?",
"What else do you want to manipulate?",
"What else do you want to edit?",
"What else do you want to change?",
"What else do you want to try?"
],
"ok": [
[
"Okay",
"Ok",
"Well",
"Okie"
],
[
" ",
", "
],
[
"done.",
"it's done.",
"bingo.",
"finished.",
"that's it.",
"this is it."
]
],
"max_edit_num_reached": [
"It is infeasible to edit this attribute. Let's try another attribute.",
"We cannot edit this attribute. Let's try something else.",
"Oops, it is hard to edit this attribute. Let's try something else.",
"Sorry, we are unable to edit this attribute. Perhaps we can try something else."
],
"already_at_target_class": [
"This attribute is already at the degree that you want. Let's try a different attribute degree or another attribute."
]
}
================================================
FILE: language/templates/gender.json
================================================
{
"male": {
"<man>": [
"person",
"guy",
"gentleman"
],
"<he>": [
"he",
"he",
"this person",
"this guy",
"this gentleman",
"this man"
],
"<his>": [
"his",
"the"
],
"<him>": [
"him"
],
"<boy>": [
"boy"
]
},
"female": {
"<man>": [
"person",
"lady",
"female"
],
"<he>": [
"she",
"she",
"this lady",
"this person",
"this female",
"this woman"
],
"<his>": [
"her",
"the"
],
"<him>": [
"her"
],
"<boy>": [
"girl"
]
}
}
================================================
FILE: language/templates/metadata_fsm.json
================================================
{
"start": {
"start_pureRequest": 0
},
"suggestion": {
"yes": 0,
"yes_pureRequest": 1,
"no": 2,
"no_pureRequest": 3,
"no_end": 4
},
"whether_enough": {
"yes": 0,
"yes_pureRequest": 1,
"yes_end": 2,
"no": 3,
"no_pureRequest": 4
},
"whats_next": {
"pureRequest": 0,
"end": 1
},
"attribute": {
"Bangs": 0,
"Eyeglasses": 1,
"No_Beard": 2,
"Smiling": 3,
"Young": 4
},
"score_change_direction": {
"negative": 0,
"positive": 1
},
"score_change_value": {
"1": 0,
"2": 1,
"3": 2,
"4": 3,
"5": 4
},
"target_score": {
"0": 0,
"1": 1,
"2": 2,
"3": 3,
"4": 4,
"5": 5
},
"request_mode": {
"change_definite": 0,
"change_indefinite": 1,
"target": 2,
"end": 3
}
}
================================================
FILE: language/templates/overall_caption_templates.json
================================================
{
"attr_order_mapping": {
"Bangs": {
"0": [
"has",
"sentence"
],
"1": [
"has"
],
"2": [
"has"
],
"3": [
"has"
],
"4": [
"has",
"sentence"
]
},
"No_Beard": {
"0": [
"has",
"sentence"
],
"1": [
"has"
],
"2": [
"has"
],
"3": [
"has"
],
"4": [
"has",
"sentence"
]
},
"Eyeglasses": {
"0": [
"has",
"sentence"
],
"1": [
"has"
],
"2": [
"has"
],
"3": [
"has"
],
"4": [
"has",
"sentence"
]
},
"Smiling": {
"0": [
"has",
"sentence"
],
"1": [
"has"
],
"2": [
"has"
],
"3": [
"has"
],
"4": [
"has",
"sentence"
]
},
"Young": {
"0": [
"start"
],
"1": [
"sentence"
],
"2": [
"sentence"
],
"3": [
"sentence"
],
"4": [
"sentence"
]
}
},
"has": {
"Bangs": {
"0": [
"no bangs"
],
"1": [
"very short bangs",
"very short bangs which only covers a tiny portion of <his> forehead"
],
"2": [
"short bangs",
"short bangs that covers a small portion of <his> forehead",
"short bangs that only covers a small portion of <his> forehead"
],
"3": [
"medium bangs",
"medium bangs that covers half of <his> forehead",
"bangs of medium length that covers half of <his> forehead",
"bangs of medium length that leaves half of <his> forehead visible"
],
"4": [
"long bangs",
"long bangs that almost covers all of <his> forehead",
"long bangs that almost covers This entire forehead"
],
"5": [
"extremely long bangs",
"extremely long bangs that almost covers all of <his> forehead",
"extremely long bangs that almost covers This entire forehead"
]
},
"Eyeglasses": {
"0": [
"no eyeglasses"
],
"1": [
"rimless eyeglasses"
],
"2": [
"eyeglasses with thin frame",
"thin frame eyeglasses"
],
"3": [
"eyeglasses with thick frame",
"thick frame eyeglasses"
],
"4": [
"sunglasses with thin frame",
"thin frame sunglasses"
],
"5": [
"sunglasses with thick frame",
"thick frame sunglasses"
]
},
"No_Beard": {
"0": [
"no beard",
"no beard at all"
],
"1": [
"short pointed beard",
"stubble",
"a rough growth of stubble",
"stubble covering <his> cheeks and chin"
],
"2": [
"short beard"
],
"3": [
"beard of medium length"
],
"4": [
"a big mustache on his face",
"a bushy beard"
],
"5": [
"very long beard",
"full beard",
"very thick beard",
"a very bushy beard"
]
},
"Smiling": {
"0": [
"no smile"
],
"1": [
"a very mild smile"
],
"2": [
"a mild smile"
],
"3": [
"a beaming face",
"a smile with <his> teeth visible",
"a face that is beamed with happiness",
"a smile"
],
"4": [
"a big smile",
"a big smile on <his> face",
"a big smile with <his> mouth slightly open",
"a big smile with <his> mouth slightly open and teeth visible"
],
"5": [
"a deep rumbling laugh",
"a very big smile",
"a very big smile on <his> face",
"a very big smile with <his> mouth wide open",
"a very big smile with <his> mouth wide open and teeth visible"
]
}
},
"start": {
"Young": {
"0": [
"This young kid",
"This young child",
"This little <boy>"
],
"1": [
"This teenager",
"This young <man>",
"This young <boy>"
],
"2": [
"This young adult",
"This <man> in <his> thirties"
],
"3": [
"This <man> in <his> forties",
"This <man> in <his> middle age",
"This middle-aged <man>"
],
"4": [
"This <man> in <his> sixties",
"This <man> in <his> fifties",
"This elderly <man>"
],
"5": [
"This old <man>",
"This <man> in <his> eighties",
"This old <man> in <his> eighties",
"This <man> in <his> seventies",
"This old <man> in <his> seventies",
"This very old <man>"
]
}
},
"has_prefix": [
"This <man> has ",
"<He> has "
]
}
================================================
FILE: language/templates/pool.json
================================================
{
"synonyms": {
" can ": [
" can ",
" could ",
" should "
],
"i'm": [
"i'm",
"i am"
],
"it's": [
"it's",
"it is"
],
"bangs": [
"bangs",
"fringe"
],
"slightly": [
"slightly",
"a little bit",
"a tiny little bit",
"a little",
"a bit",
"only a little",
"just a little bit"
],
"somewhat": [
"somewhat",
"relatively",
"to some extent",
"to some degree",
"moderately",
"partially",
"sort of",
"kind of",
"considerably"
],
"very": [
"very",
"extremely"
],
"entire": [
"entire",
"whole",
"full"
],
"child": [
"child",
"schoolchild"
],
"teenager": [
"teenager",
"teen"
],
"beard": [
"beard",
"mustache"
],
"i think": [
"i think",
"i think that",
"i feel",
"i feel that",
"i kind of think",
"i kind of think that",
"i kind of feel",
"i kind of feel that",
"i guess",
"i guess that"
],
"i want": [
"i want",
"i kind of want",
"i would like"
],
"let's try": [
"let's try",
"how about trying",
"what about trying"
],
"but not too much": [
"but not too much",
"just not too much",
"just that not too much",
"just don't go too much"
],
"only": [
"only",
"simply",
"just"
],
"eyeglasses": [
"eyeglasses",
"glasses"
],
"pokerface": [
"pokerface",
"poker face"
],
"what's": [
"what's",
"what is"
],
"how's": [
"how's",
"how is"
],
"do you want to": [
"do you want to",
"would you like to",
"perhaps you would like to",
"perhaps you might want to",
"maybe you would like to",
"maybe you might want to"
],
"want to": [
"want to",
"would like to"
],
"manipulate": [
"manipulate",
"edit"
],
"manipulating": [
"manipulating",
"editing",
"playing with"
]
},
"prefix": [
"Actually,",
"To be honest,",
"Well,",
"Well",
"Emm",
"Emmm",
"Emmmm",
"Emm,",
"Emmm,",
"Emmmm,",
"Hi,",
"Hello,",
"Let me think about it.",
"I'm not too sure but",
"What about this?",
"Can we try this?",
"It looks okay now but",
"It looks better now, but still,",
"It looks nice, but still,",
"Let me have a look. Well,",
"Let me have a look. Well",
"Let me have a look. Emm,",
"Let me have a look. Emmm,",
"Let me have a look. Emmmm,",
"Let me have a look. Emm",
"Let me have a look. Emmm",
"Let me have a look. Emmmm",
"Let me take a look. Well,",
"Let me take a look. Well",
"Let me take a look. Emm,",
"Let me take a look. Emmm,",
"Let me take a look. Emmmm,",
"Let me take a look. Emm",
"Let me take a look. Emmm",
"Let me take a look. Emmmm"
],
"postfix": [
"Thanks!",
"Thank you!",
"Is that possible?",
"and emmm... well let's try this first.",
"I guess it will probably get better this way.",
"I'm not too sure, let's see how it goes first.",
"It would be nicer in that way.",
"It would be nicer in that way, I think.",
"It would be nicer in that way, I guess.",
"I think it would be nicer in that way.",
"I guess it would be nicer in that way.",
"It would be nicer this way.",
"It would be nicer this way, I think.",
"It would be nicer this way, I guess.",
"I think it would be nicer this way.",
"I guess it would be nicer this way.",
"It might be nicer in that way.",
"It might be nicer in that way, I think.",
"It might be nicer in that way, I guess.",
"I think it might be nicer in that way.",
"I guess it might be nicer in that way.",
"It might be nicer this way.",
"It might be nicer this way, I think.",
"It might be nicer this way, I guess.",
"I think it might be nicer this way.",
"I guess it might be nicer this way.",
"It would look better in that way.",
"It would look better in that way, I think.",
"It would look better in that way, I guess.",
"I think it would look better in that way.",
"I guess it would look better in that way.",
"It would look better this way.",
"It would look better this way, I think.",
"It would look better this way, I guess.",
"I think it would look better this way.",
"I guess it would look better this way.",
"It might look better in that way.",
"It might look better in that way, I think.",
"It might look better in that way, I guess.",
"I think it might look better in that way.",
"I guess it might look better in that way.",
"It might look better this way.",
"It might look better this way, I think.",
"It might look better this way, I guess.",
"I think it might look better this way.",
"I guess it might look better this way."
]
}
================================================
FILE: language/templates/system_mode.json
================================================
{
"start": 0,
"suggestion": 1,
"whether_enough": 2,
"whats_next": 3
}
================================================
FILE: language/templates/user_fsm.json
================================================
{
"start": [
[
"Hi.",
"Hello."
]
],
"pureRequest": {
"Bangs": {
"target": {
"0": [
"No bangs.",
"Remove all the bangs.",
"Cut off all the bangs.",
"I don't want the bangs at all.",
"I don't want any bangs.",
"I don't want any bangs visible.",
"The bangs doesn't look good, let's remove it.",
"The bangs covers the forehead, but I want the entire forehead visible."
],
"1": [
"Add very short bangs.",
"I want very short bangs.",
"Add very short bangs that leaves most of the forehead uncovered.",
"I want very short bangs that leaves most of the forehead uncovered."
],
"2": [
"Add short bangs.",
"Let's try short bangs.",
"Add short bangs that covers only a small portion of the forehead.",
"Let's try short bangs that covers only a small portion of the forehead."
],
"3": [
"Add medium bangs.",
"Add bangs of medium length.",
"Let's try bangs of medium length.",
"Let's try bangs that leaves half of the forehead visible."
],
"4": [
"Add long bangs.",
"Let's try long bangs.",
"Add long bangs but don't cover the entire forehead.",
"Let's try long bangs but don't cover the entire forehead."
],
"5": [
"Add extremely long bangs.",
"Let's try extremely long bangs.",
"Add extremely long bangs that covers the entire forehead.",
"Let's try extremely long bangs that covers the entire forehead.",
"Indeed, the bangs can be much longer. Let's cover the eyebrows."
]
},
"change": {
"positive": {
"definite": {
"1": [
"The bangs can be slightly longer.",
"Make the bangs slightly longer."
],
"2": [
"The bangs can be somewhat longer, but not too much.",
"Make the bangs somewhat longer, but not too much."
],
"3": [
"Make the bangs longer, but not too much."
],
"4": [
"The bangs can be longer.",
"Make the bangs longer."
],
"5": [
"The bangs can be much longer.",
"Make the bangs much longer."
]
},
"indefinite": [
"Longer bangs.",
"Add bangs.",
"The bangs can be longer.",
"Let's add some bangs.",
"Maybe the bangs can be longer.",
"Let's try adding longer bangs.",
"What about adding longer bangs?",
"Emm, I think the bangs can be longer.",
"Let's make the bangs longer.",
"Hi, I want to see how my friend looks like with some bangs."
]
},
"negative": {
"definite": {
"1": [
"The bangs can be slightly shorter.",
"Make the bangs slightly shorter."
],
"2": [
"The bangs can be somewhat shorter, but not too much.",
"Make the bangs somewhat shorter, but not too much."
],
"3": [
"The bangs can be shorter.",
"Make the bangs shorter."
],
"4": [
"The bangs can be much shorter.",
"Make the bangs much shorter."
],
"5": [
"Remove all the bangs.",
"I don't want the bangs at all.",
"I don't want any bangs at all."
]
},
"indefinite": [
"Less bangs",
"Remove bangs.",
"Remove the bangs.",
"Let's cut off the bangs.",
"Let's cut the bangs short.",
"Let's cut the bangs off.",
"I don't like the bangs, let's remove it.",
"I don't like the bangs, let's cut it off.",
"The bangs is too long, let's remove it.",
"The bangs is too long, let's cut it off."
]
}
}
},
"Eyeglasses": {
"target": {
"0": [
"No eyeglass",
"No eyeglasses please.",
"No eyeglasses.",
"Remove eyeglasses.",
"Remove the eyeglasses.",
"I don't want to see the eyeglasses.",
"I think there shouldn't be any eyeglasses."
],
"1": [
"The eyeglasses should be rimless.",
"Let's try rimless eyeglasses."
],
"2": [
"The eyeglasses should have thin frame.",
"Let's try thin frame eyeglasses."
],
"3": [
"The eyeglasses should have thick frame.",
"Let's try thick frame eyeglasses."
],
"4": [
"Let's try thin frame sunglasses.",
"It should be sunglasses with thin frame."
],
"5": [
"Let's try thick frame sunglasses.",
"It should be sunglasses with thick frame."
]
},
"change": {
"positive": {
"definite": {
"1": [
"Make the eyeglasses slightly more obvious.",
"The eyeglasses can be slightly more obvious."
],
"2": [
"Make the eyeglasses somewhat more obvious.",
"The eyeglasses can be somewhat more obvious."
],
"3": [
"Make the eyeglasses more obvious.",
"The eyeglasses can be more obvious."
],
"4": [
"Let's try eyeglasses with thicker frame and darker color."
],
"5": [
"Let's try thick frame sunglasses.",
"It should be sunglasses with thick frame."
]
},
"indefinite": [
"Add glasses",
"Use eyeglasses",
"Try eyeglasses.",
"Add eyeglasses.",
"Add eyeglasses to the face.",
"Add eyeglasses please.",
"Let's add eyeglasses.",
"The eyeglasses can be more obvious.",
"The eyeglasses are not obvious enough.",
"I can't see the eyeglasses clearly, let's make them more obvious.",
"The eyeglasses frame can be thicker.",
"The glass color can be darker."
]
},
"negative": {
"definite": {
"1": [
"Make the eyeglasses slightly less obvious.",
"The eyeglasses can be slightly less obvious."
],
"2": [
"Make the eyeglasses somewhat less obvious.",
"The eyeglasses can be somewhat less obvious."
],
"3": [
"Make the eyeglasses less obvious.",
"The eyeglasses can be less obvious."
],
"4": [
"The eyeglasses are too obvious, let's make it much less obvious.",
"The eyeglasses are too obvious, let's try make it much less obvious."
],
"5": [
"Remove eyeglasses.",
"Remove the eyeglasses.",
"I don't like the eyeglasses.",
"I don't want to see the eyeglasses.",
"There shouldn't be any eyeglasses."
]
},
"indefinite": [
"Remove eyeglasses.",
"No eyeglasses.",
"The eyeglasses can be less obvious.",
"The eyeglasses are too obvious.",
"Let's make the eyeglasses more obvious.",
"The eyeglasses frame can be thinner.",
"The glass color can be lighter."
]
}
}
},
"No_Beard": {
"target": {
"0": [
"Let's see what he looks like without his beard.",
"Let's shave the beard off.",
"No beard"
],
"1": [
"His face should be covered with short pointed beard.",
"His face should be covered with the stubble.",
"His face has a rough growth of stubble.",
"There should be stubble covering his cheeks and chin."
],
"2": [
"His face should be covered with short beard.",
"Let's add short beard to his face.",
"Let's try short beard on his face."
],
"3": [
"His face should be covered with beard of medium length.",
"Let's add medium-length beard to his face.",
"Let's try medium-length beard on his face."
],
"4": [
"Let's try a big mustache on his face.",
"He should have a bushy beard."
],
"5": [
"Let's add very long beard.",
"Let's add a full beard.",
"He should have very thick beard.",
"He should have a very bushy beard."
]
},
"change": {
"positive": {
"definite": {
"1": [
"The beard can be slightly longer.",
"Make the beard slightly longer.",
"Slightly add more beard."
],
"2": [
"The beard can be somewhat longer, but not too much.",
"Make the beard somewhat longer, but not too much."
],
"3": [
"The beard can be longer.",
"Make the beard longer."
],
"4": [
"The beard can be much longer.",
"Make the beard much longer."
],
"5": [
"Let's add very long beard.",
"Let's add a full beard.",
"He should have very thick beard",
"He has a very bushy beard."
]
},
"indefinite": [
"Add beard.",
"Add some beard.",
"Longer beard.",
"Let's add more beard.",
"I want some more beard on the face."
]
},
"negative": {
"definite": {
"1": [
"The beard can be slightly shorter.",
"Make the beard slightly shorter.",
"Slightly remove some beard."
],
"2": [
"The beard can be somewhat shorter, but not too much.",
"Make the beard somewhat shorter, but not too much."
],
"3": [
"The beard can be shorter.",
"Make the beard shorter."
],
"4": [
"The beard can be much shorter.",
"Make the beard much shorter."
],
"5": [
"Let's see what he looks like without his beard.",
"Let's shave the beard off."
]
},
"indefinite": [
"Less beard.",
"Remove beard.",
"Remove the beard.",
"The beard should be gone.",
"Let's try to remove the beard.",
"I don't like the beard.",
"Let's try shorter beard."
]
}
}
},
"Smiling": {
"target": {
"0": [
"I think the person shouldn't be smiling.",
"I don't like the smile.",
"I don't want the smile.",
"No smile.",
"Remove the smile."
],
"1": [
"Turn up the corners of the mouth.",
"The corners of the mouth should curve up."
],
"2": [
"The corners of the mouth should curve up and show some teeth.",
"Smile broadly and show some teeth."
],
"3": [
"I want a beaming face.",
"I want the face to be smiling with teeth visible.",
"The entire face should be beamed with happiness."
],
"4": [
"It can be a big smile.",
"I want a big smile on the face.",
"I want the face to be smiling with the mouth slightly open.",
"I want the face to be smiling with the mouth slightly open. We should be able to see the teeth.",
"I want the face to be smiling with the mouth slightly open so that we can see the teeth."
],
"5": [
"I want a deep rumbling laugh.",
"It can be laughing happily.",
"It can be a very big smile.",
"I want a very big smile on the face.",
"I want the face to be smiling with the mouth wide open.",
"I want the face to be smiling with the mouth wide open. We should be able to see the teeth."
]
},
"change": {
"positive": {
"definite": {
"1": [
"Smile slightly more.",
"The smile can be slightly bigger.",
"Make the smile slightly bigger.",
"The person can look slightly happier.",
"The person can smile slightly more happily."
],
"2": [
"The smile can be somewhat bigger, but not too much.",
"Make the smile somewhat bigger, but not too much.",
"The person can look somewhat happier.",
"The person can smile somewhat more happily."
],
"3": [
"Smile more.",
"The smile can be bigger.",
"Make the smile bigger.",
"The person can be happier.",
"The person can smile more happily."
],
"4": [
"The smile can be much bigger.",
"Make the smile much bigger.",
"The person can be a lot happier.",
"The person can smile a lot more happily."
],
"5": [
"I want a deep rumbling laugh.",
"It can be laughing happily.",
"It can be a very big smile.",
"I want a very big smile on the face.",
"I want the face to be smiling with the mouth wide open.",
"I want the face to be smiling with the mouth wide open. We should be able to see the teeth.",
"The person can smile very happily."
]
},
"indefinite": [
"Look not so serious.",
"Look less serious.",
"Too serious, be happier.",
"Add smile.",
"Add some smiling please.",
"The smile is not big enough.",
"I want a bigger smile.",
"I want the face to smile more.",
"I want to change the pokerface face to a smiling face.",
"The person can smile more happily.",
"Can look happier."
]
},
"negative": {
"definite": {
"1": [
"I want the smile to be slightly less obvious.",
"The smile can be slightly less obvious.",
"The person can smile slightly less happily."
],
"2": [
"I want the smile to be less obvious.",
"The smile can be less obvious.",
"The person can smile somewhat less happily."
],
"3": [
"I want the smile to be much less obvious.",
"The smile can be much less obvious.",
"The person can smile less happily."
],
"4": [
"I want to make the smile almost vanish.",
"The person can smile a lot less happily."
],
"5": [
"I want the smile to vanish.",
"I don't like the smile, let's remove it."
]
},
"indefinite": [
"Not serious enough.",
"More serious.",
"No smiling.",
"No smile.",
"Remove smiling.",
"Remove the smiling.",
"Remove smile.",
"Remove the smile.",
"Smile less happily.",
"Don't be so happy.",
"The smile is too much.",
"Can we have a gentler smile? This smile is too big.",
"I want to change the smiling face to a pokerface."
]
}
}
},
"Young": {
"target": {
"0": [
"Let's make the face a child one.",
"Let's make the face very young."
],
"1": [
"Let's make the face a teenager one.",
"Let's make the face relatively young.",
"The person should be in the twenties."
],
"2": [
"Let's make the face a young one.",
"It should be a young adult.",
"The person should be in the thirties."
],
"3": [
"Let's make the face a middle age one.",
"The person should be in the forties."
],
"4": [
"Let's make the face slightly older than middle age.",
"Let's make the face the one of a senior.",
"Let's make the face the one of an elderly.",
"The person should be in the sixties.",
"The person should be in the fifties."
],
"5": [
"Let's make the face a very old one.",
"The person should be in the seventies.",
"The person should be in the eighties."
]
},
"change": {
"positive": {
"definite": {
"1": [
"The face can be slightly older.",
"Make the face slightly older."
],
"2": [
"Somewhat older",
"The face can be somewhat older, just not too much.",
"Make the face somewhat older, but not too much."
],
"3": [
"Make the face older, but not too much.",
"Make the face older, but not too much."
],
"4": [
"The face can be older.",
"Make the face older."
],
"5": [
"The face can be much older.",
"Make the face much older.",
"Let's make the face a very old one."
]
},
"indefinite": [
"Older.",
"Make it older.",
"The face can be older.",
"This face is too young, let's make it older.",
"Let's make the face older.",
"What about making the face look older?"
]
},
"negative": {
"definite": {
"1": [
"The face can be slightly younger.",
"Make the face slightly younger."
],
"2": [
"Somewhat younger.",
"The face can be somewhat younger, but not too much.",
"Make the face somewhat younger, but not too much."
],
"3": [
"The face can be younger.",
"Make the face younger.",
"Younger face."
],
"4": [
"Much younger.",
"The face can be much younger.",
"Make the face much younger."
],
"5": [
"Let's make the face a child one."
]
},
"indefinite": [
"Younger face.",
"Younger.",
"Look younger",
"Make it younger.",
"Be younger.",
"Less old.",
"The face can be younger.",
"This face is too old, let's make it younger.",
"Let's make the face younger.",
"What about making it younger?",
"Can you make the person look younger?"
]
}
}
}
},
"yes": [
[
"Yes",
"Yep",
"Yeep",
"Yep sure",
"Yes sure",
"Sure",
"Ok"
],
[
"."
]
],
"no": [
[
"No",
"Nope"
],
[
"."
]
],
"end": [
[
"End.",
"Nothing.",
"Nothing else.",
"Nothing else for now.",
"It's all good now.",
"I don't want any further edits.",
"Actually it's all good now.",
"No need for further edits.",
"I don't need any further edits.",
"That's all.",
"This is it.",
"That is it.",
"That is all.",
"No."
],
[
" Thanks!",
" Thank you!",
" Thanks a lot!",
""
]
]
}
================================================
FILE: language/templates/user_old_templates.json
================================================
{
"start": [
[
"Hi.",
"Hello."
],
[
" "
]
],
"requests": {
"Bangs": {
"target": {
"0": [
"No bangs.",
"Remove all the bangs.",
"Cut off all the bangs.",
"I don't want the bangs at all.",
"I don't want any bangs.",
"I don't want any bangs visible.",
"The bangs doesn't look good, let's remove it.",
"The bangs covers the forehead, but I want the entire forehead visible."
],
"1": [
"Add very short bangs.",
"I want very short bangs.",
"Add very short bangs that leaves most of the forehead uncovered.",
"I want very short bangs that leaves most of the forehead uncovered."
],
"2": [
"Add short bangs.",
"Let's try short bangs.",
"Add short bangs that covers only a small portion of the forehead.",
"Let's try short bangs that covers only a small portion of the forehead."
],
"3": [
"Add medium bangs.",
"Add bangs of medium length.",
"Let's try bangs of medium length.",
"Let's try bangs that leaves half of the forehead visible."
],
"4": [
"Add long bangs.",
"Let's try long bangs.",
"Add long bangs but don't cover the entire forehead.",
"Let's try long bangs but don't cover the entire forehead."
],
"5": [
"Add extremely long bangs.",
"Let's try extremely long bangs.",
"Add extremely long bangs that covers the entire forehead.",
"Let's try extremely long bangs that covers the entire forehead.",
"Indeed, the bangs can be much longer. Let's cover the eyebrows."
]
},
"change": {
"positive": {
"definite": {
"1": [
"The bangs can be slightly longer.",
"Make the bangs slightly longer."
],
"2": [
"The bangs can be somewhat longer, but not too much.",
"Make the bangs somewhat longer, but not too much."
],
"3": [
"Make the bangs longer, but not too much."
],
"4": [
"The bangs can be longer.",
"Make the bangs longer."
],
"5": [
"The bangs can be much longer.",
"Make the bangs much longer."
]
},
"indefinite": [
"The bangs can be longer.",
"Let's add some bangs.",
"Maybe the bangs can be longer.",
"Let's try adding longer bangs.",
"What about adding longer bangs?",
"Emm, I think the bangs can be longer.",
"Let's make the bangs longer.",
"Hi, I want to see how my friend looks like with some bangs."
]
},
"negative": {
"definite": {
"1": [
"The bangs can be slightly shorter.",
"Make the bangs slightly shorter."
],
"2": [
"The bangs can be somewhat shorter, but not too much.",
"Make the bangs somewhat shorter, but not too much."
],
"3": [
"The bangs can be shorter.",
"Make the bangs shorter."
],
"4": [
"The bangs can be much shorter.",
"Make the bangs much shorter."
],
"5": [
"Remove all the bangs.",
"I don't want the bangs at all.",
"I don't want any bangs at all."
]
},
"indefinite": [
"Remove bangs.",
"Remove the bangs.",
"Let's cut off the bangs.",
"Let's cut the bangs short.",
"Let's cut the bangs off.",
"I don't like the bangs, let's remove it.",
"I don't like the bangs, let's cut it off.",
"The bangs is too long, let's remove it.",
"The bangs is too long, let's cut it off."
]
}
}
},
"Eyeglasses": {
"target": {
"0": [
"No eyeglasses please.",
"No eyeglasses.",
"Remove eyeglasses.",
"Remove the eyeglasses.",
"I don't want to see the eyeglasses.",
"I think there shouldn't be any eyeglasses."
],
"1": [
"The eyeglasses should be rimless.",
"Let's try rimless eyeglasses."
],
"2": [
"The eyeglasses should have thin frame.",
"Let's try thin frame eyeglasses."
],
"3": [
"The eyeglasses should have thick frame.",
"Let's try thick frame eyeglasses."
],
"4": [
"Let's try thin frame sunglasses.",
"It should be sunglasses with thin frame."
],
"5": [
"Let's try thick frame sunglasses.",
"It should be sunglasses with thick frame."
]
},
"change": {
"positive": {
"definite": {
"1": [
"Make the eyeglasses slightly more obvious.",
"The eyeglasses can be slightly more obvious."
],
"2": [
"Make the eyeglasses somewhat more obvious.",
"The eyeglasses can be somewhat more obvious."
],
"3": [
"Make the eyeglasses more obvious.",
"The eyeglasses can be more obvious."
],
"4": [
"Let's try eyeglasses with thicker frame and darker color."
],
"5": [
"Let's try thick frame sunglasses.",
"It should be sunglasses with thick frame."
]
},
"indefinite": [
"Try eyeglasses.",
"Add eyeglasses.",
"Add eyeglasses to the face.",
"Add eyeglasses please.",
"Let's add eyeglasses.",
"The eyeglasses can be more obvious.",
"The eyeglasses are not obvious enough.",
"I can't see the eyeglasses clearly, let's make them more obvious.",
"The eyeglasses frame can be thicker.",
"The glass color can be darker."
]
},
"negative": {
"definite": {
"1": [
"Make the eyeglasses slightly less obvious.",
"The eyeglasses can be slightly less obvious."
],
"2": [
"Make the eyeglasses somewhat less obvious.",
"The eyeglasses can be somewhat less obvious."
],
"3": [
"Make the eyeglasses less obvious.",
"The eyeglasses can be less obvious."
],
"4": [
"The eyeglasses are too obvious, let's make it much less obvious.",
"The eyeglasses are too obvious, let's try make it much less obvious."
],
"5": [
"Remove eyeglasses.",
"Remove the eyeglasses.",
"I don't like the eyeglasses.",
"I don't want to see the eyeglasses.",
"There shouldn't be any eyeglasses."
]
},
"indefinite": [
"The eyeglasses can be less obvious.",
"The eyeglasses are too obvious.",
"Let's make the eyeglasses more obvious.",
"The eyeglasses frame can be thinner.",
"The glass color can be lighter."
]
}
}
},
"No_Beard": {
"target": {
"0": [
"Let's see what he looks like without his beard.",
"Let's shave the beard off."
],
"1": [
"His face should be covered with short pointed beard.",
"His face should be covered with the stubble.",
"His face has a rough growth of stubble.",
"There should be stubble covering his cheeks and chin."
],
"2": [
"His face should be covered with short beard.",
"Let's add short beard to his face.",
"Let's try short beard on his face."
],
"3": [
"His face should be covered with beard of medium length.",
"Let's add medium-length beard to his face.",
"Let's try medium-length beard on his face."
],
"4": [
"Let's try a big mustache on his face.",
"He should have a bushy beard."
],
"5": [
"Let's add very long beard.",
"Let's add a full beard.",
"He should have very thick beard.",
"He should have a very bushy beard."
]
},
"change": {
"positive": {
"definite": {
"1": [
"The beard can be slightly longer.",
"Make the beard slightly longer.",
"Slightly add more beard."
],
"2": [
"The beard can be somewhat longer, but not too much.",
"Make the beard somewhat longer, but not too much."
],
"3": [
"The beard can be longer.",
"Make the beard longer."
],
"4": [
"The beard can be much longer.",
"Make the beard much longer."
],
"5": [
"Let's add very long beard.",
"Let's add a full beard.",
"He should have very thick beard",
"He has a very bushy beard."
]
},
"indefinite": [
"Add beard.",
"Add some beard.",
"Longer beard.",
"Let's add more beard.",
"I want some more beard on the face."
]
},
"negative": {
"definite": {
"1": [
"The beard can be slightly shorter.",
"Make the beard slightly shorter.",
"Slightly remove some beard."
],
"2": [
"The beard can be somewhat shorter, but not too much.",
"Make the beard somewhat shorter, but not too much."
],
"3": [
"The beard can be shorter.",
"Make the beard shorter."
],
"4": [
"The beard can be much shorter.",
"Make the beard much shorter."
],
"5": [
"Let's see what he looks like without his beard.",
"Let's shave the beard off."
]
},
"indefinite": [
"Remove beard.",
"Remove the beard.",
"The beard should be gone.",
"Let's try to remove the beard.",
"I don't like the beard.",
"Let's try shorter beard"
]
}
}
},
"Smiling": {
"target": {
"0": [
"I think the person shouldn't be smiling.",
"I don't like the smile.",
"I don't want the smile"
],
"1": [
"Turn up the corners of the mouth",
"The corners of the mouth curve up."
],
"2": [
"The corners of the mouth curve up and show some teeth.",
"Smile broadly and show some teeth."
],
"3": [
"I want a beaming face.",
"I want the face to be smiling with teeth visible.",
"The entire face should be beamed with happiness."
],
"4": [
"It can be a big smile.",
"I want a big smile on the face.",
"I want the face to be smiling with the mouth slightly open.",
"I want the face to be smiling with the mouth slightly open. We should be able to see the teeth.",
"I want the face to be smiling with the mouth slightly open so that we can see the teeth."
],
"5": [
"I want a deep rumbling laugh.",
"It can be laughing happily.",
"It can be a very big smile.",
"I want a very big smile on the face.",
"I want the face to be smiling with the mouth wide open.",
"I want the face to be smiling with the mouth wide open. We should be able to see the teeth."
]
},
"change": {
"positive": {
"definite": {
"1": [
"Smile slightly more.",
"The smile can be slightly bigger.",
"Make the smile slightly bigger.",
"The person can look slightly happier.",
"The person can smile slightly more happily."
],
"2": [
"The smile can be somewhat bigger, but not too much.",
"Make the smile somewhat bigger, but not too much.",
"The person can look somewhat happier.",
"The person can smile somewhat more happily."
],
"3": [
"Smile more.",
"The smile can be bigger.",
"Make the smile bigger.",
"The person can be happier.",
"The person can smile more happily."
],
"4": [
"The smile can be much bigger.",
"Make the smile much bigger.",
"The person can be a lot happier.",
"The person can smile a lot more happily."
],
"5": [
"I want a deep rumbling laugh.",
"It can be laughing happily.",
"It can be a very big smile.",
"I want a very big smile on the face.",
"I want the face to be smiling with the mouth wide open.",
"I want the face to be smiling with the mouth wide open. We should be able to see the teeth.",
"The person can smile very happily."
]
},
"indefinite": [
"Add some smiling please.",
"The smile is not big enough.",
"I want a bigger smile.",
"I want the face to smile more.",
"I want to change the pokerface face to a smiling face.",
"The person can smile more happily.",
"Can look happier."
]
},
"negative": {
"definite": {
"1": [
"I want the smile to be slightly less obvious.",
"The smile can be slightly less obvious.",
"The person can smile slightly less happily."
],
"2": [
"I want the smile to be less obvious.",
"The smile can be less obvious.",
"The person can smile somewhat less happily."
],
"3": [
"I want the smile to be much less obvious.",
"The smile can be much less obvious.",
"The person can smile less happily."
],
"4": [
"I want to make the smile almost vanish.",
"The person can smile a lot less happily."
],
"5": [
"I want the smile to vanish.",
"I don't like the smile, let's remove it."
]
},
"indefinite": [
"No smiling.",
"No smile.",
"Remove smiling.",
"Remove the smiling.",
"Remove smile.",
"Remove the smile.",
"Smile less happily.",
"Don't be so happy.",
"The smile is too much.",
"Can we have a gentler smile? This smile is too big.",
"I want to change the smiling face to a pokerface."
]
}
}
},
"Young": {
"target": {
"0": [
"Let's make the face a child one.",
"Let's make the face very young."
],
"1": [
"Let's make the face a teenager one.",
"Let's make the face relatively young.",
"The person should be in the twenties."
],
"2": [
"Let's make the face a young one.",
"It should be a young adult.",
"The person should be in the thirties."
],
"3": [
"Let's make the face a middle age one.",
"The person should be in the forties."
],
"4": [
"Let's make the face slightly older than middle age.",
"Let's make the face the one of a senior.",
"Let's make the face the one of an elderly.",
"The person should be in the sixties.",
"The person should be in the fifties."
],
"5": [
"Let's make the face a very old one.",
"The person should be in the seventies.",
"The person should be in the eighties."
]
},
"change": {
"positive": {
"definite": {
"1": [
"The face can be slightly older.",
"Make the face slightly older."
],
"2": [
"Somewhat older",
"The face can be somewhat older, just not too much.",
"Make the face somewhat older, but not too much."
],
"3": [
"Make the face older, but not too much.",
"Make the face older, but not too much."
],
"4": [
"The face can be older.",
"Make the face older."
],
"5": [
"The face can be much older.",
"Make the face much older.",
"Let's make the face a very old one."
]
},
"indefinite": [
"Older.",
"Make it older.",
"The face can be older.",
"This face is too young, let's make it older.",
"Let's make the face older.",
"What about making the face look older?"
]
},
"negative": {
"definite": {
"1": [
"The face can be slightly younger.",
"Make the face slightly younger."
],
"2": [
"Somewhat younger.",
"The face can be somewhat younger, but not too much.",
"Make the face somewhat younger, but not too much."
],
"3": [
"The face can be younger.",
"Make the face younger.",
"Younger face."
],
"4": [
"Much younger.",
"The face can be much younger.",
"Make the face much younger."
],
"5": [
"Let's make the face a child one."
]
},
"indefinite": [
"Younger face.",
"Younger.",
"Make it younger.",
"Be younger.",
"Less old.",
"The face can be younger.",
"This face is too old, let's make it younger.",
"Let's make the face younger.",
"What about making it younger?"
]
}
}
}
},
"yes_enough": [
[
"Emmm, yep",
"Emmm, yes",
"Emmm, yeep",
"Yes",
"Yep",
"Yeep",
"Yep sure"
],
[
", ",
". ",
"! "
],
[
"That's good enough now.",
"That's nice.",
"That's perfect.",
"This is great."
],
[
" "
]
],
"no_enough": [
[
"Actually,",
"To be honest,",
"Well,",
"Well",
"Emm",
"Emmm",
"Emmmm",
"Emm,",
"Emmm,",
"Emmmm,",
"I'm not too sure but",
"It looks okay now but",
"It looks better now, but still,",
"It looks nice, but still,",
"Let me have a look. Well,",
"Let me have a look. Well",
"Let me have a look. Emm,",
"Let me have a look. Emmm,",
"Let me have a look. Emmmm,",
"Let me have a look. Emm",
"Let me have a look. Emmm",
"Let me have a look. Emmmm",
"Let me take a look. Well,",
"Let me take a look. Well",
"Let me take a look. Emm,",
"Let me take a look. Emmm,",
"Let me take a look. Emmmm,",
"Let me take a look. Emm",
"Let me take a look. Emmm",
"Let me take a look. Emmmm"
],
[
" "
]
],
"yes_suggestion": [
[
"Emmm, yep",
"Emmm, yes",
"Emmm, yeep",
"Yes",
"Yep",
"Yeep",
"Yep sure",
"Yes sure"
],
[
",",
".",
"!"
],
[
" "
]
],
"no_suggestion": [
[
"Well,",
"Well",
"Emm,",
"Emmm",
"Emmmm",
"Emm,",
"Emmm,",
"Emmmm,",
"I'm not too sure so",
"It looks okay now so",
"It looks nice, so,",
"Let me have a look. Well,",
"Let me have a look. Well",
"Let me have a look. Emm,",
"Let me have a look. Emmm,",
"Let me have a look. Emmmm,",
"Let me have a look. Emm",
"Let me have a look. Emmm",
"Let me have a look. Emmmm",
"Let me take a look. Well,",
"Let me take a look. Well",
"Let me take a look. Emm,",
"Let me take a look. Emmm,",
"Let me take a look. Emmmm,",
"Let me take a look. Emm",
"Let me take a look. Emmm",
"Let me take a look. Emmmm"
],
[
" "
],
[
"Not really.",
"Not really actually.",
"No actually."
],
[
" "
]
],
"end": [
[
"Nothing else.",
"Nothing else for now.",
"It's all good now.",
"I don't want any further edits.",
"Actually it's all good now.",
"No need for further edits.",
"I don't need any further edits.",
"That's all.",
"This is it.",
"That is it.",
"That is all.",
"No."
],
[
" "
],
[
"Thanks!",
"Thank you!",
"Thanks a lot!"
]
]
}
================================================
FILE: language/templates/vocab.json
================================================
{
"text_token_to_idx": {
"<NULL>": 0,
"<START>": 1,
"<END>": 2,
"<UNK>": 3,
"?": 4,
"a": 5,
"able": 6,
"about": 7,
"actually": 8,
"add": 9,
"adding": 10,
"adult": 11,
"age": 12,
"all": 13,
"almost": 14,
"an": 15,
"and": 16,
"any": 17,
"are": 18,
"at": 19,
"bangs": 20,
"be": 21,
"beamed": 22,
"beaming": 23,
"beard": 24,
"big": 25,
"bigger": 26,
"bit": 27,
"broadly": 28,
"bushy": 29,
"but": 30,
"can": 31,
"can't": 32,
"change": 33,
"cheeks": 34,
"child": 35,
"chin": 36,
"clearly": 37,
"color": 38,
"considerably": 39,
"corners": 40,
"could": 41,
"cover": 42,
"covered": 43,
"covering": 44,
"covers": 45,
"curve": 46,
"cut": 47,
"darker": 48,
"deep": 49,
"degree": 50,
"doesn't": 51,
"don't": 52,
"edits": 53,
"eighties": 54,
"elderly": 55,
"else": 56,
"emm": 57,
"end": 58,
"enough": 59,
"entire": 60,
"extent": 61,
"extremely": 62,
"eyebrows": 63,
"eyeglass": 64,
"eyeglasses": 65,
"face": 66,
"feel": 67,
"fifties": 68,
"for": 69,
"forehead": 70,
"forties": 71,
"frame": 72,
"friend": 73,
"fringe": 74,
"full": 75,
"further": 76,
"gentler": 77,
"glass": 78,
"glasses": 79,
"go": 80,
"gone": 81,
"good": 82,
"growth": 83,
"guess": 84,
"half": 85,
"happier": 86,
"happily": 87,
"happiness": 88,
"happy": 89,
"has": 90,
"have": 91,
"he": 92,
"hello": 93,
"hi": 94,
"his": 95,
"how": 96,
"i": 97,
"in": 98,
"indeed": 99,
"is": 100,
"it": 101,
"it's": 102,
"just": 103,
"kind": 104,
"laugh": 105,
"laughing": 106,
"leaves": 107,
"length": 108,
"less": 109,
"let's": 110,
"lighter": 111,
"like": 112,
"little": 113,
"long": 114,
"longer": 115,
"look": 116,
"looks": 117,
"lot": 118,
"make": 119,
"making": 120,
"maybe": 121,
"medium": 122,
"medium-length": 123,
"middle": 124,
"moderately": 125,
"more": 126,
"most": 127,
"mouth": 128,
"much": 129,
"mustache": 130,
"my": 131,
"need": 132,
"no": 133,
"nope": 134,
"not": 135,
"nothing": 136,
"now": 137,
"obvious": 138,
"of": 139,
"off": 140,
"ok": 141,
"old": 142,
"older": 143,
"on": 144,
"one": 145,
"only": 146,
"open": 147,
"partially": 148,
"person": 149,
"please": 150,
"pointed": 151,
"poker": 152,
"pokerface": 153,
"portion": 154,
"relatively": 155,
"remove": 156,
"rimless": 157,
"rough": 158,
"rumbling": 159,
"schoolchild": 160,
"see": 161,
"senior": 162,
"serious": 163,
"seventies": 164,
"shave": 165,
"short": 166,
"shorter": 167,
"should": 168,
"shouldn't": 169,
"show": 170,
"simply": 171,
"sixties": 172,
"slightly": 173,
"small": 174,
"smile": 175,
"smiling": 176,
"so": 177,
"some": 178,
"somewhat": 179,
"sort": 180,
"stubble": 181,
"sunglasses": 182,
"sure": 183,
"teen": 184,
"teenager": 185,
"teeth": 186,
"than": 187,
"thank": 188,
"thanks": 189,
"that": 190,
"that's": 191,
"the": 192,
"them": 193,
"there": 194,
"thick": 195,
"thicker": 196,
"thin": 197,
"think": 198,
"thinner": 199,
"thirties": 200,
"this": 201,
"tiny": 202,
"to": 203,
"too": 204,
"try": 205,
"trying": 206,
"turn": 207,
"twenties": 208,
"uncovered": 209,
"up": 210,
"use": 211,
"vanish": 212,
"very": 213,
"visible": 214,
"want": 215,
"we": 216,
"what": 217,
"whole": 218,
"wide": 219,
"with": 220,
"without": 221,
"would": 222,
"yeep": 223,
"yep": 224,
"yes": 225,
"you": 226,
"young": 227,
"younger": 228
}
}
================================================
FILE: language/train_encoder.py
================================================
import argparse
import json
import sys
import time
import torch
import torch.nn as nn
import torch.utils.data
sys.path.append('.')
from accuracy import head_accuracy # noqa
from dataset import EncoderDataset # noqa
from lstm import Encoder # noqa
from utils import AverageMeter, dict2str, save_checkpoint # noqa
from utils.setup_logger import setup_logger # noqa
def parse_args():
"""Parses arguments."""
parser = argparse.ArgumentParser(description='Train the language encoder')
# mode
parser.add_argument('--debug', type=int, default=0)
# training
parser.add_argument('--batch_size', type=int, default=2048)
parser.add_argument('--val_batch', type=int, default=1024)
# learning rate scheme
parser.add_argument('--num_epochs', default=20, type=int)
parser.add_argument('--lr', default=1e-3, type=float)
parser.add_argument('--weight_decay', default=0, type=float)
# LSTM hyperparameter
parser.add_argument('--word_embedding_dim', default=300, type=int)
parser.add_argument('--text_embed_size', default=1024, type=int)
parser.add_argument('--linear_hidden_size', default=256, type=int)
parser.add_argument('--linear_dropout_rate', default=0, type=float)
# input directories
parser.add_argument(
'--vocab_file', required=True, type=str, help='path to vocab file.')
parser.add_argument(
'--metadata_file',
default='./templates/metadata_fsm.json',
type=str,
help='path to metadata file.')
parser.add_argument(
'--train_set_dir', required=True, type=str, help='path to train data.')
parser.add_argument(
'--val_set_dir', required=True, type=str, help='path to val data.')
# output directories
parser.add_argument(
'--work_dir',
required=True,
type=str,
help='path to save checkpoint and log files.')
# misc
parser.add_argument(
'--unlabeled_value',
default=999,
type=int,
help='value to represent unlabeled value')
parser.add_argument('--num_workers', default=8, type=int)
return parser.parse_args()
best_val_acc, best_epoch, current_iters = 0, 0, 0
def main():
"""Main function."""
# ################### Set Up #######################
global args, best_val_acc, best_epoch
args = parse_args()
logger = setup_logger(
args.work_dir, logger_name='train.txt', debug=args.debug)
args.device = torch.device('cuda')
logger.info('Saving arguments.')
logger.info(dict2str(args.__dict__))
# ################### Metadata #######################
with open(args.metadata_file, 'r') as f:
args.metadata = json.load(f)
args.num_head = len(args.metadata.items())
logger.info(f'args.num_head: {args.num_head}, ')
logger.info(f'args.metadata: {args.metadata}.')
# ################### Language Encoder #######################
# load vocab file
with open(args.vocab_file, 'r') as f:
vocab = json.load(f)
text_token_to_idx = vocab['text_token_to_idx']
encoder = Encoder(
token_to_idx=text_token_to_idx,
word_embedding_dim=args.word_embedding_dim,
text_embed_size=args.text_embed_size,
metadata_file=args.metadata_file,
linear_hidden_size=args.linear_hidden_size,
linear_dropout_rate=args.linear_dropout_rate)
encoder = encoder.to(args.device)
# ################### DataLoader #######################
logger.info('Preparing train_dataset')
train_dataset = EncoderDataset(preprocessed_dir=args.train_set_dir)
logger.info('Preparing train_loader')
train_loader = torch.utils.data.DataLoader(
train_dataset,
batch_size=args.batch_size,
shuffle=True,
num_workers=args.num_workers,
pin_memory=False,
sampler=None)
logger.info('Preparing val_dataset')
val_dataset = EncoderDataset(preprocessed_dir=args.val_set_dir)
logger.info('Preparing val_loader')
val_loader = torch.utils.data.DataLoader(
val_dataset,
batch_size=args.val_batch,
shuffle=False,
num_workers=args.num_workers,
pin_memory=False)
logger.info(f'Number of train text: {len(train_dataset)}, '
f'Number of val text: {len(val_dataset)}.')
data_loader = {
'train': train_loader,
'val': val_loader,
}
# ################### Optimizer #######################
optimizer = torch.optim.Adam(
encoder.parameters(), args.lr, weight_decay=args.weight_decay)
# ################### Loss Function #######################
criterion = nn.CrossEntropyLoss(
reduction='mean', ignore_index=args.unlabeled_value)
# ################### Epochs #######################
for epoch in range(args.num_epochs):
logger.info(
'----------- Training: Epoch '
f'({epoch + 1} / {args.num_epochs}), LR: {args.lr:.4f}. ---------'
)
train_per_head_acc_avg, train_overall_acc = train(
args,
'train',
encoder,
data_loader['train'],
criterion,
optimizer,
logger,
)
logger.info(
'Train accuracy '
f'({epoch + 1} / {args.num_epochs}), '
f'{[str(round(i, 2))+"%" for i in train_per_head_acc_avg]}')
val_per_head_acc_avg, val_overall_acc = train(
args,
'val',
encoder,
data_loader['val'],
criterion,
optimizer,
logger,
)
logger.info('Validation accuracy '
f'({epoch + 1} / {args.num_epochs}), '
f'{[str(round(i, 2))+"%" for i in val_per_head_acc_avg]}')
# whether this epoch has the highest val acc so far
is_best = val_overall_acc > best_val_acc
if is_best:
best_epoch = epoch + 1
best_val_acc = val_overall_acc
logger.info(
f'Best Epoch: {best_epoch}, best acc: {best_val_acc: .4f}.')
save_checkpoint(
args, {
'epoch': epoch + 1,
'best_epoch_so_far': best_epoch,
'state_dict': encoder.state_dict(),
'best_val_acc': best_val_acc,
'optimizer': optimizer.state_dict(),
},
is_best,
checkpoint=args.work_dir)
logger.info('successful')
def train(args, phase, encoder, data_loader, criterion, optimizer, logger):
if phase == 'train':
encoder.train()
else:
encoder.eval()
# record time
batch_time = AverageMeter()
data_time = AverageMeter()
end = time.time()
# record accuracy
per_head_acc_list = [AverageMeter() for _ in range(args.num_head)]
for batch_idx, batch_data in enumerate(data_loader):
data_time.update(time.time() - end)
text, system_mode, labels = batch_data
text = text.to(args.device)
system_mode = system_mode.to(args.device)
labels = labels.to(args.device)
if phase == 'train':
output = encoder(text)
else:
with torch.no_grad():
output = encoder(text)
loss_list = []
# Labels: loss and acc
for head_idx, (key, val) in enumerate(args.metadata.items()):
loss = criterion(output[head_idx], labels[:, head_idx])
loss_list.append(loss)
acc_dict = head_accuracy(
output=output[head_idx],
target=labels[:, head_idx],
unlabeled_value=args.unlabeled_value)
acc = acc_dict['acc']
labeled_count = int(acc_dict['labeled_count'])
if labeled_count > 0:
per_head_acc_list[head_idx].update(acc, labeled_count)
loss_avg = sum(loss_list) / len(loss_list)
if phase == 'train':
optimizer.zero_grad()
loss_avg.backward()
optimizer.step()
# measure elapsed time
batch_time.update(time.time() - end)
end = time.time()
logger.info(
f'Batch: {batch_idx+1}, '
f'Data time: {data_time.avg:.3f}s, Batch time: {batch_time.avg:.3f}s, ' # noqa
f'loss: {loss_avg:.4f}.')
overall_acc = 0
per_head_acc_avg = []
for head_idx in range(args.num_head):
per_head_acc_avg.append(per_head_acc_list[head_idx].avg)
overall_acc += per_head_acc_list[head_idx].avg
overall_acc = overall_acc / args.num_head
return per_head_acc_avg, overall_acc
if __name__ == '__main__':
main()
================================================
FILE: language/utils/__init__.py
================================================
"""Useful utils
"""
# progress bar
import os
import sys
from .eval import * # noqa
from .logger import * # noqa
from .lr_schedule import * # noqa
from .misc import * # noqa
from .numerical import * # noqa
from .visualize import * # noqa
sys.path.append(os.path.join(os.path.dirname(__file__), "progress"))
from progress.bar import Bar as Bar # noqa
================================================
FILE: language/utils/eval.py
================================================
from __future__ import absolute_import, print_function
import torch
__all__ = ['classification_accuracy', 'regression_accuracy']
def classification_accuracy(output,
target,
class_wise=False,
num_cls=6,
excluded_cls_idx=None):
"""
Computes the precision@k for the specified values of k
output: batch_size * num_cls (for a specific attribute)
target: batch_size * 1 (for a specific attribute)
return res: res = 100 * num_correct / batch_size, for a specific attribute
for a batch
"""
with torch.no_grad():
batch_size = target.size(0)
# _ = the largest score, pred = cls_idx with the largest score
_, pred = output.topk(1, 1, True, True)
pred = pred.reshape(-1)
acc = float(torch.sum(pred == target)) / float(batch_size) * 100
return_dict = {'acc': acc}
if excluded_cls_idx is not None:
correct_count = torch.sum(
(pred == target) * (target != excluded_cls_idx))
labeled_count = torch.sum(target != excluded_cls_idx)
if labeled_count:
labeled_acc = float(correct_count) / float(labeled_count) * 100
else:
labeled_acc = 0
return_dict['labeled_acc'] = labeled_acc
return_dict['labeled_count'] = labeled_count
else:
return_dict['labeled_acc'] = acc
return_dict['labeled_count'] = batch_size
if class_wise:
acc_class_wise = []
per_class_count = []
# actual number of classes <= num_cls=6
for i in range(num_cls):
total_sample_cls_i = torch.sum(target == i)
if total_sample_cls_i:
correct_samples_cls_i = torch.sum(
(pred == i) * (target == i))
acc_class_wise.append(
float(correct_samples_cls_i) /
float(total_sample_cls_i) * 100)
else:
acc_class_wise.append(0)
per_class_count.append(total_sample_cls_i)
return_dict['acc_class_wise'] = acc_class_wise
return_dict['per_class_count'] = per_class_count
return return_dict
def regression_accuracy(output,
target,
margin=0.2,
uni_neg=True,
class_wise=False,
num_cls=6,
excluded_cls_idx=None,
max_cls_value=5):
"""
Computes the regression accuracy
if predicted score is less than one margin from the ground-truth score, we
consider it as correct otherwise it is incorrect, the acc is the
percentage of correct regression
class_wise: if True, then report overall accuracy and class-wise accuracy
else, then only report overall accuracy
"""
output = output.clone().reshape(-1)
if uni_neg:
output[(output <= 0 + margin) * (target == 0)] = 0
output[(output >= max_cls_value - margin) *
(target == max_cls_value)] = max_cls_value
distance = torch.absolute(target - output)
distance = distance - margin
predicted_class = torch.zeros_like(target)
# if distance <= 0, assign ground truth class
predicted_class[distance <= 0] = target[distance <= 0]
# if distance > 0, assign an invalid value
predicted_class[distance > 0] = -1
acc = float(torch.sum(predicted_class == target)) / float(
target.size(0)) * 100
return_dict = {'acc': acc}
if excluded_cls_idx is not None:
correct_count = torch.sum(
(predicted_class == target) * (target != excluded_cls_idx))
labeled_count = torch.sum(target != excluded_cls_idx)
if labeled_count:
labeled_acc = float(correct_count) / float(labeled_count) * 100
else:
labeled_acc = 0
return_dict['labeled_acc'] = labeled_acc
return_dict['labeled_count'] = labeled_count
else:
labeled_acc = acc
return_dict['labeled_acc'] = acc
return_dict['labeled_count'] = target.size(0)
if class_wise:
acc_class_wise = []
per_class_count = []
for i in range(num_cls):
total_sample_cls_i = torch.sum(target == i)
if total_sample_cls_i:
correct_samples_cls_i = torch.sum(
(predicted_class == i) * (target == i))
acc_class_wise.append(
float(correct_samples_cls_i) / float(total_sample_cls_i) *
100)
else:
acc_class_wise.append(0)
per_class_count.append(total_sample_cls_i)
return_dict['acc_class_wise'] = acc_class_wise
return_dict['per_class_count'] = per_class_count
return return_dict
def main():
l1 = [
0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 2, 2, 2, 1.7, 0, 3, 3, 2.79, 3.3, 0, 4,
2, 5, 3, 0, 6, 6, 4.78, 6, 0
]
l2 = [
0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4,
4, 5, 5, 5, 5, 5
]
output = torch.FloatTensor(l1)
target = torch.LongTensor(l2)
acc = regression_accuracy(output, target, margin=0.2)
print('acc:', acc)
print()
acc, acc_class_wise_list, per_class_count = regression_accuracy(
output, target, margin=0.2, class_wise=True)
print('acc:', acc)
print('acc_class_wise_list:', acc_class_wise_list)
print('per_class_count: ', per_class_count)
if __name__ == '__main__':
main()
================================================
FILE: language/utils/logger.py
================================================
from __future__ import absolute_import
import datetime
import logging
import time
import matplotlib.pyplot as plt
import matplotlib.ticker as plticker
import numpy as np
# from mmcv.runner import get_dist_info, master_only
__all__ = [
'Logger', 'LoggerMonitor', 'savefig', 'MessageLogger', 'init_tb_logger',
'get_root_logger', 'dict2str'
]
def savefig(fname, dpi=None):
dpi = 150 if dpi is None else dpi
plt.savefig(fname, dpi=dpi)
def plot_overlap(logger, names=None):
names = logger.names if names is None else names
numbers = logger.numbers
for _, name in enumerate(names):
x = np.arange(len(numbers[name]))
plt.plot(x, np.asarray(numbers[name]))
return [logger.title + '(' + name + ')' for name in names]
class Logger(object):
'''Save training process to log file with simple plot function.'''
def __init__(self, fpath, title=None, resume=False):
self.file = None
self.resume = resume
self.title = '' if title is None else title
if fpath is not None:
if resume:
self.file = open(fpath, 'r')
name = self.file.readline()
self.names = name.rstrip().split('\t')
self.numbers = {}
for _, name in enumerate(self.names):
self.numbers[name] = []
for numbers in self.file:
numbers = numbers.rstrip().split('\t')
for i in range(0, len(numbers)):
self.numbers[self.names[i]].append(numbers[i])
self.file.close()
self.file = open(fpath, 'a')
else:
self.file = open(fpath, 'w')
def set_names(self, names):
if self.resume:
pass
# initialize numbers as empty list
self.numbers = {}
self.names = names
for _, name in enumerate(self.names):
self.file.write(name)
self.file.write('\t')
self.numbers[name] = []
self.file.write('\n')
self.file.flush()
def append(self, numbers):
assert len(self.names) == len(numbers), 'Numbers do not match names'
for index, num in enumerate(numbers):
if type(num) == int:
self.file.write(str(num))
elif type(num) == float:
self.file.write("{0:.6f}".format(num))
else: # str
self.file.write(str(num))
self.file.write('\t')
self.numbers[self.names[index]].append(num)
self.file.write('\n')
self.file.flush()
def plot(self, out_file, names=None):
names = self.names if names is None else names
numbers = self.numbers
fig, ax = plt.subplots(1, 1)
for _, name in enumerate(names):
x = np.arange(len(numbers[name]))
ax.plot(x, numbers[name])
# whether add data labels to each point in the plot
if False:
for i in range(len(x)):
y = numbers[name][i]
# text = round(y, 2) # below 4 line are added by ziqi
if type(y) == int or type(y) == float:
text = round(y, 2)
else:
text = y
ax.text(x[i], y, text)
ax.legend([self.title + '(' + name + ')' for name in names])
loc = plticker.MultipleLocator(
base=1.0
) # this locator puts ticks at regular intervals # ziqi added
ax.xaxis.set_major_locator(loc)
ax.grid(True)
plt.savefig(out_file)
plt.close()
def close(self):
if self.file is not None:
self.file.close()
def get_numbers(self):
stats = {}
for name in self.names:
stats[name] = self.numbers[name]
return stats
class LoggerMonitor(object):
'''Load and visualize multiple logs.'''
def __init__(self, paths):
'''paths is a distionary with {name:filepath} pair'''
self.loggers = []
for title, path in paths.items():
logger = Logger(path, title=title, resume=True)
self.loggers.append(logger)
def plot(self, names=None):
plt.figure()
plt.subplot(121)
legend_text = []
for logger in self.loggers:
legend_text += plot_overlap(logger, names)
plt.legend(
legend_text, bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
plt.grid(True)
class MessageLogger():
"""Message logger for printing.
Args:
opt (dict): Config. It contains the following keys:
name (str): Exp name.
logger (dict): Contains 'print_freq' (str) for logger interval.
train (dict): Contains 'niter' (int) for total iters.
use_tb_logger (bool): Use tensorboard logger.
start_iter (int): Start iter. Default: 1.
tb_logger (obj:`tb_logger`): Tensorboard logger. Default: None.
"""
def __init__(self, opt, start_iter=1, tb_logger=None):
self.exp_name = opt['name']
self.interval = opt['logger']['print_freq']
self.start_iter = start_iter
self.max_iters = opt['train']['niter']
self.use_tb_logger = opt['use_tb_logger']
self.tb_logger = tb_logger
self.start_time = time.time()
self.logger = get_root_logger()
# @master_only
def __call__(self, log_vars):
"""Format logging message.
Args:
log_vars (dict): It contains the following keys:
epoch (int): Epoch number.
iter (int): Current iter.
lrs (list): List for learning rates.
time (float): Iter time.
data_time (float): Data time for each iter.
"""
# epoch, iter, learning rates
epoch = log_vars.pop('epoch')
current_iter = log_vars.pop('iter')
lrs = log_vars.pop('lrs')
message = (f'[{self.exp_name[:5]}..][epoch:{epoch:3d}, '
f'iter:{current_iter:8,d}, lr:(')
for v in lrs:
message += f'{v:.3e},'
message += ')] '
# time and estimated time
if 'time' in log_vars.keys():
iter_time = log_vars.pop('time')
data_time = log_vars.pop('data_time')
total_time = time.time() - self.start_time
time_sec_avg = total_time / (current_iter - self.start_iter + 1)
eta_sec = time_sec_avg * (self.max_iters - current_iter - 1)
eta_str = str(datetime.timedelta(seconds=int(eta_sec)))
message += f'[eta: {eta_str}, '
message += f'time: {iter_time:.3f}, data_time: {data_time:.3f}] '
# other items, especially losses
for k, v in log_vars.items():
message += f'{k}: {v:.4e} '
# tensorboard logger
if self.use_tb_logger and 'debug' not in self.exp_name:
self.tb_logger.add_scalar(k, v, current_iter)
self.logger.info(message)
# @master_only
def init_tb_logger(log_dir):
from torch.utils.tensorboard import SummaryWriter
tb_logger = SummaryWriter(log_dir=log_dir)
return tb_logger
def get_root_logger(logger_name='base', log_level=logging.INFO, log_file=None):
"""Get the root logger.
The logger will be initialized if it has not been initialized. By default a
StreamHandler will be added. If `log_file` is specified, a FileHandler will
also be added.
Args:
logger_name (str): root logger name. Default: base.
log_file (str | None): The log filename. If specified, a FileHandler
will be added to the root logger.
log_level (int): The root logger level. Note that only the process of
rank 0 is affected, while other processes will set the level to
"Error" and be silent most of the time.
Returns:
logging.Logger: The root logger.
"""
logger = logging.getLogger(logger_name)
# if the logger has been initialized, just return it
if logger.hasHandlers():
return logger
format_str = '%(asctime)s.%(msecs)03d - %(levelname)s: %(message)s'
logging.basicConfig(format=format
gitextract_av2x8cy4/
├── .gitignore
├── README.md
├── configs/
│ ├── attributes_5.json
│ ├── editing/
│ │ ├── editing_with_dialog.yml
│ │ └── editing_wo_dialog.yml
│ └── train/
│ ├── field_1024_bangs.yml
│ ├── field_1024_beard.yml
│ ├── field_1024_eyeglasses.yml
│ ├── field_1024_smiling.yml
│ ├── field_1024_young.yml
│ ├── field_128_bangs.yml
│ ├── field_128_beard.yml
│ ├── field_128_eyeglasses.yml
│ ├── field_128_smiling.yml
│ └── field_128_young.yml
├── data/
│ ├── __init__.py
│ └── latent_code_dataset.py
├── editing_quantitative.py
├── editing_with_dialog.py
├── editing_wo_dialog.py
├── environment.yml
├── language/
│ ├── accuracy.py
│ ├── build_vocab.py
│ ├── dataset.py
│ ├── generate_feedback.py
│ ├── generate_training_request.py
│ ├── language_utils.py
│ ├── lstm.py
│ ├── preprocess_request.py
│ ├── run_encoder.py
│ ├── templates/
│ │ ├── attr_wise_caption_templates.json
│ │ ├── feedback.json
│ │ ├── gender.json
│ │ ├── metadata_fsm.json
│ │ ├── overall_caption_templates.json
│ │ ├── pool.json
│ │ ├── system_mode.json
│ │ ├── user_fsm.json
│ │ ├── user_old_templates.json
│ │ └── vocab.json
│ ├── train_encoder.py
│ └── utils/
│ ├── __init__.py
│ ├── eval.py
│ ├── logger.py
│ ├── lr_schedule.py
│ ├── misc.py
│ ├── numerical.py
│ ├── progress/
│ │ ├── .gitignore
│ │ ├── LICENSE
│ │ ├── MANIFEST.in
│ │ ├── README.rst
│ │ ├── progress/
│ │ │ ├── __init__.py
│ │ │ ├── bar.py
│ │ │ ├── counter.py
│ │ │ ├── helpers.py
│ │ │ └── spinner.py
│ │ ├── setup.py
│ │ └── test_progress.py
│ ├── setup_logger.py
│ └── visualize.py
├── models/
│ ├── __init__.py
│ ├── archs/
│ │ ├── __init__.py
│ │ ├── attribute_predictor_arch.py
│ │ ├── field_function_arch.py
│ │ └── stylegan2/
│ │ ├── .gitignore
│ │ ├── LICENSE
│ │ ├── LICENSE-FID
│ │ ├── LICENSE-LPIPS
│ │ ├── LICENSE-NVIDIA
│ │ ├── __init__.py
│ │ ├── apply_factor.py
│ │ ├── calc_inception.py
│ │ ├── checkpoint/
│ │ │ └── .gitignore
│ │ ├── convert_weight.py
│ │ ├── dataset.py
│ │ ├── distributed.py
│ │ ├── fid.py
│ │ ├── generate.py
│ │ ├── inception.py
│ │ ├── inversion.py
│ │ ├── lpips/
│ │ │ ├── __init__.py
│ │ │ ├── base_model.py
│ │ │ ├── dist_model.py
│ │ │ ├── networks_basic.py
│ │ │ ├── pretrained_networks.py
│ │ │ └── weights/
│ │ │ ├── v0.0/
│ │ │ │ ├── alex.pth
│ │ │ │ ├── squeeze.pth
│ │ │ │ └── vgg.pth
│ │ │ └── v0.1/
│ │ │ ├── alex.pth
│ │ │ ├── squeeze.pth
│ │ │ └── vgg.pth
│ │ ├── model.py
│ │ ├── non_leaking.py
│ │ ├── op/
│ │ │ ├── __init__.py
│ │ │ ├── fused_act.py
│ │ │ ├── fused_bias_act.cpp
│ │ │ ├── fused_bias_act_kernel.cu
│ │ │ ├── upfirdn2d.cpp
│ │ │ ├── upfirdn2d.py
│ │ │ └── upfirdn2d_kernel.cu
│ │ ├── ppl.py
│ │ ├── sample/
│ │ │ └── .gitignore
│ │ └── train.py
│ ├── base_model.py
│ ├── field_function_model.py
│ ├── losses/
│ │ ├── __init__.py
│ │ ├── arcface_loss.py
│ │ └── discriminator_loss.py
│ └── utils.py
├── quantitative_results.py
├── train.py
└── utils/
├── __init__.py
├── crop_img.py
├── dialog_edit_utils.py
├── editing_utils.py
├── inversion_utils.py
├── logger.py
├── numerical_metrics.py
├── options.py
└── util.py
SYMBOL INDEX (515 symbols across 66 files)
FILE: data/latent_code_dataset.py
class LatentCodeDataset (line 14) | class LatentCodeDataset(data.Dataset):
method __init__ (line 16) | def __init__(self, input_dir, subset_samples=None):
method __getitem__ (line 42) | def __getitem__(self, index):
method __len__ (line 46) | def __len__(self):
FILE: editing_quantitative.py
function main (line 14) | def main():
FILE: editing_with_dialog.py
function parse_args (line 18) | def parse_args():
function main (line 26) | def main():
FILE: editing_wo_dialog.py
function parse_args (line 18) | def parse_args():
function main (line 29) | def main():
FILE: language/accuracy.py
function head_accuracy (line 4) | def head_accuracy(output, target, unlabeled_value=999):
FILE: language/build_vocab.py
function parse_args (line 13) | def parse_args():
function main (line 31) | def main():
FILE: language/dataset.py
class EncoderDataset (line 7) | class EncoderDataset(Dataset):
method __init__ (line 9) | def __init__(self, preprocessed_dir):
method __getitem__ (line 21) | def __getitem__(self, index):
method __len__ (line 31) | def __len__(self):
function main (line 35) | def main():
FILE: language/generate_feedback.py
function parse_args (line 11) | def parse_args():
function main (line 49) | def main():
function instantiate_feedback (line 105) | def instantiate_feedback(args,
FILE: language/generate_training_request.py
function parse_args (line 11) | def parse_args():
function main (line 53) | def main():
function instantiate_training_request (line 130) | def instantiate_training_request(
FILE: language/language_utils.py
function build_vocab (line 15) | def build_vocab(text_list,
function tokenize (line 54) | def tokenize(text,
function encode (line 88) | def encode(text_tokens, token_to_idx, allow_unk=False):
function decode (line 100) | def decode(seq_idx, idx_to_token, delim=None, stop_at_end=True):
function reverse_dict (line 112) | def reverse_dict(input_dict):
function to_long_tensor (line 121) | def to_long_tensor(dset):
function proper_capitalize (line 127) | def proper_capitalize(text):
FILE: language/lstm.py
class Encoder (line 22) | class Encoder(nn.Module):
method __init__ (line 24) | def __init__(self,
method forward (line 55) | def forward(self, text):
class LSTM (line 71) | class LSTM(nn.Module):
method __init__ (line 73) | def __init__(self,
method forward (line 102) | def forward(self, x):
class fc_block (line 142) | class fc_block(nn.Module):
method __init__ (line 144) | def __init__(self, inplanes, planes, drop_rate=0.15):
method forward (line 153) | def forward(self, x):
function main (line 162) | def main():
FILE: language/preprocess_request.py
function parse_args (line 15) | def parse_args():
function main (line 63) | def main():
FILE: language/run_encoder.py
function parse_args (line 11) | def parse_args():
function main (line 60) | def main():
function encode_request (line 66) | def encode_request(args, system_mode=None, dialog_logger=None):
FILE: language/train_encoder.py
function parse_args (line 18) | def parse_args():
function main (line 74) | def main():
function train (line 203) | def train(args, phase, encoder, data_loader, criterion, optimizer, logger):
FILE: language/utils/eval.py
function classification_accuracy (line 8) | def classification_accuracy(output,
function regression_accuracy (line 68) | def regression_accuracy(output,
function main (line 144) | def main():
FILE: language/utils/logger.py
function savefig (line 19) | def savefig(fname, dpi=None):
function plot_overlap (line 24) | def plot_overlap(logger, names=None):
class Logger (line 33) | class Logger(object):
method __init__ (line 36) | def __init__(self, fpath, title=None, resume=False):
method set_names (line 58) | def set_names(self, names):
method append (line 71) | def append(self, numbers):
method plot (line 85) | def plot(self, out_file, names=None):
method close (line 113) | def close(self):
method get_numbers (line 117) | def get_numbers(self):
class LoggerMonitor (line 124) | class LoggerMonitor(object):
method __init__ (line 127) | def __init__(self, paths):
method plot (line 134) | def plot(self, names=None):
class MessageLogger (line 145) | class MessageLogger():
method __init__ (line 158) | def __init__(self, opt, start_iter=1, tb_logger=None):
method __call__ (line 169) | def __call__(self, log_vars):
function init_tb_logger (line 215) | def init_tb_logger(log_dir):
function get_root_logger (line 221) | def get_root_logger(logger_name='base', log_level=logging.INFO, log_file...
function dict2str (line 255) | def dict2str(opt, indent_level=1):
FILE: language/utils/lr_schedule.py
function adjust_learning_rate (line 6) | def adjust_learning_rate(args, optimizer, epoch):
FILE: language/utils/misc.py
function get_mean_and_std (line 19) | def get_mean_and_std(dataset):
function init_params (line 36) | def init_params(net):
function mkdir_p (line 52) | def mkdir_p(path):
function save_checkpoint (line 63) | def save_checkpoint(args,
class AverageMeter (line 84) | class AverageMeter(object):
method __init__ (line 91) | def __init__(self):
method reset (line 94) | def reset(self):
method update (line 100) | def update(self, val, n=1):
FILE: language/utils/numerical.py
function get_weight (line 8) | def get_weight(args):
function transpose_and_format (line 84) | def transpose_and_format(args, input):
FILE: language/utils/progress/progress/__init__.py
class Infinite (line 27) | class Infinite(object):
method __init__ (line 31) | def __init__(self, *args, **kwargs):
method __getitem__ (line 40) | def __getitem__(self, key):
method elapsed (line 46) | def elapsed(self):
method elapsed_td (line 50) | def elapsed_td(self):
method update_avg (line 53) | def update_avg(self, n, dt):
method update (line 58) | def update(self):
method start (line 61) | def start(self):
method finish (line 64) | def finish(self):
method next (line 67) | def next(self, n=1):
method iter (line 75) | def iter(self, it):
class Progress (line 84) | class Progress(Infinite):
method __init__ (line 85) | def __init__(self, *args, **kwargs):
method eta (line 90) | def eta(self):
method eta_td (line 94) | def eta_td(self):
method percent (line 98) | def percent(self):
method progress (line 102) | def progress(self):
method remaining (line 106) | def remaining(self):
method start (line 109) | def start(self):
method goto (line 112) | def goto(self, index):
method iter (line 116) | def iter(self, it):
FILE: language/utils/progress/progress/bar.py
class Bar (line 22) | class Bar(WritelnMixin, Progress):
method update (line 32) | def update(self):
class ChargingBar (line 45) | class ChargingBar(Bar):
class FillingSquaresBar (line 53) | class FillingSquaresBar(ChargingBar):
class FillingCirclesBar (line 58) | class FillingCirclesBar(ChargingBar):
class IncrementalBar (line 63) | class IncrementalBar(Bar):
method update (line 66) | def update(self):
class PixelBar (line 83) | class PixelBar(IncrementalBar):
class ShadyBar (line 87) | class ShadyBar(IncrementalBar):
FILE: language/utils/progress/progress/counter.py
class Counter (line 22) | class Counter(WriteMixin, Infinite):
method update (line 26) | def update(self):
class Countdown (line 30) | class Countdown(WriteMixin, Progress):
method update (line 33) | def update(self):
class Stack (line 37) | class Stack(WriteMixin, Progress):
method update (line 41) | def update(self):
class Pie (line 47) | class Pie(Stack):
FILE: language/utils/progress/progress/helpers.py
class WriteMixin (line 22) | class WriteMixin(object):
method __init__ (line 25) | def __init__(self, message=None, **kwargs):
method write (line 37) | def write(self, s):
method finish (line 45) | def finish(self):
class WritelnMixin (line 50) | class WritelnMixin(object):
method __init__ (line 53) | def __init__(self, message=None, **kwargs):
method clearln (line 61) | def clearln(self):
method writeln (line 65) | def writeln(self, line):
method finish (line 71) | def finish(self):
class SigIntMixin (line 82) | class SigIntMixin(object):
method __init__ (line 85) | def __init__(self, *args, **kwargs):
method _sigint_handler (line 89) | def _sigint_handler(self, signum, frame):
FILE: language/utils/progress/progress/spinner.py
class Spinner (line 22) | class Spinner(WriteMixin, Infinite):
method update (line 27) | def update(self):
class PieSpinner (line 32) | class PieSpinner(Spinner):
class MoonSpinner (line 36) | class MoonSpinner(Spinner):
class LineSpinner (line 40) | class LineSpinner(Spinner):
class PixelSpinner (line 43) | class PixelSpinner(Spinner):
FILE: language/utils/progress/test_progress.py
function sleep (line 16) | def sleep():
FILE: language/utils/setup_logger.py
function setup_logger (line 11) | def setup_logger(work_dir=None,
FILE: language/utils/visualize.py
function make_image (line 12) | def make_image(img, mean=(0,0,0), std=(1,1,1)):
function gauss (line 18) | def gauss(x,a,b,c):
function colorize (line 21) | def colorize(x):
function show_batch (line 38) | def show_batch(images, Mean=(2, 2, 2), Std=(0.5,0.5,0.5)):
function show_mask_single (line 44) | def show_mask_single(images, mask, Mean=(2, 2, 2), Std=(0.5,0.5,0.5)):
function show_mask (line 73) | def show_mask(images, masklist, Mean=(2, 2, 2), Std=(0.5,0.5,0.5)):
FILE: models/__init__.py
function create_model (line 21) | def create_model(opt):
FILE: models/archs/attribute_predictor_arch.py
function conv3x3 (line 17) | def conv3x3(in_planes, out_planes, stride=1):
function conv1x1 (line 28) | def conv1x1(in_planes, out_planes, stride=1):
class BasicBlock (line 34) | class BasicBlock(nn.Module):
method __init__ (line 37) | def __init__(self, inplanes, planes, stride=1, downsample=None):
method forward (line 47) | def forward(self, x):
class Bottleneck (line 66) | class Bottleneck(nn.Module):
method __init__ (line 69) | def __init__(self, inplanes, planes, stride=1, downsample=None):
method forward (line 81) | def forward(self, x):
class fc_block (line 104) | class fc_block(nn.Module):
method __init__ (line 106) | def __init__(self, inplanes, planes, drop_rate=0.15):
method forward (line 115) | def forward(self, x):
class ResNet (line 124) | class ResNet(nn.Module):
method __init__ (line 126) | def __init__(self,
method _make_layer (line 179) | def _make_layer(self, block, planes, blocks, stride=1):
method forward (line 195) | def forward(self, x):
function resnet50 (line 220) | def resnet50(pretrained=True, **kwargs):
function init_pretrained_weights (line 232) | def init_pretrained_weights(model, model_url):
FILE: models/archs/field_function_arch.py
class FieldFunction (line 5) | class FieldFunction(nn.Module):
method __init__ (line 7) | def __init__(
method forward (line 43) | def forward(self, x):
class LinearLayer (line 48) | class LinearLayer(nn.Module):
method __init__ (line 50) | def __init__(
method forward (line 68) | def forward(self, x):
class Normalization (line 75) | class Normalization(nn.Module):
method __init__ (line 77) | def __init__(self, ):
method forward (line 87) | def forward(self, x):
FILE: models/archs/stylegan2/calc_inception.py
class Inception3Feature (line 18) | class Inception3Feature(Inception3):
method forward (line 19) | def forward(self, x):
function load_patched_inception_v3 (line 51) | def load_patched_inception_v3():
function extract_features (line 61) | def extract_features(loader, inception, device):
FILE: models/archs/stylegan2/convert_weight.py
function convert_modconv (line 14) | def convert_modconv(vars, source_name, target_name, flip=False):
function convert_conv (line 41) | def convert_conv(vars, source_name, target_name, bias=True, start=0):
function convert_torgb (line 61) | def convert_torgb(vars, source_name, target_name):
function convert_dense (line 82) | def convert_dense(vars, source_name, target_name):
function update (line 96) | def update(state_dict, new):
function discriminator_fill_statedict (line 108) | def discriminator_fill_statedict(statedict, vars, size):
function fill_statedict (line 148) | def fill_statedict(state_dict, vars, size, n_mlp):
FILE: models/archs/stylegan2/dataset.py
class MultiResolutionDataset (line 8) | class MultiResolutionDataset(Dataset):
method __init__ (line 9) | def __init__(self, path, transform, resolution=256):
method __len__ (line 28) | def __len__(self):
method __getitem__ (line 31) | def __getitem__(self, index):
FILE: models/archs/stylegan2/distributed.py
function get_rank (line 9) | def get_rank():
function synchronize (line 19) | def synchronize():
function get_world_size (line 34) | def get_world_size():
function reduce_sum (line 44) | def reduce_sum(tensor):
function gather_grad (line 57) | def gather_grad(params):
function all_gather (line 69) | def all_gather(data):
function reduce_loss_dict (line 104) | def reduce_loss_dict(loss_dict):
FILE: models/archs/stylegan2/fid.py
function extract_feature_from_samples (line 15) | def extract_feature_from_samples(
function calc_fid (line 34) | def calc_fid(sample_mean, sample_cov, real_mean, real_cov, eps=1e-6):
FILE: models/archs/stylegan2/generate.py
function generate (line 14) | def generate(args, g_ema, device, mean_latent):
FILE: models/archs/stylegan2/inception.py
class InceptionV3 (line 16) | class InceptionV3(nn.Module):
method __init__ (line 31) | def __init__(self,
method forward (line 129) | def forward(self, inp):
function fid_inception_v3 (line 166) | def fid_inception_v3():
class FIDInceptionA (line 193) | class FIDInceptionA(models.inception.InceptionA):
method __init__ (line 195) | def __init__(self, in_channels, pool_features):
method forward (line 198) | def forward(self, x):
class FIDInceptionC (line 218) | class FIDInceptionC(models.inception.InceptionC):
method __init__ (line 220) | def __init__(self, in_channels, channels_7x7):
method forward (line 223) | def forward(self, x):
class FIDInceptionE_1 (line 246) | class FIDInceptionE_1(models.inception.InceptionE):
method __init__ (line 248) | def __init__(self, in_channels):
method forward (line 251) | def forward(self, x):
class FIDInceptionE_2 (line 279) | class FIDInceptionE_2(models.inception.InceptionE):
method __init__ (line 281) | def __init__(self, in_channels):
method forward (line 284) | def forward(self, x):
FILE: models/archs/stylegan2/inversion.py
function noise_regularize (line 17) | def noise_regularize(noises):
function noise_normalize_ (line 39) | def noise_normalize_(noises):
function get_lr (line 47) | def get_lr(t, initial_lr, rampdown=0.25, rampup=0.05):
function latent_noise (line 55) | def latent_noise(latent, strength):
function make_image (line 61) | def make_image(tensor):
FILE: models/archs/stylegan2/lpips/__init__.py
class PerceptualLoss (line 9) | class PerceptualLoss(torch.nn.Module):
method __init__ (line 11) | def __init__(
method forward (line 35) | def forward(self, pred, target, normalize=False):
function normalize_tensor (line 52) | def normalize_tensor(in_feat, eps=1e-10):
function l2 (line 57) | def l2(p0, p1, range=255.):
function psnr (line 61) | def psnr(p0, p1, peak=255.):
function dssim (line 65) | def dssim(p0, p1, range=255.):
function rgb2lab (line 69) | def rgb2lab(in_img, mean_cent=False):
function tensor2np (line 77) | def tensor2np(tensor_obj):
function np2tensor (line 82) | def np2tensor(np_obj):
function tensor2tensorlab (line 87) | def tensor2tensorlab(image_tensor, to_norm=True, mc_only=False):
function tensorlab2tensor (line 102) | def tensorlab2tensor(lab_tensor, return_inbnd=False):
function rgb2lab (line 122) | def rgb2lab(input):
function tensor2im (line 127) | def tensor2im(image_tensor, imtype=np.uint8, cent=1., factor=255. / 2.):
function im2tensor (line 133) | def im2tensor(image, imtype=np.uint8, cent=1., factor=255. / 2.):
function tensor2vec (line 138) | def tensor2vec(vector_tensor):
function voc_ap (line 142) | def voc_ap(rec, prec, use_07_metric=False):
function tensor2im (line 176) | def tensor2im(image_tensor, imtype=np.uint8, cent=1., factor=255. / 2.):
function im2tensor (line 183) | def im2tensor(image, imtype=np.uint8, cent=1., factor=255. / 2.):
FILE: models/archs/stylegan2/lpips/base_model.py
class BaseModel (line 7) | class BaseModel():
method __init__ (line 9) | def __init__(self):
method name (line 12) | def name(self):
method initialize (line 15) | def initialize(self, use_gpu=True, gpu_ids=[0]):
method forward (line 19) | def forward(self):
method get_image_paths (line 22) | def get_image_paths(self):
method optimize_parameters (line 25) | def optimize_parameters(self):
method get_current_visuals (line 28) | def get_current_visuals(self):
method get_current_errors (line 31) | def get_current_errors(self):
method save (line 34) | def save(self, label):
method save_network (line 38) | def save_network(self, network, path, network_label, epoch_label):
method load_network (line 44) | def load_network(self, network, network_label, epoch_label):
method update_learning_rate (line 50) | def update_learning_rate():
method get_image_paths (line 53) | def get_image_paths(self):
method save_done (line 56) | def save_done(self, flag=False):
FILE: models/archs/stylegan2/lpips/dist_model.py
class DistModel (line 17) | class DistModel(BaseModel):
method name (line 19) | def name(self):
method initialize (line 22) | def initialize(self,
method forward (line 130) | def forward(self, in0, in1, retPerLayer=False):
method optimize_parameters (line 141) | def optimize_parameters(self):
method clamp_weights (line 148) | def clamp_weights(self):
method set_input (line 153) | def set_input(self, data):
method forward_train (line 169) | def forward_train(self): # run forward pass
method backward_train (line 184) | def backward_train(self):
method compute_accuracy (line 187) | def compute_accuracy(self, d0, d1, judge):
method get_current_errors (line 193) | def get_current_errors(self):
method get_current_visuals (line 203) | def get_current_visuals(self):
method save (line 217) | def save(self, path, label):
method update_learning_rate (line 224) | def update_learning_rate(self, nepoch_decay):
function score_2afc_dataset (line 235) | def score_2afc_dataset(data_loader, func, name=''):
function score_jnd_dataset (line 273) | def score_jnd_dataset(data_loader, func, name=''):
FILE: models/archs/stylegan2/lpips/networks_basic.py
function spatial_average (line 11) | def spatial_average(in_tens, keepdim=True):
function upsample (line 15) | def upsample(in_tens, out_H=64): # assumes scale factor is same for H a...
class PNetLin (line 25) | class PNetLin(nn.Module):
method __init__ (line 27) | def __init__(self,
method forward (line 71) | def forward(self, in0, in1, retPerLayer=False):
class ScalingLayer (line 121) | class ScalingLayer(nn.Module):
method __init__ (line 123) | def __init__(self):
method forward (line 132) | def forward(self, inp):
class NetLinLayer (line 136) | class NetLinLayer(nn.Module):
method __init__ (line 139) | def __init__(self, chn_in, chn_out=1, use_dropout=False):
class Dist2LogitLayer (line 151) | class Dist2LogitLayer(nn.Module):
method __init__ (line 154) | def __init__(self, chn_mid=32, use_sigmoid=True):
method forward (line 178) | def forward(self, d0, d1, eps=0.1):
class BCERankingLoss (line 184) | class BCERankingLoss(nn.Module):
method __init__ (line 186) | def __init__(self, chn_mid=32):
method forward (line 192) | def forward(self, d0, d1, judge):
class FakeNet (line 199) | class FakeNet(nn.Module):
method __init__ (line 201) | def __init__(self, use_gpu=True, colorspace='Lab'):
class L2 (line 207) | class L2(FakeNet):
method forward (line 209) | def forward(self, in0, in1, retPerLayer=None):
class DSSIM (line 231) | class DSSIM(FakeNet):
method forward (line 233) | def forward(self, in0, in1, retPerLayer=None):
function print_network (line 252) | def print_network(net):
FILE: models/archs/stylegan2/lpips/pretrained_networks.py
class squeezenet (line 7) | class squeezenet(torch.nn.Module):
method __init__ (line 9) | def __init__(self, requires_grad=False, pretrained=True):
method forward (line 38) | def forward(self, X):
class alexnet (line 62) | class alexnet(torch.nn.Module):
method __init__ (line 64) | def __init__(self, requires_grad=False, pretrained=True):
method forward (line 88) | def forward(self, X):
class vgg16 (line 106) | class vgg16(torch.nn.Module):
method __init__ (line 108) | def __init__(self, requires_grad=False, pretrained=True):
method forward (line 131) | def forward(self, X):
class resnet (line 151) | class resnet(torch.nn.Module):
method __init__ (line 153) | def __init__(self, requires_grad=False, pretrained=True, num=18):
method forward (line 176) | def forward(self, X):
FILE: models/archs/stylegan2/model.py
class PixelNorm (line 15) | class PixelNorm(nn.Module):
method __init__ (line 17) | def __init__(self):
method forward (line 20) | def forward(self, input):
function make_kernel (line 25) | def make_kernel(k):
class Upsample (line 36) | class Upsample(nn.Module):
method __init__ (line 38) | def __init__(self, kernel, factor=2):
method forward (line 52) | def forward(self, input):
class Downsample (line 59) | class Downsample(nn.Module):
method __init__ (line 61) | def __init__(self, kernel, factor=2):
method forward (line 75) | def forward(self, input):
class Blur (line 82) | class Blur(nn.Module):
method __init__ (line 84) | def __init__(self, kernel, pad, upsample_factor=1):
method forward (line 96) | def forward(self, input):
class EqualConv2d (line 102) | class EqualConv2d(nn.Module):
method __init__ (line 104) | def __init__(self,
method forward (line 126) | def forward(self, input):
method __repr__ (line 137) | def __repr__(self):
class EqualLinear (line 144) | class EqualLinear(nn.Module):
method __init__ (line 146) | def __init__(self,
method forward (line 168) | def forward(self, input):
method __repr__ (line 179) | def __repr__(self):
class ModulatedConv2d (line 185) | class ModulatedConv2d(nn.Module):
method __init__ (line 187) | def __init__(
method __repr__ (line 235) | def __repr__(self):
method forward (line 240) | def forward(self, input, style):
class NoiseInjection (line 284) | class NoiseInjection(nn.Module):
method __init__ (line 286) | def __init__(self):
method forward (line 291) | def forward(self, image, noise=None):
class ConstantInput (line 299) | class ConstantInput(nn.Module):
method __init__ (line 301) | def __init__(self, channel, size=4):
method forward (line 306) | def forward(self, input):
class StyledConv (line 313) | class StyledConv(nn.Module):
method __init__ (line 315) | def __init__(
method forward (line 342) | def forward(self, input, style, noise=None):
class ToRGB (line 351) | class ToRGB(nn.Module):
method __init__ (line 353) | def __init__(self,
method forward (line 367) | def forward(self, input, style, skip=None):
class Generator (line 379) | class Generator(nn.Module):
method __init__ (line 381) | def __init__(
method make_noise (line 473) | def make_noise(self):
method mean_latent (line 484) | def mean_latent(self, n_latent):
method get_latent (line 491) | def get_latent(self, input):
method style_forward (line 497) | def style_forward(self, input, skip_norm=False):
method forward (line 505) | def forward(
class ConvLayer (line 580) | class ConvLayer(nn.Sequential):
method __init__ (line 582) | def __init__(
class ResBlock (line 625) | class ResBlock(nn.Module):
method __init__ (line 627) | def __init__(self, in_channel, out_channel, blur_kernel=[1, 3, 3, 1]):
method forward (line 641) | def forward(self, input):
class Discriminator (line 651) | class Discriminator(nn.Module):
method __init__ (line 653) | def __init__(self, size, channel_multiplier=2, blur_kernel=[1, 3, 3, 1]):
method forward (line 693) | def forward(self, input):
FILE: models/archs/stylegan2/non_leaking.py
class AdaptiveAugment (line 10) | class AdaptiveAugment:
method __init__ (line 11) | def __init__(self, ada_aug_target, ada_aug_len, update_every, device):
method tune (line 21) | def tune(self, real_pred):
function translate_mat (line 62) | def translate_mat(t_x, t_y):
function rotate_mat (line 72) | def rotate_mat(theta):
function scale_mat (line 84) | def scale_mat(s_x, s_y):
function translate3d_mat (line 94) | def translate3d_mat(t_x, t_y, t_z):
function rotate3d_mat (line 104) | def rotate3d_mat(axis, theta):
function scale3d_mat (line 125) | def scale3d_mat(s_x, s_y, s_z):
function luma_flip_mat (line 136) | def luma_flip_mat(axis, i):
function saturation_mat (line 146) | def saturation_mat(axis, i):
function lognormal_sample (line 157) | def lognormal_sample(size, mean=0, std=1):
function category_sample (line 161) | def category_sample(size, categories):
function uniform_sample (line 168) | def uniform_sample(size, low, high):
function normal_sample (line 172) | def normal_sample(size, mean=0, std=1):
function bernoulli_sample (line 176) | def bernoulli_sample(size, p):
function random_mat_apply (line 180) | def random_mat_apply(p, transform, prev, eye):
function sample_affine (line 188) | def sample_affine(p, size, height, width):
function sample_color (line 247) | def sample_color(p, size):
function make_grid (line 281) | def make_grid(shape, x0, x1, y0, y1, device):
function affine_grid (line 291) | def affine_grid(grid, mat):
function get_padding (line 296) | def get_padding(G, height, width):
function try_sample_affine_and_pad (line 325) | def try_sample_affine_and_pad(img, p, pad_k, G=None):
function random_apply_affine (line 353) | def random_apply_affine(img, p, G=None, antialiasing_kernel=SYM6):
function apply_color (line 411) | def apply_color(img, mat):
function random_apply_color (line 422) | def random_apply_color(img, p, C=None):
function augment (line 431) | def augment(img, p, transform_matrix=(None, None)):
FILE: models/archs/stylegan2/op/fused_act.py
class FusedLeakyReLUFunctionBackward (line 20) | class FusedLeakyReLUFunctionBackward(Function):
method forward (line 22) | def forward(ctx, grad_output, out, bias, negative_slope, scale):
method backward (line 47) | def backward(ctx, gradgrad_input, gradgrad_bias):
class FusedLeakyReLUFunction (line 56) | class FusedLeakyReLUFunction(Function):
method forward (line 58) | def forward(ctx, input, bias, negative_slope, scale):
method backward (line 74) | def backward(ctx, grad_output):
class FusedLeakyReLU (line 87) | class FusedLeakyReLU(nn.Module):
method __init__ (line 88) | def __init__(self, channel, bias=True, negative_slope=0.2, scale=2 ** ...
method forward (line 100) | def forward(self, input):
function fused_leaky_relu (line 104) | def fused_leaky_relu(input, bias=None, negative_slope=0.2, scale=2 ** 0.5):
FILE: models/archs/stylegan2/op/fused_bias_act.cpp
function fused_bias_act (line 11) | torch::Tensor fused_bias_act(const torch::Tensor& input, const torch::Te...
function PYBIND11_MODULE (line 19) | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
FILE: models/archs/stylegan2/op/upfirdn2d.cpp
function upfirdn2d (line 12) | torch::Tensor upfirdn2d(const torch::Tensor& input, const torch::Tensor&...
function PYBIND11_MODULE (line 21) | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
FILE: models/archs/stylegan2/op/upfirdn2d.py
class UpFirDn2dBackward (line 19) | class UpFirDn2dBackward(Function):
method forward (line 21) | def forward(
method backward (line 63) | def backward(ctx, gradgrad_input):
class UpFirDn2d (line 88) | class UpFirDn2d(Function):
method forward (line 90) | def forward(ctx, input, kernel, up, down, pad):
method backward (line 127) | def backward(ctx, grad_output):
function upfirdn2d (line 145) | def upfirdn2d(input, kernel, up=1, down=1, pad=(0, 0)):
function upfirdn2d_native (line 159) | def upfirdn2d_native(
FILE: models/archs/stylegan2/ppl.py
function normalize (line 12) | def normalize(x):
function slerp (line 16) | def slerp(a, b, t):
function lerp (line 27) | def lerp(a, b, t):
FILE: models/archs/stylegan2/train.py
function data_sampler (line 33) | def data_sampler(dataset, shuffle, distributed):
function requires_grad (line 44) | def requires_grad(model, flag=True):
function accumulate (line 49) | def accumulate(model1, model2, decay=0.999):
function sample_data (line 57) | def sample_data(loader):
function d_logistic_loss (line 63) | def d_logistic_loss(real_pred, fake_pred):
function d_r1_loss (line 70) | def d_r1_loss(real_pred, real_img):
function g_nonsaturating_loss (line 79) | def g_nonsaturating_loss(fake_pred):
function g_path_regularize (line 85) | def g_path_regularize(fake_img, latents, mean_path_length, decay=0.01):
function make_noise (line 101) | def make_noise(batch, latent_dim, n_noise, device):
function mixing_noise (line 110) | def mixing_noise(batch, latent_dim, prob, device):
function set_grad_none (line 118) | def set_grad_none(model, targets):
function train (line 124) | def train(args, loader, generator, discriminator, g_optim, d_optim, g_em...
FILE: models/base_model.py
class BaseModel (line 22) | class BaseModel():
method __init__ (line 26) | def __init__(self, opt):
method init_training_settings (line 89) | def init_training_settings(self):
method feed_data (line 114) | def feed_data(self, data):
method optimize_parameters (line 121) | def optimize_parameters(self):
method get_current_log (line 174) | def get_current_log(self):
method update_learning_rate (line 177) | def update_learning_rate(self, epoch):
method save_network (line 214) | def save_network(self, net, save_path):
method load_network (line 225) | def load_network(self, pretrained_field):
method synthesize_image (line 231) | def synthesize_image(self, sample_latent_code):
method synthesize_and_predict (line 241) | def synthesize_and_predict(self, sample_latent_code):
method inference (line 251) | def inference(self, batch_idx, epoch, save_dir):
method continuous_editing (line 285) | def continuous_editing(self, latent_codes, save_dir, editing_logger):
method continuous_editing_with_target (line 423) | def continuous_editing_with_target(self,
FILE: models/field_function_model.py
class FieldFunctionModel (line 10) | class FieldFunctionModel(BaseModel):
method __init__ (line 12) | def __init__(self, opt):
method modify_latent_code (line 17) | def modify_latent_code(self, latent_code_w, latent_code_w_plus=None):
method modify_latent_code_bidirection (line 42) | def modify_latent_code_bidirection(self,
FILE: models/losses/arcface_loss.py
function conv3x3 (line 6) | def conv3x3(in_planes, out_planes, stride=1):
class BasicBlock (line 17) | class BasicBlock(nn.Module):
method __init__ (line 20) | def __init__(self, inplanes, planes, stride=1, downsample=None):
method forward (line 30) | def forward(self, x):
class IRBlock (line 49) | class IRBlock(nn.Module):
method __init__ (line 52) | def __init__(self,
method forward (line 71) | def forward(self, x):
class Bottleneck (line 92) | class Bottleneck(nn.Module):
method __init__ (line 95) | def __init__(self, inplanes, planes, stride=1, downsample=None):
method forward (line 114) | def forward(self, x):
class SEBlock (line 137) | class SEBlock(nn.Module):
method __init__ (line 139) | def __init__(self, channel, reduction=16):
method forward (line 146) | def forward(self, x):
class ResNetFace (line 153) | class ResNetFace(nn.Module):
method __init__ (line 155) | def __init__(self, block, layers, use_se=True):
method _make_layer (line 183) | def _make_layer(self, block, planes, blocks, stride=1):
method forward (line 205) | def forward(self, x):
function resnet_face18 (line 224) | def resnet_face18(use_se=True, **kwargs):
class ArcFaceLoss (line 229) | class ArcFaceLoss(nn.Module):
method __init__ (line 231) | def __init__(self, pretrained_model, loss_type, use_se=False):
method forward (line 250) | def forward(self, original_imgs, edited_imgs, resize=False):
FILE: models/losses/discriminator_loss.py
class DiscriminatorLoss (line 7) | class DiscriminatorLoss(nn.Module):
method __init__ (line 9) | def __init__(self, pretrained_model, img_res):
method forward (line 24) | def forward(self, generated_images):
FILE: models/utils.py
function postprocess (line 9) | def postprocess(images, channel_order='BGR', min_val=-1.0, max_val=1.0):
function transform_image (line 46) | def transform_image(image, resize=False):
function set_random_seed (line 65) | def set_random_seed(seed):
function output_to_label (line 74) | def output_to_label(output):
function predictor_to_label (line 101) | def predictor_to_label(predictor_output):
function save_image (line 119) | def save_image(img, save_path, need_post_process=True):
FILE: quantitative_results.py
function parse_args (line 21) | def parse_args():
function get_edited_images_list (line 51) | def get_edited_images_list(img_dir, img_idx):
function load_face_image (line 69) | def load_face_image(img_path):
function load_image_predictor (line 83) | def load_image_predictor(img_path,
function predictor_score (line 98) | def predictor_score(predictor_output, gt_label, target_attr_idx,
function compute_num_metrics (line 117) | def compute_num_metrics(image_dir, image_num, target_attr_idx, logger):
function main (line 185) | def main():
FILE: train.py
function main (line 19) | def main():
FILE: utils/crop_img.py
function crop_img (line 27) | def crop_img(img_size, input_img_path, cropped_output_path, device='cuda'):
function crop_img_128 (line 36) | def crop_img_128(input_img_path, cropped_output_path, device='cuda'):
function get_landmark (line 68) | def get_landmark(filepath):
function crop_img_1024 (line 98) | def crop_img_1024(input_img_path, cropped_output_path):
FILE: utils/dialog_edit_utils.py
function dialog_with_real_user (line 13) | def dialog_with_real_user(field_model,
function decide_next_state (line 165) | def decide_next_state(state, system_mode, user_mode):
function decide_next_edit (line 243) | def decide_next_edit(edit_log, system_labels, user_labels, state,
function decide_next_feedback (line 360) | def decide_next_feedback(system_labels, user_labels, state, edit_labels,
FILE: utils/editing_utils.py
function edit_target_attribute (line 1) | def edit_target_attribute(opt,
FILE: utils/inversion_utils.py
function noise_regularize (line 15) | def noise_regularize(noises):
function noise_normalize_ (line 37) | def noise_normalize_(noises):
function get_lr (line 45) | def get_lr(t, initial_lr, rampdown=0.25, rampup=0.05):
function latent_noise (line 53) | def latent_noise(latent, strength):
function make_image (line 59) | def make_image(tensor):
function inversion (line 64) | def inversion(opt, field_model):
FILE: utils/logger.py
class MessageLogger (line 6) | class MessageLogger():
method __init__ (line 19) | def __init__(self, opt, start_iter=1, tb_logger=None):
method __call__ (line 29) | def __call__(self, log_vars):
function init_tb_logger (line 74) | def init_tb_logger(log_dir):
function get_root_logger (line 80) | def get_root_logger(logger_name='base', log_level=logging.INFO, log_file...
FILE: utils/numerical_metrics.py
function parse_args (line 16) | def parse_args():
function get_edited_images_list (line 69) | def get_edited_images_list(img_dir, img_idx):
function load_image_predictor (line 87) | def load_image_predictor(img_path,
function load_image_arcface (line 106) | def load_image_arcface(img_path):
function cosin_metric (line 125) | def cosin_metric(x1, x2):
function predictor_score (line 129) | def predictor_score(predictor_output, gt_label, target_attr_idx,
function compute_num_metrics (line 148) | def compute_num_metrics(image_dir, image_num, pretrained_arcface, attr_f...
FILE: utils/options.py
function ordered_yaml (line 8) | def ordered_yaml():
function parse (line 33) | def parse(opt_path, is_train=True):
function dict2str (line 116) | def dict2str(opt, indent_level=1):
class NoneDict (line 137) | class NoneDict(dict):
method __missing__ (line 140) | def __missing__(self, key):
function dict_to_nonedict (line 144) | def dict_to_nonedict(opt):
function parse_args_from_opt (line 164) | def parse_args_from_opt(args, opt):
function parse_opt_wrt_resolution (line 178) | def parse_opt_wrt_resolution(opt):
FILE: utils/util.py
function make_exp_dirs (line 14) | def make_exp_dirs(opt):
function set_random_seed (line 25) | def set_random_seed(seed):
class ProgressBar (line 34) | class ProgressBar(object):
method __init__ (line 41) | def __init__(self, task_num=0, bar_width=50, start=True):
method _get_max_bar_width (line 50) | def _get_max_bar_width(self):
method start (line 60) | def start(self):
method update (line 69) | def update(self, msg='In progress...'):
Condensed preview — 120 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (560K chars).
[
{
"path": ".gitignore",
"chars": 73,
"preview": "experiments/\nresults/\ntb_logger/\n*.pyc\n.vscode/\ndownload\ndownload/*\n*.sh\n"
},
{
"path": "README.md",
"chars": 10010,
"preview": "# Talk-to-Edit (ICCV2021)\n\n\n![pytorch 1.6."
},
{
"path": "configs/attributes_5.json",
"chars": 1350,
"preview": "\n{\n \"attr_info\":{\n \"6\": {\n \"name\": \"Bangs\",\n \"value\":[0, 1, 2, 3, 4, 5],\n \"id"
},
{
"path": "configs/editing/editing_with_dialog.yml",
"chars": 2781,
"preview": "name: dialog_editing\n\nimg_res: 1024 # 128\n\n# latent code\nlatent_code_path: ./download/editing_data/teaser_latent_code.np"
},
{
"path": "configs/editing/editing_wo_dialog.yml",
"chars": 2159,
"preview": "name: editing_wo_dialog\n\nimg_res: 1024 # 128\n\n# latent code\nlatent_code_path: ./download/editing_data/teaser_latent_code"
},
{
"path": "configs/train/field_1024_bangs.yml",
"chars": 1309,
"preview": "name: field_1024_bangs\nuse_tb_logger: true\nset_CUDA_VISIBLE_DEVICES: ~\ngpu_ids: [3]\n\nattribute: Bangs\n\nmodel_type: Field"
},
{
"path": "configs/train/field_1024_beard.yml",
"chars": 1318,
"preview": "name: field_1024_beard\nuse_tb_logger: true\nset_CUDA_VISIBLE_DEVICES: ~\ngpu_ids: [3]\n\nattribute: No_Beard\n\nmodel_type: Fi"
},
{
"path": "configs/train/field_1024_eyeglasses.yml",
"chars": 1330,
"preview": "name: field_1024_eyeglasses\nuse_tb_logger: true\nset_CUDA_VISIBLE_DEVICES: ~\ngpu_ids: [3]\n\nattribute: Eyeglasses\n\nmodel_t"
},
{
"path": "configs/train/field_1024_smiling.yml",
"chars": 1316,
"preview": "name: field_1024_smiling\nuse_tb_logger: true\nset_CUDA_VISIBLE_DEVICES: ~\ngpu_ids: [3]\n\nattribute: Smiling\n\nmodel_type: F"
},
{
"path": "configs/train/field_1024_young.yml",
"chars": 1309,
"preview": "name: field_1024_young\nuse_tb_logger: true\nset_CUDA_VISIBLE_DEVICES: ~\ngpu_ids: [3]\n\nattribute: Young\n\nmodel_type: Field"
},
{
"path": "configs/train/field_128_bangs.yml",
"chars": 1281,
"preview": "name: field_128_bangs\nuse_tb_logger: true\nset_CUDA_VISIBLE_DEVICES: ~\ngpu_ids: [3]\n\nattribute: Bangs\n\nmodel_type: FieldF"
},
{
"path": "configs/train/field_128_beard.yml",
"chars": 1290,
"preview": "name: field_128_beard\nuse_tb_logger: true\nset_CUDA_VISIBLE_DEVICES: ~\ngpu_ids: [3]\n\nattribute: No_Beard\n\nmodel_type: Fie"
},
{
"path": "configs/train/field_128_eyeglasses.yml",
"chars": 1301,
"preview": "name: field_128_eyeglasses\nuse_tb_logger: true\nset_CUDA_VISIBLE_DEVICES: ~\ngpu_ids: [3]\n\nattribute: Eyeglasses\n\nmodel_ty"
},
{
"path": "configs/train/field_128_smiling.yml",
"chars": 1273,
"preview": "name: field_128_smiling\nuse_tb_logger: true\nset_CUDA_VISIBLE_DEVICES: ~\ngpu_ids: [3]\n\nattribute: Smiling\n\nmodel_type: Fi"
},
{
"path": "configs/train/field_128_young.yml",
"chars": 1281,
"preview": "name: field_128_young\nuse_tb_logger: true\nset_CUDA_VISIBLE_DEVICES: ~\ngpu_ids: [3]\n\nattribute: Young\n\nmodel_type: FieldF"
},
{
"path": "data/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "data/latent_code_dataset.py",
"chars": 1555,
"preview": "\"\"\"\nDataset for field function\n\"\"\"\n\nimport os\nimport os.path\nimport random\n\nimport numpy as np\nimport torch\nimport torch"
},
{
"path": "editing_quantitative.py",
"chars": 1723,
"preview": "import argparse\nimport logging\nimport os\n\nimport numpy as np\n\nfrom models import create_model\nfrom utils.logger import g"
},
{
"path": "editing_with_dialog.py",
"chars": 3327,
"preview": "import argparse\nimport json\nimport logging\nimport os.path\n\nimport numpy as np\nimport torch\n\nfrom models import create_mo"
},
{
"path": "editing_wo_dialog.py",
"chars": 3972,
"preview": "import argparse\nimport logging\nimport os\n\nimport numpy as np\nimport torch\n\nfrom models import create_model\nfrom models.u"
},
{
"path": "environment.yml",
"chars": 4553,
"preview": "name: talk_edit\nchannels:\n - pytorch\n - conda-forge\n - anaconda\n - defaults\ndependencies:\n - _libgcc_mutex=0.1=main"
},
{
"path": "language/accuracy.py",
"chars": 1288,
"preview": "import torch\n\n\ndef head_accuracy(output, target, unlabeled_value=999):\n \"\"\"\n Computes the precision@k for the spec"
},
{
"path": "language/build_vocab.py",
"chars": 1675,
"preview": "import argparse\nimport json\nimport os\nimport sys\n\nsys.path.append('.')\nfrom language_utils import * # noqa\n\"\"\"\nBuild vo"
},
{
"path": "language/dataset.py",
"chars": 1174,
"preview": "import os.path\n\nimport numpy as np\nfrom torch.utils.data import Dataset\n\n\nclass EncoderDataset(Dataset):\n\n def __init"
},
{
"path": "language/generate_feedback.py",
"chars": 6686,
"preview": "import argparse\nimport json\nimport os.path\nimport random\n\nimport numpy as np\n\nfrom .language_utils import proper_capital"
},
{
"path": "language/generate_training_request.py",
"chars": 7416,
"preview": "import argparse\nimport json\nimport os.path\nimport random\nimport sys\n\nsys.path.append('.')\nfrom language_utils import pro"
},
{
"path": "language/language_utils.py",
"chars": 4075,
"preview": "import numpy as np\nimport torch\n\n# global variables\nPUNCTUATION_TO_KEEP = ['?', ';']\nPUNCTUATION_TO_REMOVE = ['.', '!', "
},
{
"path": "language/lstm.py",
"chars": 7013,
"preview": "\"\"\"\nLSTM\n\nInput: batch_size x max_text_length (tokenized questions)\nOutput: batch_size x lstm_hidden_size (question embe"
},
{
"path": "language/preprocess_request.py",
"chars": 4452,
"preview": "import argparse\nimport json\nimport os\nimport sys\n\nimport numpy as np\n\nsys.path.append('.')\nfrom language_utils import * "
},
{
"path": "language/run_encoder.py",
"chars": 7815,
"preview": "import argparse\nimport json\nimport random\n\nimport torch\n\nfrom .language_utils import * # noqa\nfrom .lstm import Encoder"
},
{
"path": "language/templates/attr_wise_caption_templates.json",
"chars": 5402,
"preview": "{\n \"Bangs\": {\n \"0\": [\n \"<He> has no bangs at all.\",\n \"<He> has no bangs at all and <his>"
},
{
"path": "language/templates/feedback.json",
"chars": 3588,
"preview": "{\n \"replacement\": {\n \"<ATTR_NAME>\": {\n \"Bangs\": \"bangs\",\n \"Eyeglasses\": \"glasses\",\n "
},
{
"path": "language/templates/gender.json",
"chars": 899,
"preview": "{\n \"male\": {\n \"<man>\": [\n \"person\",\n \"guy\",\n \"gentleman\"\n ],\n \""
},
{
"path": "language/templates/metadata_fsm.json",
"chars": 998,
"preview": "{\n \"start\": {\n \"start_pureRequest\": 0\n },\n \"suggestion\": {\n \"yes\": 0,\n \"yes_pureRequest\": "
},
{
"path": "language/templates/overall_caption_templates.json",
"chars": 6613,
"preview": "{\n \"attr_order_mapping\": {\n \"Bangs\": {\n \"0\": [\n \"has\",\n \"sentence\"\n "
},
{
"path": "language/templates/pool.json",
"chars": 6147,
"preview": "{\n \"synonyms\": {\n \" can \": [\n \" can \",\n \" could \",\n \" should \"\n ],\n "
},
{
"path": "language/templates/system_mode.json",
"chars": 91,
"preview": "\n{\n \"start\": 0,\n \"suggestion\": 1,\n \"whether_enough\": 2,\n \"whats_next\": 3\n} \n"
},
{
"path": "language/templates/user_fsm.json",
"chars": 27122,
"preview": "{\n \"start\": [\n [\n \"Hi.\",\n \"Hello.\"\n ]\n ],\n \"pureRequest\": {\n \"Bangs\""
},
{
"path": "language/templates/user_old_templates.json",
"chars": 29001,
"preview": "{\n \"start\": [\n [\n \"Hi.\",\n \"Hello.\"\n ],\n [\n \" \"\n ]\n ],"
},
{
"path": "language/templates/vocab.json",
"chars": 4998,
"preview": "{\n \"text_token_to_idx\": {\n \"<NULL>\": 0,\n \"<START>\": 1,\n \"<END>\": 2,\n \"<UNK>\": 3,\n "
},
{
"path": "language/train_encoder.py",
"chars": 8683,
"preview": "import argparse\nimport json\nimport sys\nimport time\n\nimport torch\nimport torch.nn as nn\nimport torch.utils.data\n\nsys.path"
},
{
"path": "language/utils/__init__.py",
"chars": 358,
"preview": "\"\"\"Useful utils\n\"\"\"\n# progress bar\nimport os\nimport sys\n\nfrom .eval import * # noqa\nfrom .logger import * # noqa\nfrom "
},
{
"path": "language/utils/eval.py",
"chars": 5720,
"preview": "from __future__ import absolute_import, print_function\n\nimport torch\n\n__all__ = ['classification_accuracy', 'regression_"
},
{
"path": "language/utils/logger.py",
"chars": 9143,
"preview": "from __future__ import absolute_import\n\nimport datetime\nimport logging\nimport time\n\nimport matplotlib.pyplot as plt\nimpo"
},
{
"path": "language/utils/lr_schedule.py",
"chars": 1085,
"preview": "import math\n\n__all__ = ['adjust_learning_rate']\n\n\ndef adjust_learning_rate(args, optimizer, epoch):\n lr = optimizer.p"
},
{
"path": "language/utils/misc.py",
"chars": 3411,
"preview": "'''Some helper functions for PyTorch, including:\n - get_mean_and_std: calculate the mean and std value of dataset.\n "
},
{
"path": "language/utils/numerical.py",
"chars": 3954,
"preview": "import json\n\nimport numpy as np\n\n__all__ = ['get_weight', 'transpose_and_format']\n\n\ndef get_weight(args):\n \"\"\"\n re"
},
{
"path": "language/utils/progress/.gitignore",
"chars": 0,
"preview": ""
},
{
"path": "language/utils/progress/LICENSE",
"chars": 776,
"preview": "# Copyright (c) 2012 Giorgos Verigakis <verigak@gmail.com>\n#\n# Permission to use, copy, modify, and distribute this soft"
},
{
"path": "language/utils/progress/MANIFEST.in",
"chars": 27,
"preview": "include README.rst LICENSE\n"
},
{
"path": "language/utils/progress/README.rst",
"chars": 2944,
"preview": "Easy progress reporting for Python\n==================================\n\n|pypi|\n\n|demo|\n\n.. |pypi| image:: https://img.shi"
},
{
"path": "language/utils/progress/progress/__init__.py",
"chars": 3188,
"preview": "# Copyright (c) 2012 Giorgos Verigakis <verigak@gmail.com>\n#\n# Permission to use, copy, modify, and distribute this soft"
},
{
"path": "language/utils/progress/progress/bar.py",
"chars": 2784,
"preview": "# -*- coding: utf-8 -*-\n\n# Copyright (c) 2012 Giorgos Verigakis <verigak@gmail.com>\n#\n# Permission to use, copy, modify,"
},
{
"path": "language/utils/progress/progress/counter.py",
"chars": 1502,
"preview": "# -*- coding: utf-8 -*-\n\n# Copyright (c) 2012 Giorgos Verigakis <verigak@gmail.com>\n#\n# Permission to use, copy, modify,"
},
{
"path": "language/utils/progress/progress/helpers.py",
"chars": 2854,
"preview": "# Copyright (c) 2012 Giorgos Verigakis <verigak@gmail.com>\n#\n# Permission to use, copy, modify, and distribute this soft"
},
{
"path": "language/utils/progress/progress/spinner.py",
"chars": 1395,
"preview": "# -*- coding: utf-8 -*-\n\n# Copyright (c) 2012 Giorgos Verigakis <verigak@gmail.com>\n#\n# Permission to use, copy, modify,"
},
{
"path": "language/utils/progress/setup.py",
"chars": 843,
"preview": "#!/usr/bin/env python\n\nfrom setuptools import setup\n\nimport progress\n\n\nsetup(\n name='progress',\n version=progress."
},
{
"path": "language/utils/progress/test_progress.py",
"chars": 1461,
"preview": "#!/usr/bin/env python\n\nfrom __future__ import print_function\n\nimport random\nimport time\n\nfrom progress.bar import (Bar, "
},
{
"path": "language/utils/setup_logger.py",
"chars": 2354,
"preview": "# python3.7\n\"\"\"Utility functions for logging.\"\"\"\n\nimport logging\nimport os\nimport sys\n\n__all__ = ['setup_logger']\n\n\ndef "
},
{
"path": "language/utils/visualize.py",
"chars": 3795,
"preview": "import matplotlib.pyplot as plt\nimport torch\nimport torch.nn as nn\nimport torchvision\nimport torchvision.transforms as t"
},
{
"path": "models/__init__.py",
"chars": 1110,
"preview": "import glob\nimport importlib\nimport logging\nimport os.path as osp\n\n# automatically scan and import model modules\n# scan "
},
{
"path": "models/archs/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "models/archs/attribute_predictor_arch.py",
"chars": 7766,
"preview": "import json\n\nimport torch.nn as nn\nimport torch.utils.model_zoo as model_zoo\n\n__all__ = ['ResNet', 'resnet50']\n\nmodel_ur"
},
{
"path": "models/archs/field_function_arch.py",
"chars": 2220,
"preview": "import torch\nimport torch.nn as nn\n\n\nclass FieldFunction(nn.Module):\n\n def __init__(\n self,\n num_layer="
},
{
"path": "models/archs/stylegan2/.gitignore",
"chars": 1821,
"preview": "# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\n\n# C extensions\n*.so\n\n# Distribution / packagi"
},
{
"path": "models/archs/stylegan2/LICENSE",
"chars": 1071,
"preview": "MIT License\n\nCopyright (c) 2019 Kim Seonghyeon\n\nPermission is hereby granted, free of charge, to any person obtaining a "
},
{
"path": "models/archs/stylegan2/LICENSE-FID",
"chars": 11357,
"preview": " Apache License\n Version 2.0, January 2004\n "
},
{
"path": "models/archs/stylegan2/LICENSE-LPIPS",
"chars": 1381,
"preview": "Copyright (c) 2018, Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shechtman, Oliver Wang\r\nAll rights reserved.\r\n\r\nR"
},
{
"path": "models/archs/stylegan2/LICENSE-NVIDIA",
"chars": 4767,
"preview": "Copyright (c) 2019, NVIDIA Corporation. All rights reserved.\r\n\r\n\r\nNvidia Source Code License-NC\r\n\r\n====================="
},
{
"path": "models/archs/stylegan2/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "models/archs/stylegan2/apply_factor.py",
"chars": 2652,
"preview": "import argparse\n\nimport torch\nfrom torchvision import utils\n\nfrom model import Generator\n\n\nif __name__ == \"__main__\":\n "
},
{
"path": "models/archs/stylegan2/calc_inception.py",
"chars": 4006,
"preview": "import argparse\nimport pickle\nimport os\n\nimport torch\nfrom torch import nn\nfrom torch.nn import functional as F\nfrom tor"
},
{
"path": "models/archs/stylegan2/checkpoint/.gitignore",
"chars": 5,
"preview": "*.pt\n"
},
{
"path": "models/archs/stylegan2/convert_weight.py",
"chars": 8886,
"preview": "import argparse\nimport math\nimport os\nimport pickle\nimport sys\n\nimport numpy as np\nimport torch\nfrom torchvision import "
},
{
"path": "models/archs/stylegan2/dataset.py",
"chars": 1049,
"preview": "from io import BytesIO\n\nimport lmdb\nfrom PIL import Image\nfrom torch.utils.data import Dataset\n\n\nclass MultiResolutionDa"
},
{
"path": "models/archs/stylegan2/distributed.py",
"chars": 2715,
"preview": "import math\nimport pickle\n\nimport torch\nfrom torch import distributed as dist\nfrom torch.utils.data.sampler import Sampl"
},
{
"path": "models/archs/stylegan2/fid.py",
"chars": 3640,
"preview": "import argparse\nimport pickle\n\nimport torch\nfrom torch import nn\nimport numpy as np\nfrom scipy import linalg\nfrom tqdm i"
},
{
"path": "models/archs/stylegan2/generate.py",
"chars": 3276,
"preview": "import argparse\nimport os\nimport sys\n\nimport numpy as np\nimport torch\nfrom torchvision import utils\nfrom tqdm import tqd"
},
{
"path": "models/archs/stylegan2/inception.py",
"chars": 11623,
"preview": "import torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nfrom torchvision import models\n\ntry:\n from torchvi"
},
{
"path": "models/archs/stylegan2/inversion.py",
"chars": 11376,
"preview": "import argparse\nimport math\nimport os\n\nimport numpy as np\nimport torch\nfrom PIL import Image\nfrom torch import optim\nfro"
},
{
"path": "models/archs/stylegan2/lpips/__init__.py",
"chars": 5760,
"preview": "from __future__ import absolute_import, division, print_function\n\nimport numpy as np\nimport torch\nfrom models.archs.styl"
},
{
"path": "models/archs/stylegan2/lpips/base_model.py",
"chars": 1566,
"preview": "import os\n\nimport numpy as np\nimport torch\n\n\nclass BaseModel():\n\n def __init__(self):\n pass\n\n def name(self"
},
{
"path": "models/archs/stylegan2/lpips/dist_model.py",
"chars": 12392,
"preview": "from __future__ import absolute_import\n\nimport os\nfrom collections import OrderedDict\n\nimport models.archs.stylegan2.lpi"
},
{
"path": "models/archs/stylegan2/lpips/networks_basic.py",
"chars": 8485,
"preview": "from __future__ import absolute_import\n\nimport models.archs.stylegan2.lpips as util\nimport torch\nimport torch.nn as nn\nf"
},
{
"path": "models/archs/stylegan2/lpips/pretrained_networks.py",
"chars": 6702,
"preview": "from collections import namedtuple\n\nimport torch\nfrom torchvision import models as tv\n\n\nclass squeezenet(torch.nn.Module"
},
{
"path": "models/archs/stylegan2/model.py",
"chars": 19073,
"preview": "import functools\nimport math\nimport operator\nimport random\nimport sys\n\nimport torch\nfrom models.archs.stylegan2.op impor"
},
{
"path": "models/archs/stylegan2/non_leaking.py",
"chars": 11696,
"preview": "import math\r\n\r\nimport torch\r\nfrom torch.nn import functional as F\r\n\r\nfrom distributed import reduce_sum\r\nfrom op import "
},
{
"path": "models/archs/stylegan2/op/__init__.py",
"chars": 89,
"preview": "from .fused_act import FusedLeakyReLU, fused_leaky_relu\nfrom .upfirdn2d import upfirdn2d\n"
},
{
"path": "models/archs/stylegan2/op/fused_act.py",
"chars": 3262,
"preview": "import os\r\n\r\nimport torch\r\nfrom torch import nn\r\nfrom torch.nn import functional as F\r\nfrom torch.autograd import Functi"
},
{
"path": "models/archs/stylegan2/op/fused_bias_act.cpp",
"chars": 846,
"preview": "#include <torch/extension.h>\r\n\r\n\r\ntorch::Tensor fused_bias_act_op(const torch::Tensor& input, const torch::Tensor& bias,"
},
{
"path": "models/archs/stylegan2/op/fused_bias_act_kernel.cu",
"chars": 2875,
"preview": "// Copyright (c) 2019, NVIDIA Corporation. All rights reserved.\r\n//\r\n// This work is made available under the Nvidia Sou"
},
{
"path": "models/archs/stylegan2/op/upfirdn2d.cpp",
"chars": 988,
"preview": "#include <torch/extension.h>\r\n\r\n\r\ntorch::Tensor upfirdn2d_op(const torch::Tensor& input, const torch::Tensor& kernel,\r\n "
},
{
"path": "models/archs/stylegan2/op/upfirdn2d.py",
"chars": 5872,
"preview": "import os\r\n\r\nimport torch\r\nfrom torch.nn import functional as F\r\nfrom torch.autograd import Function\r\nfrom torch.utils.c"
},
{
"path": "models/archs/stylegan2/op/upfirdn2d_kernel.cu",
"chars": 12079,
"preview": "// Copyright (c) 2019, NVIDIA Corporation. All rights reserved.\r\n//\r\n// This work is made available under the Nvidia Sou"
},
{
"path": "models/archs/stylegan2/ppl.py",
"chars": 3852,
"preview": "import argparse\r\n\r\nimport torch\r\nfrom torch.nn import functional as F\r\nimport numpy as np\r\nfrom tqdm import tqdm\r\n\r\nimpo"
},
{
"path": "models/archs/stylegan2/sample/.gitignore",
"chars": 6,
"preview": "*.png\n"
},
{
"path": "models/archs/stylegan2/train.py",
"chars": 15875,
"preview": "import argparse\r\nimport math\r\nimport random\r\nimport os\r\n\r\nimport numpy as np\r\nimport torch\r\nfrom torch import nn, autogr"
},
{
"path": "models/base_model.py",
"chars": 30011,
"preview": "import logging\nimport math\nfrom collections import OrderedDict\n\nimport cv2\nimport matplotlib.image as mpimg\nimport matpl"
},
{
"path": "models/field_function_model.py",
"chars": 2338,
"preview": "import logging\n\nimport torch\n\nfrom models.base_model import BaseModel\n\nlogger = logging.getLogger('base')\n\n\nclass FieldF"
},
{
"path": "models/losses/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "models/losses/arcface_loss.py",
"chars": 8543,
"preview": "import torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\n\ndef conv3x3(in_planes, out_planes, stride=1):\n \""
},
{
"path": "models/losses/discriminator_loss.py",
"chars": 1004,
"preview": "import torch\nimport torch.nn as nn\nfrom models.archs.stylegan2.model import Discriminator\nfrom torch.nn import functiona"
},
{
"path": "models/utils.py",
"chars": 3925,
"preview": "import random\n\nimport cv2\nimport numpy as np\nimport torch\nimport torch.nn.functional as F\n\n\ndef postprocess(images, chan"
},
{
"path": "quantitative_results.py",
"chars": 7001,
"preview": "import argparse\nimport glob\nimport logging\n\nimport cv2\nimport numpy as np\nimport torch\nimport torch.nn as nn\nimport torc"
},
{
"path": "train.py",
"chars": 6482,
"preview": "import argparse\nimport logging\nimport os\nimport os.path as osp\nimport random\nimport time\n\nimport numpy as np\nimport torc"
},
{
"path": "utils/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "utils/crop_img.py",
"chars": 6508,
"preview": "\"\"\"\nbrief: face alignment with FFHQ method (https://github.com/NVlabs/ffhq-dataset)\nauthor: lzhbrian (https://lzhbrian.m"
},
{
"path": "utils/dialog_edit_utils.py",
"chars": 15586,
"preview": "import random\n\nimport matplotlib.image as mpimg\nimport matplotlib.pyplot as plt\nimport torch\nfrom language.generate_feed"
},
{
"path": "utils/editing_utils.py",
"chars": 1854,
"preview": "def edit_target_attribute(opt,\n attribute_dict,\n edit_labels,\n "
},
{
"path": "utils/inversion_utils.py",
"chars": 4836,
"preview": "import math\n\nimport models.archs.stylegan2.lpips as lpips\nimport numpy as np\nimport torch\nfrom PIL import Image\nfrom tor"
},
{
"path": "utils/logger.py",
"chars": 3998,
"preview": "import datetime\nimport logging\nimport time\n\n\nclass MessageLogger():\n \"\"\"Message logger for printing.\n\n Args:\n "
},
{
"path": "utils/numerical_metrics.py",
"chars": 7791,
"preview": "import argparse\nimport glob\n\nimport cv2\nimport numpy as np\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional"
},
{
"path": "utils/options.py",
"chars": 6144,
"preview": "import os\nimport os.path as osp\nfrom collections import OrderedDict\n\nimport yaml\n\n\ndef ordered_yaml():\n \"\"\"Support Or"
},
{
"path": "utils/util.py",
"chars": 3057,
"preview": "import logging\nimport os\nimport random\nimport sys\nimport time\nfrom shutil import get_terminal_size\n\nimport numpy as np\ni"
}
]
// ... and 6 more files (download for full content)
About this extraction
This page contains the full source code of the yumingj/Talk-to-Edit GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 120 files (516.1 KB), approximately 130.3k tokens, and a symbol index with 515 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.