Repository: carpedm20/DCGAN-tensorflow Branch: master Commit: 62c9a2a7f745 Files: 16 Total size: 32.4 MB Directory structure: gitextract_xriuzdbv/ ├── .gitignore ├── LICENSE ├── README.md ├── download.py ├── main.py ├── model.py ├── ops.py ├── utils.py └── web/ ├── app.py ├── css/ │ ├── fakeLoader.css │ └── main.css ├── fonts/ │ └── FontAwesome.otf ├── index.html └── js/ ├── app.js ├── convnet.js └── layers.js ================================================ FILE CONTENTS ================================================ ================================================ FILE: .gitignore ================================================ # Data data samples *.zip logs test* web/js/gen_layers.js # checkpoint checkpoint # trash .dropbox .DS_Store # Created by https://www.gitignore.io/api/python,vim ### Python ### # Byte-compiled / optimized / DLL files __pycache__/ *.py[cod] *$py.class # C extensions *.so # Distribution / packaging .Python env/ build/ develop-eggs/ dist/ downloads/ eggs/ .eggs/ lib/ lib64/ parts/ sdist/ var/ *.egg-info/ .installed.cfg *.egg # PyInstaller # Usually these files are written by a python script from a template # before PyInstaller builds the exe, so as to inject date/other infos into it. *.manifest *.spec # Installer logs pip-log.txt pip-delete-this-directory.txt # Unit test / coverage reports htmlcov/ .tox/ .coverage .coverage.* .cache nosetests.xml coverage.xml *,cover .hypothesis/ # Translations *.mo *.pot # Django stuff: *.log # Sphinx documentation docs/_build/ # PyBuilder target/ ### Vim ### [._]*.s[a-w][a-z] [._]s[a-w][a-z] *.un~ Session.vim .netrwhist *~ ================================================ FILE: LICENSE ================================================ The MIT License (MIT) Copyright (c) 2016 Taehoon Kim Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ================================================ FILE: README.md ================================================ # DCGAN in Tensorflow Tensorflow implementation of [Deep Convolutional Generative Adversarial Networks](http://arxiv.org/abs/1511.06434) which is a stabilize Generative Adversarial Networks. The referenced torch code can be found [here](https://github.com/soumith/dcgan.torch). ![alt tag](DCGAN.png) * [Brandon Amos](http://bamos.github.io/) wrote an excellent [blog post](http://bamos.github.io/2016/08/09/deep-completion/) and [image completion code](https://github.com/bamos/dcgan-completion.tensorflow) based on this repo. * *To avoid the fast convergence of D (discriminator) network, G (generator) network is updated twice for each D network update, which differs from original paper.* ## Online Demo [

](http://carpedm20.github.io/faces/) [link](http://carpedm20.github.io/faces/) ## Prerequisites - Python 2.7 or Python 3.3+ - [Tensorflow 0.12.1](https://github.com/tensorflow/tensorflow/tree/r0.12) - [SciPy](http://www.scipy.org/install.html) - [pillow](https://github.com/python-pillow/Pillow) - [tqdm](https://pypi.org/project/tqdm/) - (Optional) [moviepy](https://github.com/Zulko/moviepy) (for visualization) - (Optional) [Align&Cropped Images.zip](http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html) : Large-scale CelebFaces Dataset ## Usage First, download dataset with: $ python download.py mnist celebA To train a model with downloaded dataset: $ python main.py --dataset mnist --input_height=28 --output_height=28 --train $ python main.py --dataset celebA --input_height=108 --train --crop To test with an existing model: $ python main.py --dataset mnist --input_height=28 --output_height=28 $ python main.py --dataset celebA --input_height=108 --crop Or, you can use your own dataset (without central crop) by: $ mkdir data/DATASET_NAME ... add images to data/DATASET_NAME ... $ python main.py --dataset DATASET_NAME --train $ python main.py --dataset DATASET_NAME $ # example $ python main.py --dataset=eyes --input_fname_pattern="*_cropped.png" --train If your dataset is located in a different root directory: $ python main.py --dataset DATASET_NAME --data_dir DATASET_ROOT_DIR --train $ python main.py --dataset DATASET_NAME --data_dir DATASET_ROOT_DIR $ # example $ python main.py --dataset=eyes --data_dir ../datasets/ --input_fname_pattern="*_cropped.png" --train ## Results ![result](assets/training.gif) ### celebA After 6th epoch: ![result3](assets/result_16_01_04_.png) After 10th epoch: ![result4](assets/test_2016-01-27%2015:08:54.png) ### Asian face dataset ![custom_result1](web/img/change5.png) ![custom_result1](web/img/change2.png) ![custom_result2](web/img/change4.png) ### MNIST MNIST codes are written by [@PhoenixDai](https://github.com/PhoenixDai). ![mnist_result1](assets/mnist1.png) ![mnist_result2](assets/mnist2.png) ![mnist_result3](assets/mnist3.png) More results can be found [here](./assets/) and [here](./web/img/). ## Training details Details of the loss of Discriminator and Generator (with custom dataset not celebA). ![d_loss](assets/d_loss.png) ![g_loss](assets/g_loss.png) Details of the histogram of true and fake result of discriminator (with custom dataset not celebA). ![d_hist](assets/d_hist.png) ![d__hist](assets/d__hist.png) ## Related works - [BEGAN-tensorflow](https://github.com/carpedm20/BEGAN-tensorflow) - [DiscoGAN-pytorch](https://github.com/carpedm20/DiscoGAN-pytorch) - [simulated-unsupervised-tensorflow](https://github.com/carpedm20/simulated-unsupervised-tensorflow) ## Author Taehoon Kim / [@carpedm20](http://carpedm20.github.io/) ================================================ FILE: download.py ================================================ """ Modification of https://github.com/stanfordnlp/treelstm/blob/master/scripts/download.py Downloads the following: - Celeb-A dataset - LSUN dataset - MNIST dataset """ from __future__ import print_function import os import sys import gzip import json import shutil import zipfile import argparse import requests import subprocess from tqdm import tqdm from six.moves import urllib parser = argparse.ArgumentParser(description='Download dataset for DCGAN.') parser.add_argument('datasets', metavar='N', type=str, nargs='+', choices=['celebA', 'lsun', 'mnist'], help='name of dataset to download [celebA, lsun, mnist]') def download(url, dirpath): filename = url.split('/')[-1] filepath = os.path.join(dirpath, filename) u = urllib.request.urlopen(url) f = open(filepath, 'wb') filesize = int(u.headers["Content-Length"]) print("Downloading: %s Bytes: %s" % (filename, filesize)) downloaded = 0 block_sz = 8192 status_width = 70 while True: buf = u.read(block_sz) if not buf: print('') break else: print('', end='\r') downloaded += len(buf) f.write(buf) status = (("[%-" + str(status_width + 1) + "s] %3.2f%%") % ('=' * int(float(downloaded) / filesize * status_width) + '>', downloaded * 100. / filesize)) print(status, end='') sys.stdout.flush() f.close() return filepath def download_file_from_google_drive(id, destination): URL = "https://docs.google.com/uc?export=download" session = requests.Session() response = session.get(URL, params={ 'id': id }, stream=True) token = get_confirm_token(response) if token: params = { 'id' : id, 'confirm' : token } response = session.get(URL, params=params, stream=True) save_response_content(response, destination) def get_confirm_token(response): for key, value in response.cookies.items(): if key.startswith('download_warning'): return value return None def save_response_content(response, destination, chunk_size=32*1024): total_size = int(response.headers.get('content-length', 0)) with open(destination, "wb") as f: for chunk in tqdm(response.iter_content(chunk_size), total=total_size, unit='B', unit_scale=True, desc=destination): if chunk: # filter out keep-alive new chunks f.write(chunk) def unzip(filepath): print("Extracting: " + filepath) dirpath = os.path.dirname(filepath) with zipfile.ZipFile(filepath) as zf: zf.extractall(dirpath) os.remove(filepath) def download_celeb_a(dirpath): data_dir = 'celebA' if os.path.exists(os.path.join(dirpath, data_dir)): print('Found Celeb-A - skip') return filename, drive_id = "img_align_celeba.zip", "0B7EVK8r0v71pZjFTYXZWM3FlRnM" save_path = os.path.join(dirpath, filename) if os.path.exists(save_path): print('[*] {} already exists'.format(save_path)) else: download_file_from_google_drive(drive_id, save_path) zip_dir = '' with zipfile.ZipFile(save_path) as zf: zip_dir = zf.namelist()[0] zf.extractall(dirpath) os.remove(save_path) os.rename(os.path.join(dirpath, zip_dir), os.path.join(dirpath, data_dir)) def _list_categories(tag): url = 'http://lsun.cs.princeton.edu/htbin/list.cgi?tag=' + tag f = urllib.request.urlopen(url) return json.loads(f.read()) def _download_lsun(out_dir, category, set_name, tag): url = 'http://lsun.cs.princeton.edu/htbin/download.cgi?tag={tag}' \ '&category={category}&set={set_name}'.format(**locals()) print(url) if set_name == 'test': out_name = 'test_lmdb.zip' else: out_name = '{category}_{set_name}_lmdb.zip'.format(**locals()) out_path = os.path.join(out_dir, out_name) cmd = ['curl', url, '-o', out_path] print('Downloading', category, set_name, 'set') subprocess.call(cmd) def download_lsun(dirpath): data_dir = os.path.join(dirpath, 'lsun') if os.path.exists(data_dir): print('Found LSUN - skip') return else: os.mkdir(data_dir) tag = 'latest' #categories = _list_categories(tag) categories = ['bedroom'] for category in categories: _download_lsun(data_dir, category, 'train', tag) _download_lsun(data_dir, category, 'val', tag) _download_lsun(data_dir, '', 'test', tag) def download_mnist(dirpath): data_dir = os.path.join(dirpath, 'mnist') if os.path.exists(data_dir): print('Found MNIST - skip') return else: os.mkdir(data_dir) url_base = 'http://yann.lecun.com/exdb/mnist/' file_names = ['train-images-idx3-ubyte.gz', 'train-labels-idx1-ubyte.gz', 't10k-images-idx3-ubyte.gz', 't10k-labels-idx1-ubyte.gz'] for file_name in file_names: url = (url_base+file_name).format(**locals()) print(url) out_path = os.path.join(data_dir,file_name) cmd = ['curl', url, '-o', out_path] print('Downloading ', file_name) subprocess.call(cmd) cmd = ['gzip', '-d', out_path] print('Decompressing ', file_name) subprocess.call(cmd) def prepare_data_dir(path = './data'): if not os.path.exists(path): os.mkdir(path) if __name__ == '__main__': args = parser.parse_args() prepare_data_dir() if any(name in args.datasets for name in ['CelebA', 'celebA', 'celebA']): download_celeb_a('./data') if 'lsun' in args.datasets: download_lsun('./data') if 'mnist' in args.datasets: download_mnist('./data') ================================================ FILE: main.py ================================================ import os import scipy.misc import numpy as np import json from model import DCGAN from utils import pp, visualize, to_json, show_all_variables, expand_path, timestamp import tensorflow as tf flags = tf.app.flags flags.DEFINE_integer("epoch", 25, "Epoch to train [25]") flags.DEFINE_float("learning_rate", 0.0002, "Learning rate of for adam [0.0002]") flags.DEFINE_float("beta1", 0.5, "Momentum term of adam [0.5]") flags.DEFINE_float("train_size", np.inf, "The size of train images [np.inf]") flags.DEFINE_integer("batch_size", 64, "The size of batch images [64]") flags.DEFINE_integer("input_height", 108, "The size of image to use (will be center cropped). [108]") flags.DEFINE_integer("input_width", None, "The size of image to use (will be center cropped). If None, same value as input_height [None]") flags.DEFINE_integer("output_height", 64, "The size of the output images to produce [64]") flags.DEFINE_integer("output_width", None, "The size of the output images to produce. If None, same value as output_height [None]") flags.DEFINE_string("dataset", "celebA", "The name of dataset [celebA, mnist, lsun]") flags.DEFINE_string("input_fname_pattern", "*.jpg", "Glob pattern of filename of input images [*]") flags.DEFINE_string("data_dir", "./data", "path to datasets [e.g. $HOME/data]") flags.DEFINE_string("out_dir", "./out", "Root directory for outputs [e.g. $HOME/out]") flags.DEFINE_string("out_name", "", "Folder (under out_root_dir) for all outputs. Generated automatically if left blank []") flags.DEFINE_string("checkpoint_dir", "checkpoint", "Folder (under out_root_dir/out_name) to save checkpoints [checkpoint]") flags.DEFINE_string("sample_dir", "samples", "Folder (under out_root_dir/out_name) to save samples [samples]") flags.DEFINE_boolean("train", False, "True for training, False for testing [False]") flags.DEFINE_boolean("crop", False, "True for training, False for testing [False]") flags.DEFINE_boolean("visualize", False, "True for visualizing, False for nothing [False]") flags.DEFINE_boolean("export", False, "True for exporting with new batch size") flags.DEFINE_boolean("freeze", False, "True for exporting with new batch size") flags.DEFINE_integer("max_to_keep", 1, "maximum number of checkpoints to keep") flags.DEFINE_integer("sample_freq", 200, "sample every this many iterations") flags.DEFINE_integer("ckpt_freq", 200, "save checkpoint every this many iterations") flags.DEFINE_integer("z_dim", 100, "dimensions of z") flags.DEFINE_string("z_dist", "uniform_signed", "'normal01' or 'uniform_unsigned' or uniform_signed") flags.DEFINE_boolean("G_img_sum", False, "Save generator image summaries in log") #flags.DEFINE_integer("generate_test_images", 100, "Number of images to generate during test. [100]") FLAGS = flags.FLAGS def main(_): pp.pprint(flags.FLAGS.__flags) # expand user name and environment variables FLAGS.data_dir = expand_path(FLAGS.data_dir) FLAGS.out_dir = expand_path(FLAGS.out_dir) FLAGS.out_name = expand_path(FLAGS.out_name) FLAGS.checkpoint_dir = expand_path(FLAGS.checkpoint_dir) FLAGS.sample_dir = expand_path(FLAGS.sample_dir) if FLAGS.output_height is None: FLAGS.output_height = FLAGS.input_height if FLAGS.input_width is None: FLAGS.input_width = FLAGS.input_height if FLAGS.output_width is None: FLAGS.output_width = FLAGS.output_height # output folders if FLAGS.out_name == "": FLAGS.out_name = '{} - {} - {}'.format(timestamp(), FLAGS.data_dir.split('/')[-1], FLAGS.dataset) # penultimate folder of path if FLAGS.train: FLAGS.out_name += ' - x{}.z{}.{}.y{}.b{}'.format(FLAGS.input_width, FLAGS.z_dim, FLAGS.z_dist, FLAGS.output_width, FLAGS.batch_size) FLAGS.out_dir = os.path.join(FLAGS.out_dir, FLAGS.out_name) FLAGS.checkpoint_dir = os.path.join(FLAGS.out_dir, FLAGS.checkpoint_dir) FLAGS.sample_dir = os.path.join(FLAGS.out_dir, FLAGS.sample_dir) if not os.path.exists(FLAGS.checkpoint_dir): os.makedirs(FLAGS.checkpoint_dir) if not os.path.exists(FLAGS.sample_dir): os.makedirs(FLAGS.sample_dir) with open(os.path.join(FLAGS.out_dir, 'FLAGS.json'), 'w') as f: flags_dict = {k:FLAGS[k].value for k in FLAGS} json.dump(flags_dict, f, indent=4, sort_keys=True, ensure_ascii=False) #gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.333) run_config = tf.ConfigProto() run_config.gpu_options.allow_growth=True with tf.Session(config=run_config) as sess: if FLAGS.dataset == 'mnist': dcgan = DCGAN( sess, input_width=FLAGS.input_width, input_height=FLAGS.input_height, output_width=FLAGS.output_width, output_height=FLAGS.output_height, batch_size=FLAGS.batch_size, sample_num=FLAGS.batch_size, y_dim=10, z_dim=FLAGS.z_dim, dataset_name=FLAGS.dataset, input_fname_pattern=FLAGS.input_fname_pattern, crop=FLAGS.crop, checkpoint_dir=FLAGS.checkpoint_dir, sample_dir=FLAGS.sample_dir, data_dir=FLAGS.data_dir, out_dir=FLAGS.out_dir, max_to_keep=FLAGS.max_to_keep) else: dcgan = DCGAN( sess, input_width=FLAGS.input_width, input_height=FLAGS.input_height, output_width=FLAGS.output_width, output_height=FLAGS.output_height, batch_size=FLAGS.batch_size, sample_num=FLAGS.batch_size, z_dim=FLAGS.z_dim, dataset_name=FLAGS.dataset, input_fname_pattern=FLAGS.input_fname_pattern, crop=FLAGS.crop, checkpoint_dir=FLAGS.checkpoint_dir, sample_dir=FLAGS.sample_dir, data_dir=FLAGS.data_dir, out_dir=FLAGS.out_dir, max_to_keep=FLAGS.max_to_keep) show_all_variables() if FLAGS.train: dcgan.train(FLAGS) else: load_success, load_counter = dcgan.load(FLAGS.checkpoint_dir) if not load_success: raise Exception("Checkpoint not found in " + FLAGS.checkpoint_dir) # to_json("./web/js/layers.js", [dcgan.h0_w, dcgan.h0_b, dcgan.g_bn0], # [dcgan.h1_w, dcgan.h1_b, dcgan.g_bn1], # [dcgan.h2_w, dcgan.h2_b, dcgan.g_bn2], # [dcgan.h3_w, dcgan.h3_b, dcgan.g_bn3], # [dcgan.h4_w, dcgan.h4_b, None]) # Below is codes for visualization if FLAGS.export: export_dir = os.path.join(FLAGS.checkpoint_dir, 'export_b'+str(FLAGS.batch_size)) dcgan.save(export_dir, load_counter, ckpt=True, frozen=False) if FLAGS.freeze: export_dir = os.path.join(FLAGS.checkpoint_dir, 'frozen_b'+str(FLAGS.batch_size)) dcgan.save(export_dir, load_counter, ckpt=False, frozen=True) if FLAGS.visualize: OPTION = 1 visualize(sess, dcgan, FLAGS, OPTION, FLAGS.sample_dir) if __name__ == '__main__': tf.app.run() ================================================ FILE: model.py ================================================ from __future__ import division from __future__ import print_function import os import time import math from glob import glob import tensorflow as tf import numpy as np from six.moves import xrange from ops import * from utils import * def conv_out_size_same(size, stride): return int(math.ceil(float(size) / float(stride))) def gen_random(mode, size): if mode=='normal01': return np.random.normal(0,1,size=size) if mode=='uniform_signed': return np.random.uniform(-1,1,size=size) if mode=='uniform_unsigned': return np.random.uniform(0,1,size=size) class DCGAN(object): def __init__(self, sess, input_height=108, input_width=108, crop=True, batch_size=64, sample_num = 64, output_height=64, output_width=64, y_dim=None, z_dim=100, gf_dim=64, df_dim=64, gfc_dim=1024, dfc_dim=1024, c_dim=3, dataset_name='default', max_to_keep=1, input_fname_pattern='*.jpg', checkpoint_dir='ckpts', sample_dir='samples', out_dir='./out', data_dir='./data'): """ Args: sess: TensorFlow session batch_size: The size of batch. Should be specified before training. y_dim: (optional) Dimension of dim for y. [None] z_dim: (optional) Dimension of dim for Z. [100] gf_dim: (optional) Dimension of gen filters in first conv layer. [64] df_dim: (optional) Dimension of discrim filters in first conv layer. [64] gfc_dim: (optional) Dimension of gen units for for fully connected layer. [1024] dfc_dim: (optional) Dimension of discrim units for fully connected layer. [1024] c_dim: (optional) Dimension of image color. For grayscale input, set to 1. [3] """ self.sess = sess self.crop = crop self.batch_size = batch_size self.sample_num = sample_num self.input_height = input_height self.input_width = input_width self.output_height = output_height self.output_width = output_width self.y_dim = y_dim self.z_dim = z_dim self.gf_dim = gf_dim self.df_dim = df_dim self.gfc_dim = gfc_dim self.dfc_dim = dfc_dim # batch normalization : deals with poor initialization helps gradient flow self.d_bn1 = batch_norm(name='d_bn1') self.d_bn2 = batch_norm(name='d_bn2') if not self.y_dim: self.d_bn3 = batch_norm(name='d_bn3') self.g_bn0 = batch_norm(name='g_bn0') self.g_bn1 = batch_norm(name='g_bn1') self.g_bn2 = batch_norm(name='g_bn2') if not self.y_dim: self.g_bn3 = batch_norm(name='g_bn3') self.dataset_name = dataset_name self.input_fname_pattern = input_fname_pattern self.checkpoint_dir = checkpoint_dir self.data_dir = data_dir self.out_dir = out_dir self.max_to_keep = max_to_keep if self.dataset_name == 'mnist': self.data_X, self.data_y = self.load_mnist() self.c_dim = self.data_X[0].shape[-1] else: data_path = os.path.join(self.data_dir, self.dataset_name, self.input_fname_pattern) self.data = glob(data_path) if len(self.data) == 0: raise Exception("[!] No data found in '" + data_path + "'") np.random.shuffle(self.data) imreadImg = imread(self.data[0]) if len(imreadImg.shape) >= 3: #check if image is a non-grayscale image by checking channel number self.c_dim = imread(self.data[0]).shape[-1] else: self.c_dim = 1 if len(self.data) < self.batch_size: raise Exception("[!] Entire dataset size is less than the configured batch_size") self.grayscale = (self.c_dim == 1) self.build_model() def build_model(self): if self.y_dim: self.y = tf.placeholder(tf.float32, [self.batch_size, self.y_dim], name='y') else: self.y = None if self.crop: image_dims = [self.output_height, self.output_width, self.c_dim] else: image_dims = [self.input_height, self.input_width, self.c_dim] self.inputs = tf.placeholder( tf.float32, [self.batch_size] + image_dims, name='real_images') inputs = self.inputs self.z = tf.placeholder( tf.float32, [None, self.z_dim], name='z') self.z_sum = histogram_summary("z", self.z) self.G = self.generator(self.z, self.y) self.D, self.D_logits = self.discriminator(inputs, self.y, reuse=False) self.sampler = self.sampler(self.z, self.y) self.D_, self.D_logits_ = self.discriminator(self.G, self.y, reuse=True) self.d_sum = histogram_summary("d", self.D) self.d__sum = histogram_summary("d_", self.D_) self.G_sum = image_summary("G", self.G) def sigmoid_cross_entropy_with_logits(x, y): try: return tf.nn.sigmoid_cross_entropy_with_logits(logits=x, labels=y) except: return tf.nn.sigmoid_cross_entropy_with_logits(logits=x, targets=y) self.d_loss_real = tf.reduce_mean( sigmoid_cross_entropy_with_logits(self.D_logits, tf.ones_like(self.D))) self.d_loss_fake = tf.reduce_mean( sigmoid_cross_entropy_with_logits(self.D_logits_, tf.zeros_like(self.D_))) self.g_loss = tf.reduce_mean( sigmoid_cross_entropy_with_logits(self.D_logits_, tf.ones_like(self.D_))) self.d_loss_real_sum = scalar_summary("d_loss_real", self.d_loss_real) self.d_loss_fake_sum = scalar_summary("d_loss_fake", self.d_loss_fake) self.d_loss = self.d_loss_real + self.d_loss_fake self.g_loss_sum = scalar_summary("g_loss", self.g_loss) self.d_loss_sum = scalar_summary("d_loss", self.d_loss) t_vars = tf.trainable_variables() self.d_vars = [var for var in t_vars if 'd_' in var.name] self.g_vars = [var for var in t_vars if 'g_' in var.name] self.saver = tf.train.Saver(max_to_keep=self.max_to_keep) def train(self, config): d_optim = tf.train.AdamOptimizer(config.learning_rate, beta1=config.beta1) \ .minimize(self.d_loss, var_list=self.d_vars) g_optim = tf.train.AdamOptimizer(config.learning_rate, beta1=config.beta1) \ .minimize(self.g_loss, var_list=self.g_vars) try: tf.global_variables_initializer().run() except: tf.initialize_all_variables().run() if config.G_img_sum: self.g_sum = merge_summary([self.z_sum, self.d__sum, self.G_sum, self.d_loss_fake_sum, self.g_loss_sum]) else: self.g_sum = merge_summary([self.z_sum, self.d__sum, self.d_loss_fake_sum, self.g_loss_sum]) self.d_sum = merge_summary( [self.z_sum, self.d_sum, self.d_loss_real_sum, self.d_loss_sum]) self.writer = SummaryWriter(os.path.join(self.out_dir, "logs"), self.sess.graph) sample_z = gen_random(config.z_dist, size=(self.sample_num , self.z_dim)) if config.dataset == 'mnist': sample_inputs = self.data_X[0:self.sample_num] sample_labels = self.data_y[0:self.sample_num] else: sample_files = self.data[0:self.sample_num] sample = [ get_image(sample_file, input_height=self.input_height, input_width=self.input_width, resize_height=self.output_height, resize_width=self.output_width, crop=self.crop, grayscale=self.grayscale) for sample_file in sample_files] if (self.grayscale): sample_inputs = np.array(sample).astype(np.float32)[:, :, :, None] else: sample_inputs = np.array(sample).astype(np.float32) counter = 1 start_time = time.time() could_load, checkpoint_counter = self.load(self.checkpoint_dir) if could_load: counter = checkpoint_counter print(" [*] Load SUCCESS") else: print(" [!] Load failed...") for epoch in xrange(config.epoch): if config.dataset == 'mnist': batch_idxs = min(len(self.data_X), config.train_size) // config.batch_size else: self.data = glob(os.path.join( config.data_dir, config.dataset, self.input_fname_pattern)) np.random.shuffle(self.data) batch_idxs = min(len(self.data), config.train_size) // config.batch_size for idx in xrange(0, int(batch_idxs)): if config.dataset == 'mnist': batch_images = self.data_X[idx*config.batch_size:(idx+1)*config.batch_size] batch_labels = self.data_y[idx*config.batch_size:(idx+1)*config.batch_size] else: batch_files = self.data[idx*config.batch_size:(idx+1)*config.batch_size] batch = [ get_image(batch_file, input_height=self.input_height, input_width=self.input_width, resize_height=self.output_height, resize_width=self.output_width, crop=self.crop, grayscale=self.grayscale) for batch_file in batch_files] if self.grayscale: batch_images = np.array(batch).astype(np.float32)[:, :, :, None] else: batch_images = np.array(batch).astype(np.float32) batch_z = gen_random(config.z_dist, size=[config.batch_size, self.z_dim]) \ .astype(np.float32) if config.dataset == 'mnist': # Update D network _, summary_str = self.sess.run([d_optim, self.d_sum], feed_dict={ self.inputs: batch_images, self.z: batch_z, self.y:batch_labels, }) self.writer.add_summary(summary_str, counter) # Update G network _, summary_str = self.sess.run([g_optim, self.g_sum], feed_dict={ self.z: batch_z, self.y:batch_labels, }) self.writer.add_summary(summary_str, counter) # Run g_optim twice to make sure that d_loss does not go to zero (different from paper) _, summary_str = self.sess.run([g_optim, self.g_sum], feed_dict={ self.z: batch_z, self.y:batch_labels }) self.writer.add_summary(summary_str, counter) errD_fake = self.d_loss_fake.eval({ self.z: batch_z, self.y:batch_labels }) errD_real = self.d_loss_real.eval({ self.inputs: batch_images, self.y:batch_labels }) errG = self.g_loss.eval({ self.z: batch_z, self.y: batch_labels }) else: # Update D network _, summary_str = self.sess.run([d_optim, self.d_sum], feed_dict={ self.inputs: batch_images, self.z: batch_z }) self.writer.add_summary(summary_str, counter) # Update G network _, summary_str = self.sess.run([g_optim, self.g_sum], feed_dict={ self.z: batch_z }) self.writer.add_summary(summary_str, counter) # Run g_optim twice to make sure that d_loss does not go to zero (different from paper) _, summary_str = self.sess.run([g_optim, self.g_sum], feed_dict={ self.z: batch_z }) self.writer.add_summary(summary_str, counter) errD_fake = self.d_loss_fake.eval({ self.z: batch_z }) errD_real = self.d_loss_real.eval({ self.inputs: batch_images }) errG = self.g_loss.eval({self.z: batch_z}) print("[%8d Epoch:[%2d/%2d] [%4d/%4d] time: %4.4f, d_loss: %.8f, g_loss: %.8f" \ % (counter, epoch, config.epoch, idx, batch_idxs, time.time() - start_time, errD_fake+errD_real, errG)) if np.mod(counter, config.sample_freq) == 0: if config.dataset == 'mnist': samples, d_loss, g_loss = self.sess.run( [self.sampler, self.d_loss, self.g_loss], feed_dict={ self.z: sample_z, self.inputs: sample_inputs, self.y:sample_labels, } ) save_images(samples, image_manifold_size(samples.shape[0]), './{}/train_{:08d}.png'.format(config.sample_dir, counter)) print("[Sample] d_loss: %.8f, g_loss: %.8f" % (d_loss, g_loss)) else: try: samples, d_loss, g_loss = self.sess.run( [self.sampler, self.d_loss, self.g_loss], feed_dict={ self.z: sample_z, self.inputs: sample_inputs, }, ) save_images(samples, image_manifold_size(samples.shape[0]), './{}/train_{:08d}.png'.format(config.sample_dir, counter)) print("[Sample] d_loss: %.8f, g_loss: %.8f" % (d_loss, g_loss)) except: print("one pic error!...") if np.mod(counter, config.ckpt_freq) == 0: self.save(config.checkpoint_dir, counter) counter += 1 def discriminator(self, image, y=None, reuse=False): with tf.variable_scope("discriminator") as scope: if reuse: scope.reuse_variables() if not self.y_dim: h0 = lrelu(conv2d(image, self.df_dim, name='d_h0_conv')) h1 = lrelu(self.d_bn1(conv2d(h0, self.df_dim*2, name='d_h1_conv'))) h2 = lrelu(self.d_bn2(conv2d(h1, self.df_dim*4, name='d_h2_conv'))) h3 = lrelu(self.d_bn3(conv2d(h2, self.df_dim*8, name='d_h3_conv'))) h4 = linear(tf.reshape(h3, [self.batch_size, -1]), 1, 'd_h4_lin') return tf.nn.sigmoid(h4), h4 else: yb = tf.reshape(y, [self.batch_size, 1, 1, self.y_dim]) x = conv_cond_concat(image, yb) h0 = lrelu(conv2d(x, self.c_dim + self.y_dim, name='d_h0_conv')) h0 = conv_cond_concat(h0, yb) h1 = lrelu(self.d_bn1(conv2d(h0, self.df_dim + self.y_dim, name='d_h1_conv'))) h1 = tf.reshape(h1, [self.batch_size, -1]) h1 = concat([h1, y], 1) h2 = lrelu(self.d_bn2(linear(h1, self.dfc_dim, 'd_h2_lin'))) h2 = concat([h2, y], 1) h3 = linear(h2, 1, 'd_h3_lin') return tf.nn.sigmoid(h3), h3 def generator(self, z, y=None): with tf.variable_scope("generator") as scope: if not self.y_dim: s_h, s_w = self.output_height, self.output_width s_h2, s_w2 = conv_out_size_same(s_h, 2), conv_out_size_same(s_w, 2) s_h4, s_w4 = conv_out_size_same(s_h2, 2), conv_out_size_same(s_w2, 2) s_h8, s_w8 = conv_out_size_same(s_h4, 2), conv_out_size_same(s_w4, 2) s_h16, s_w16 = conv_out_size_same(s_h8, 2), conv_out_size_same(s_w8, 2) # project `z` and reshape self.z_, self.h0_w, self.h0_b = linear( z, self.gf_dim*8*s_h16*s_w16, 'g_h0_lin', with_w=True) self.h0 = tf.reshape( self.z_, [-1, s_h16, s_w16, self.gf_dim * 8]) h0 = tf.nn.relu(self.g_bn0(self.h0)) self.h1, self.h1_w, self.h1_b = deconv2d( h0, [self.batch_size, s_h8, s_w8, self.gf_dim*4], name='g_h1', with_w=True) h1 = tf.nn.relu(self.g_bn1(self.h1)) h2, self.h2_w, self.h2_b = deconv2d( h1, [self.batch_size, s_h4, s_w4, self.gf_dim*2], name='g_h2', with_w=True) h2 = tf.nn.relu(self.g_bn2(h2)) h3, self.h3_w, self.h3_b = deconv2d( h2, [self.batch_size, s_h2, s_w2, self.gf_dim*1], name='g_h3', with_w=True) h3 = tf.nn.relu(self.g_bn3(h3)) h4, self.h4_w, self.h4_b = deconv2d( h3, [self.batch_size, s_h, s_w, self.c_dim], name='g_h4', with_w=True) return tf.nn.tanh(h4) else: s_h, s_w = self.output_height, self.output_width s_h2, s_h4 = int(s_h/2), int(s_h/4) s_w2, s_w4 = int(s_w/2), int(s_w/4) # yb = tf.expand_dims(tf.expand_dims(y, 1),2) yb = tf.reshape(y, [self.batch_size, 1, 1, self.y_dim]) z = concat([z, y], 1) h0 = tf.nn.relu( self.g_bn0(linear(z, self.gfc_dim, 'g_h0_lin'))) h0 = concat([h0, y], 1) h1 = tf.nn.relu(self.g_bn1( linear(h0, self.gf_dim*2*s_h4*s_w4, 'g_h1_lin'))) h1 = tf.reshape(h1, [self.batch_size, s_h4, s_w4, self.gf_dim * 2]) h1 = conv_cond_concat(h1, yb) h2 = tf.nn.relu(self.g_bn2(deconv2d(h1, [self.batch_size, s_h2, s_w2, self.gf_dim * 2], name='g_h2'))) h2 = conv_cond_concat(h2, yb) return tf.nn.sigmoid( deconv2d(h2, [self.batch_size, s_h, s_w, self.c_dim], name='g_h3')) def sampler(self, z, y=None): with tf.variable_scope("generator") as scope: scope.reuse_variables() if not self.y_dim: s_h, s_w = self.output_height, self.output_width s_h2, s_w2 = conv_out_size_same(s_h, 2), conv_out_size_same(s_w, 2) s_h4, s_w4 = conv_out_size_same(s_h2, 2), conv_out_size_same(s_w2, 2) s_h8, s_w8 = conv_out_size_same(s_h4, 2), conv_out_size_same(s_w4, 2) s_h16, s_w16 = conv_out_size_same(s_h8, 2), conv_out_size_same(s_w8, 2) # project `z` and reshape h0 = tf.reshape( linear(z, self.gf_dim*8*s_h16*s_w16, 'g_h0_lin'), [-1, s_h16, s_w16, self.gf_dim * 8]) h0 = tf.nn.relu(self.g_bn0(h0, train=False)) h1 = deconv2d(h0, [self.batch_size, s_h8, s_w8, self.gf_dim*4], name='g_h1') h1 = tf.nn.relu(self.g_bn1(h1, train=False)) h2 = deconv2d(h1, [self.batch_size, s_h4, s_w4, self.gf_dim*2], name='g_h2') h2 = tf.nn.relu(self.g_bn2(h2, train=False)) h3 = deconv2d(h2, [self.batch_size, s_h2, s_w2, self.gf_dim*1], name='g_h3') h3 = tf.nn.relu(self.g_bn3(h3, train=False)) h4 = deconv2d(h3, [self.batch_size, s_h, s_w, self.c_dim], name='g_h4') return tf.nn.tanh(h4) else: s_h, s_w = self.output_height, self.output_width s_h2, s_h4 = int(s_h/2), int(s_h/4) s_w2, s_w4 = int(s_w/2), int(s_w/4) # yb = tf.reshape(y, [-1, 1, 1, self.y_dim]) yb = tf.reshape(y, [self.batch_size, 1, 1, self.y_dim]) z = concat([z, y], 1) h0 = tf.nn.relu(self.g_bn0(linear(z, self.gfc_dim, 'g_h0_lin'), train=False)) h0 = concat([h0, y], 1) h1 = tf.nn.relu(self.g_bn1( linear(h0, self.gf_dim*2*s_h4*s_w4, 'g_h1_lin'), train=False)) h1 = tf.reshape(h1, [self.batch_size, s_h4, s_w4, self.gf_dim * 2]) h1 = conv_cond_concat(h1, yb) h2 = tf.nn.relu(self.g_bn2( deconv2d(h1, [self.batch_size, s_h2, s_w2, self.gf_dim * 2], name='g_h2'), train=False)) h2 = conv_cond_concat(h2, yb) return tf.nn.sigmoid(deconv2d(h2, [self.batch_size, s_h, s_w, self.c_dim], name='g_h3')) def load_mnist(self): data_dir = os.path.join(self.data_dir, self.dataset_name) fd = open(os.path.join(data_dir,'train-images-idx3-ubyte')) loaded = np.fromfile(file=fd,dtype=np.uint8) trX = loaded[16:].reshape((60000,28,28,1)).astype(np.float) fd = open(os.path.join(data_dir,'train-labels-idx1-ubyte')) loaded = np.fromfile(file=fd,dtype=np.uint8) trY = loaded[8:].reshape((60000)).astype(np.float) fd = open(os.path.join(data_dir,'t10k-images-idx3-ubyte')) loaded = np.fromfile(file=fd,dtype=np.uint8) teX = loaded[16:].reshape((10000,28,28,1)).astype(np.float) fd = open(os.path.join(data_dir,'t10k-labels-idx1-ubyte')) loaded = np.fromfile(file=fd,dtype=np.uint8) teY = loaded[8:].reshape((10000)).astype(np.float) trY = np.asarray(trY) teY = np.asarray(teY) X = np.concatenate((trX, teX), axis=0) y = np.concatenate((trY, teY), axis=0).astype(np.int) seed = 547 np.random.seed(seed) np.random.shuffle(X) np.random.seed(seed) np.random.shuffle(y) y_vec = np.zeros((len(y), self.y_dim), dtype=np.float) for i, label in enumerate(y): y_vec[i,y[i]] = 1.0 return X/255.,y_vec @property def model_dir(self): return "{}_{}_{}_{}".format( self.dataset_name, self.batch_size, self.output_height, self.output_width) def save(self, checkpoint_dir, step, filename='model', ckpt=True, frozen=False): # model_name = "DCGAN.model" # checkpoint_dir = os.path.join(checkpoint_dir, self.model_dir) filename += '.b' + str(self.batch_size) if not os.path.exists(checkpoint_dir): os.makedirs(checkpoint_dir) if ckpt: self.saver.save(self.sess, os.path.join(checkpoint_dir, filename), global_step=step) if frozen: tf.train.write_graph( tf.graph_util.convert_variables_to_constants(self.sess, self.sess.graph_def, ["generator_1/Tanh"]), checkpoint_dir, '{}-{:06d}_frz.pb'.format(filename, step), as_text=False) def load(self, checkpoint_dir): #import re print(" [*] Reading checkpoints...", checkpoint_dir) # checkpoint_dir = os.path.join(checkpoint_dir, self.model_dir) # print(" ->", checkpoint_dir) ckpt = tf.train.get_checkpoint_state(checkpoint_dir) if ckpt and ckpt.model_checkpoint_path: ckpt_name = os.path.basename(ckpt.model_checkpoint_path) self.saver.restore(self.sess, os.path.join(checkpoint_dir, ckpt_name)) #counter = int(next(re.finditer("(\d+)(?!.*\d)",ckpt_name)).group(0)) counter = int(ckpt_name.split('-')[-1]) print(" [*] Success to read {}".format(ckpt_name)) return True, counter else: print(" [*] Failed to find a checkpoint") return False, 0 ================================================ FILE: ops.py ================================================ import math import numpy as np import tensorflow as tf from tensorflow.python.framework import ops from utils import * try: image_summary = tf.image_summary scalar_summary = tf.scalar_summary histogram_summary = tf.histogram_summary merge_summary = tf.merge_summary SummaryWriter = tf.train.SummaryWriter except: image_summary = tf.summary.image scalar_summary = tf.summary.scalar histogram_summary = tf.summary.histogram merge_summary = tf.summary.merge SummaryWriter = tf.summary.FileWriter if "concat_v2" in dir(tf): def concat(tensors, axis, *args, **kwargs): return tf.concat_v2(tensors, axis, *args, **kwargs) else: def concat(tensors, axis, *args, **kwargs): return tf.concat(tensors, axis, *args, **kwargs) class batch_norm(object): def __init__(self, epsilon=1e-5, momentum = 0.9, name="batch_norm"): with tf.variable_scope(name): self.epsilon = epsilon self.momentum = momentum self.name = name def __call__(self, x, train=True): return tf.contrib.layers.batch_norm(x, decay=self.momentum, updates_collections=None, epsilon=self.epsilon, scale=True, is_training=train, scope=self.name) def conv_cond_concat(x, y): """Concatenate conditioning vector on feature map axis.""" x_shapes = x.get_shape() y_shapes = y.get_shape() return concat([ x, y*tf.ones([x_shapes[0], x_shapes[1], x_shapes[2], y_shapes[3]])], 3) def conv2d(input_, output_dim, k_h=5, k_w=5, d_h=2, d_w=2, stddev=0.02, name="conv2d"): with tf.variable_scope(name): w = tf.get_variable('w', [k_h, k_w, input_.get_shape()[-1], output_dim], initializer=tf.truncated_normal_initializer(stddev=stddev)) conv = tf.nn.conv2d(input_, w, strides=[1, d_h, d_w, 1], padding='SAME') biases = tf.get_variable('biases', [output_dim], initializer=tf.constant_initializer(0.0)) conv = tf.reshape(tf.nn.bias_add(conv, biases), conv.get_shape()) return conv def deconv2d(input_, output_shape, k_h=5, k_w=5, d_h=2, d_w=2, stddev=0.02, name="deconv2d", with_w=False): with tf.variable_scope(name): # filter : [height, width, output_channels, in_channels] w = tf.get_variable('w', [k_h, k_w, output_shape[-1], input_.get_shape()[-1]], initializer=tf.random_normal_initializer(stddev=stddev)) try: deconv = tf.nn.conv2d_transpose(input_, w, output_shape=output_shape, strides=[1, d_h, d_w, 1]) # Support for verisons of TensorFlow before 0.7.0 except AttributeError: deconv = tf.nn.deconv2d(input_, w, output_shape=output_shape, strides=[1, d_h, d_w, 1]) biases = tf.get_variable('biases', [output_shape[-1]], initializer=tf.constant_initializer(0.0)) deconv = tf.reshape(tf.nn.bias_add(deconv, biases), deconv.get_shape()) if with_w: return deconv, w, biases else: return deconv def lrelu(x, leak=0.2, name="lrelu"): return tf.maximum(x, leak*x) def linear(input_, output_size, scope=None, stddev=0.02, bias_start=0.0, with_w=False): shape = input_.get_shape().as_list() with tf.variable_scope(scope or "Linear"): try: matrix = tf.get_variable("Matrix", [shape[1], output_size], tf.float32, tf.random_normal_initializer(stddev=stddev)) except ValueError as err: msg = "NOTE: Usually, this is due to an issue with the image dimensions. Did you correctly set '--crop' or '--input_height' or '--output_height'?" err.args = err.args + (msg,) raise bias = tf.get_variable("bias", [output_size], initializer=tf.constant_initializer(bias_start)) if with_w: return tf.matmul(input_, matrix) + bias, matrix, bias else: return tf.matmul(input_, matrix) + bias ================================================ FILE: utils.py ================================================ """ Some codes from https://github.com/Newmu/dcgan_code """ from __future__ import division import math import json import random import pprint import scipy.misc import cv2 import numpy as np import os import time import datetime from time import gmtime, strftime from six.moves import xrange from PIL import Image import tensorflow as tf import tensorflow.contrib.slim as slim pp = pprint.PrettyPrinter() get_stddev = lambda x, k_h, k_w: 1/math.sqrt(k_w*k_h*x.get_shape()[-1]) def expand_path(path): return os.path.expanduser(os.path.expandvars(path)) def timestamp(s='%Y%m%d.%H%M%S', ts=None): if not ts: ts = time.time() st = datetime.datetime.fromtimestamp(ts).strftime(s) return st def show_all_variables(): model_vars = tf.trainable_variables() slim.model_analyzer.analyze_vars(model_vars, print_info=True) def get_image(image_path, input_height, input_width, resize_height=64, resize_width=64, crop=True, grayscale=False): image = imread(image_path, grayscale) return transform(image, input_height, input_width, resize_height, resize_width, crop) def save_images(images, size, image_path): return imsave(inverse_transform(images), size, image_path) def imread(path, grayscale = False): if (grayscale): return scipy.misc.imread(path, flatten = True).astype(np.float) else: # Reference: https://github.com/carpedm20/DCGAN-tensorflow/issues/162#issuecomment-315519747 img_bgr = cv2.imread(path) # Reference: https://stackoverflow.com/a/15074748/ img_rgb = img_bgr[..., ::-1] return img_rgb.astype(np.float) def merge_images(images, size): return inverse_transform(images) def merge(images, size): h, w = images.shape[1], images.shape[2] if (images.shape[3] in (3,4)): c = images.shape[3] img = np.zeros((h * size[0], w * size[1], c)) for idx, image in enumerate(images): i = idx % size[1] j = idx // size[1] img[j * h:j * h + h, i * w:i * w + w, :] = image return img elif images.shape[3]==1: img = np.zeros((h * size[0], w * size[1])) for idx, image in enumerate(images): i = idx % size[1] j = idx // size[1] img[j * h:j * h + h, i * w:i * w + w] = image[:,:,0] return img else: raise ValueError('in merge(images,size) images parameter ' 'must have dimensions: HxW or HxWx3 or HxWx4') def imsave(images, size, path): image = np.squeeze(merge(images, size)) return scipy.misc.imsave(path, image) def center_crop(x, crop_h, crop_w, resize_h=64, resize_w=64): if crop_w is None: crop_w = crop_h h, w = x.shape[:2] j = int(round((h - crop_h)/2.)) i = int(round((w - crop_w)/2.)) im = Image.fromarray(x[j:j+crop_h, i:i+crop_w]) return np.array(im.resize([resize_h, resize_w]), PIL.Image.BILINEAR) def transform(image, input_height, input_width, resize_height=64, resize_width=64, crop=True): if crop: cropped_image = center_crop( image, input_height, input_width, resize_height, resize_width) else: im = Image.fromarray(image[j:j+crop_h, i:i+crop_w]) return np.array(im.resize([resize_h, resize_w]), PIL.Image.BILINEAR)/127.5 - 1. def inverse_transform(images): return (images+1.)/2. def to_json(output_path, *layers): with open(output_path, "w") as layer_f: lines = "" for w, b, bn in layers: layer_idx = w.name.split('/')[0].split('h')[1] B = b.eval() if "lin/" in w.name: W = w.eval() depth = W.shape[1] else: W = np.rollaxis(w.eval(), 2, 0) depth = W.shape[0] biases = {"sy": 1, "sx": 1, "depth": depth, "w": ['%.2f' % elem for elem in list(B)]} if bn != None: gamma = bn.gamma.eval() beta = bn.beta.eval() gamma = {"sy": 1, "sx": 1, "depth": depth, "w": ['%.2f' % elem for elem in list(gamma)]} beta = {"sy": 1, "sx": 1, "depth": depth, "w": ['%.2f' % elem for elem in list(beta)]} else: gamma = {"sy": 1, "sx": 1, "depth": 0, "w": []} beta = {"sy": 1, "sx": 1, "depth": 0, "w": []} if "lin/" in w.name: fs = [] for w in W.T: fs.append({"sy": 1, "sx": 1, "depth": W.shape[0], "w": ['%.2f' % elem for elem in list(w)]}) lines += """ var layer_%s = { "layer_type": "fc", "sy": 1, "sx": 1, "out_sx": 1, "out_sy": 1, "stride": 1, "pad": 0, "out_depth": %s, "in_depth": %s, "biases": %s, "gamma": %s, "beta": %s, "filters": %s };""" % (layer_idx.split('_')[0], W.shape[1], W.shape[0], biases, gamma, beta, fs) else: fs = [] for w_ in W: fs.append({"sy": 5, "sx": 5, "depth": W.shape[3], "w": ['%.2f' % elem for elem in list(w_.flatten())]}) lines += """ var layer_%s = { "layer_type": "deconv", "sy": 5, "sx": 5, "out_sx": %s, "out_sy": %s, "stride": 2, "pad": 1, "out_depth": %s, "in_depth": %s, "biases": %s, "gamma": %s, "beta": %s, "filters": %s };""" % (layer_idx, 2**(int(layer_idx)+2), 2**(int(layer_idx)+2), W.shape[0], W.shape[3], biases, gamma, beta, fs) layer_f.write(" ".join(lines.replace("'","").split())) def make_gif(images, fname, duration=2, true_image=False): import moviepy.editor as mpy def make_frame(t): try: x = images[int(len(images)/duration*t)] except: x = images[-1] if true_image: return x.astype(np.uint8) else: return ((x+1)/2*255).astype(np.uint8) clip = mpy.VideoClip(make_frame, duration=duration) clip.write_gif(fname, fps = len(images) / duration) def visualize(sess, dcgan, config, option, sample_dir='samples'): image_frame_dim = int(math.ceil(config.batch_size**.5)) if option == 0: z_sample = np.random.uniform(-0.5, 0.5, size=(config.batch_size, dcgan.z_dim)) samples = sess.run(dcgan.sampler, feed_dict={dcgan.z: z_sample}) save_images(samples, [image_frame_dim, image_frame_dim], os.path.join(sample_dir, 'test_%s.png' % strftime("%Y%m%d%H%M%S", gmtime() ))) elif option == 1: values = np.arange(0, 1, 1./config.batch_size) for idx in xrange(dcgan.z_dim): print(" [*] %d" % idx) z_sample = np.random.uniform(-1, 1, size=(config.batch_size , dcgan.z_dim)) for kdx, z in enumerate(z_sample): z[idx] = values[kdx] if config.dataset == "mnist": y = np.random.choice(10, config.batch_size) y_one_hot = np.zeros((config.batch_size, 10)) y_one_hot[np.arange(config.batch_size), y] = 1 samples = sess.run(dcgan.sampler, feed_dict={dcgan.z: z_sample, dcgan.y: y_one_hot}) else: samples = sess.run(dcgan.sampler, feed_dict={dcgan.z: z_sample}) save_images(samples, [image_frame_dim, image_frame_dim], os.path.join(sample_dir, 'test_arange_%s.png' % (idx))) elif option == 2: values = np.arange(0, 1, 1./config.batch_size) for idx in [random.randint(0, dcgan.z_dim - 1) for _ in xrange(dcgan.z_dim)]: print(" [*] %d" % idx) z = np.random.uniform(-0.2, 0.2, size=(dcgan.z_dim)) z_sample = np.tile(z, (config.batch_size, 1)) #z_sample = np.zeros([config.batch_size, dcgan.z_dim]) for kdx, z in enumerate(z_sample): z[idx] = values[kdx] if config.dataset == "mnist": y = np.random.choice(10, config.batch_size) y_one_hot = np.zeros((config.batch_size, 10)) y_one_hot[np.arange(config.batch_size), y] = 1 samples = sess.run(dcgan.sampler, feed_dict={dcgan.z: z_sample, dcgan.y: y_one_hot}) else: samples = sess.run(dcgan.sampler, feed_dict={dcgan.z: z_sample}) try: make_gif(samples, './samples/test_gif_%s.gif' % (idx)) except: save_images(samples, [image_frame_dim, image_frame_dim], os.path.join(sample_dir, 'test_%s.png' % strftime("%Y%m%d%H%M%S", gmtime() ))) elif option == 3: values = np.arange(0, 1, 1./config.batch_size) for idx in xrange(dcgan.z_dim): print(" [*] %d" % idx) z_sample = np.zeros([config.batch_size, dcgan.z_dim]) for kdx, z in enumerate(z_sample): z[idx] = values[kdx] samples = sess.run(dcgan.sampler, feed_dict={dcgan.z: z_sample}) make_gif(samples, os.path.join(sample_dir, 'test_gif_%s.gif' % (idx))) elif option == 4: image_set = [] values = np.arange(0, 1, 1./config.batch_size) for idx in xrange(dcgan.z_dim): print(" [*] %d" % idx) z_sample = np.zeros([config.batch_size, dcgan.z_dim]) for kdx, z in enumerate(z_sample): z[idx] = values[kdx] image_set.append(sess.run(dcgan.sampler, feed_dict={dcgan.z: z_sample})) make_gif(image_set[-1], os.path.join(sample_dir, 'test_gif_%s.gif' % (idx))) new_image_set = [merge(np.array([images[idx] for images in image_set]), [10, 10]) \ for idx in range(64) + range(63, -1, -1)] make_gif(new_image_set, './samples/test_gif_merged.gif', duration=8) def image_manifold_size(num_images): manifold_h = int(np.floor(np.sqrt(num_images))) manifold_w = int(np.ceil(np.sqrt(num_images))) assert manifold_h * manifold_w == num_images return manifold_h, manifold_w ================================================ FILE: web/app.py ================================================ from flask import Flask from flask import render_template app = Flask(__name__, template_folder="./", static_folder='./', static_url_path='') @app.route('/') def index(): return render_template('index.html') if __name__ == '__main__': app.debug=True app.run(host='0.0.0.0') ================================================ FILE: web/css/fakeLoader.css ================================================ /********************** *CSS Animations by: *http://codepen.io/vivinantony ***********************/ .spinner1 { width: 40px; height: 40px; position: relative; } .double-bounce1, .double-bounce2 { width: 100%; height: 100%; border-radius: 50%; background-color: #fff; opacity: 0.6; position: absolute; top: 0; left: 0; -webkit-animation: bounce 2.0s infinite ease-in-out; animation: bounce 2.0s infinite ease-in-out; } .double-bounce2 { -webkit-animation-delay: -1.0s; animation-delay: -1.0s; } @-webkit-keyframes bounce { 0%, 100% { -webkit-transform: scale(0.0) } 50% { -webkit-transform: scale(1.0) } } @keyframes bounce { 0%, 100% { transform: scale(0.0); -webkit-transform: scale(0.0); } 50% { transform: scale(1.0); -webkit-transform: scale(1.0); } } .spinner2 { width: 40px; height: 40px; position: relative; } .container1 > div, .container2 > div, .container3 > div { width: 6px; height: 6px; background-color: #fff; border-radius: 100%; position: absolute; -webkit-animation: bouncedelay 1.2s infinite ease-in-out; animation: bouncedelay 1.2s infinite ease-in-out; /* Prevent first frame from flickering when animation starts */ -webkit-animation-fill-mode: both; animation-fill-mode: both; } .spinner2 .spinner-container { position: absolute; width: 100%; height: 100%; } .container2 { -webkit-transform: rotateZ(45deg); transform: rotateZ(45deg); } .container3 { -webkit-transform: rotateZ(90deg); transform: rotateZ(90deg); } .circle1 { top: 0; left: 0; } .circle2 { top: 0; right: 0; } .circle3 { right: 0; bottom: 0; } .circle4 { left: 0; bottom: 0; } .container2 .circle1 { -webkit-animation-delay: -1.1s; animation-delay: -1.1s; } .container3 .circle1 { -webkit-animation-delay: -1.0s; animation-delay: -1.0s; } .container1 .circle2 { -webkit-animation-delay: -0.9s; animation-delay: -0.9s; } .container2 .circle2 { -webkit-animation-delay: -0.8s; animation-delay: -0.8s; } .container3 .circle2 { -webkit-animation-delay: -0.7s; animation-delay: -0.7s; } .container1 .circle3 { -webkit-animation-delay: -0.6s; animation-delay: -0.6s; } .container2 .circle3 { -webkit-animation-delay: -0.5s; animation-delay: -0.5s; } .container3 .circle3 { -webkit-animation-delay: -0.4s; animation-delay: -0.4s; } .container1 .circle4 { -webkit-animation-delay: -0.3s; animation-delay: -0.3s; } .container2 .circle4 { -webkit-animation-delay: -0.2s; animation-delay: -0.2s; } .container3 .circle4 { -webkit-animation-delay: -0.1s; animation-delay: -0.1s; } @-webkit-keyframes bouncedelay { 0%, 80%, 100% { -webkit-transform: scale(0.0) } 40% { -webkit-transform: scale(1.0) } } @keyframes bouncedelay { 0%, 80%, 100% { transform: scale(0.0); -webkit-transform: scale(0.0); } 40% { transform: scale(1.0); -webkit-transform: scale(1.0); } } .spinner3 { width: 40px; height: 40px; position: relative; -webkit-animation: rotate 2.0s infinite linear; animation: rotate 2.0s infinite linear; } .dot1, .dot2 { width: 60%; height: 60%; display: inline-block; position: absolute; top: 0; background-color: #fff; border-radius: 100%; -webkit-animation: bounce 2.0s infinite ease-in-out; animation: bounce 2.0s infinite ease-in-out; } .dot2 { top: auto; bottom: 0px; -webkit-animation-delay: -1.0s; animation-delay: -1.0s; } @-webkit-keyframes rotate { 100% { -webkit-transform: rotate(360deg) }} @keyframes rotate { 100% { transform: rotate(360deg); -webkit-transform: rotate(360deg) }} @-webkit-keyframes bounce { 0%, 100% { -webkit-transform: scale(0.0) } 50% { -webkit-transform: scale(1.0) } } @keyframes bounce { 0%, 100% { transform: scale(0.0); -webkit-transform: scale(0.0); } 50% { transform: scale(1.0); -webkit-transform: scale(1.0); } } .spinner4 { width: 30px; height: 30px; background-color: #fff; -webkit-animation: rotateplane 1.2s infinite ease-in-out; animation: rotateplane 1.2s infinite ease-in-out; } @-webkit-keyframes rotateplane { 0% { -webkit-transform: perspective(120px) } 50% { -webkit-transform: perspective(120px) rotateY(180deg) } 100% { -webkit-transform: perspective(120px) rotateY(180deg) rotateX(180deg) } } @keyframes rotateplane { 0% { transform: perspective(120px) rotateX(0deg) rotateY(0deg); -webkit-transform: perspective(120px) rotateX(0deg) rotateY(0deg) } 50% { transform: perspective(120px) rotateX(-180.1deg) rotateY(0deg); -webkit-transform: perspective(120px) rotateX(-180.1deg) rotateY(0deg) } 100% { transform: perspective(120px) rotateX(-180deg) rotateY(-179.9deg); -webkit-transform: perspective(120px) rotateX(-180deg) rotateY(-179.9deg); } } @charset 'UTF-8'; /* Slider */ .slick-loading .slick-list { background: #fff url('./ajax-loader.gif') center center no-repeat; } /* Icons */ @font-face { font-family: 'slick'; font-weight: normal; font-style: normal; src: url('../fonts/slick.eot'); src: url('../fonts/slick.eot?#iefix') format('embedded-opentype'), url('../fonts/slick.woff') format('woff'), url('../fonts/slick.ttf') format('truetype'), url('../fonts/slick.svg#slick') format('svg'); } /* Arrows */ .slick-prev, .slick-next { font-size: 0; line-height: 0; position: absolute; top: 50%; display: block; width: 20px; height: 20px; margin-top: -10px; padding: 0; cursor: pointer; color: transparent; border: none; outline: none; background: transparent; } .slick-prev:hover, .slick-prev:focus, .slick-next:hover, .slick-next:focus { color: transparent; outline: none; background: transparent; } .slick-prev:hover:before, .slick-prev:focus:before, .slick-next:hover:before, .slick-next:focus:before { opacity: 1; } .slick-prev.slick-disabled:before, .slick-next.slick-disabled:before { opacity: .25; } .slick-prev:before, .slick-next:before { font-family: 'slick'; font-size: 20px; line-height: 1; opacity: .75; color: white; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } .slick-prev { left: -25px; } [dir='rtl'] .slick-prev { right: -25px; left: auto; } .slick-prev:before { content: '←'; } [dir='rtl'] .slick-prev:before { content: '→'; } .slick-next { right: -25px; } [dir='rtl'] .slick-next { right: auto; left: -25px; } .slick-next:before { content: '→'; } [dir='rtl'] .slick-next:before { content: '←'; } /* Dots */ .slick-slider { margin-bottom: 30px; } .slick-dots { position: absolute; bottom: -45px; display: block; width: 100%; padding: 0; list-style: none; text-align: center; } .slick-dots li { position: relative; display: inline-block; width: 20px; height: 20px; margin: 0 5px; padding: 0; cursor: pointer; } .slick-dots li button { font-size: 0; line-height: 0; display: block; width: 20px; height: 20px; padding: 5px; cursor: pointer; color: transparent; border: 0; outline: none; background: transparent; } .slick-dots li button:hover, .slick-dots li button:focus { outline: none; } .slick-dots li button:hover:before, .slick-dots li button:focus:before { opacity: 1; } .slick-dots li button:before { font-family: 'slick'; font-size: 6px; line-height: 20px; position: absolute; top: 0; left: 0; width: 20px; height: 20px; content: '•'; text-align: center; opacity: .25; color: white; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } .slick-dots li.slick-active button:before { opacity: .75; color: white; } ================================================ FILE: web/css/main.css ================================================ canvas { background-color: white; } #colors { overflow: hidden; width: 90px; margin-left: 30px; float: left; } #colors .color { float: left; width: 40px; height: 40px; } #colors .color div { width: 100%; height: 100%; cursor: pointer; } #colors .color #color_1::before { background-color: #000000; } #colors .color #color_1::after { background-color: #000000; } #colors .color #color_2::before { background-color: #444444; } #colors .color #color_2::after { background-color: #303030; } #colors .color #color_3::before { background-color: #000088; } #colors .color #color_3::after { background-color: #00005f; } #colors .color #color_4::before { background-color: #1111ff; } #colors .color #color_4::after { background-color: #0000be; } #colors .color #color_5::before { background-color: #008800; } #colors .color #color_5::after { background-color: #005f00; } #colors .color #color_6::before { background-color: #11ff11; } #colors .color #color_6::after { background-color: #00be00; } #colors .color #color_7::before { background-color: #008888; } #colors .color #color_7::after { background-color: #005f5f; } #colors .color #color_8::before { background-color: #11ffff; } #colors .color #color_8::after { background-color: #00bebe; } #colors .color #color_9::before { background-color: #880000; } #colors .color #color_9::after { background-color: #5f0000; } #colors .color #color_10::before { background-color: #ff1111; } #colors .color #color_10::after { background-color: #be0000; } #colors .color #color_11::before { background-color: #880088; } #colors .color #color_11::after { background-color: #5f005f; } #colors .color #color_12::before { background-color: #ff11ff; } #colors .color #color_12::after { background-color: #be00be; } #colors .color #color_13::before { background-color: #884400; } #colors .color #color_13::after { background-color: #5f3000; } #colors .color #color_14::before { background-color: #ffff11; } #colors .color #color_14::after { background-color: #bebe00; } #colors .color #color_15::before { background-color: #888888; } #colors .color #color_15::after { background-color: #5f5f5f; } #colors .color #color_16::before { background-color: #cccccc; } #colors .color #color_16::after { background-color: #8f8f8f; } #colors .color .isometric::after { width: 40px; height: 10px; } #colors .color .isometric::before { width: 10px; height: 40px; } #colors .color.active div { top: 10px; left: 10px; } /* CANVAS */ #canvas { float: left; cursor: crosshair; width: 320px; height: 320px; margin: 0 auto; } #canvas.isometric::after { width: 320px; height: 10px; } #canvas.isometric::before { width: 10px; height: 320px; } /*! * Start Bootstrap - Grayscale Bootstrap Theme (http://startbootstrap.com) * Code licensed under the Apache License v2.0. * For details, see http://www.apache.org/licenses/LICENSE-2.0. */ body { width: 100%; height: 100%; font-family: Lora,"Helvetica Neue",Helvetica,Arial,sans-serif; color: #fff; background-color: #000; } html { width: 100%; height: 100%; } h1, h2, h3, h4, h5, h6 { margin: 0 0 35px; text-transform: uppercase; font-family: Montserrat,"Helvetica Neue",Helvetica,Arial,sans-serif; font-weight: 700; letter-spacing: 1px; } p { margin: 0 0 25px; font-size: 18px; line-height: 1.5; } @media(min-width:768px) { p { margin: 0 0 35px; font-size: 20px; line-height: 1.6; } } b { color: #3A8BB2; } a { color: #42dca3; -webkit-transition: all .2s ease-in-out; -moz-transition: all .2s ease-in-out; transition: all .2s ease-in-out; } a:hover, a:focus { text-decoration: none; color: #1d9b6c; } .light { font-weight: 400; } .navbar-custom { margin-bottom: 0; border-bottom: 1px solid rgba(255,255,255,.3); text-transform: uppercase; font-family: Montserrat,"Helvetica Neue",Helvetica,Arial,sans-serif; background-color: #000; } .navbar-custom .navbar-brand { font-weight: 700; } .navbar-custom .navbar-brand:focus { outline: 0; } .navbar-custom .navbar-brand .navbar-toggle { padding: 4px 6px; font-size: 16px; color: #fff; } .navbar-custom .navbar-brand .navbar-toggle:focus, .navbar-custom .navbar-brand .navbar-toggle:active { outline: 0; } .navbar-custom a { color: #fff; } .navbar-custom .nav li a { -webkit-transition: background .3s ease-in-out; -moz-transition: background .3s ease-in-out; transition: background .3s ease-in-out; } .navbar-custom .nav li a:hover { outline: 0; color: rgba(255,255,255,.8); background-color: transparent; } .navbar-custom .nav li a:focus, .navbar-custom .nav li a:active { outline: 0; background-color: transparent; } .navbar-custom .nav li.active { outline: 0; } .navbar-custom .nav li.active a { background-color: rgba(255,255,255,.3); } .navbar-custom .nav li.active a:hover { color: #fff; } @media(min-width:768px) { .navbar-custom { padding: 20px 0; border-bottom: 0; letter-spacing: 1px; background: 0 0; -webkit-transition: background .5s ease-in-out,padding .5s ease-in-out; -moz-transition: background .5s ease-in-out,padding .5s ease-in-out; transition: background .5s ease-in-out,padding .5s ease-in-out; } .navbar-custom.top-nav-collapse { padding: 0; border-bottom: 1px solid rgba(255,255,255,.3); background: #000; } } #background { position: absolute; top: 0; left: 0; min-width: 100%; min-height: 100%; width: auto; height: auto; } .intro { overflow: hidden; position: relative; display: table; width: 100%; height: auto; min-height: 100%; padding: 100px 0; text-align: center; color: #fff; background-color: #000; -webkit-background-size: cover; -moz-background-size: cover; background-size: cover; -o-background-size: cover; } .intro .intro-body { display: table-cell; vertical-align: middle; } .intro .intro-body .brand-heading { font-size: 40px; } .intro .intro-body .intro-text { font-size: 18px; } @media(min-width:768px) { .intro { height: 100%; padding: 0; } .intro .intro-body .brand-heading { font-size: 100px; } .intro .intro-body .intro-text { font-size: 26px; } } .btn-circle { width: 70px; height: 70px; margin-top: 15px; padding: 7px 16px; border: 2px solid #fff; border-radius: 100%!important; font-size: 40px; color: #fff; background: 0 0; -webkit-transition: background .3s ease-in-out; -moz-transition: background .3s ease-in-out; transition: background .3s ease-in-out; } .btn-circle:hover, .btn-circle:focus { outline: 0; color: #fff; background: rgba(255,255,255,.1); } .btn-circle i.animated { -webkit-transition-property: -webkit-transform; -webkit-transition-duration: 1s; -moz-transition-property: -moz-transform; -moz-transition-duration: 1s; } .btn-circle:hover i.animated { -webkit-animation-name: pulse; -moz-animation-name: pulse; -webkit-animation-duration: 1.5s; -moz-animation-duration: 1.5s; -webkit-animation-iteration-count: infinite; -moz-animation-iteration-count: infinite; -webkit-animation-timing-function: linear; -moz-animation-timing-function: linear; } @-webkit-keyframes pulse { 0% { -webkit-transform: scale(1); transform: scale(1); } 50% { -webkit-transform: scale(1.2); transform: scale(1.2); } 100% { -webkit-transform: scale(1); transform: scale(1); } } @-moz-keyframes pulse { 0% { -moz-transform: scale(1); transform: scale(1); } 50% { -moz-transform: scale(1.2); transform: scale(1.2); } 100% { -moz-transform: scale(1); transform: scale(1); } } .content-section { padding-top: 100px; } .download-section { width: 100%; padding: 50px 0; color: #fff; background: url(../img/downloads-bg.jpg) no-repeat center center scroll; background-color: #000; -webkit-background-size: cover; -moz-background-size: cover; background-size: cover; -o-background-size: cover; } @media(min-width:767px) { .content-section { padding-top: 250px; } .download-section { padding: 100px 0; } } .btn { border-radius: 0; text-transform: uppercase; font-family: Montserrat,"Helvetica Neue",Helvetica,Arial,sans-serif; font-weight: 400; -webkit-transition: all .3s ease-in-out; -moz-transition: all .3s ease-in-out; transition: all .3s ease-in-out; } .btn-default { border: 1px solid #42dca3; color: #42dca3; background-color: transparent; } .btn-default:hover, .btn-default:focus { border: 1px solid #42dca3; outline: 0; color: #000; background-color: #42dca3; } ul.banner-social-buttons { margin-top: 0; } @media(max-width:1199px) { ul.banner-social-buttons { margin-top: 15px; } } @media(max-width:767px) { ul.banner-social-buttons li { display: block; margin-bottom: 20px; padding: 0; } ul.banner-social-buttons li:last-child { margin-bottom: 0; } } footer { padding: 50px 0; } footer p { margin: 0; } ::-moz-selection { text-shadow: none; background: #fcfcfc; background: rgba(255,255,255,.2); } ::selection { text-shadow: none; background: #fcfcfc; background: rgba(255,255,255,.2); } img::selection { background: 0 0; } img::-moz-selection { background: 0 0; } body { webkit-tap-highlight-color: rgba(255,255,255,.2); } .logo { background-color: rgba(0,0,0,0.7); padding: 30px; } .tooltip > .tooltip-inner { background-color: white; color: black; } .tooltip > .tooltip-arrow { border-bottom-color: white; color: black; } .tooltip { opacity: 0.95 !important; } .pixel-tooltip { padding: 0px !important; background-color: transparent !important; } .fa-5 { font-size: 40px !important; } .ico { padding-right: 10px; } .sk-wave { margin: 40px auto; width: 50px; height: 40px; text-align: center; font-size: 10px; } .sk-wave .sk-rect { background-color: #eeeeee; height: 100%; width: 6px; display: inline-block; -webkit-animation: sk-waveStretchDelay 1.2s infinite ease-in-out; animation: sk-waveStretchDelay 1.2s infinite ease-in-out; } .sk-wave .sk-rect1 { -webkit-animation-delay: -1.2s; animation-delay: -1.2s; } .sk-wave .sk-rect2 { -webkit-animation-delay: -1.1s; animation-delay: -1.1s; } .sk-wave .sk-rect3 { -webkit-animation-delay: -1s; animation-delay: -1s; } .sk-wave .sk-rect4 { -webkit-animation-delay: -0.9s; animation-delay: -0.9s; } .sk-wave .sk-rect5 { -webkit-animation-delay: -0.8s; animation-delay: -0.8s; } @-webkit-keyframes sk-waveStretchDelay { 0%, 40%, 100% { -webkit-transform: scaleY(0.4); transform: scaleY(0.4); } 20% { -webkit-transform: scaleY(1); transform: scaleY(1); } } @keyframes sk-waveStretchDelay { 0%, 40%, 100% { -webkit-transform: scaleY(0.4); transform: scaleY(0.4); } 20% { -webkit-transform: scaleY(1); transform: scaleY(1); } } ================================================ FILE: web/index.html ================================================ Neural Face | 프사 뉴럴

프사 뉴럴

Neural Face

인공지능이 만든 얼굴들

프사 뉴럴

Neural Face

프사 뉴럴은 Facebook AI Research에서 개발한 Deep Convolutional Generative Adversarial Networks (DCGAN) 이라는 기계 학습 모델을 사용해 만들어졌습니다.

프사 뉴럴은 얼굴 사진을 만드는 인공 지능이며
이 페이지에 나오는 모든 사람들은 이세상에 존재하지 않습니다.

Neural Face uses Deep Convolutional Generative Adversarial Networks (DCGAN), which is developed by Facebook AI Research.

Neural Face is an Artificial Intelligence which generates face images
and all images in this page are not REAL.

Image Generation

프사 뉴럴은 0에서 1 사이의 100개의 숫자 z로 사람의 이미지를 만들어내는 인공지능입니다.

1. 아래에 보이는 100개의 픽셀을 z의 각 숫자를 나타냅니다.
2. 만들어진 사진 위에 마우스를 올리면 사진에 사용된 z가 보입니다.
3. 만들어진 이미지를 누르시면 그 이미지의 z가 복사됩니다.

Neural Face uses a vector z that consists of 100 real numbers ranging from 0 to 1.

1. Each pixel in the below pallete represents a value in z.
2. If you hover your mouse over an image, z for that image will be displayed.
3. If you click an image, z will be copied to the palette.

(브라우저 성능에 따라 1~10초가 걸립니다)

(Might take 1 to 10 seconds depending on your browser)

프사 뉴럴을 불러오고 있습니다...

Neural Face is preparing to draw...

알고리즘

Algorithm

프사 뉴럴의 핵심 모델인 DCGAN은 두 개의 인공 신경망으로 구성되어 있으며, 각각

1. 사진을 만들어내는 생성자 (G)
2. 진짜 사진과 생성자가 만든 사진을 구분하는 구분자 (D)

라고 부릅니다.

두 신경망은 수많은 이미지를 반복적으로 보면서 생성자는 구분자를 속이기 위해, 구분자는 생성자가 만든 사진을 판별하기 위해 학습합니다. 이러한 학습 방법을 적대적 학습 (Adversarial Learning)이라고 하며, 생성자와 구분자를 도둑과 경찰로 비유하기도 합니다.

DCGAN, which is the core of Neural Face, consists of two different neural networks which are:

1. Generator (G) that generates an image
2. Discriminator (D) that discriminate real images from generated images

Two neural networks compete as one tries to deceive the other. This kind of learning is called Adversarial Learning. Because of this, Generator and Discriminator are described as a thief and police, respectively.

생성자와 구분자는 여러 가지 인공 신경망 종류 중에서 각각 Deconvolutional Network (DNN)과 Convolutional Neural Network (CNN)로 구현되어 있습니다. CNN은 수백 개의 픽셀로 이루어진 이미지를 작은 차원의 숫자들 (z)로 잘 요약할 수 있는 필터를 배우는 인공 신경망이며, DNN은 이렇게 작아진 차원의 숫자들로 원래 이미지를 복원하는 필터를 배우는 신경망입니다.

구분자는 인공 신경망에 실제 이미지를 넣은 결과를 1로, 만들어진 이미지의 결과는 0으로 구분하도록 학습합니다. 반대로 생성자는 Gaussian Distribution을 따르는 z라는 확률 변수를 두고, 사람의 이미지의 확률 분포를 z를 사용해 계산합니다. 이렇게 만들어진 이미지를 구분자가 실제 이미지라고 잘못 판단하도록 계속 학습합니다.

Generator and Discriminator consist of Deconvolutional Network (DNN) and Convolutional Neural Network (CNN). CNN is a neural network which encodes the hundreds of pixels of an image into a vector of small dimensions (z) which is a summary of the image. DNN is a network that learns filters to recover the original image from z.

When a real image is given, Discriminator should output 1 or 0 for whether the image was generated from Generator. In the contrast, Generator generates an image from z, which follows a Gaussian Distribution, and tries to figure out the distribution of human images from z. In this way, a Generator tries to cheat Discriminator into making a wrong decision.

Results

프사 뉴럴를 학습시키기 위해 인터넷에 10만 개 이상의 사진들을 모았고 이 사진들에서 얼굴 사진만 잘라서 얼굴 데이터 셋을 만들었습니다. 코드는 최근에 구글에서 공개한 TensorFlow로 구현했으며 GTX 980 Ti를 사용하여 이틀간 학습시켰습니다.

아래는 초기 학습 단계에서 프사 뉴럴이 정해진 z로 얼굴 사진을 만들어 가는 과정을 보여줍니다.

More than 100K images are crawled from online communities and those images are cropped by using openface which is a face recognition framework. Neural Face is implemented with TensorFlow and a GTX 980 Ti is used to train for two days.

Below is a series of images generated by Generator with a fixed z between the first and the fith epoch of training.

생성자가 사용하는 z는 -1에서 1 사이의 Gaussian Distribution을 따르는 확률 변수이며, 평균값인 0으로 이미지를 만들게 되면, 프사 뉴럴이 생각하는 평균적인 얼굴을 알 수 있습니다.

The vector z has real values from -1 to 1 and it follows the Gaussian Distribution. We can see the most common face that is interpreted by Neural Face using 0 as all values of z.

평균값 0에서 랜덤한 차원의 값을 조금씩 바꾸면 아래와 같은 변화를 볼 수 있습니다.

The below images are generated by changing the values of z continuously, starting from the average value (0) to -1 or 1.

아래의 사진들은 100차원의 z 값 중에서 임의의 차원들을 -1부터 1까지 바꾸면서 생성자 신경망에 넣은 결과이며, 점점 미소를 짓거나, 안경이 생기거나, 흑백 사진이 되거나, 성별이 바뀌는 등의 결과를 확인하실 수 있습니다.

The below images are generated by changing ten different values of z from -1 to 1. People in the images vary in characteristics such as smiling, wearing glasses, turning into black and white images, and changing into different sex.

프사 뉴럴의 코드는 이곳에 공개되어 있습니다.

The code of Neural Face can be found here.

Misc.

마지막으로 튜링 테스트를 해 보겠습니다 :)
아래 사진 중에서 진짜 사진은 무엇일까요?

(마우스로 클릭하면 정답이 보입니다)

Lastly, let's conduct a Turing Test :)
Can you guess which are the real images?

(Answer will be showed if you click an image)

Other Projects

Taehoon Kim

@carpedm20

================================================ FILE: web/js/app.js ================================================ window.mobilecheck = function() { var check = false; (function(a){if(/(android|bb\d+|meego).+mobile|avantgo|bada\/|blackberry|blazer|compal|elaine|fennec|hiptop|iemobile|ip(hone|od)|iris|kindle|lge |maemo|midp|mmp|mobile.+firefox|netfront|opera m(ob|in)i|palm( os)?|phone|p(ixi|re)\/|plucker|pocket|psp|series(4|6)0|symbian|treo|up\.(browser|link)|vodafone|wap|windows ce|xda|xiino/i.test(a)||/1207|6310|6590|3gso|4thp|50[1-6]i|770s|802s|a wa|abac|ac(er|oo|s\-)|ai(ko|rn)|al(av|ca|co)|amoi|an(ex|ny|yw)|aptu|ar(ch|go)|as(te|us)|attw|au(di|\-m|r |s )|avan|be(ck|ll|nq)|bi(lb|rd)|bl(ac|az)|br(e|v)w|bumb|bw\-(n|u)|c55\/|capi|ccwa|cdm\-|cell|chtm|cldc|cmd\-|co(mp|nd)|craw|da(it|ll|ng)|dbte|dc\-s|devi|dica|dmob|do(c|p)o|ds(12|\-d)|el(49|ai)|em(l2|ul)|er(ic|k0)|esl8|ez([4-7]0|os|wa|ze)|fetc|fly(\-|_)|g1 u|g560|gene|gf\-5|g\-mo|go(\.w|od)|gr(ad|un)|haie|hcit|hd\-(m|p|t)|hei\-|hi(pt|ta)|hp( i|ip)|hs\-c|ht(c(\-| |_|a|g|p|s|t)|tp)|hu(aw|tc)|i\-(20|go|ma)|i230|iac( |\-|\/)|ibro|idea|ig01|ikom|im1k|inno|ipaq|iris|ja(t|v)a|jbro|jemu|jigs|kddi|keji|kgt( |\/)|klon|kpt |kwc\-|kyo(c|k)|le(no|xi)|lg( g|\/(k|l|u)|50|54|\-[a-w])|libw|lynx|m1\-w|m3ga|m50\/|ma(te|ui|xo)|mc(01|21|ca)|m\-cr|me(rc|ri)|mi(o8|oa|ts)|mmef|mo(01|02|bi|de|do|t(\-| |o|v)|zz)|mt(50|p1|v )|mwbp|mywa|n10[0-2]|n20[2-3]|n30(0|2)|n50(0|2|5)|n7(0(0|1)|10)|ne((c|m)\-|on|tf|wf|wg|wt)|nok(6|i)|nzph|o2im|op(ti|wv)|oran|owg1|p800|pan(a|d|t)|pdxg|pg(13|\-([1-8]|c))|phil|pire|pl(ay|uc)|pn\-2|po(ck|rt|se)|prox|psio|pt\-g|qa\-a|qc(07|12|21|32|60|\-[2-7]|i\-)|qtek|r380|r600|raks|rim9|ro(ve|zo)|s55\/|sa(ge|ma|mm|ms|ny|va)|sc(01|h\-|oo|p\-)|sdk\/|se(c(\-|0|1)|47|mc|nd|ri)|sgh\-|shar|sie(\-|m)|sk\-0|sl(45|id)|sm(al|ar|b3|it|t5)|so(ft|ny)|sp(01|h\-|v\-|v )|sy(01|mb)|t2(18|50)|t6(00|10|18)|ta(gt|lk)|tcl\-|tdg\-|tel(i|m)|tim\-|t\-mo|to(pl|sh)|ts(70|m\-|m3|m5)|tx\-9|up(\.b|g1|si)|utst|v400|v750|veri|vi(rg|te)|vk(40|5[0-3]|\-v)|vm40|voda|vulc|vx(52|53|60|61|70|80|81|83|85|98)|w3c(\-| )|webc|whit|wi(g |nc|nw)|wmlb|wonu|x700|yas\-|your|zeto|zte\-/i.test(a.substr(0,4)))check = true})(navigator.userAgent||navigator.vendor||window.opera); return check; } function rgb2hex(rgb){ rgb = rgb.match(/^rgba?[\s+]?\([\s+]?(\d+)[\s+]?,[\s+]?(\d+)[\s+]?,[\s+]?(\d+)[\s+]?/i); return (rgb && rgb.length === 4) ? ("0" + parseInt(rgb[1],10).toString(16)).slice(-2) + ("0" + parseInt(rgb[2],10).toString(16)).slice(-2) + ("0" + parseInt(rgb[3],10).toString(16)).slice(-2) : ""; } var make_z = function(max_length, scale) { var z = [] var scale = scale | 2; while(z.length < max_length){ var randomnumber=Math.random() * scale; var found=false; for(var i=0;i1) return 255; else if(x<-1) return 0; else return 255*(x+1.0)/2.0; } function cloneCanvas(oldCanvas) { //create a new canvas var newCanvas = document.createElement('canvas'); var context = newCanvas.getContext('2d'); //set dimensions newCanvas.width = oldCanvas.width; newCanvas.height = oldCanvas.height; //apply the old canvas to the new one context.drawImage(oldCanvas, 0, 0); //return the new canvas return newCanvas; } $( document ).ready(function() { draw_pixels(make_z(100, 255)); $('.slick').slick({ slidesToShow: 2, autoplay: true, dots: true, autoplaySpeed: 3000, responsive: [ { breakpoint: 980, settings: { slidesToShow: 1, slidesToScroll: 1 } } ] }); $('.turing-slick').slick({ slidesToShow: 6, autoplay: true, dots: true, autoplaySpeed: 3000, responsive: [ { breakpoint: 1200, settings: { slidesToShow: 5, slidesToScroll: 5 } }, { breakpoint: 980, settings: { slidesToShow: 3, slidesToScroll: 3 } } ] }); $("[data-toggle=tooltip]").tooltip(); var layer_defs = []; layer_defs.push({type:"input", out_sx:1, out_sy:1, out_depth:100}); layer_defs.push({type:"deconv", sx:4, filters:512, stride:1, pad:0, bn:true, activation:"relu"}); layer_defs.push({type:"deconv", sx:4, filters:256, stride:2, pad:1, bn:true, activation:"relu"}); layer_defs.push({type:"deconv", sx:4, filters:128, stride:2, pad:1, bn:true, activation:"relu"}); layer_defs.push({type:"deconv", sx:4, filters:64, stride:2, pad:1, bn:true, activation:"relu"}); layer_defs.push({type:"deconv", sx:4, filters:3, stride:2, pad:1, activation:"tanh"}); var net = new convnetjs.Net(); net.makeLayers(layer_defs); net.layers[1].fromJSON(layer_0); net.layers[3].fromJSON(layer_1); net.layers[5].fromJSON(layer_2); net.layers[7].fromJSON(layer_3); net.layers[9].fromJSON(layer_4); var input = new convnetjs.Vol(1, 1, 100, 0.0); var duplicates = []; var pixels = []; var draw = function() { cur_pixel = get_pixels(); input.w = cur_pixel; var output = net.forward(input); var scale = 2; var W = output.sx * scale; var H = output.sy * scale; var canv = document.createElement("canvas"); canv.width = W; canv.height = H; var ctx = canv.getContext("2d"); var g = ctx.createImageData(W, H); for(var d=0; d < 3; d++) { for(var x=0; x < output.sx; x++) { for(var y=0; y < output.sy; y++) { var dval = clip_pixel(output.get(x,y,d)); for(var dx = 0; dx < scale; dx++) { for(var dy =0 ;dy < scale; dy++) { var pp = ((W * (y*scale + dy)) + (dx + x*scale)) * 4; g.data[pp + d] = dval; if(d===0) g.data[pp+3] = 255; // alpha channel } } } } } ctx.putImageData(g, 0, 0); document.getElementById("images").appendChild(canv); duplicates.push(cloneCanvas(document.getElementById("pixel"))); pixels.push(cur_pixel); $(canv).tooltip({ html: true, template: '', title: function(e) { var duplicated = duplicates[parseInt($(this).attr("id")) - 1]; return duplicated; }, }).hide() .attr("id", duplicates.length) .fadeIn(1000) .click(function() { draw_pixels(recover_pixels(pixels[parseInt($(this).attr("id")) - 1])); }); } $(".draw").click(draw); $(".shuffle").click(function() { draw_pixels(make_z(100, 255)); draw(); }); $("#loading").hide(); $("#draw-btn").show(); if (!mobilecheck()) { draw(); } $("#fakeLoader").fadeOut(3000); }); // deactivate element function deactivate($el) { return $el.removeClass("active"); } // activate element function activate($el) { return $el.addClass("active"); } // disable element function disable($el) { return $el.addClass("disabled"); } // enable element function enable($el) { return $el.removeClass("disabled"); } // is element enabled? function isEnabled($el) { return $el.size() > 0 && !$el.hasClass("disabled"); } // track event function trackEvent(e, url) { _gaq.push(['_trackEvent', 'Drawings', e, url]); } // returns mouse or tap event relative coordinates function getCoordinates(e) { var x, y; x = e.offsetX ? e.offsetX : e.pageX - e.target.parentNode.offsetLeft; y = e.offsetY ? e.offsetY : e.pageY - e.target.parentNode.offsetTop; return {x: x, y: y}; } var currentColor = "#000000", copyFrameIndex = -1, tips = true; // mouse down event callback function mouseDownCallback(e) { PIXEL.setDraw(true); var coordinates = getCoordinates(e); PIXEL.doAction(coordinates.x, coordinates.y, currentColor); } // mouse move event callback function mouseMoveCallback(e) { var coordinates = getCoordinates(e); PIXEL.doAction(coordinates.x, coordinates.y, currentColor); e.preventDefault(); } // mouse up event callback function mouseUpCallback() { PIXEL.setDraw(false); } var canvas = $("#pixel"); PIXEL.init(canvas[0], true); // set drawing on mousedown canvas.mousedown(mouseDownCallback).mousemove(mouseMoveCallback); canvas.bind('touchstart', mouseDownCallback).bind('touchmove', mouseMoveCallback); // reset drawing on mouseup $(document).mouseup(mouseUpCallback); $(document).bind('touchend', mouseUpCallback); $(".action.selectable").click(function() { PIXEL.setAction($(this).data('action')); deactivate($(".action.selectable.active")); activate($(this)); }); // colors $(".color").click(function() { currentColor = $(this).data('color'); deactivate($(".color.active")); activate($(this)); }); // undo $(".undo").click(function() { PIXEL.undo(); }); // copy $(".copy").click(function() { copyFrameIndex = PIXEL.getCurrentFrameId(); }); $(".paste").click(function() { if(copyFrameIndex > -1 && copyFrameIndex < PIXEL.getFramesLength()) { PIXEL.pasteFrameAt(copyFrameIndex); } }); $(".rotate").click(function() { PIXEL.rotate(); }); // jQuery to collapse the navbar on scroll function collapseNavbar() { if ($(".navbar").offset().top > 50) { $(".navbar-fixed-top").addClass("top-nav-collapse"); } else { $(".navbar-fixed-top").removeClass("top-nav-collapse"); } } $(window).scroll(collapseNavbar); $(document).ready(collapseNavbar); // jQuery for page scrolling feature - requires jQuery Easing plugin $(function() { $('a.page-scroll').bind('click', function(event) { var $anchor = $(this); $('html, body').stop().animate({ scrollTop: $($anchor.attr('href')).offset().top }, 1500, 'easeInOutExpo'); event.preventDefault(); }); }); // Closes the Responsive Menu on Menu Item Click $('.navbar-collapse ul li a').click(function() { if ($(this).attr('class') != 'dropdown-toggle active' && $(this).attr('class') != 'dropdown-toggle') { $('.navbar-toggle:visible').click(); } }); ================================================ FILE: web/js/convnet.js ================================================ var convnetjs = convnetjs || { REVISION: 'ALPHA' }; (function(global) { "use strict"; // Random number utilities var return_v = false; var v_val = 0.0; var gaussRandom = function() { if(return_v) { return_v = false; return v_val; } var u = 2*Math.random()-1; var v = 2*Math.random()-1; var r = u*u + v*v; if(r == 0 || r > 1) return gaussRandom(); var c = Math.sqrt(-2*Math.log(r)/r); v_val = v*c; // cache this return_v = true; return u*c; } var randf = function(a, b) { return Math.random()*(b-a)+a; } var randi = function(a, b) { return Math.floor(Math.random()*(b-a)+a); } var randn = function(mu, std){ return mu+gaussRandom()*std; } // Array utilities var zeros = function(n) { if(typeof(n)==='undefined' || isNaN(n)) { return []; } if(typeof ArrayBuffer === 'undefined') { // lacking browser support var arr = new Array(n); for(var i=0;i maxv) { maxv = w[i]; maxi = i; } if(w[i] < minv) { minv = w[i]; mini = i; } } return {maxi: maxi, maxv: maxv, mini: mini, minv: minv, dv:maxv-minv}; } // create random permutation of numbers, in range [0...n-1] var randperm = function(n) { var i = n, j = 0, temp; var array = []; for(var q=0;qright var augment = function(V, crop, dx, dy, fliplr) { // note assumes square outputs of size crop x crop if(typeof(fliplr)==='undefined') var fliplr = false; if(typeof(dx)==='undefined') var dx = global.randi(0, V.sx - crop); if(typeof(dy)==='undefined') var dy = global.randi(0, V.sy - crop); // randomly sample a crop in the input volume var W; if(crop !== V.sx || dx!==0 || dy!==0) { W = new Vol(crop, crop, V.depth, 0.0); for(var x=0;x=V.sx || y+dy<0 || y+dy>=V.sy) continue; // oob for(var d=0;d=0 && oy=0 && ox=0 && oy=0 && ox=0 && iy=0 && ix=0 && oy=0 && ox a) { a = v; winx=ox; winy=oy;} } } } this.switchx[n] = winx; this.switchy[n] = winy; n++; A.set(ax, ay, d, a); } } } this.out_act = A; return this.out_act; }, backward: function() { // pooling layers have no parameters, so simply compute // gradient wrt data here var V = this.in_act; V.dw = global.zeros(V.w.length); // zero out gradient wrt data var A = this.out_act; // computed in forward pass var n = 0; for(var d=0;d amax) amax = as[i]; } // compute exponentials (carefully to not blow up) var es = global.zeros(this.out_depth); var esum = 0.0; for(var i=0;i 0) { // violating dimension, apply loss x.dw[i] += 1; x.dw[y] -= 1; loss += ydiff; } } return loss; }, getParamsAndGrads: function() { return []; }, toJSON: function() { var json = {}; json.out_depth = this.out_depth; json.out_sx = this.out_sx; json.out_sy = this.out_sy; json.layer_type = this.layer_type; json.num_inputs = this.num_inputs; return json; }, fromJSON: function(json) { this.out_depth = json.out_depth; this.out_sx = json.out_sx; this.out_sy = json.out_sy; this.layer_type = json.layer_type; this.num_inputs = json.num_inputs; } } global.RegressionLayer = RegressionLayer; global.SoftmaxLayer = SoftmaxLayer; global.SVMLayer = SVMLayer; })(convnetjs); (function(global) { "use strict"; var Vol = global.Vol; // convenience // Implements ReLU nonlinearity elementwise // x -> max(0, x) // the output is in [0, inf) var ReluLayer = function(opt) { var opt = opt || {}; // computed this.out_sx = opt.in_sx; this.out_sy = opt.in_sy; this.out_depth = opt.in_depth; this.layer_type = 'relu'; } ReluLayer.prototype = { forward: function(V, is_training) { this.in_act = V; var V2 = V.clone(); var N = V.w.length; var V2w = V2.w; for(var i=0;i 1/(1+e^(-x)) // so the output is between 0 and 1. var SigmoidLayer = function(opt) { var opt = opt || {}; // computed this.out_sx = opt.in_sx; this.out_sy = opt.in_sy; this.out_depth = opt.in_depth; this.layer_type = 'sigmoid'; } SigmoidLayer.prototype = { forward: function(V, is_training) { this.in_act = V; var V2 = V.cloneAndZero(); var N = V.w.length; var V2w = V2.w; var Vw = V.w; for(var i=0;i max(x) // where x is a vector of size group_size. Ideally of course, // the input size should be exactly divisible by group_size var MaxoutLayer = function(opt) { var opt = opt || {}; // required this.group_size = typeof opt.group_size !== 'undefined' ? opt.group_size : 2; // computed this.out_sx = opt.in_sx; this.out_sy = opt.in_sy; this.out_depth = Math.floor(opt.in_depth / this.group_size); this.layer_type = 'maxout'; this.switches = global.zeros(this.out_sx*this.out_sy*this.out_depth); // useful for backprop } MaxoutLayer.prototype = { forward: function(V, is_training) { this.in_act = V; var N = this.out_depth; var V2 = new Vol(this.out_sx, this.out_sy, this.out_depth, 0.0); // optimization branch. If we're operating on 1D arrays we dont have // to worry about keeping track of x,y,d coordinates inside // input volumes. In convnets we do :( if(this.out_sx === 1 && this.out_sy === 1) { for(var i=0;i a) { a = a2; ai = j; } } V2.w[i] = a; this.switches[i] = ix + ai; } } else { var n=0; // counter for switches for(var x=0;x a) { a = a2; ai = j; } } V2.set(x,y,i,a); this.switches[n] = ix + ai; n++; } } } } this.out_act = V2; return this.out_act; }, backward: function() { var V = this.in_act; // we need to set dw of this var V2 = this.out_act; var N = this.out_depth; V.dw = global.zeros(V.w.length); // zero out gradient wrt data // pass the gradient through the appropriate switch if(this.out_sx === 1 && this.out_sy === 1) { for(var i=0;i tanh(x) // so the output is between -1 and 1. var TanhLayer = function(opt) { var opt = opt || {}; // computed this.out_sx = opt.in_sx; this.out_sy = opt.in_sy; this.out_depth = opt.in_depth; this.layer_type = 'tanh'; } TanhLayer.prototype = { forward: function(V, is_training) { this.in_act = V; var V2 = V.cloneAndZero(); var N = V.w.length; for(var i=0;i= 2, 'Error! At least one input layer and one loss layer are required.'); assert(defs[0].type === 'input', 'Error! First layer must be the input layer, to declare size of inputs'); // desugar layer_defs for adding activation, dropout layers etc var desugar = function() { var new_defs = []; for(var i=0;i0) { var prev = this.layers[i-1]; def.in_sx = prev.out_sx; def.in_sy = prev.out_sy; def.in_depth = prev.out_depth; } switch(def.type) { case 'fc': this.layers.push(new global.FullyConnLayer(def)); break; case 'lrn': this.layers.push(new global.LocalResponseNormalizationLayer(def)); break; case 'dropout': this.layers.push(new global.DropoutLayer(def)); break; case 'input': this.layers.push(new global.InputLayer(def)); break; case 'softmax': this.layers.push(new global.SoftmaxLayer(def)); break; case 'regression': this.layers.push(new global.RegressionLayer(def)); break; case 'conv': this.layers.push(new global.ConvLayer(def)); break; case 'deconv': this.layers.push(new global.DeconvLayer(def)); break; case 'pool': this.layers.push(new global.PoolLayer(def)); break; case 'relu': this.layers.push(new global.ReluLayer(def)); break; case 'sigmoid': this.layers.push(new global.SigmoidLayer(def)); break; case 'tanh': this.layers.push(new global.TanhLayer(def)); break; case 'maxout': this.layers.push(new global.MaxoutLayer(def)); break; case 'svm': this.layers.push(new global.SVMLayer(def)); break; default: console.log('ERROR: UNRECOGNIZED LAYER TYPE: ' + def.type); } } }, // forward prop the network. // The trainer class passes is_training = true, but when this function is // called from outside (not from the trainer), it defaults to prediction mode forward: function(V, is_training) { if(typeof(is_training) === 'undefined') is_training = false; var act = this.layers[0].forward(V, is_training); for(var i=1;i=0;i--) { // first layer assumed input this.layers[i].backward(); } return loss; }, getParamsAndGrads: function() { // accumulate parameters and gradients for the entire network var response = []; for(var i=0;i maxv) { maxv = p[i]; maxi = i;} } return maxi; // return index of the class with highest class probability }, toJSON: function() { var json = {}; json.layers = []; for(var i=0;i 0.0)) { // only vanilla sgd doesnt need either lists // momentum needs gsum // adagrad needs gsum // adam and adadelta needs gsum and xsum for(var i=0;i 0 ? 1 : -1); var l2grad = l2_decay * (p[j]); var gij = (l2grad + l1grad + g[j]) / this.batch_size; // raw batch gradient var gsumi = this.gsum[i]; var xsumi = this.xsum[i]; if(this.method === 'adam') { // adam update gsumi[j] = gsumi[j] * this.beta1 + (1- this.beta1) * gij; // update biased first moment estimate xsumi[j] = xsumi[j] * this.beta2 + (1-this.beta2) * gij * gij; // update biased second moment estimate var biasCorr1 = gsumi[j] * (1 - Math.pow(this.beta1, this.k)); // correct bias first moment estimate var biasCorr2 = xsumi[j] * (1 - Math.pow(this.beta2, this.k)); // correct bias second moment estimate var dx = - this.learning_rate * biasCorr1 / (Math.sqrt(biasCorr2) + this.eps); p[j] += dx; } else if(this.method === 'adagrad') { // adagrad update gsumi[j] = gsumi[j] + gij * gij; var dx = - this.learning_rate / Math.sqrt(gsumi[j] + this.eps) * gij; p[j] += dx; } else if(this.method === 'windowgrad') { // this is adagrad but with a moving window weighted average // so the gradient is not accumulated over the entire history of the run. // it's also referred to as Idea #1 in Zeiler paper on Adadelta. Seems reasonable to me! gsumi[j] = this.ro * gsumi[j] + (1-this.ro) * gij * gij; var dx = - this.learning_rate / Math.sqrt(gsumi[j] + this.eps) * gij; // eps added for better conditioning p[j] += dx; } else if(this.method === 'adadelta') { gsumi[j] = this.ro * gsumi[j] + (1-this.ro) * gij * gij; var dx = - Math.sqrt((xsumi[j] + this.eps)/(gsumi[j] + this.eps)) * gij; xsumi[j] = this.ro * xsumi[j] + (1-this.ro) * dx * dx; // yes, xsum lags behind gsum by 1. p[j] += dx; } else if(this.method === 'nesterov') { var dx = gsumi[j]; gsumi[j] = gsumi[j] * this.momentum + this.learning_rate * gij; dx = this.momentum * dx - (1.0 + this.momentum) * gsumi[j]; p[j] += dx; } else { // assume SGD if(this.momentum > 0.0) { // momentum update var dx = this.momentum * gsumi[j] - this.learning_rate * gij; // step gsumi[j] = dx; // back this up for next iteration of momentum p[j] += dx; // apply corrected gradient } else { // vanilla sgd p[j] += - this.learning_rate * gij; } } g[j] = 0.0; // zero out gradient so that we can begin accumulating anew } } } // appending softmax_loss for backwards compatibility, but from now on we will always use cost_loss // in future, TODO: have to completely redo the way loss is done around the network as currently // loss is a bit of a hack. Ideally, user should specify arbitrary number of loss functions on any layer // and it should all be computed correctly and automatically. return {fwd_time: fwd_time, bwd_time: bwd_time, l2_decay_loss: l2_decay_loss, l1_decay_loss: l1_decay_loss, cost_loss: cost_loss, softmax_loss: cost_loss, loss: cost_loss + l1_decay_loss + l2_decay_loss} } } global.Trainer = Trainer; global.SGDTrainer = Trainer; // backwards compatibility })(convnetjs); (function(global) { "use strict"; // used utilities, make explicit local references var randf = global.randf; var randi = global.randi; var Net = global.Net; var Trainer = global.Trainer; var maxmin = global.maxmin; var randperm = global.randperm; var weightedSample = global.weightedSample; var getopt = global.getopt; var arrUnique = global.arrUnique; /* A MagicNet takes data: a list of convnetjs.Vol(), and labels which for now are assumed to be class indeces 0..K. MagicNet then: - creates data folds for cross-validation - samples candidate networks - evaluates candidate networks on all data folds - produces predictions by model-averaging the best networks */ var MagicNet = function(data, labels, opt) { var opt = opt || {}; if(typeof data === 'undefined') { data = []; } if(typeof labels === 'undefined') { labels = []; } // required inputs this.data = data; // store these pointers to data this.labels = labels; // optional inputs this.train_ratio = getopt(opt, 'train_ratio', 0.7); this.num_folds = getopt(opt, 'num_folds', 10); this.num_candidates = getopt(opt, 'num_candidates', 50); // we evaluate several in parallel // how many epochs of data to train every network? for every fold? // higher values mean higher accuracy in final results, but more expensive this.num_epochs = getopt(opt, 'num_epochs', 50); // number of best models to average during prediction. Usually higher = better this.ensemble_size = getopt(opt, 'ensemble_size', 10); // candidate parameters this.batch_size_min = getopt(opt, 'batch_size_min', 10); this.batch_size_max = getopt(opt, 'batch_size_max', 300); this.l2_decay_min = getopt(opt, 'l2_decay_min', -4); this.l2_decay_max = getopt(opt, 'l2_decay_max', 2); this.learning_rate_min = getopt(opt, 'learning_rate_min', -4); this.learning_rate_max = getopt(opt, 'learning_rate_max', 0); this.momentum_min = getopt(opt, 'momentum_min', 0.9); this.momentum_max = getopt(opt, 'momentum_max', 0.9); this.neurons_min = getopt(opt, 'neurons_min', 5); this.neurons_max = getopt(opt, 'neurons_max', 30); // computed this.folds = []; // data fold indices, gets filled by sampleFolds() this.candidates = []; // candidate networks that are being currently evaluated this.evaluated_candidates = []; // history of all candidates that were fully evaluated on all folds this.unique_labels = arrUnique(labels); this.iter = 0; // iteration counter, goes from 0 -> num_epochs * num_training_data this.foldix = 0; // index of active fold // callbacks this.finish_fold_callback = null; this.finish_batch_callback = null; // initializations if(this.data.length > 0) { this.sampleFolds(); this.sampleCandidates(); } }; MagicNet.prototype = { // sets this.folds to a sampling of this.num_folds folds sampleFolds: function() { var N = this.data.length; var num_train = Math.floor(this.train_ratio * N); this.folds = []; // flush folds, if any for(var i=0;i