Repository: forcecore/Keras-GAN-Animeface-Character Branch: master Commit: eef29baec4c8 Files: 11 Total size: 35.7 KB Directory structure: gitextract_c27pvakc/ ├── .gitignore ├── LICENSE ├── README.md ├── args.py ├── data.py ├── discrimination.py ├── gan.py ├── layers.py ├── make_mp4.sh ├── nets.py └── requirements.txt ================================================ FILE CONTENTS ================================================ ================================================ FILE: .gitignore ================================================ # Byte-compiled / optimized / DLL files __pycache__/ *.py[cod] *$py.class # C extensions *.so # Distribution / packaging .Python env/ build/ develop-eggs/ dist/ downloads/ eggs/ .eggs/ lib/ lib64/ parts/ sdist/ var/ *.egg-info/ .installed.cfg *.egg # PyInstaller # Usually these files are written by a python script from a template # before PyInstaller builds the exe, so as to inject date/other infos into it. *.manifest *.spec # Installer logs pip-log.txt pip-delete-this-directory.txt # Unit test / coverage reports htmlcov/ .tox/ .coverage .coverage.* .cache nosetests.xml coverage.xml *,cover .hypothesis/ # Translations *.mo *.pot # Django stuff: *.log local_settings.py # Flask stuff: instance/ .webassets-cache # Scrapy stuff: .scrapy # Sphinx documentation docs/_build/ # PyBuilder target/ # IPython Notebook .ipynb_checkpoints # pyenv .python-version # celery beat schedule file celerybeat-schedule # dotenv .env # virtualenv venv/ ENV/ # Spyder project settings .spyderproject # Rope project settings .ropeproject ================================================ FILE: LICENSE ================================================ MIT License Copyright (c) 2017 Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ================================================ FILE: README.md ================================================ # Keras-GAN-Animeface-Character GAN example for Keras. Cuz MNIST is too small and there should an example on something more realistic. ## Some results ### Training for 22 epochs Youtube Video, click on the image [![Training for 22 epochs](http://img.youtube.com/vi/YuGFmgCQV8I/0.jpg)](https://www.youtube.com/watch?v=YuGFmgCQV8I) ### Loss graph for 5000 mini-batches ![Loss graph](html_data/loss.png) 1 mini-batch = 64 images. Dataset = 14490, hence 5000 mini-batches is approximately 22 epochs. ### Some outputs of 5000th min-batch ![Some ouptputs of 5000th mini-batch](html_data/frame_00000499.png) ### Some training images ![Some inputs](html_data/reals.png) ## Useful resources, before you go on * There are great examples on MNIST already. Be sure to check them out. * https://oshearesearch.com/index.php/2016/07/01/mnist-generative-adversarial-model-in-keras/ * https://github.com/osh/KerasGAN * https://medium.com/towards-data-science/gan-by-example-using-keras-on-tensorflow-backend-1a6d515a60d0 * "How to Train a GAN? Tips and tricks to make GANs work" is a must read! (GAN Hacks) * The advices were extremely helpful in making this example. * https://github.com/soumith/ganhacks * Projects doing the same thing: * https://github.com/jayleicn/animeGAN * https://github.com/tdrussell/IllustrationGAN * I used slow implementation for the sake of simplicity. However, the correct way is: * https://ctmakro.github.io/site/on_learning/fast_gan_in_keras.html * https://github.com/shekkizh/neuralnetworks.thought-experiments/blob/master/Generative%20Models/GAN/Readme.md ## How to run this example ### Setup * My environment: Python 3.6 + Keras 2.0.4 + Tensorflow 1.x * If you are on Keras 2.0.0, you need to update it otherwise BatchNormalization() will cause bug, saying "you need to pass float to input" or something like that from Tensorflow back end. * Use [virtualenv](https://docs.python.org/3/tutorial/venv.html) to initialize a similar environment (python and dependencies): ```bash pip install virtualenv virtualenv -p /python3.6 venv source venv/bin/activate pip install -r requirements.txt ``` * I HATE making a program that has so many command line parameters to pass. Many of the parameters are there in the scripts. Adjust the script as you need. The "main()" function is at the bottom of the script as people do in C/C++ * Most global parameters are defined in args.py. * They are defined as class variables not instance variables so you may have trouble running/training multiple instances of the GAN with different parameters. (which is very unlikely to happen) * Download dataset from http://www.nurs.or.jp/~nagadomi/animeface-character-dataset/ * Extract it to this directory so that the scipt can find ./animeface-character-dataset/thumb/ * Any dataset should work in principle but GAN is sensitive to hyperparameters and may not work on yours. I tuned the parameters for animeface-character-dataset. ### Preprocessing * Run the preprocessing script. It saves training time to resize/scale the input than doing those tasks on the fly in the training loop. * ./data.py * The image, when loaded from PNG files, the RGB values have [0, 255]. (uint8 type). data.py will collect the images, resize the images to 64x64 and scale the RGB values so that they will be in [-1.0, 1.0] range. * Data.py will only sample a subset of the dataset if configured to do so. The size of the subset is determined by dataset_sz defined in args.py * The images will be written to data.hdf5. * Made it small to verify the training is working. * You can increase it but you need to adjust the network sizes accordingly. * Again, which files to read is defined in the script at the bottom, not by sys.argv. * You need a large enough dataset. Otherwise the discriminator will sort of "memorize" the true data and reject all that's generated. ### Training * Open gan.py then at the bottom, uncomment train\_autoenc() if you wish. * This is useful for seeing the generator network's capability to reproduce the input. * The auto-encoder will be trained on input images. * The output will be blurry, as the auto-encoder having mean-squared-error loss. (This is why GAN got invented in the first place!) * To run training, modify main() so that train\_gan() is uncommented. * The script will dump reals.png and fakes.png every 10 epoch so that you can see how the training is going. * The training takes a while. For this example on Anime Face dataset, it took about 10000 mini-batches to get good results. * If you see only uniform color or "modern art" until 2000 then the training is not working! * The script also dumps weights every 10 batches. Utilize them to save training time. Weights before diverging is preferred :) Uncomment load\_weights() in train\_gan(). ### Training tips What I experienced during my training of GAN. * As described in GAN Hacks, discriminator should be ahead of the generator so that the generator can be "guided" by the discriminator. * If you look at loss graph at https://github.com/osh/KerasGAN, they had gen loss in range of 2 to 4. Their training worked well. The discriminator loss is low, arond 0.1. * You'll need trial and error to get the hyper-pameters right so that the training stays in the stable, balanced zone. That includes learning rate of D and G, momentums, etc. * The convergence is quite sensitive with LR, beware! * If things go well, the discriminator loss for detecting real/fake = dloss0/dloss1 should be less than or around 0.1, which means it is good at telling whether the input is real or fake. * If learning rate is too high, the discriminator will diverge and one of the loss will get high and will not fall. Training fails in this case. * If you make LR too small, it will only slow the learning and will not prevent other issues such as oscillation. It only needs to be lower than certain threshold that is data dependent. * If adjusting LR doesn't work, it could be lack of complexity in the discriminator layer. Add more layers, or some other parameters. It could be anything :( Good luck! * On the other hand, generator loss will be relatively higher than discriminator loss. In this script, it oscillates in range 0.1 to 4. * If you see any of the D loss staying > 15 (when batch size is 32) the training is screwed. * In case of G loss > 15, see if it escapes within 30 batches. If it stays there for too long, it isn't good, I think. * In case you're seeing high G loss, it could mean it can't keep up with discriminator. You might need to increase LR. (Must be slower than discriminator though) * One final piece of the training I was missing was the parameter in BatchNormalization. I found about it in this link: https://github.com/shekkizh/neuralnetworks.thought-experiments/blob/master/Generative%20Models/GAN/Readme.md * Sort of interesting, in PyTorch, momentum parameter for BatchNorm is 0.1, according to the API documents, while in Keras it is 0.99. I'm not sure if 0.1 in PyTorch actually means 1 - 0.1. I didn't look into PyTorch backend implementation. ================================================ FILE: args.py ================================================ #!/usr/bin/env python3 class Args : # dataset size... Use positive number to sample subset of the full dataset. dataset_sz = -1 # Archive outputs of training here for animating later. anim_dir = "anim" # images size we will work on. (sz, sz, 3) sz = 64 # alpha, used by leaky relu of D and G networks. alpha_D = 0.2 alpha_G = 0.2 # batch size, during training. batch_sz = 64 # Length of the noise vector to generate the faces from. # Latent space z noise_shape = (1, 1, 100) # GAN training can be ruined any moment if not careful. # Archive some snapshots in this directory. snapshot_dir = "./snapshots" # dropout probability dropout = 0.3 # noisy label magnitude label_noise = 0.1 # history to keep. Slower training but higher quality. history_sz = 8 genw = "gen.hdf5" discw = "disc.hdf5" # Weight initialization function. #kernel_initializer = 'Orthogonal' #kernel_initializer = 'RandomNormal' # Same as default in Keras, but good for GAN, says # https://github.com/gheinrich/DIGITS-GAN/blob/master/examples/weight-init/README.md#experiments-with-lenet-on-mnist kernel_initializer = 'glorot_uniform' # Since DCGAN paper, everybody uses 0.5 and for me, it works the best too. # I tried 0.9, 0.1. adam_beta = 0.5 # BatchNormalization matters too. bn_momentum = 0.3 ================================================ FILE: data.py ================================================ #!/usr/bin/env python3 import glob import h5py import numpy as np import scipy.misc import random import cv2 from args import Args def normalize4gan(im): ''' Convert colorspace and cale the input in [-1, 1] range, as described in ganhacks ''' #im = cv2.cvtColor(im, cv2.COLOR_RGB2YCR_CB).astype(np.float32) # HSV... not helpful. im = im.astype(np.float32) im /= 128.0 im -= 1.0 # now in [-1, 1] return im def denormalize4gan(im): ''' Does opposite of normalize4gan: [-1, 1] to [0, 255]. Warning: input im is modified in-place! ''' im += 1.0 # in [0, 2] im *= 127.0 # in [0, 255] return im.astype(np.uint8) def make_hdf5(ofname, wildcard): ''' Preprocess files given by wildcard and save them in hdf5 file, as ofname. ''' pool = list(glob.glob(wildcard)) if Args.dataset_sz <= 0: fnames = pool else: fnames = [] for i in range(Args.dataset_sz): # possible duplicate but don't care fnames.append(random.choice(pool)) with h5py.File(ofname, "w") as f: faces = f.create_dataset("faces", (len(fnames), Args.sz, Args.sz, 3), dtype='f') for i, fname in enumerate(fnames): print(fname) im = scipy.misc.imread(fname, mode='RGB') # some have alpha channel im = scipy.misc.imresize(im, (Args.sz, Args.sz)) faces[i] = normalize4gan(im) def test(hdff): ''' Reads in hdf file and check if pixels are scaled in [-1, 1] range. ''' with h5py.File(hdff, "r") as f: X = f.get("faces") print(np.min(X[:,:,:,0])) print(np.max(X[:,:,:,0])) print(np.min(X[:,:,:,1])) print(np.max(X[:,:,:,1])) print(np.min(X[:,:,:,2])) print(np.max(X[:,:,:,2])) print("Dataset size:", len(X)) assert np.max(X) <= 1.0 assert np.min(X) >= -1.0 if __name__ == "__main__" : # Thankfully the dataset is in PNG, not JPEG. # Anime style suffers from significant quality degradation in JPEG. make_hdf5("data.hdf5", "animeface-character-dataset/thumb/*/*.png") #make_hdf5("data.hdf5", "animeface-character-dataset/thumb/025*/*.png") # Uncomment and run test, if you want. test("data.hdf5") ================================================ FILE: discrimination.py ================================================ from keras import backend as K from keras.engine import InputSpec, Layer from keras import initializers, regularizers, constraints # From a PR that is not pulled into Keras # https://github.com/fchollet/keras/pull/3677 # I updated the code to work on Keras 2.x class MinibatchDiscrimination(Layer): """Concatenates to each sample information about how different the input features for that sample are from features of other samples in the same minibatch, as described in Salimans et. al. (2016). Useful for preventing GANs from collapsing to a single output. When using this layer, generated samples and reference samples should be in separate batches. # Example ```python # apply a convolution 1d of length 3 to a sequence with 10 timesteps, # with 64 output filters model = Sequential() model.add(Convolution1D(64, 3, border_mode='same', input_shape=(10, 32))) # now model.output_shape == (None, 10, 64) # flatten the output so it can be fed into a minibatch discrimination layer model.add(Flatten()) # now model.output_shape == (None, 640) # add the minibatch discrimination layer model.add(MinibatchDiscrimination(5, 3)) # now model.output_shape = (None, 645) ``` # Arguments nb_kernels: Number of discrimination kernels to use (dimensionality concatenated to output). kernel_dim: The dimensionality of the space where closeness of samples is calculated. init: name of initialization function for the weights of the layer (see [initializations](../initializations.md)), or alternatively, Theano function to use for weights initialization. This parameter is only relevant if you don't pass a `weights` argument. weights: list of numpy arrays to set as initial weights. W_regularizer: instance of [WeightRegularizer](../regularizers.md) (eg. L1 or L2 regularization), applied to the main weights matrix. activity_regularizer: instance of [ActivityRegularizer](../regularizers.md), applied to the network output. W_constraint: instance of the [constraints](../constraints.md) module (eg. maxnorm, nonneg), applied to the main weights matrix. input_dim: Number of channels/dimensions in the input. Either this argument or the keyword argument `input_shape`must be provided when using this layer as the first layer in a model. # Input shape 2D tensor with shape: `(samples, input_dim)`. # Output shape 2D tensor with shape: `(samples, input_dim + nb_kernels)`. # References - [Improved Techniques for Training GANs](https://arxiv.org/abs/1606.03498) """ def __init__(self, nb_kernels, kernel_dim, init='glorot_uniform', weights=None, W_regularizer=None, activity_regularizer=None, W_constraint=None, input_dim=None, **kwargs): self.init = initializers.get(init) self.nb_kernels = nb_kernels self.kernel_dim = kernel_dim self.input_dim = input_dim self.W_regularizer = regularizers.get(W_regularizer) self.activity_regularizer = regularizers.get(activity_regularizer) self.W_constraint = constraints.get(W_constraint) self.initial_weights = weights self.input_spec = [InputSpec(ndim=2)] if self.input_dim: kwargs['input_shape'] = (self.input_dim,) super(MinibatchDiscrimination, self).__init__(**kwargs) def build(self, input_shape): assert len(input_shape) == 2 input_dim = input_shape[1] self.input_spec = [InputSpec(dtype=K.floatx(), shape=(None, input_dim))] self.W = self.add_weight(shape=(self.nb_kernels, input_dim, self.kernel_dim), initializer=self.init, name='kernel', regularizer=self.W_regularizer, trainable=True, constraint=self.W_constraint) # Set built to true. super(MinibatchDiscrimination, self).build(input_shape) def call(self, x, mask=None): activation = K.reshape(K.dot(x, self.W), (-1, self.nb_kernels, self.kernel_dim)) diffs = K.expand_dims(activation, 3) - K.expand_dims(K.permute_dimensions(activation, [1, 2, 0]), 0) abs_diffs = K.sum(K.abs(diffs), axis=2) minibatch_features = K.sum(K.exp(-abs_diffs), axis=2) return K.concatenate([x, minibatch_features], 1) def compute_output_shape(self, input_shape): assert input_shape and len(input_shape) == 2 return input_shape[0], input_shape[1]+self.nb_kernels def get_config(self): config = {'nb_kernels': self.nb_kernels, 'kernel_dim': self.kernel_dim, 'init': self.init.__name__, 'W_regularizer': self.W_regularizer.get_config() if self.W_regularizer else None, 'activity_regularizer': self.activity_regularizer.get_config() if self.activity_regularizer else None, 'W_constraint': self.W_constraint.get_config() if self.W_constraint else None, 'input_dim': self.input_dim} base_config = super(MinibatchDiscrimination, self).get_config() return dict(list(base_config.items()) + list(config.items())) ================================================ FILE: gan.py ================================================ #!/usr/bin/env python3 import os import sys import numpy as np import random from keras import models from keras import optimizers from keras.layers import Input from keras.optimizers import Adam, Adagrad, Adadelta, Adamax, SGD from keras.callbacks import CSVLogger import scipy import h5py from args import Args from data import denormalize4gan #from layers import bilinear2x from discrimination import MinibatchDiscrimination from nets import build_discriminator, build_gen, build_enc #import tensorflow as tf #import keras #keras.backend.get_session().run(tf.initialize_all_variables()) def sample_faces( faces ): reals = [] for i in range( Args.batch_sz ) : j = random.randrange( len(faces) ) face = faces[ j ] reals.append( face ) reals = np.array(reals) return reals def binary_noise(cnt): # Distribution of noise matters. # If you use single ranf that spans [0, 1], training will not work. # Well, for me at least. # Either normal or ranf works for me but be sure to use them with randrange(2) or something. #noise = np.random.normal( scale=Args.label_noise, size=((Args.batch_sz,) + Args.noise_shape) ) # Note about noise rangel. # 0, 1 noise vs -1, 1 noise. -1, 1 seems to be better and stable. noise = Args.label_noise * np.random.ranf((cnt,) + Args.noise_shape) # [0, 0.1] noise -= 0.05 # [-0.05, 0.05] noise += np.random.randint(0, 2, size=((cnt,) + Args.noise_shape)) noise -= 0.5 noise *= 2 return noise def sample_fake( gen ) : noise = binary_noise(Args.batch_sz) fakes = gen.predict(noise) return fakes, noise def dump_batch(imgs, cnt, ofname): ''' Merges cnt x cnt generated images into one big image. Use the command $ feh dump.png --reload 1 to refresh image peroidically during training! ''' assert Args.batch_sz >= cnt * cnt rows = [] for i in range( cnt ) : cols = [] for j in range(cnt*i, cnt*i+cnt): cols.append( imgs[j] ) rows.append( np.concatenate(cols, axis=1) ) alles = np.concatenate( rows, axis=0 ) alles = denormalize4gan( alles ) #alles = scipy.misc.imresize(alles, 200) # uncomment to scale scipy.misc.imsave( ofname, alles ) def build_networks(): shape = (Args.sz, Args.sz, 3) # Learning rate is important. # Optimizers are important too, try experimenting them yourself to fit your dataset. # I recommend you read DCGAN paper. # Unlike gan hacks, sgd doesn't seem to work well. # DCGAN paper states that they used Adam for both G and D. #opt = optimizers.SGD(lr=0.0001, decay=0.0, momentum=0.9, nesterov=True) #dopt = optimizers.SGD(lr=0.0001, decay=0.0, momentum=0.9, nesterov=True) # lr=0.010. Looks good, statistically (low d loss, higher g loss) # but too much for the G to create face. # If you see only one color 'flood fill' during training for about 10 batches or so, # training is failing. If you see only a few colors (instead of colorful noise) # then lr is too high for the opt and G will not have chance to form face. #dopt = Adam(lr=0.010, beta_1=0.5) #opt = Adam(lr=0.001, beta_1=0.5) # vague faces @ 500 # Still can't get higher frequency component. #dopt = Adam(lr=0.0010, beta_1=0.5) #opt = Adam(lr=0.0001, beta_1=0.5) # better faces @ 500 # but mode collapse after that, probably due to learning rate being too high. # opt.lr = dopt.lr / 10 works nicely. I found this with trial and error. # now same lr, as we are using history to train D multiple times. # I don't exactly understand how decay parameter in Adam works. Certainly not exponential. # Actually faster than exponential, when I look at the code and plot it in Excel. dopt = Adam(lr=0.0002, beta_1=Args.adam_beta) opt = Adam(lr=0.0001, beta_1=Args.adam_beta) # too slow # Another thing about LR. # If you make it small, it will only optimize slowly. # LR only has to be smaller than certain threshold that is data dependent. # (related to the largest gradient that prevents optimization) #dopt = Adam(lr=0.000010, beta_1=0.5) #opt = Adam(lr=0.000001, beta_1=0.5) # generator part gen = build_gen( shape ) # loss function doesn't seem to matter for this one, as it is not directly trained gen.compile(optimizer=opt, loss='binary_crossentropy') gen.summary() # discriminator part disc = build_discriminator( shape ) disc.compile(optimizer=dopt, loss='binary_crossentropy') disc.summary() # GAN stack # https://ctmakro.github.io/site/on_learning/fast_gan_in_keras.html is the faster way. # Here, for simplicity, I use slower way (slower due to duplicate computation). noise = Input( shape=Args.noise_shape ) gened = gen( noise ) result = disc( gened ) gan = models.Model( inputs=noise, outputs=result ) gan.compile(optimizer=opt, loss='binary_crossentropy') gan.summary() return gen, disc, gan def train_autoenc( dataf ): ''' Train an autoencoder first to see if your network is large enough. ''' f = h5py.File( dataf, 'r' ) faces = f.get( 'faces' ) opt = Adam(lr=0.001) shape = (Args.sz, Args.sz, 3) enc = build_enc( shape ) enc.compile(optimizer=opt, loss='mse') enc.summary() # generator part gen = build_gen( shape ) # generator is not directly trained. Optimizer and loss doesn't matter too much. gen.compile(optimizer=opt, loss='mse') gen.summary() face = Input( shape=shape ) vector = enc(face) recons = gen(vector) autoenc = models.Model( inputs=face, outputs=recons ) autoenc.compile(optimizer=opt, loss='mse') epoch = 0 while epoch < 200 : for i in range(10) : reals = sample_faces( faces ) fakes, noises = sample_fake( gen ) loss = autoenc.train_on_batch( reals, reals ) epoch += 1 print(epoch, loss) fakes = autoenc.predict(reals) dump_batch(fakes, 4, "fakes.png") dump_batch(reals, 4, "reals.png") gen.save_weights(Args.genw) enc.save_weights(Args.discw) print("Saved", Args.genw, Args.discw) def load_weights(model, wf): ''' I find error message in load_weights hard to understand sometimes. ''' try: model.load_weights(wf) except: print("failed to load weight, network changed or corrupt hdf5", wf, file=sys.stderr) sys.exit(1) def train_gan( dataf ) : gen, disc, gan = build_networks() # Uncomment these, if you want to continue training from some snapshot. # (or load pretrained generator weights) #load_weights(gen, Args.genw) #load_weights(disc, Args.discw) logger = CSVLogger('loss.csv') # yeah, you can use callbacks independently logger.on_train_begin() # initialize csv file with h5py.File( dataf, 'r' ) as f : faces = f.get( 'faces' ) run_batches(gen, disc, gan, faces, logger, range(5000)) logger.on_train_end() def run_batches(gen, disc, gan, faces, logger, itr_generator): history = [] # need this to prevent G from shifting from mode to mode to trick D. train_disc = True for batch in itr_generator: # Using soft labels here. lbl_fake = Args.label_noise * np.random.ranf(Args.batch_sz) lbl_real = 1 - Args.label_noise * np.random.ranf(Args.batch_sz) fakes, noises = sample_fake( gen ) reals = sample_faces( faces ) # Add noise... # My dataset works without this. #reals += 0.5 * np.exp(-batch/100) * np.random.normal( size=reals.shape ) if batch % 10 == 0 : if len(history) > Args.history_sz: history.pop(0) # evict oldest history.append( (reals, fakes) ) gen.trainable = False #for reals, fakes in history: d_loss1 = disc.train_on_batch( reals, lbl_real ) d_loss0 = disc.train_on_batch( fakes, lbl_fake ) gen.trainable = True #if d_loss1 > 15.0 or d_loss0 > 15.0 : # artificial training of one of G or D based on # statistics is not good at all. # pretrain train discriminator only if batch < 20 : print( batch, "d0:{} d1:{}".format( d_loss0, d_loss1 ) ) continue disc.trainable = False g_loss = gan.train_on_batch( noises, lbl_real ) # try to trick the classifier. disc.trainable = True # To escape this loop, both D and G should be trained so that # D begins to mark everything that's wrong that G has done. # Otherwise G will only change locally and fail to escape the minima. #train_disc = True if g_loss < 15 else False print( batch, "d0:{} d1:{} g:{}".format( d_loss0, d_loss1, g_loss ) ) # save weights every 10 batches if batch % 10 == 0 and batch != 0 : end_of_batch_task(batch, gen, disc, reals, fakes) row = {"d_loss0": d_loss0, "d_loss1": d_loss1, "g_loss": g_loss} logger.on_epoch_end(batch, row) _bits = binary_noise(Args.batch_sz) def end_of_batch_task(batch, gen, disc, reals, fakes): try : # Dump how the generator is doing. # Animation dump dump_batch(reals, 4, "reals.png") dump_batch(fakes, 4, "fakes.png") # to check how noisy the image is frame = gen.predict(_bits) animf = os.path.join(Args.anim_dir, "frame_{:08d}.png".format(int(batch/10))) dump_batch(frame, 4, animf) dump_batch(frame, 4, "frame.png") serial = int(batch / 10) % 10 prefix = os.path.join(Args.snapshot_dir, str(serial) + ".") print("Saving weights", serial) gen.save_weights(prefix + Args.genw) disc.save_weights(prefix + Args.discw) except KeyboardInterrupt : print("Saving, don't interrupt with Ctrl+C!", serial) # recursion to surely save everything haha end_of_batch_task(batch, gen, disc, reals, fakes) raise def generate( genw, cnt ): shape = (Args.sz, Args.sz, 3) gen = build_gen( shape ) gen.compile(optimizer='sgd', loss='mse') load_weights(gen, Args.genw) generated = gen.predict(binary_noise(Args.batch_sz)) # Unoffset, in batch. # Must convert back to unit8 to stop color distortion. generated = denormalize4gan(generated) for i in range(cnt): ofname = "{:04d}.png".format(i) scipy.misc.imsave( ofname, generated[i] ) def main( argv ) : if not os.path.exists(Args.snapshot_dir) : os.mkdir(Args.snapshot_dir) if not os.path.exists(Args.anim_dir) : os.mkdir(Args.anim_dir) # test the capability of generator network through autoencoder test. # The argument is that if the generator network can memorize the inputs then # it should be enough to GAN-generate stuff. # Pretraining gen isn't that useful in gan training as # the untrained discriminator will soon ruin everything. #train_autoenc( "data.hdf5" ) # train GAN with inputs in data.hdf5 train_gan( "data.hdf5" ) # Lets generate stuff #generate( "gen.hdf5", 256 ) if __name__ == '__main__': main(sys.argv) ================================================ FILE: layers.py ================================================ #!/usr/bin/env python3 from keras.layers.convolutional import Conv2DTranspose from keras.initializers import Constant import numpy as np def upsample_filt(size): factor = (size + 1) // 2 if size % 2 == 1: center = factor - 1 else: center = factor - 0.5 og = np.ogrid[:size, :size] return (1 - abs(og[0] - center) / factor) * (1 - abs(og[1] - center) / factor) def bilinear_upsample_weights(factor, number_of_classes): filter_size = factor*2 - factor%2 weights = np.zeros((filter_size, filter_size, number_of_classes, number_of_classes), dtype=np.float32) upsample_kernel = upsample_filt(filter_size) for i in range(number_of_classes): weights[:, :, i, i] = upsample_kernel return weights def bilinear2x(x, nfilters): ''' Ugh, I don't like making layers. My credit goes to: https://kivantium.net/keras-bilinear ''' return Conv2DTranspose(nfilters, (4, 4), strides=(2, 2), padding='same', kernel_initializer=Constant(bilinear_upsample_weights(2, nfilters)))(x) ================================================ FILE: make_mp4.sh ================================================ #!/bin/bash # -r specifies FPS, of the video. # -s specifies video size. # framerate is the rate the next image will show... sort of confusing with -r but # if you want to show one image long, use this option. # I might have misunderstood, refer to man page for the options. #-framerate 2 \ ffmpeg \ -f image2 \ -i anim/frame_%08d.png \ -r 30 \ -crf 5 -pix_fmt yuv420p \ -vcodec libx264 \ anim.mp4 ================================================ FILE: nets.py ================================================ #!/usr/bin/env python3 import os import sys import numpy as np import random from keras import models from keras import optimizers from keras.layers.normalization import BatchNormalization from keras.layers.convolutional import Conv2D, Conv2DTranspose, UpSampling2D from keras.layers.pooling import MaxPooling2D from keras.layers.core import Dense, Activation, Flatten, Reshape, Dropout from keras.layers import Input from keras.optimizers import Adam, Adagrad, Adadelta, Adamax, SGD from keras.callbacks import CSVLogger # GAN doesn't like spare gradients (says ganhack). LeakyReLU better. from keras.layers.advanced_activations import LeakyReLU import scipy import h5py from args import Args from data import denormalize4gan from layers import bilinear2x from discrimination import MinibatchDiscrimination #import tensorflow as tf #import keras #keras.backend.get_session().run(tf.initialize_all_variables()) def build_enc( shape ) : return build_discriminator(shape, build_disc=False) def build_discriminator( shape, build_disc=True ) : ''' Build discriminator. Set build_disc=False to build an encoder network to test the encoding/discrimination capability with autoencoder... ''' def conv2d( x, filters, shape=(4, 4), **kwargs ) : ''' I don't want to write lengthy parameters so I made a short hand function. ''' x = Conv2D( filters, shape, strides=(2, 2), padding='same', kernel_initializer=Args.kernel_initializer, **kwargs )( x ) #x = MaxPooling2D()( x ) x = BatchNormalization(momentum=Args.bn_momentum)( x ) x = LeakyReLU(alpha=Args.alpha_D)( x ) return x # https://github.com/tdrussell/IllustrationGAN # As proposed by them, unlike GAN hacks, MaxPooling works better for anime dataset it seems. # However, animeGAN doesn't use it so I'll keep it more similar to DCGAN. face = Input( shape=shape ) x = face # Warning: Don't batchnorm the first set of Conv2D. x = Conv2D( 64, (4, 4), strides=(2, 2), padding='same', kernel_initializer=Args.kernel_initializer )( x ) x = LeakyReLU(alpha=Args.alpha_D)( x ) # 32x32 x = conv2d( x, 128 ) # 16x16 x = conv2d( x, 256 ) # 8x8 x = conv2d( x, 512 ) # 4x4 if build_disc: x = Flatten()(x) # add 16 features. Run 1D conv of size 3. #x = MinibatchDiscrimination(16, 3)( x ) #x = Dense(1024, kernel_initializer=Args.kernel_initializer)( x ) #x = LeakyReLU(alpha=Args.alpha_D)( x ) # 1 when "real", 0 when "fake". x = Dense(1, activation='sigmoid', kernel_initializer=Args.kernel_initializer)( x ) return models.Model( inputs=face, outputs=x ) else: # build encoder. x = Conv2D(Args.noise_shape[2], (4, 4), activation='tanh')(x) return models.Model( inputs=face, outputs=x ) def build_gen( shape ) : def deconv2d( x, filters, shape=(4, 4) ) : ''' Conv2DTransposed gives me checkerboard artifact... Select one of the 3. ''' # Simpe Conv2DTranspose # Not good, compared to upsample + conv2d below. x= Conv2DTranspose( filters, shape, padding='same', strides=(2, 2), kernel_initializer=Args.kernel_initializer )(x) # simple and works #x = UpSampling2D( (2, 2) )( x ) #x = Conv2D( filters, shape, padding='same' )( x ) # Bilinear2x... Not sure if it is without bug, not tested yet. # Tend to make output blurry though #x = bilinear2x( x, filters ) #x = Conv2D( filters, shape, padding='same' )( x ) x = BatchNormalization(momentum=Args.bn_momentum)( x ) x = LeakyReLU(alpha=Args.alpha_G)( x ) return x # https://github.com/tdrussell/IllustrationGAN z predictor...? # might help. Not sure. noise = Input( shape=Args.noise_shape ) x = noise # 1x1x256 # noise is not useful for generating images. x= Conv2DTranspose( 512, (4, 4), kernel_initializer=Args.kernel_initializer )(x) x = BatchNormalization(momentum=Args.bn_momentum)( x ) x = LeakyReLU(alpha=Args.alpha_G)( x ) # 4x4 x = deconv2d( x, 256 ) # 8x8 x = deconv2d( x, 128 ) # 16x16 x = deconv2d( x, 64 ) # 32x32 # Extra layer x = Conv2D( 64, (3, 3), padding='same', kernel_initializer=Args.kernel_initializer )( x ) x = BatchNormalization(momentum=Args.bn_momentum)( x ) x = LeakyReLU(alpha=Args.alpha_G)( x ) # 32x32 x= Conv2DTranspose( 3, (4, 4), padding='same', activation='tanh', strides=(2, 2), kernel_initializer=Args.kernel_initializer )(x) # 64x64 return models.Model( inputs=noise, outputs=x ) ================================================ FILE: requirements.txt ================================================ absl-py==0.3.0 astor==0.7.1 backports.weakref==1.0rc1 bleach==1.5.0 enum34==1.1.6 gast==0.2.0 grpcio==1.13.0 h5py==2.8.0 html5lib==0.9999999 Keras==2.0.4 Markdown==2.6.11 numpy==1.15.0 opencv-python==3.3.0.9 Pillow==5.2.0 protobuf==3.6.0 PyYAML==3.13 scipy==1.1.0 six==1.11.0 tensorboard==1.9.0 tensorflow==1.9.0 tensorflow-tensorboard==1.5.1 termcolor==1.1.0 Theano==1.0.2 Werkzeug==0.14.1