[
  {
    "path": "README.md",
    "content": "All have been tested with python2.7+ and tensorflow1.0+ in linux.  \n\n* Samples: save generated data, each folder contains a figure to show the results.  \n* utils: contains 2 files  \n\t* data.py: prepreocessing data.  \n\t* nets.py: Generator and Discriminator are saved here.   \n\n\nFor research purpose,   \n**Network architecture**: all GANs used the same network architecture(the Discriminator of EBGAN and BEGAN are the combination of traditional D and G)  \n**Learning rate**: all initialized by 1e-4 and decayed by a factor of 2 each 5000 epoches (Maybe it is unfair for some GANs, but the influences are small, so I ignored)  \n**Dataset**: celebA cropped with 128 and resized to 64, users should copy all celebA images to `./Datas/celebA` for training  \n\n- [x] DCGAN  \n- [x] EBGAN  \n- [x] WGAN  \n- [x] BEGAN  \nAnd for comparsion, I added VAE here.  \n- [x] VAE  \n\nThe generated results are shown in the end of this page.   \n\n ***************\n \n \n# Theories\n \n:sparkles:DCGAN  \n--------\n**Main idea: Techniques(of architecture) to stabilize GAN**  \n[Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks](https://arxiv.org/pdf/1511.06434.pdf)[2015]  \n\n**Loss Function** (the same as Vanilla GAN)   \n\n![DCGAN_loss](https://raw.githubusercontent.com/YadiraF/Images/master/GAN/GAN_loss.png)    \n\n\n**Architecture guidelines for stable Deep Convolutional GANs**  \n\n* Replace any pooling layers with strided convolutions (discriminator) and fractional-strided convolutions (generator).  \n* Use batchnorm in both the generator and the discriminator  \n* Remove fully connected hidden layers for deeper architectures. Just use average pooling at the end.  \n* Use ReLU activation in generator for all layers except for the output, which uses Tanh.  \n* Use LeakyReLU activation in the discriminator for all layers.  \n\n ***************\n\n:sparkles:EBGAN  \n--------\n**Main idea: Views the discriminator as an energy function**\n[Energy-based Generative Adversarial Network](https://arxiv.org/pdf/1609.03126.pdf)[2016]   \n(Here introduce EBGAN just for BEGAN, they use the same network structure)  \n\nWhat is energy function?  \n![EBGAN_structure](https://raw.githubusercontent.com/YadiraF/Images/master/GAN/Energy_based_model.png)   \nThe figure is from [LeCun, Yann, et al. \"A tutorial on energy-based learning.\" ](http://yann.lecun.com/exdb/publis/pdf/lecun-06.pdf)  \n\nIn EBGAN, we want the Discriminator to distinguish the real images and the generated(fake) images. How? A simple idea is to set X as the real image and Y as the reconstructed image, and then minimize the energy of X and Y. So we need a auto-encoder to get Y from X, and a measure to calcuate the energy (here are MSE, so simple).   \nFinally we get the structure of Discriminator as shown below.  \n\n![EBGAN_structure](https://raw.githubusercontent.com/YadiraF/Images/master/GAN/EBGAN_structure.png)    \n\nSo the task of D is to minimize the MSE of real image and the corresponding reconstructed image, and maximize the MSE of fake image from the G and the corresponding reconstructed fake image. And G is to do the adversarial task: minimize the MSE of fake images...   \nThen obviously the loss function can be written as:  \n![EBGAN_loss](https://raw.githubusercontent.com/YadiraF/Images/master/GAN/EBGAN_loss1.png)  \n\nAnd for comparison with BEGAN, we can set the D only as the auto-encoder and L(*) for the MSE loss.\n**Loss Function**\n![EBGAN_loss](https://raw.githubusercontent.com/YadiraF/Images/master/GAN/EBGAN_loss2.png)  \n\nm is a positive margin here, when L(G(z)) is close to zero, the L_D is L(x) + m, which means to train D more heavily, and on the contrary, when L(G(z))>m, the L_D is L(x), which means the the D loosens the judgement of the fake images.  \n\n\n Finally, there is a quetion for EBGAN, why use auto-encoder in D instead of the traditonal one? What are the benifits?  \n I have not read the paper carefully, but one reason I think is that (said in the paper) auto-encoders have the ability to learn an energy manifold without supervision or negative examples.  So, rather than simply judge the real or fake of images, the new D can catch the primary distribution of data then distinguish them. And the generated result shown in EBGAN also illustrated that(my understanding): the generated images of celebA from dcgan can hardly distinguish the face and the complex background, but the images from EBGAN focus more heavily on generating faces.  \n\n ***************\n\n:sparkles:Wasserstein GAN\n--------\n**Main idea: Stabilize the training by using Wasserstein-1 distance instead of Jenson-Shannon(JS) divergence**  \nGAN before using JS divergence has the problem of non-overlapping, leading to mode collapse and convergence difficulty.   \nUse EM distance or Wasserstein-1 distance, so GAN can solve the two problems above without particular architecture (like dcgan).   \n[Wasserstein GAN](https://arxiv.org/pdf/1701.07875.pdf)[2017]   \n\n**Mathmatics Analysis**  \nWhy JS divergence has problems? pleas see [Towards Principled Methods for Training Generative Adversarial Networks](https://arxiv.org/pdf/1701.04862.pdf)  \n\nAnyway, this highlights the fact that **the KL, JS, and TV distances are not sensible\ncost functions** when learning distributions supported by low dimensional manifolds.   \n\nso the author use Wasserstein distance  \n![WGAN_loss](https://raw.githubusercontent.com/YadiraF/Images/master/GAN/WGAN_loss1.png)  \nApparently, the G is to maximize the distance, while the D is to minimize the distance. \n\nHowever,  it is difficult to directly calculate the original formula, ||f||_L<=1 is hard to express. So the authors change it to the clip of varibales in D after some mathematical analysis,  then the Wasserstein distance version of GAN loss function can be:\n**Loss Function**  \n![WGAN_loss](https://raw.githubusercontent.com/YadiraF/Images/master/GAN/WGAN_loss2.png)  \n\n\n**Algorithm guidelines for stable GANs**  \n\n* No log in the loss. The output of D is no longer a probability, hence we do not apply sigmoid at the output of D\n>\n\t\tG_loss = -tf.reduce_mean(D_fake)\n\t\tD_loss = tf.reduce_mean(D_fake) - tf.reduce_mean(D_real) \n* Clip the weight of D (0.01)\n>\n\t\tself.clip_D = [var.assign(tf.clip_by_value(var, -0.01, 0.01)) for var in self.discriminator.vars]\n* Train D more than G (5:1)\n* Use RMSProp instead of ADAM\n* Lower learning rate (0.00005)\n\n\n****************\n\n:sparkles: BEGAN\n--------\n**Main idea: Match auto-encoder loss distributions using a loss derived from the Wasserstein distance**  \n[BEGAN: Boundary Equilibrium Generative Adversarial Networks](https://arxiv.org/pdf/1703.10717.pdf)[2017]   \n\n**Mathmatics Analysis**  \nWe have already introduced the structure of EBGAN, which is also used in BEGAN.   \nThen, instead of calculating the Wasserstein distance of the samples distribution in WGAN, BEGAN calculates the wasserstein distance of loss distribution.   \n(The mathematical analysis in BEGAN I think is more clear and intuitive than in WGAN)  \nSo, simply replace the E of L, we get the loss function:  \n ![BEGAN_loss](https://raw.githubusercontent.com/YadiraF/Images/master/GAN/BEGAN_loss1.png)  \n\nThen, the most intereting part is comming:  \na new hyper-paramer  to control the trade-off between image diversity and visual quality.   \n ![BEGAN_loss](https://raw.githubusercontent.com/YadiraF/Images/master/GAN/BEGAN_loss2.png)  \nLower values of γ lead to lower image diversity because the discriminator focuses more heavily on auto-encoding real images.  \n\nThe final loss function is:  \n**Loss Function**  \n ![BEGAN_loss](https://raw.githubusercontent.com/YadiraF/Images/master/GAN/BEGAN_loss3.png)  \n \n The intuition behind the function is easy to understand:  \n (Here I describe my understanding roughly...)   \n(1). In the beginning, the G and D are initialized randomly and k_0 = 0, so the L_real is larger than L_fake, leading to a short increase of k.   \n(2). After several iterations, the D easily learned how to reconstruct the real data, so gamma x L_real - L_fake is negative, k decreased to 0, now D is only to reconstruct the real data and G is to learn real data distrubition so as to minimize the reconstruction error in D.   \n(3). Along with the improvement of the ability of G to generate images like real data, L_fake becomes smaller and k becomes larger, so D focuses more on discriminating the real and fake data, then G trained more following.   \n(4). In the end, k becomes a constant, which means  gamma x L_real - L_fake=0, so the optimization is done.  \n\n \n And the global loss is defined the addition of L_real (how well D learns the distribution of real data) and |gamma*L_real - L_fake| (how closed of the generated data from G and the real data)   \n  ![BEGAN_loss](https://raw.githubusercontent.com/YadiraF/Images/master/GAN/BEGAN_loss4.png)  \n\n\nI set gamma=0.75, learning rate of k = 0.001, then the learning curve of loss and k is shown below.   \n  ![BEGAN_loss](https://raw.githubusercontent.com/YadiraF/Images/master/GAN/BEGAN_curve.png)  \n\n\n\n# Results\n\nDCGAN  \n![DCGAN_samples](https://raw.githubusercontent.com/YadiraF/GAN_Theories/master/Samples/dcgan/497.png)  \n\nEBGAN (not trained enough)  \n![EBGAN_samples](https://raw.githubusercontent.com/YadiraF/GAN_Theories/master/Samples/ebgan/109_r.png)  \n\nWGAN (not trained enough)  \n![WGAN_samples](https://raw.githubusercontent.com/YadiraF/GAN_Theories/master/Samples/wgan/260.png)  \n\nBEGAN: gamma=0.75 learning rate of k=0.001  \n![BEGAN_samples](https://raw.githubusercontent.com/YadiraF/GAN_Theories/master/Samples/began_n/369_r.png)  \n\nBEGAN: gamma= 0.5 learning rate of k = 0.002  \n![BEGAN_samples](https://raw.githubusercontent.com/YadiraF/GAN_Theories/master/Samples/began/228_r.png)  \n\nVAE  \n![BEGAN_samples](https://raw.githubusercontent.com/YadiraF/GAN_Theories/master/Samples/vae/499_s.png)   \n\n\n# References\nhttp://wiseodd.github.io/techblog/2016/12/10/variational-autoencoder/ (a good blog to introduce VAE)  \nhttps://github.com/wiseodd/generative-models/tree/master/GAN  \nhttps://github.com/artcg/BEGAN  \n\n\n# Others\nTensorflow style: https://www.tensorflow.org/community/style_guide  \n\n\n\nA good website to convert latex equation to img(then insert into README):\nhttp://www.sciweavers.org/free-online-latex-equation-editor \n"
  },
  {
    "path": "began.py",
    "content": "import tensorflow as tf\nfrom tensorflow.examples.tutorials.mnist import input_data\nimport numpy as np\nimport matplotlib as mpl\nmpl.use('Agg')\nimport matplotlib.pyplot as plt\nimport matplotlib.gridspec as gridspec\nimport os,sys\n\nsys.path.append('utils')\nfrom nets import *\nfrom datas import *\n\ndef sample_z(m, n):\n\treturn np.random.uniform(-1., 1., size=[m, n])\n\nclass BEGAN():\n\tdef __init__(self, generator, discriminator, data):\n\t\tself.generator = generator\n\t\tself.discriminator = discriminator\n\t\tself.data = data\n\n\t\t# data\n\t\tself.z_dim = self.data.z_dim\n\t\tself.size = self.data.size\n\t\tself.channel = self.data.channel\n\n\t\tself.X = tf.placeholder(tf.float32, shape=[None, self.size, self.size, self.channel])\n\t\tself.z = tf.placeholder(tf.float32, shape=[None, self.z_dim])\n\n\t\t# began parameters\n\t\tself.k_t =  tf.placeholder(tf.float32, shape=[]) # weighting parameter which constantly updates during training\n\t\tgamma = 0.75  # diversity ratio, used to control model equibilibrium.\n\t\tlambda_k = 0.001 # learning rate for k. Berthelot et al. use 0.001\n\n\t\t# nets\n\t\tself.G_sample = self.generator(self.z)\n\n\t\tself.D_real = self.discriminator(self.X)\n\t\tself.D_fake = self.discriminator(self.G_sample, reuse = True)\n\t\t\n\t\t# loss\n\t\tL_real = tf.reduce_mean(tf.abs(self.X - self.D_real))\n\t\tL_fake = tf.reduce_mean(tf.abs(self.G_sample - self.D_fake))\n\n\t\tself.D_loss = L_real - self.k_t * L_fake\n\t\tself.G_loss = L_fake\n\t\t\n\t\tself.k_tn = self.k_t + lambda_k * (gamma*L_real - L_fake)\n\t\tself.M_global = L_real + tf.abs(gamma*L_real - L_fake)\t\t\n\t\n\t\t# solver\n\t\tself.learning_rate = tf.placeholder(tf.float32, shape=[])\n\t\tself.D_solver = tf.train.AdamOptimizer(learning_rate=self.learning_rate).minimize(self.D_loss, var_list=self.discriminator.vars)\n\t\tself.G_solver = tf.train.AdamOptimizer(learning_rate=self.learning_rate).minimize(self.G_loss, var_list=self.generator.vars)\n\t\t\n\t\tself.saver = tf.train.Saver()\n\t\tgpu_options = tf.GPUOptions(allow_growth=True)\n\t\tself.sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))\n\t\tself.model_name = 'Models/began.ckpt'\n\n\tdef train(self, sample_dir, training_epoches = 500000, batch_size = 16):\n\t\tfig_count = 0\n\t\tself.sess.run(tf.global_variables_initializer())\n\t\t#self.saver.restore(self.sess, self.model_name)\t\t\n\n\t\tk_tn = 0\n\t\tlearning_rate_initial = 1e-4\n\t\tfor epoch in range(training_epoches):\n\t\t\tlearning_rate =  learning_rate_initial * pow(0.5, epoch // 50000)\n\t\t\t# update D and G\n\t\t\tX_b = self.data(batch_size)\n\t\t\t_, _, k_tn = self.sess.run(\n\t\t\t\t[self.D_solver, self.G_solver, self.k_tn],\n\t\t\t\tfeed_dict={self.X: X_b, self.z: sample_z(batch_size, self.z_dim), self.k_t: min(max(k_tn, 0.), 1.), self.learning_rate: learning_rate}\n\t\t\t\t)\n\t\t\t# save img, model. print loss\n\t\t\tif epoch % 100 == 0 or epoch < 100:\n\t\t\t\tD_loss_curr, G_loss_curr, M_global_curr = self.sess.run(\n\t\t\t\t\t\t[self.D_loss, self.G_loss, self.M_global],\n            \t\t\tfeed_dict={self.X: X_b, self.z: sample_z(batch_size, self.z_dim), self.k_t: min(max(k_tn, 0.), 1.)})\n\t\t\t\tprint('Iter: {}; D loss: {:.4}; G_loss: {:.4}; M_global: {:.4}; k_t: {:.6}; learning_rate:{:.8}'.format(epoch, D_loss_curr, G_loss_curr, M_global_curr, min(max(k_tn, 0.), 1.), learning_rate))\n\n\t\t\t\tif epoch % 1000 == 0:\n\t\t\t\t\tX_s, real, samples = self.sess.run([self.X, self.D_real, self.G_sample], feed_dict={self.X: X_b[:16,:,:,:], self.z: sample_z(16, self.z_dim)})\n\n\t\t\t\t\tfig = self.data.data2fig(X_s)\n\t\t\t\t\tplt.savefig('{}/{}.png'.format(sample_dir, str(fig_count).zfill(3)), bbox_inches='tight')\n\t\t\t\t\tplt.close(fig)\n\n\t\t\t\t\tfig = self.data.data2fig(real)\n\t\t\t\t\tplt.savefig('{}/{}_d.png'.format(sample_dir, str(fig_count).zfill(3)), bbox_inches='tight')\n\t\t\t\t\tplt.close(fig)\n\n\t\t\t\t\tfig = self.data.data2fig(samples)\n\t\t\t\t\tplt.savefig('{}/{}_r.png'.format(sample_dir, str(fig_count).zfill(3)), bbox_inches='tight')\n\t\t\t\t\tplt.close(fig)\n\n\t\t\t\t\tfig_count += 1\n\n\t\t\t\tif epoch % 5000 == 0:\n\t\t\t\t\tself.saver.save(self.sess, self.model_name)\n\nif __name__ == '__main__':\n\n\t# constraint GPU\n\tos.environ['CUDA_VISIBLE_DEVICES'] = '1'\n\n\t# save generated images\n\tsample_dir = 'Samples/began'\n\tif not os.path.exists(sample_dir):\n\t\tos.makedirs(sample_dir)\n\n\t# param\n\tgenerator = G_conv()\n\tdiscriminator = D_autoencoder()\n\n\tdata = cifar()\n\n\t# run\n\tbegan = BEGAN(generator, discriminator, data)\n\tbegan.train(sample_dir)\n\n"
  },
  {
    "path": "dcgan.py",
    "content": "import tensorflow as tf\nfrom tensorflow.examples.tutorials.mnist import input_data\nimport numpy as np\nimport matplotlib as mpl\nmpl.use('Agg')\nimport matplotlib.pyplot as plt\nimport matplotlib.gridspec as gridspec\nimport os,sys\n\nsys.path.append('utils')\nfrom nets import *\nfrom datas import *\n\ndef sample_z(m, n):\n\treturn np.random.uniform(-1., 1., size=[m, n])\n\nclass DCGAN():\n\tdef __init__(self, generator, discriminator, data):\n\t\tself.generator = generator\n\t\tself.discriminator = discriminator\n\t\tself.data = data\n\n\t\t# data\n\t\tself.z_dim = self.data.z_dim\n\t\tself.size = self.data.size\n\t\tself.channel = self.data.channel\n\n\t\tself.X = tf.placeholder(tf.float32, shape=[None, self.size, self.size, self.channel])\n\t\tself.z = tf.placeholder(tf.float32, shape=[None, self.z_dim])\n\n\t\t# nets\n\t\tself.G_sample = self.generator(self.z)\n\n\t\tself.D_real = self.discriminator(self.X)\n\t\tself.D_fake = self.discriminator(self.G_sample, reuse = True)\n\t\t\n\t\t# loss\n\t\tself.D_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=self.D_real, labels=tf.ones_like(self.D_real))) + tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=self.D_fake, labels=tf.zeros_like(self.D_fake)))\n\t\tself.G_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=self.D_fake, labels=tf.ones_like(self.D_fake)))\n\n\t\t# solver\n\t\tself.learning_rate = tf.placeholder(tf.float32, shape=[])\n\t\tself.D_solver = tf.train.AdamOptimizer(learning_rate=self.learning_rate).minimize(self.D_loss, var_list=self.discriminator.vars)\n\t\tself.G_solver = tf.train.AdamOptimizer(learning_rate=self.learning_rate).minimize(self.G_loss, var_list=self.generator.vars)\n\t\t\n\t\tself.saver = tf.train.Saver()\n\t\tgpu_options = tf.GPUOptions(allow_growth=True)\n\t\tself.sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))\n\t\tself.model_name = 'Models/dcgan.ckpt'\n\n\tdef train(self, sample_dir, training_epoches = 500000, batch_size = 32):\n\t\tfig_count = 0\n\t\tself.sess.run(tf.global_variables_initializer())\n\t\t#self.saver.restore(self.sess, self.model_name)\t\t\n\t\tlearning_rate_initial = 1e-4\n\t\tfor epoch in range(training_epoches):\n\t\t\tlearning_rate = learning_rate_initial * pow(0.5, epoch // 50000)\n\t\t\t# update D\n\t\t\tX_b = self.data(batch_size)\n\t\t\tself.sess.run(\n\t\t\t\tself.D_solver,\n\t\t\t\tfeed_dict={self.X: X_b, self.z: sample_z(batch_size, self.z_dim), self.learning_rate: learning_rate}\n\t\t\t\t)\n\t\t\t# update G\n\t\t\tfor _ in range(1):\n\t\t\t\tself.sess.run(\n\t\t\t\t\tself.G_solver,\n\t\t\t\t\tfeed_dict={self.z: sample_z(batch_size, self.z_dim), self.learning_rate: learning_rate}\n\t\t\t\t)\n\n\t\t\t# save img, model. print loss\n\t\t\tif epoch % 100 == 0 or epoch < 100:\n\t\t\t\tD_loss_curr, G_loss_curr = self.sess.run(\n\t\t\t\t\t\t[self.D_loss, self.G_loss],\n            \t\t\tfeed_dict={self.X: X_b, self.z: sample_z(batch_size, self.z_dim)})\n\t\t\t\tprint('Iter: {}; D loss: {:.4}; G_loss: {:.4}'.format(epoch, D_loss_curr, G_loss_curr))\n\n\t\t\t\tif epoch % 1000 == 0:\n\t\t\t\t\tsamples = self.sess.run(self.G_sample, feed_dict={self.z: sample_z(16, self.z_dim)})\n\n\t\t\t\t\tfig = self.data.data2fig(samples)\n\t\t\t\t\tplt.savefig('{}/{}.png'.format(sample_dir, str(fig_count).zfill(3)), bbox_inches='tight')\n\t\t\t\t\tfig_count += 1\n\t\t\t\t\tplt.close(fig)\n\n\t\t\t\tif epoch % 5000 == 0:\n\t\t\t\t\tself.saver.save(self.sess, self.model_name)\n\n\nif __name__ == '__main__':\n\n\t# constraint GPU\n\tos.environ['CUDA_VISIBLE_DEVICES'] = '2'\n\n\t# save generated images\n\tsample_dir = 'Samples/dcgan'\n\tif not os.path.exists(sample_dir):\n\t\tos.makedirs(sample_dir)\n\n\t# param\n\tgenerator = G_conv()\n\tdiscriminator = D_conv()\n\n\tdata = celebA()\n\n\t# run\n\tdcgan = DCGAN(generator, discriminator, data)\n\tdcgan.train(sample_dir)\n\n"
  },
  {
    "path": "ebgan.py",
    "content": "import tensorflow as tf\nfrom tensorflow.examples.tutorials.mnist import input_data\nimport numpy as np\nimport matplotlib as mpl\nmpl.use('Agg')\nimport matplotlib.pyplot as plt\nimport matplotlib.gridspec as gridspec\nimport os,sys\n\nsys.path.append('utils')\nfrom nets import *\nfrom datas import *\n\ndef sample_z(m, n):\n\treturn np.random.uniform(-1., 1., size=[m, n])\n\nclass EBGAN():\n\tdef __init__(self, generator, discriminator, data):\n\t\tself.generator = generator\n\t\tself.discriminator = discriminator\n\t\tself.data = data\n\n\t\t# data\n\t\tself.z_dim = self.data.z_dim\n\t\tself.size = self.data.size\n\t\tself.channel = self.data.channel\n\n\t\tself.X = tf.placeholder(tf.float32, shape=[None, self.size, self.size, self.channel])\n\t\tself.z = tf.placeholder(tf.float32, shape=[None, self.z_dim])\n\n\t\t# ebgan parameters\n\t\tmargin = 50. #\n\n\t\t# nets\n\t\tself.G_sample = self.generator(self.z)\n\n\t\tself.D_real = self.discriminator(self.X)\n\t\tself.D_fake = self.discriminator(self.G_sample, reuse = True)\n\t\t\n\t\t# loss\n\t\t#L_real = tf.reduce_mean((self.X - self.D_real)**2, [1,2,3])\n\t\t#L_fake = tf.reduce_mean((self.G_sample - self.D_fake)**2, [1,2,3])\n\t\tL_real = tf.nn.l2_loss(self.X - self.D_real)\n\t\tL_fake = tf.nn.l2_loss(self.G_sample - self.D_fake)\n\n\t\tself.D_loss = L_real + tf.maximum(0., margin - L_fake)\n\t\tself.G_loss = L_fake\n\t\t\n\t\t# solver\n\t\tself.learning_rate = tf.placeholder(tf.float32, shape=[])\n\t\tself.D_solver = tf.train.AdamOptimizer(learning_rate=self.learning_rate).minimize(self.D_loss, var_list=self.discriminator.vars)\n\t\tself.G_solver = tf.train.AdamOptimizer(learning_rate=self.learning_rate).minimize(self.G_loss, var_list=self.generator.vars)\n\t\t\n\t\tself.saver = tf.train.Saver()\n\t\tgpu_options = tf.GPUOptions(allow_growth=True)\n\t\tself.sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))\n\t\tself.model_name = 'Models/ebgan.ckpt'\n\n\tdef train(self, sample_dir, training_epoches = 500000, batch_size = 32):\n\t\tfig_count = 0\n\t\tself.sess.run(tf.global_variables_initializer())\n\t\t#self.saver.restore(self.sess, self.model_name)\t\t\n\n\t\tlearning_rate_initial = 1e-4\n\t\tfor epoch in range(training_epoches):\n\t\t\tlearning_rate =  learning_rate_initial * pow(0.5, epoch // 50000)\n\t\t\t# update D and G\n\t\t\tX_b = self.data(batch_size)\n\t\t\tself.sess.run(\n\t\t\t\t[self.D_solver, self.G_solver],\n\t\t\t\tfeed_dict={self.X: X_b, self.z: sample_z(batch_size, self.z_dim), self.learning_rate: learning_rate}\n\t\t\t\t)\n\t\t\t# save img, model. print loss\n\t\t\tif epoch % 100 == 0 or epoch < 100:\n\t\t\t\tD_loss_curr, G_loss_curr = self.sess.run(\n\t\t\t\t\t\t[self.D_loss, self.G_loss],\n            \t\t\tfeed_dict={self.X: X_b, self.z: sample_z(batch_size, self.z_dim)})\n\t\t\t\tprint('Iter: {}; D loss: {:.4}; G_loss: {:.4};'.format(epoch, D_loss_curr, G_loss_curr))\n\n\t\t\t\tif epoch % 1000 == 0:\n\t\t\t\t\tX_s, real, samples = self.sess.run([self.X, self.D_real, self.G_sample], feed_dict={self.X: X_b[:16,:,:,:], self.z: sample_z(16, self.z_dim)})\n\n\t\t\t\t\tfig = self.data.data2fig(X_s)\n\t\t\t\t\tplt.savefig('{}/{}.png'.format(sample_dir, str(fig_count).zfill(3)), bbox_inches='tight')\n\t\t\t\t\tplt.close(fig)\n\n\t\t\t\t\tfig = self.data.data2fig(real)\n\t\t\t\t\tplt.savefig('{}/{}_d.png'.format(sample_dir, str(fig_count).zfill(3)), bbox_inches='tight')\n\t\t\t\t\tplt.close(fig)\n\n\t\t\t\t\tfig = self.data.data2fig(samples)\n\t\t\t\t\tplt.savefig('{}/{}_r.png'.format(sample_dir, str(fig_count).zfill(3)), bbox_inches='tight')\n\t\t\t\t\tplt.close(fig)\n\n\t\t\t\t\tfig_count += 1\n\n\t\t\t\tif epoch % 5000 == 0:\n\t\t\t\t\tself.saver.save(self.sess, self.model_name)\n\nif __name__ == '__main__':\n\n\t# constraint GPU\n\tos.environ['CUDA_VISIBLE_DEVICES'] = '1'\n\n\t# save generated images\n\tsample_dir = 'Samples/ebgan'\n\tif not os.path.exists(sample_dir):\n\t\tos.makedirs(sample_dir)\n\n\t# param\n\tgenerator = G_conv()\n\tdiscriminator = D_autoencoder()\n\n\tdata = celebA()\n\n\t# run\n\tebgan = EBGAN(generator, discriminator, data)\n\tebgan.train(sample_dir)\n\n"
  },
  {
    "path": "utils/datas.py",
    "content": "import os,sys\nfrom PIL import Image\nimport scipy.misc\nfrom glob import glob\nimport numpy as np\nimport matplotlib as mpl\nmpl.use('Agg')\nimport matplotlib.pyplot as plt\nimport matplotlib.gridspec as gridspec\n\nfrom tensorflow.examples.tutorials.mnist import input_data\n\nprefix = './Datas/'\n\ndef get_img(img_path, is_crop=True, crop_h=256, resize_h=64):\n\timg=scipy.misc.imread(img_path).astype(np.float)\n\tresize_w = resize_h\n\tif is_crop:\n\t\tcrop_w = crop_h\n\t\th, w = img.shape[:2]\n\t\tj = int(round((h - crop_h)/2.))\n\t\ti = int(round((w - crop_w)/2.))\n\t\tcropped_image = scipy.misc.imresize(img[j:j+crop_h, i:i+crop_w],[resize_h, resize_w])\n\telse:\n\t\tcropped_image = scipy.misc.imresize(img,[resize_h, resize_w])\n\treturn np.array(cropped_image)/255.0\n\n\nclass celebA():\n\tdef __init__(self):\n\t\tdatapath = prefix + 'celebA'\n\t\tself.z_dim = 100\n\t\tself.size = 64\n\t\tself.channel = 3\n\t\tself.data = glob(os.path.join(datapath, '*.jpg'))\n\n\t\tself.batch_count = 0\n\n\tdef __call__(self,batch_size):\n\t\tbatch_number = len(self.data)/batch_size\n\t\tif self.batch_count < batch_number-2:\n\t\t\tself.batch_count += 1\n\t\telse:\n\t\t\tself.batch_count = 0\n\n\t\tpath_list = self.data[self.batch_count*batch_size:(self.batch_count+1)*batch_size]\n\n\t\tbatch = [get_img(img_path, True, 128, self.size) for img_path in path_list]\n\t\tbatch_imgs = np.array(batch).astype(np.float32)\n\t\t\n\t\treturn batch_imgs\n\n\tdef data2fig(self, samples):\n\t\tfig = plt.figure(figsize=(4, 4))\n\t\tgs = gridspec.GridSpec(4, 4)\n\t\tgs.update(wspace=0.05, hspace=0.05)\n\n\t\tfor i, sample in enumerate(samples):\n\t\t\tax = plt.subplot(gs[i])\n\t\t\tplt.axis('off')\n\t\t\tax.set_xticklabels([])\n\t\t\tax.set_yticklabels([])\n\t\t\tax.set_aspect('equal')\n\t\t\tplt.imshow(sample)\n\t\treturn fig\n\nclass cifar():\n\tdef __init__(self):\n\t\tdatapath = prefix + 'cifar10'\n\t\tself.z_dim = 100\n\t\tself.size = 64\n\t\tself.channel = 3\n\t\tself.data = glob(os.path.join(datapath, '*'))\n\n\t\tself.batch_count = 0\n\n\tdef __call__(self,batch_size):\n\t\tbatch_number = len(self.data)/batch_size\n\t\tif self.batch_count < batch_number-2:\n\t\t\tself.batch_count += 1\n\t\telse:\n\t\t\tself.batch_count = 0\n\n\t\tpath_list = self.data[self.batch_count*batch_size:(self.batch_count+1)*batch_size]\n\n\t\tbatch = [get_img(img_path, False, 128, self.size) for img_path in path_list]\n\t\tbatch_imgs = np.array(batch).astype(np.float32)\n\t\n\t\treturn batch_imgs\n\n\tdef data2fig(self, samples):\n\t\tfig = plt.figure(figsize=(4, 4))\n\t\tgs = gridspec.GridSpec(4, 4)\n\t\tgs.update(wspace=0.05, hspace=0.05)\n\n\t\tfor i, sample in enumerate(samples):\n\t\t\tax = plt.subplot(gs[i])\n\t\t\tplt.axis('off')\n\t\t\tax.set_xticklabels([])\n\t\t\tax.set_yticklabels([])\n\t\t\tax.set_aspect('equal')\n\t\t\tplt.imshow(sample)\n\t\treturn fig\n\n\nclass mnist():\n\tdef __init__(self):\n\t\tdatapath = prefix + 'mnist'\n\t\tself.z_dim = 100\n\t\tself.size = 64\n\t\tself.channel = 1\n\t\tself.data = input_data.read_data_sets(datapath, one_hot=True)\n\n\tdef __call__(self,batch_size):\n\t\tbatch_imgs = np.zeros([batch_size, self.size, self.size, self.channel])\n\n\t\tbatch_x,y = self.data.train.next_batch(batch_size)\n\t\tbatch_x = np.reshape(batch_x, (batch_size, 28, 28, self.channel))\n\t\tfor i in range(batch_size):\n\t\t\timg = batch_x[i,:,:,0]\n\t\t\tbatch_imgs[i,:,:,0] = scipy.misc.imresize(img, [self.size, self.size])\n\t\tbatch_imgs /= 255.\n\t\treturn batch_imgs, y\n\n\tdef data2fig(self, samples):\n\t\tfig = plt.figure(figsize=(4, 4))\n\t\tgs = gridspec.GridSpec(4, 4)\n\t\tgs.update(wspace=0.05, hspace=0.05)\n\n\t\tfor i, sample in enumerate(samples):\n\t\t\tax = plt.subplot(gs[i])\n\t\t\tplt.axis('off')\n\t\t\tax.set_xticklabels([])\n\t\t\tax.set_yticklabels([])\n\t\t\tax.set_aspect('equal')\n\t\t\tplt.imshow(sample.reshape(self.size,self.size), cmap='Greys_r')\n\t\treturn fig\t\n\n\nif __name__ == '__main__':\n\tdata = mnist()\n\timgs,_ = data(20)\n\n\tfig = mnist.data2fig(imgs[:16,:,:])\n\tplt.savefig('Samples/test.png', bbox_inches='tight')\n\tplt.close(fig)\n"
  },
  {
    "path": "utils/nets.py",
    "content": "import tensorflow as tf\nimport tensorflow.contrib as tc\nimport tensorflow.contrib.layers as tcl\n\ndef lrelu(x, leak=0.2, name=\"lrelu\"):\n\twith tf.variable_scope(name):\n\t\tf1 = 0.5 * (1 + leak)\n\t\tf2 = 0.5 * (1 - leak)\n\t\treturn f1 * x + f2 * abs(x)\n\nclass G_conv(object):\n\tdef __init__(self, channel=3, name='G_conv'):\n\t\tself.name = name\n\t\tself.size = 64/16\n\t\tself.channel = channel\n\n\tdef __call__(self, z):\n\t\twith tf.variable_scope(self.name) as scope:\n\t\t\tg = tcl.fully_connected(z, self.size * self.size * 512, activation_fn=tf.nn.relu, normalizer_fn=tcl.batch_norm)\n\t\t\tg = tf.reshape(g, (-1, self.size, self.size, 512))  # size\n\t\t\tg = tcl.conv2d_transpose(g, 256, 3, stride=2, # size*2\n\t\t\t\t\t\t\t\t\tactivation_fn=tf.nn.relu, normalizer_fn=tcl.batch_norm, padding='SAME', weights_initializer=tf.random_normal_initializer(0, 0.02))\n\t\t\tg = tcl.conv2d_transpose(g, 128, 3, stride=2, # size*4\n\t\t\t\t\t\t\t\t\tactivation_fn=tf.nn.relu, normalizer_fn=tcl.batch_norm, padding='SAME', weights_initializer=tf.random_normal_initializer(0, 0.02))\n\t\t\tg = tcl.conv2d_transpose(g, 64, 3, stride=2, # size*8 32x32x64\n\t\t\t\t\t\t\t\t\tactivation_fn=tf.nn.relu, normalizer_fn=tcl.batch_norm, padding='SAME', weights_initializer=tf.random_normal_initializer(0, 0.02))\n\t\t\t\n\t\t\tg = tcl.conv2d_transpose(g, self.channel, 3, stride=2, # size*16 \n\t\t\t\t\t\t\t\t\t\tactivation_fn=tf.nn.sigmoid, padding='SAME', weights_initializer=tf.random_normal_initializer(0, 0.02))\n\t\t\treturn g\n\t@property\n\tdef vars(self):\n\t\treturn tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope=self.name)\n\n\nclass D_conv(object):\n\tdef __init__(self, name='D_conv'):\n\t\tself.name = name\n\n\tdef __call__(self, x, reuse=False):\n\t\twith tf.variable_scope(self.name) as scope:\n\t\t\tif reuse:\n\t\t\t\tscope.reuse_variables()\n\t\t\tsize = 64\n\t\t\td = tcl.conv2d(x, num_outputs=size, kernel_size=3, # bzx64x64x3 -> bzx32x32x64\n\t\t\t\t\t\tstride=2, activation_fn=lrelu, normalizer_fn=tcl.batch_norm, padding='SAME', weights_initializer=tf.random_normal_initializer(0, 0.02))\n\t\t\td = tcl.conv2d(d, num_outputs=size * 2, kernel_size=3, # 16x16x128\n\t\t\t\t\t\tstride=2, activation_fn=lrelu, normalizer_fn=tcl.batch_norm, padding='SAME', weights_initializer=tf.random_normal_initializer(0, 0.02))\n\t\t\td = tcl.conv2d(d, num_outputs=size * 4, kernel_size=3, # 8x8x256\n\t\t\t\t\t\tstride=2, activation_fn=lrelu, normalizer_fn=tcl.batch_norm, padding='SAME', weights_initializer=tf.random_normal_initializer(0, 0.02))\n\t\t\td = tcl.conv2d(d, num_outputs=size * 8, kernel_size=3, # 4x4x512\n\t\t\t\t\t\tstride=2, activation_fn=lrelu, normalizer_fn=tcl.batch_norm, padding='SAME', weights_initializer=tf.random_normal_initializer(0, 0.02))\n\n\t\t\td = tcl.fully_connected(tcl.flatten(d), 256, activation_fn=lrelu, weights_initializer=tf.random_normal_initializer(0, 0.02))\n\t\t\td = tcl.fully_connected(d, 1, activation_fn=None, weights_initializer=tf.random_normal_initializer(0, 0.02))\n\t\t\t\n\t\t\treturn d\n\t\t\t\n\t@property\n\tdef vars(self):\n\t\treturn tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope=self.name)\n\n# for ebgan and began\nclass D_autoencoder(object):\n\tdef __init__(self, n_hidden=256, name='D_autoencoder'):\n\t\tself.name = name\n\t\tself.n_hidden = n_hidden\n\n\tdef __call__(self, x, reuse=False):\n\t\twith tf.variable_scope(self.name) as scope:\n\t\t\tif reuse:\n\t\t\t\tscope.reuse_variables()\n\t\t\t# --- conv\n\t\t\tsize = 64\n\t\t\td = tcl.conv2d(x, num_outputs=size, kernel_size=3, # bzx64x64x3 -> bzx32x32x64\n\t\t\t\t\t\tstride=2, activation_fn=lrelu, normalizer_fn=tcl.batch_norm, padding='SAME', weights_initializer=tf.random_normal_initializer(0, 0.02))\n\t\t\td = tcl.conv2d(d, num_outputs=size * 2, kernel_size=3, # 16x16x128\n\t\t\t\t\t\tstride=2, activation_fn=lrelu, normalizer_fn=tcl.batch_norm, padding='SAME', weights_initializer=tf.random_normal_initializer(0, 0.02))\n\t\t\td = tcl.conv2d(d, num_outputs=size * 4, kernel_size=3, # 8x8x256\n\t\t\t\t\t\tstride=2, activation_fn=lrelu, normalizer_fn=tcl.batch_norm, padding='SAME', weights_initializer=tf.random_normal_initializer(0, 0.02))\n\t\t\td = tcl.conv2d(d, num_outputs=size * 8, kernel_size=3, # 4x4x512\n\t\t\t\t\t\tstride=2, activation_fn=lrelu, normalizer_fn=tcl.batch_norm, padding='SAME', weights_initializer=tf.random_normal_initializer(0, 0.02))\n\t\n\t\t\th = tcl.fully_connected(tcl.flatten(d), self.n_hidden, activation_fn=lrelu, weights_initializer=tf.random_normal_initializer(0, 0.02))\n\n\t\t\t# -- deconv\n\t\t\td = tcl.fully_connected(h, 4 * 4 * 512, activation_fn=tf.nn.relu, normalizer_fn=tcl.batch_norm)\n\t\t\td = tf.reshape(d, (-1, 4, 4, 512))  # size\n\t\t\td = tcl.conv2d_transpose(d, 256, 3, stride=2, # size*2\n\t\t\t\t\t\t\t\t\tactivation_fn=tf.nn.relu, normalizer_fn=tcl.batch_norm, padding='SAME', weights_initializer=tf.random_normal_initializer(0, 0.02))\n\t\t\td = tcl.conv2d_transpose(d, 128, 3, stride=2, # size*4\n\t\t\t\t\t\t\t\t\tactivation_fn=tf.nn.relu, normalizer_fn=tcl.batch_norm, padding='SAME', weights_initializer=tf.random_normal_initializer(0, 0.02))\n\t\t\td = tcl.conv2d_transpose(d, 64, 3, stride=2, # size*8\n\t\t\t\t\t\t\t\t\tactivation_fn=tf.nn.relu, normalizer_fn=tcl.batch_norm, padding='SAME', weights_initializer=tf.random_normal_initializer(0, 0.02))\n\t\t\t\n\t\t\td = tcl.conv2d_transpose(d, 3, 3, stride=2, # size*16\n\t\t\t\t\t\t\t\t\tactivation_fn=tf.nn.sigmoid, padding='SAME', weights_initializer=tf.random_normal_initializer(0, 0.02))\n\t\t\treturn d\n\n\t@property\n\tdef vars(self):\n\t\treturn tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope=self.name)\n\n# for vae\nclass D_vae(object):\n\tdef __init__(self, name='D_vae'):\n\t\tself.name = name\n\n\tdef __call__(self, x, reuse=False):\n\t\twith tf.variable_scope(self.name) as scope:\n\t\t\tif reuse:\n\t\t\t\tscope.reuse_variables()\n\t\t\tsize = 64\n\t\t\td = tcl.conv2d(x, num_outputs=size, kernel_size=3, # bzx64x64x3 -> bzx32x32x64\n\t\t\t\t\t\tstride=2, activation_fn=lrelu, normalizer_fn=tcl.batch_norm, padding='SAME', weights_initializer=tf.random_normal_initializer(0, 0.02))\n\t\t\td = tcl.conv2d(d, num_outputs=size * 2, kernel_size=3, # 16x16x128\n\t\t\t\t\t\tstride=2, activation_fn=lrelu, normalizer_fn=tcl.batch_norm, padding='SAME', weights_initializer=tf.random_normal_initializer(0, 0.02))\n\t\t\td = tcl.conv2d(d, num_outputs=size * 4, kernel_size=3, # 8x8x256\n\t\t\t\t\t\tstride=2, activation_fn=lrelu, normalizer_fn=tcl.batch_norm, padding='SAME', weights_initializer=tf.random_normal_initializer(0, 0.02))\n\t\t\td = tcl.conv2d(d, num_outputs=size * 8, kernel_size=3, # 4x4x512\n\t\t\t\t\t\tstride=2, activation_fn=lrelu, normalizer_fn=tcl.batch_norm, padding='SAME', weights_initializer=tf.random_normal_initializer(0, 0.02))\n\n\t\t\td = tcl.fully_connected(tcl.flatten(d), 256, activation_fn=lrelu, weights_initializer=tf.random_normal_initializer(0, 0.02))\n\t\t\tmu = tcl.fully_connected(d, 100, activation_fn=None, weights_initializer=tf.random_normal_initializer(0, 0.02))\n\t\t\tsigma = tcl.fully_connected(d, 100, activation_fn=None, weights_initializer=tf.random_normal_initializer(0, 0.02))\n\t\t\t\n\t\t\treturn mu, sigma\n\t\t\t\n\t@property\n\tdef vars(self):\n\t\treturn tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope=self.name)\n\n\n\n"
  },
  {
    "path": "vae.py",
    "content": "import tensorflow as tf\nfrom tensorflow.examples.tutorials.mnist import input_data\nimport numpy as np\nimport matplotlib as mpl\nmpl.use('Agg')\nimport matplotlib.pyplot as plt\nimport matplotlib.gridspec as gridspec\nimport os,sys\n\nsys.path.append('utils')\nfrom nets import *\nfrom datas import *\n\ndef sample_z(m, n):\n\treturn np.random.uniform(0, 1., size=[m, n])\n\nclass VAE():\n\tdef __init__(self, generator, discriminator, data):\n\t\tself.generator = generator\n\t\tself.discriminator = discriminator\n\t\tself.data = data\n\n\t\t# data\n\t\tself.z_dim = self.data.z_dim\n\t\tself.size = self.data.size\n\t\tself.channel = self.data.channel\n\n\t\tself.X = tf.placeholder(tf.float32, shape=[None, self.size, self.size, self.channel])\n\t\tself.z = tf.placeholder(tf.float32, shape=[None, self.z_dim])\n\n\t\t# nets\n\t\tmu, sigma = self.discriminator(self.X) \n\t\tlatent_code = mu + tf.exp(sigma/2)*self.z\n\t\t\n\t\tself.G_real = self.generator(latent_code)\n\t\tself.G_sample = self.generator(self.z)\n\t\t\n\t\t# loss\n\t\t# E[log P(X|z)]\n\t\tepsilon = 1e-8\n\t\tself.recon = tf.reduce_sum(-self.X * tf.log(self.G_real + epsilon) -(1.0 - self.X) * tf.log(1.0 - self.G_real + epsilon))\n\t\t\n\t\t# D_KL(Q(z|X) || P(z|X)); calculate in closed form as both dist. are Gaussian\n\t\tself.kl = 0.5 * tf.reduce_sum(tf.exp(sigma) + tf.square(mu) - 1. - sigma)\n\n\t\tself.loss = self.recon + self.kl\n\n\t\t# solver\n\t\tself.learning_rate = tf.placeholder(tf.float32, shape=[])\n\t\tself.solver = tf.train.AdamOptimizer(learning_rate=self.learning_rate).minimize(self.loss, var_list=self.generator.vars + self.discriminator.vars)\n\t\t\n\t\tself.saver = tf.train.Saver()\n\t\tgpu_options = tf.GPUOptions(allow_growth=True)\n\t\tself.sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))\n\t\tself.model_name = 'Models/vae_cifar.ckpt'\n\n\tdef train(self, sample_dir, training_epoches = 500000, batch_size = 32):\n\t\tfig_count = 0\n\t\tself.sess.run(tf.global_variables_initializer())\n\t\t#self.saver.restore(self.sess, self.model_name)\t\t\n\n\t\tlearning_rate_initial = 1e-4\n\t\tfor epoch in range(training_epoches):\n\t\t\tlearning_rate = learning_rate_initial * pow(0.5, epoch // 50000)\n\t\t\tX_b = self.data(batch_size)\n\t\t\tself.sess.run(\n\t\t\t\tself.solver,\n\t\t\t\tfeed_dict={self.X: X_b, self.z: sample_z(batch_size, self.z_dim), self.learning_rate: learning_rate}\n\t\t\t\t)\n\t\t\t# save img, model. print loss\n\t\t\tif epoch % 100 == 0 or epoch < 100:\n\t\t\t\tloss_curr = self.sess.run(\n\t\t\t\t\t\tself.loss,\n            \t\t\tfeed_dict={self.X: X_b, self.z: sample_z(batch_size, self.z_dim)})\n\t\t\t\tprint('Iter: {}; loss: {:.4}'.format(epoch, loss_curr))\n\n\t\t\t\tif epoch % 1000 == 0:\n\t\t\t\t\treal, samples = self.sess.run([self.G_real, self.G_sample], feed_dict={self.X: X_b[:16,:,:,:], self.z: sample_z(16, self.z_dim)})\n\n\t\t\t\t\tfig = self.data.data2fig(real)\n\t\t\t\t\tplt.savefig('{}/{}.png'.format(sample_dir, str(fig_count).zfill(3)), bbox_inches='tight')\n\t\t\t\t\tplt.close(fig)\n\n\t\t\t\t\tfig = self.data.data2fig(samples)\n\t\t\t\t\tplt.savefig('{}/{}_s.png'.format(sample_dir, str(fig_count).zfill(3)), bbox_inches='tight')\n\t\t\t\t\tplt.close(fig)\n\n\t\t\t\t\tfig_count += 1\n\n\t\t\t\tif epoch % 5000 == 0:\n\t\t\t\t\tself.saver.save(self.sess, self.model_name)\n\n\nif __name__ == '__main__':\n\n\t# constraint GPU\n\tos.environ['CUDA_VISIBLE_DEVICES'] = '2'\n\n\t# save generated images\n\tsample_dir = 'Samples/vae'\n\tif not os.path.exists(sample_dir):\n\t\tos.makedirs(sample_dir)\n\n\t# param\n\tgenerator = G_conv()\n\tdiscriminator = D_vae()\n\n\tdata = celebA()\n\n\t# run\n\tvae = VAE(generator, discriminator, data)\n\tvae.train(sample_dir)\n\n"
  },
  {
    "path": "wgan.py",
    "content": "import tensorflow as tf\nfrom tensorflow.examples.tutorials.mnist import input_data\nimport numpy as np\nimport matplotlib as mpl\nmpl.use('Agg')\nimport matplotlib.pyplot as plt\nimport matplotlib.gridspec as gridspec\nimport os,sys\n\nsys.path.append('utils')\nfrom nets import *\nfrom datas import *\n\ndef sample_z(m, n):\n\treturn np.random.uniform(-1., 1., size=[m, n])\n\nclass WGAN():\n\tdef __init__(self, generator, discriminator, data):\n\t\tself.generator = generator\n\t\tself.discriminator = discriminator\n\t\tself.data = data\n\n\t\t# data\n\t\tself.z_dim = self.data.z_dim\n\t\tself.size = self.data.size\n\t\tself.channel = self.data.channel\n\n\t\tself.X = tf.placeholder(tf.float32, shape=[None, self.size, self.size, self.channel])\n\t\tself.z = tf.placeholder(tf.float32, shape=[None, self.z_dim])\n\n\t\t# nets\n\t\tself.G_sample = self.generator(self.z)\n\n\t\tself.D_real = self.discriminator(self.X)\n\t\tself.D_fake = self.discriminator(self.G_sample, reuse = True)\n\t\t\n\t\t# loss\n\t\tself.D_loss = - tf.reduce_mean(self.D_real) + tf.reduce_mean(self.D_fake)\n\t\tself.G_loss = - tf.reduce_mean(self.D_fake)\n\n\t\t# clip\n\t\tself.clip_D = [var.assign(tf.clip_by_value(var, -0.01, 0.01)) for var in self.discriminator.vars]\t\t\n\n\t\t# solver\n\t\tself.learning_rate = tf.placeholder(tf.float32, shape=[])\n\t\tself.D_solver = tf.train.RMSPropOptimizer(learning_rate=self.learning_rate).minimize(self.D_loss, var_list=self.discriminator.vars)\n\t\tself.G_solver = tf.train.RMSPropOptimizer(learning_rate=self.learning_rate).minimize(self.G_loss, var_list=self.generator.vars)\n\n\t\tgpu_options = tf.GPUOptions(allow_growth=True)\n\t\tself.sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))\n\t\tself.saver = tf.train.Saver()\n\t\tself.model_name = 'Models/wgan.ckpt'\n\n\tdef train(self, sample_dir, training_epoches = 500000, batch_size = 32):\n\t\tfig_count = 0\n\t\tself.sess.run(tf.global_variables_initializer())\n\t\t#self.saver.restore(self.sess, self.model_name)\t\t\n\n\t\tlearning_rate_initial = 1e-4\n\t\tfor epoch in range(training_epoches):\n\t\t\tlearning_rate =  learning_rate_initial * pow(0.5, epoch // 50000)\n\t\t\t# update D\n\t\t\tn_d = 100 if epoch < 25 or (epoch+1) % 500 == 0 else 5\n\t\t\tfor _ in range(n_d):\n\t\t\t\tX_b = self.data(batch_size)\n\t\t\t\tself.sess.run(\n\t\t\t\t\t[self.clip_D,self.D_solver],\n\t\t\t\t\tfeed_dict={self.X: X_b, self.z: sample_z(batch_size, self.z_dim), self.learning_rate: learning_rate}\n\t\t\t\t\t)\n\t\t\t# update G\n\t\t\tfor _ in range(1):\n\t\t\t\tself.sess.run(\n\t\t\t\t\tself.G_solver,\n\t\t\t\t\tfeed_dict={self.z: sample_z(batch_size, self.z_dim), self.learning_rate: learning_rate}\n\t\t\t\t)\n\n\t\t\t# save img, model. print loss\n\t\t\tif epoch % 100 == 0 or epoch < 100:\n\t\t\t\tD_loss_curr, G_loss_curr = self.sess.run(\n\t\t\t\t\t\t[self.D_loss, self.G_loss],\n            \t\t\tfeed_dict={self.X: X_b, self.z: sample_z(batch_size, self.z_dim)})\n\t\t\t\tprint('Iter: {}; D loss: {:.4}; G_loss: {:.4}'.format(epoch, D_loss_curr, G_loss_curr))\n\n\t\t\t\tif epoch % 1000 == 0:\n\t\t\t\t\tsamples = self.sess.run(self.G_sample, feed_dict={self.z: sample_z(16, self.z_dim)})\n\n\t\t\t\t\tfig = self.data.data2fig(samples)\n\t\t\t\t\tplt.savefig('{}/{}.png'.format(sample_dir, str(fig_count).zfill(3)), bbox_inches='tight')\n\t\t\t\t\tfig_count += 1\n\t\t\t\t\tplt.close(fig)\n\n\t\t\t\tif epoch % 5000 == 0:\n\t\t\t\t\tself.saver.save(self.sess, self.model_name)\n\n\nif __name__ == '__main__':\n\n\t# constraint GPU\n\tos.environ['CUDA_VISIBLE_DEVICES'] = '1'\n\n\t# save generated images\n\tsample_dir = 'Samples/wgan'\n\tif not os.path.exists(sample_dir):\n\t\tos.makedirs(sample_dir)\n\n\t# param\n\tgenerator = G_conv()\n\tdiscriminator = D_conv()\n\n\tdata = celebA()\n\n\t# run\n\twgan = WGAN(generator, discriminator, data)\n\twgan.train(sample_dir)\n\n"
  }
]