Repository: mkocaoglu/CausalGAN Branch: master Commit: 9d52b520b5ef Files: 51 Total size: 288.4 KB Directory structure: gitextract_fvn7v_0h/ ├── .gitignore ├── LICENSE ├── README.md ├── assets/ │ ├── 0808_112404_cbcg.csv │ ├── 0810_191625_bcg.csv │ ├── 0821_213901_rcbcg.csv │ ├── guide_to_gifs.txt │ └── tvdplot.ipynb ├── causal_began/ │ ├── CausalBEGAN.py │ ├── __init__.py │ ├── config.py │ ├── models.py │ └── utils.py ├── causal_controller/ │ ├── ArrayDict.py │ ├── CausalController.py │ ├── __init__.py │ ├── config.py │ ├── models.py │ └── utils.py ├── causal_dcgan/ │ ├── CausalGAN.py │ ├── __init__.py │ ├── config.py │ ├── models.py │ ├── ops.py │ └── utils.py ├── causal_graph.py ├── config.py ├── data_loader.py ├── download.py ├── figure_scripts/ │ ├── __init__.py │ ├── distributions.py │ ├── encode.py │ ├── high_level.py │ ├── pairwise.py │ ├── probability_table.txt │ ├── sample.py │ └── utils.py ├── main.py ├── synthetic/ │ ├── README.md │ ├── collect_stats.py │ ├── config.py │ ├── figure_generation.ipynb │ ├── main.py │ ├── models.py │ ├── run_datasets.sh │ ├── tboard.py │ ├── trainer.py │ └── utils.py ├── tboard.py ├── trainer.py └── utils.py ================================================ FILE CONTENTS ================================================ ================================================ FILE: .gitignore ================================================ data/ data .*.swp logs old final_checkpoints checkpoint/ figures/ *.pyc .DS_Store .ipynb_checkpoints [._]*.s[a-v][a-z] [._]*.sw[a-p] [._]s[a-v][a-z] [._]sw[a-p] samples outputs ================================================ FILE: LICENSE ================================================ MIT License Copyright (c) 2017 Murat Kocaoglu, Christopher Snyder Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ================================================ FILE: README.md ================================================ # CausalGAN/CausalBEGAN in Tensorflow Tensorflow implementation of [CausalGAN: Learning Causal Implicit Generative Models with Adversarial Training](https://arxiv.org/abs/1709.02023) ### Top: Random samples from do(Bald=1); Bottom: Random samples from cond(Bald=1) ![alt text](./assets/314393_began_Bald_topdo1_botcond1.png) ### Top: Random samples from do(Mustache=1); Bottom: Random samples from cond(Mustache=1) ![alt text](./assets/314393_began_Mustache_topdo1_botcond1.png) ## Requirements - Python 2.7 - [Pillow](https://pillow.readthedocs.io/en/4.0.x/) - [tqdm](https://github.com/tqdm/tqdm) - [requests](https://github.com/kennethreitz/requests) (Only used for downloading CelebA dataset) - [TensorFlow 1.1.0](https://github.com/tensorflow/tensorflow) ## Getting Started First download [CelebA](http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html) datasets with: $ apt-get install p7zip-full # ubuntu $ brew install p7zip # Mac $ pip install tqdm $ python download.py ## Usage The CausalGAN/CausalBEGAN code factorizes into two components, which can be trained or loaded independently: the causal_controller module specifies the model which learns a causal generative model over labels, and the causal_dcgan or causal_began modules learn a GAN over images given those labels. We denote training the causal controller over labels as "pretraining" (--is_pretrain=True), and training a GAN over images given labels as "training" (--is_train=True) To train a causal implicit model over labels and then over the image given the labels use $ python main.py --causal_model big_causal_graph --is_pretrain True --model_type began --is_train True where "big_causal_graph" is one of the causal graphs specified by the keys in the causal_graphs dictionary in causal_graph.py. Alternatively, one can first train a causal implicit model over labels only with the following command: $ python main.py --causal_model big_causal_graph --is_pretrain True One can then train a conditional generative model for the images given the trained causal generative model for the labels (causal controller), which yields a causal implicit generative model for the image and the labels, as suggested in [arXiv link to the paper]: $ echo CC-MODEL_PATH='./logs/celebA_0810_191625_0.145tvd_bcg/controller/checkpoints/CC-Model-20000' $ python main.py --causal_model big_causal_graph --pt_load_path $CC-MODEL_PATH --model_type began --is_train True Instead of loading the model piecewise, once image training has been run once, the entire joint model can be loaded more simply by specifying the model directory: $ python main.py --causal_model big_causal_graph --load_path ./logs/celebA_0815_170635 --model_type began --is_train True Tensorboard visualization of the most recently created model is simply (as long as port 6006 is free): $ python tboard.py To interact with an already trained model I recommend the following procedure: ipython In [1]: %run main --causal_model big_causal_graph --load_path './logs/celebA_0815_170635' --model_type 'began' For example to sample N=22 interventional images from do(Smiling=1) (as long as your causal graph includes a "Smiling" node: In [2]: sess.run(model.G,{cc.Smiling.label:np.ones((22,1), trainer.batch_size:22}) Conditional sampling is most efficiently done through 2 session calls: the first to cc.sample_label to get, and the second feeds that sampled label to get an image. See trainer.causal_sampling for a more extensive example. Note that is also possible combine conditioning and intervention during sampling. In [3]: lab_samples=cc.sample_label(sess,do_dict={'Bald':1}, cond_dict={'Mustache':1},N=22) will sample all labels from the joint distribution conditioned on Mustache=1 and do(Bald=1). These label samples can be turned into image samples as follows: In [4]: feed_dict={cc.label_dict[k]:v for k,v in lab_samples.iteritems()} In [5]: feed_dict[trainer.batch_size]=22 In [6]: images=sess.run(trainer.G,feed_dict) ### Configuration Since this really controls training of 3 different models (CausalController, CausalGAN, and CausalBEGAN), many configuration options are available. To make things managable, there are 4 files corresponding to configurations specific to different parts of the model. Not all configuration combinations are tested. Default parameters are gauranteed to work. configurations: ./config.py : generic data and scheduling ./causal_controller/config : specific to CausalController ./causal_dcgan/config : specific to CausalGAN ./causal_began/config : specific to CausalBEGAN For convenience, the configurations used are saved in 4 .json files in the model directory for future reference. ## Results ### Causal Controller convergence We show convergence in TVD for Causal Graph 1 (big_causal_graph in causal_graph.py), a completed version of Causal Graph 1 (complete_big_causal_graph in causal_graph.py, and an edge reversed version of the complete Causal Graph 1 (reverse_big_causal_graph in causal_graph.py). We could get reasonable marginals with a complete DAG containing all 40 nodes, but TVD becomes very difficult to measure. We show TVD convergence for 9 nodes for two complete graphs. When the graph is incomplete, there is a "TVD gap" but reasonable convergence. ![alt text](./assets/tvd_vs_step.png) ### Conditional vs Interventional Sampling: We trained a causal implicit generative model assuming we are given the following causal graph over labels: For the following images when we condition or intervene, these operations can be reasoned about from the graph structure. e.g., conditioning on mustache=1 should give more male whereas intervening should not (since the edges from the parents are disconnected in an intervention). ### CausalGAN Conditioning vs Intervening For each label, images were randomly sampled by either _intervening_ (top row) or _conditioning_ (bottom row) on label=1. ![alt text](./assets/causalgan_pictures/45507_intvcond_Bald=1_2x10.png) Bald ![alt text](./assets/causalgan_pictures/45507_intvcond_Mouth_Slightly_Open=1_2x10.png) Mouth Slightly Open ![alt text](./assets/causalgan_pictures/45507_intvcond_Mustache=1_2x10.png) Mustache ![alt text](./assets/causalgan_pictures/45507_intvcond_Narrow_Eyes=1_2x10.png) Narrow Eyes ![alt text](./assets/causalgan_pictures/45507_intvcond_Smiling=1_2x10.png) Smiling ![alt text](./assets/causalgan_pictures/45507_intvcond_Eyeglasses=1_2x10.png) Eyeglasses ![alt text](./assets/causalgan_pictures/45507_intvcond_Wearing_Lipstick=1_2x10.png) Wearing Lipstick ### CausalBEGAN Conditioning vs Intervening For each label, images were randomly sampled by either _intervening_ (top row) or _conditioning_ (bottom row) on label=1. ![alt text](./assets/causalbegan_pictures/190001_intvcond_Bald=1_2x10.png) Bald ![alt text](./assets/causalbegan_pictures/190001_intvcond_Mouth_Slightly_Open=1_2x10.png) Mouth Slightly Open ![alt text](./assets/causalbegan_pictures/190001_intvcond_Mustache=1_2x10.png) Mustache ![alt text](./assets/causalbegan_pictures/190001_intvcond_Narrow_Eyes=1_2x10.png) Narrow Eyes ![alt text](./assets/causalbegan_pictures/190001_intvcond_Smiling=1_2x10.png) Smiling ![alt text](./assets/causalbegan_pictures/190001_intvcond_Eyeglasses=1_2x10.png) Eyeglasses ![alt text](./assets/causalbegan_pictures/190001_intvcond_Wearing_Lipstick=1_2x10.png) Wearing Lipstick ### CausalGAN Generator output (10x10) (randomly sampled label) ![alt text](https://user-images.githubusercontent.com/10726729/30076306-09743002-923e-11e7-8011-8523cd914f25.gif) ### CausalBEGAN Generator output (10x10) (randomly sampled label) ![alt text](https://user-images.githubusercontent.com/10726729/30076379-38b407fc-923e-11e7-81aa-4310c76a2e39.gif) <--- Repo originally forked from these two - [BEGAN-tensorflow](https://github.com/carpedm20/BEGAN-tensorflow) - [DCGAN-tensorflow](https://github.com/carpedm20/DCGAN-tensorflow) --> ## Related works - [Generative Adversarial Networks](https://arxiv.org/abs/1406.2661) - [Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks](https://arxiv.org/abs/1511.06434) - [Wasserstein GAN](https://arxiv.org/abs/1701.07875) - [BEGAN: Boundary Equilibrium Generative Adversarial Networks](https://arxiv.org/abs/1703.10717) ## Authors Christopher Snyder / [@22csnyder](http://22csnyder.github.io) Murat Kocaoglu / [@mkocaoglu](http://mkocaoglu.github.io) ================================================ FILE: assets/0808_112404_cbcg.csv ================================================ Wall time,Step,Value 1502209477.065396,1,0.9871935844421387 1502210175.629644,1001,0.5611526370048523 1502210858.027971,2001,0.48091334104537964 1502211539.450148,3001,0.3693326711654663 1502212228.305266,4001,0.2690610885620117 1502212916.163691,5001,0.1852252036333084 1502213605.455342,6001,0.11786147207021713 1502214290.655429,7001,0.10585799068212509 1502214974.834744,8001,0.11575613915920258 1502215664.377923,9001,0.09277261048555374 1502216342.813149,10001,0.08084549009799957 1502217004.542623,11001,0.07447165995836258 1502217677.840079,12001,0.07388914376497269 1502218338.794636,13001,0.06354445964097977 1502219000.20777,14001,0.058855485171079636 1502219659.079145,15001,0.06558254361152649 1502220348.8056,16001,0.051907140761613846 1502221033.399544,17001,0.04890892282128334 1502221718.709654,18001,0.04604059085249901 1502222403.268966,19001,0.04389917105436325 1502223087.183902,20001,0.04280887916684151 1502223772.410776,21001,0.04196497052907944 1502224457.815937,22001,0.038901761174201965 1502225141.198389,23001,0.04273799806833267 1502225826.618027,24001,0.041886329650878906 1502226518.698883,25001,0.04319506511092186 1502227208.700241,26001,0.042861778289079666 1502227899.513253,27001,0.04321207478642464 1502228588.126751,28001,0.035417430102825165 1502229277.24218,29001,0.03713845834136009 1502229964.6007,30001,0.03938867151737213 ================================================ FILE: assets/0810_191625_bcg.csv ================================================ Wall time,Step,Value 1502410626.387592,1,0.9544087648391724 1502411081.292726,1001,0.5290326476097107 1502411533.622933,2001,0.44044023752212524 1502411981.535893,3001,0.35751280188560486 1502412434.074014,4001,0.2676760256290436 1502412884.345166,5001,0.20682139694690704 1502413336.727762,6001,0.1853639930486679 1502413786.845507,7001,0.19252602756023407 1502414239.265506,8001,0.19284175336360931 1502414689.356373,9001,0.16991157829761505 1502415145.18223,10001,0.15723274648189545 1502415595.021095,11001,0.15078511834144592 1502416037.124821,12001,0.14841803908348083 1502416478.158467,13001,0.1522006243467331 1502416920.270544,14001,0.15191766619682312 1502417364.060506,15001,0.14936088025569916 1502417803.97219,16001,0.14549562335014343 1502418242.907475,17001,0.14224907755851746 1502418684.820146,18001,0.13779735565185547 1502419124.551228,19001,0.14404024183750153 ================================================ FILE: assets/0821_213901_rcbcg.csv ================================================ Wall time,Step,Value 1503369574.677247,1,0.8920440077781677 1503370041.447478,1001,0.512530505657196 1503370517.215026,2001,0.44317319989204407 1503370985.171754,3001,0.35666027665138245 1503371450.274446,4001,0.2928802967071533 1503371929.346399,5001,0.19688302278518677 1503372408.39261,6001,0.13801704347133636 1503372886.733545,7001,0.1106921136379242 1503373363.362404,8001,0.08717407286167145 1503373839.834317,9001,0.0857364684343338 1503374318.503915,10001,0.07331433147192001 1503374802.444324,11001,0.07706638425588608 1503375279.389205,12001,0.06169278547167778 1503375752.728541,13001,0.059477031230926514 1503376226.577342,14001,0.061632610857486725 1503376699.448754,15001,0.06138858571648598 1503377174.465165,16001,0.05955960601568222 1503377653.261056,17001,0.04774799197912216 1503378126.625743,18001,0.05300581455230713 1503378604.128631,19001,0.047743991017341614 1503379079.647434,20001,0.05426724627614021 1503379555.901424,21001,0.04658582806587219 1503380028.219916,22001,0.04909271374344826 1503380498.204313,23001,0.05326574668288231 1503380962.853232,24001,0.05447468161582947 1503381428.927937,25001,0.05708151310682297 1503381893.354328,26001,0.051777616143226624 1503382360.002207,27001,0.046131476759910583 1503382825.077767,28001,0.04513547569513321 1503383290.90524,29001,0.044165026396512985 ================================================ FILE: assets/guide_to_gifs.txt ================================================ #Approach uses imagemagick #Take the first 20 images in a folder and convert to gif ls -v | head -20 | xargs cp -t newfolder cd newfolder mogrify -format png *.pdf mogrify -crop 62.5%x62.5%+0+0 +repage *.png rm *.pdf convert -delay 20 $(ls -v) -loop 0 -layers optimize mygifname.gif ================================================ FILE: assets/tvdplot.ipynb ================================================ { "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Using matplotlib backend: TkAgg\n" ] } ], "source": [ "import matplotlib.pyplot as plt\n", "import tensorflow as tf\n", "import pandas as pd\n", "%matplotlib" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [], "source": [ "\n", "raw_data={'cG1': pd.read_csv('0808_112404_cbcg.csv'),\n", " 'G1' : pd.read_csv('0810_191625_bcg.csv'),\n", " 'rcG1': pd.read_csv('0821_213901_rcbcg.csv')}\n", "xlabel='Training Step'\n", "dfs=[pd.DataFrame(data={k:v['Value'].values,xlabel:v['Step'].values}) for k,v in raw_data.items()]" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [], "source": [ "\n", "raw_data={'Causal Graph 1' : pd.read_csv('0810_191625_bcg.csv'),\n", " 'complete Causal Graph 1': pd.read_csv('0808_112404_cbcg.csv'), \n", " 'edge-reversed complete Causal Graph 1': pd.read_csv('0821_213901_rcbcg.csv')}\n", "xlabel='Training Step'\n", "dfs=[pd.DataFrame(data={k:v['Value'].values,xlabel:v['Step'].values}) for k,v in raw_data.items()]" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def my_merge(df1,df2):\n", " return pd.merge(df1,df2,how='outer',on=xlabel)\n", " \n", "\n", "plot_data=reduce(my_merge,dfs)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ax=plot_data.plot.line(x=xlabel,xlim=[0,18000],ylim=[0,1],style = ['bs-','ro-','y^-'])\n", "ax.set_ylabel('Total Variation Distance',fontsize=18)\n", "ax.set_title('TVD of Label Generation',fontsize=18)\n", "ax.set_xlabel(xlabel,fontsize=18)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [], "source": [ "plt.savefig('tvd_vs_step.pdf')" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.12" } }, "nbformat": 4, "nbformat_minor": 1 } ================================================ FILE: causal_began/CausalBEGAN.py ================================================ from __future__ import print_function from utils import save_image,distribute_input_data,summary_stats,make_summary import pandas as pd import os import StringIO import scipy.misc import numpy as np from glob import glob from tqdm import trange from itertools import chain from collections import deque from figure_scripts.pairwise import crosstab from figure_scripts.sample import intervention2d,condition2d from utils import summary_stats from models import * class CausalBEGAN(object): ''' A quick quirk about this class. if the model is built with a gpu, it must later be loaded with a gpu in order to preserve tensor structure: NCHW/NHCW (number-channel-height-width/number-height-channel-width) in paper <-> in code b1,c1 <-> b_k, k_t b2,c2 <-> b_l, l_t b3,c3 <-> b_z, z_t ''' def __init__(self,batch_size,config): ''' batch_size: again a tensorflow placeholder config : see causal_began/config.py ''' self.batch_size=batch_size #a tensor self.config=config self.use_gpu = config.use_gpu self.data_format=self.config.data_format#NHWC or NCHW self.TINY = 10**-6 #number of calls to self.g_optim self.step = tf.Variable(0, name='step', trainable=False) #optimizers self.g_lr = tf.Variable(config.g_lr, name='g_lr') self.d_lr = tf.Variable(config.d_lr, name='d_lr') self.g_lr_update = tf.assign(self.g_lr, self.g_lr * 0.5, name='g_lr_update') self.d_lr_update = tf.assign(self.d_lr, self.d_lr * 0.5, name='d_lr_update') optimizer = tf.train.AdamOptimizer self.g_optimizer, self.d_optimizer = optimizer(self.g_lr), optimizer(self.d_lr) self.lambda_k = config.lambda_k self.lambda_l = config.lambda_l self.lambda_z = config.lambda_z self.gamma = config.gamma self.gamma_label = config.gamma_label self.zeta=config.zeta self.z_dim = config.z_dim self.conv_hidden_num = config.conv_hidden_num self.model_dir = config.model_dir self.start_step = 0 self.log_step = config.log_step self.max_step = config.max_step self.lr_update_step = config.lr_update_step self.is_train = config.is_train #Keeps track of params from different devices self.tower_dict=dict( c_tower_grads=[], dcc_tower_grads=[], g_tower_grads=[], d_tower_grads=[], tower_g_loss_image=[], tower_d_loss_real=[], tower_g_loss_label=[], tower_d_loss_real_label=[], tower_d_loss_fake_label=[], ) self.k_t = tf.get_variable(name='k_t',initializer=0.,trainable=False) self.l_t = tf.get_variable(name='l_t',initializer=0.,trainable=False) self.z_t = tf.get_variable(name='z_t',initializer=0.,trainable=False) def __call__(self, real_inputs, fake_inputs): ''' in a multi gpu setting, self.__call__ is done once for every device with variables shared so that a copy of the tensorflow variables created in self.__call__ resides on each device. This would be run multiple times in a loop over devices. Parameters: fake inputs : a dictionary of labels from cc real_inputs : also a dictionary of labels with an additional key 'x' for the real image ''' config=self.config #The keys are all the labels union 'x' self.real_inputs=real_inputs self.fake_inputs=fake_inputs n_labels=len(fake_inputs)#number of labels in graph, not dataset #[0,255] NHWC self.x = self.real_inputs.pop('x') #used to change dataformat in data queue if self.data_format == 'NCHW': #self.x = tf.transpose(self.x, [2, 0, 1])#3D self.x = tf.transpose(self.x, [0, 3, 1, 2])#4D elif self.data_format == 'NHWC': pass else: raise Exception("[!] Unkown data_format: {}".format(self.data_format)) _, height, width, self.channel = \ get_conv_shape(self.x, self.data_format) self.config.repeat_num= int(np.log2(height)) - 2 self.config.channel=self.channel #There are two versions: "x" and "self.x". # "x" is normalized for computation # "self.x" is unnormalized for saving and summaries # likewise for "G" and "self.G" #x in [-1,1] x = norm_img(self.x) self.real_labels=tf.concat(self.real_inputs.values(),-1) self.fake_labels=tf.concat(self.fake_inputs.values(),-1) #noise given to generate image in addition to labels self.z_gen = tf.random_uniform( (self.batch_size, self.z_dim), minval=-1.0, maxval=1.0) if self.config.round_fake_labels:#default self.z= tf.concat( [tf.round(self.fake_labels), self.z_gen],axis=-1,name='z') else: self.z= tf.concat( [self.fake_labels, self.z_gen],axis=-1,name='z') G, self.G_var = GeneratorCNN(self.z,config) d_out, self.D_z, self.D_var = DiscriminatorCNN(tf.concat([G, x],0),config) AE_G, AE_x = tf.split(d_out, 2) self.D_encode_G, self.D_encode_x=tf.split(self.D_z, 2)#axis=0 by default if not self.config.separate_labeler: self.D_fake_labels_logits=tf.slice(self.D_encode_G,[0,0],[-1,n_labels]) self.D_real_labels_logits=tf.slice(self.D_encode_x,[0,0],[-1,n_labels]) else:#default self.D_fake_labels_logits,self.DL_var=Discriminator_labeler(G,n_labels,config) self.D_real_labels_logits,_=Discriminator_labeler(x,n_labels,config,reuse=True) self.D_var += self.DL_var self.D_real_labels=tf.sigmoid(self.D_real_labels_logits) self.D_fake_labels=tf.sigmoid(self.D_fake_labels_logits) self.D_real_labels_list=tf.split(self.D_real_labels,n_labels,axis=1) self.D_fake_labels_list=tf.split(self.D_fake_labels,n_labels,axis=1) # sigmoid_cross_entropy_with_logits def sxe(logits,labels): #use zeros or ones if pass in scalar if not isinstance(labels,tf.Tensor): labels=labels*tf.ones_like(logits) return tf.nn.sigmoid_cross_entropy_with_logits( logits=logits,labels=labels) #Round fake labels before calc loss if self.config.round_fake_labels:#default fake_labels=tf.round(self.fake_labels) else: fake_labels=self.fake_labels #This is here because it's used in cross_entropy calc, but it's not used by default self.fake_labels_logits= -tf.log(1/(self.fake_labels+self.TINY)-1) #One of three label losses available # Default is squared loss, "squarediff" self.d_xe_real_label=sxe(self.D_real_labels_logits,self.real_labels) self.d_xe_fake_label=sxe(self.D_fake_labels_logits,fake_labels) self.g_xe_label=sxe(self.fake_labels_logits, self.D_fake_labels) self.d_absdiff_real_label=tf.abs(self.D_real_labels - self.real_labels) self.d_absdiff_fake_label=tf.abs(self.D_fake_labels - fake_labels) self.g_absdiff_label =tf.abs(fake_labels - self.D_fake_labels) self.d_squarediff_real_label=tf.square(self.D_real_labels - self.real_labels) self.d_squarediff_fake_label=tf.square(self.D_fake_labels - fake_labels) self.g_squarediff_label =tf.square(fake_labels - self.D_fake_labels) if self.config.label_loss=='xe': self.d_loss_real_label = tf.reduce_mean(self.d_xe_real_label) self.d_loss_fake_label = tf.reduce_mean(self.d_xe_fake_label) self.g_loss_label = tf.reduce_mean(self.g_xe_label) elif self.config.label_loss=='absdiff': self.d_loss_real_label = tf.reduce_mean(self.d_absdiff_real_label) self.d_loss_fake_label = tf.reduce_mean(self.d_absdiff_fake_label) self.g_loss_label = tf.reduce_mean(self.g_absdiff_label) elif self.config.label_loss=='squarediff': self.d_loss_real_label = tf.reduce_mean(self.d_squarediff_real_label) self.d_loss_fake_label = tf.reduce_mean(self.d_squarediff_fake_label) self.g_loss_label = tf.reduce_mean(self.g_squarediff_label) #"self.G" is [0,255], "G" is [-1,1] self.G = denorm_img(G, self.data_format) self.AE_G, self.AE_x = denorm_img(AE_G, self.data_format), denorm_img(AE_x, self.data_format) u1=tf.abs(AE_x - x) u2=tf.abs(AE_G - G) m1=tf.reduce_mean(u1) m2=tf.reduce_mean(u2) c1=tf.reduce_mean(tf.square(u1-m1)) c2=tf.reduce_mean(tf.square(u2-m2)) self.eqn2 = tf.square(m1-m2)#from orig began paper self.eqn1 = (c1+c2-2*tf.sqrt(c1*c2))/self.eqn2#from orig began paper self.d_loss_real = tf.reduce_mean(u1) self.d_loss_fake = tf.reduce_mean(u2) self.g_loss_image = tf.reduce_mean(tf.abs(AE_G - G)) self.d_loss_image=self.d_loss_real - self.k_t*self.d_loss_fake self.d_loss_label=self.d_loss_real_label - self.l_t*self.d_loss_fake_label self.d_loss=self.d_loss_image+self.d_loss_label if not self.config.no_third_margin:#normal mode #Careful on z_t sign!#(z_t <==> c_3 from paper) self.g_loss = self.g_loss_image + self.z_t*self.g_loss_label else: print('Warning: not using third margin') self.g_loss = self.g_loss_image + 1.*self.g_loss_label # Calculate the gradients for the batch of data, # on this particular gpu tower. g_grad=self.g_optimizer.compute_gradients(self.g_loss,var_list=self.G_var) d_grad=self.d_optimizer.compute_gradients(self.d_loss,var_list=self.D_var) self.tower_dict['g_tower_grads'].append(g_grad) self.tower_dict['d_tower_grads'].append(d_grad) self.tower_dict['tower_g_loss_image'].append(self.g_loss_image) self.tower_dict['tower_d_loss_real'].append(self.d_loss_real) self.tower_dict['tower_g_loss_label'].append(self.g_loss_label) self.tower_dict['tower_d_loss_real_label'].append(self.d_loss_real_label) self.tower_dict['tower_d_loss_fake_label'].append(self.d_loss_fake_label) self.var=self.G_var+self.D_var+[self.step] def build_train_op(self): #Now outside gpu loop #attributes starting with ave_ are averaged over devices self.ave_d_loss_real =tf.reduce_mean(self.tower_dict['tower_d_loss_real']) self.ave_g_loss_image =tf.reduce_mean(self.tower_dict['tower_g_loss_image']) self.ave_d_loss_real_label =tf.reduce_mean(self.tower_dict['tower_d_loss_real_label']) self.ave_d_loss_fake_label =tf.reduce_mean(self.tower_dict['tower_d_loss_fake_label']) self.ave_g_loss_label =tf.reduce_mean(self.tower_dict['tower_g_loss_label']) #recalculate balance equations (b1,b2,b3 in paper) self.balance_k = self.gamma * self.ave_d_loss_real - self.ave_g_loss_image self.balance_l = self.gamma_label * self.ave_d_loss_real_label - self.ave_d_loss_fake_label self.balance_z = self.zeta*tf.nn.relu(self.balance_k) - tf.nn.relu(self.balance_l) self.measure = self.ave_d_loss_real + tf.abs(self.balance_k) self.measure_complete = self.ave_d_loss_real + self.ave_d_loss_real_label + \ tf.abs(self.balance_k)+tf.abs(self.balance_l)+tf.abs(self.balance_z) #update margins coefficients (c1,c2,c3 in paper) k_update = tf.assign( self.k_t, tf.clip_by_value(self.k_t + self.lambda_k*self.balance_k, 0, 1)) l_update = tf.assign( self.l_t, tf.clip_by_value(self.l_t + self.lambda_l*self.balance_l, 0, 1)) z_update = tf.assign( self.z_t, tf.clip_by_value(self.z_t + self.lambda_z*self.balance_z, 0, 1)) g_grads=average_gradients(self.tower_dict['g_tower_grads']) d_grads=average_gradients(self.tower_dict['d_tower_grads']) g_optim = self.g_optimizer.apply_gradients(g_grads, global_step=self.step) d_optim = self.d_optimizer.apply_gradients(d_grads) #every time train_op is run, run k_update, l_update, z_update with tf.control_dependencies([k_update,l_update,z_update]): #when train_op is run, run [g_optim,d_optim] self.train_op=tf.group(g_optim, d_optim) def train_step(self,sess,counter): sess.run(self.train_op) if counter % self.config.lr_update_step == self.lr_update_step - 1: sess.run([self.g_lr_update, self.d_lr_update]) def build_summary_op(self): names,real_labels_list=zip(*self.real_inputs.items()) _ ,fake_labels_list=zip(*self.fake_inputs.items()) LabelList=[names,real_labels_list,fake_labels_list, self.D_fake_labels_list,self.D_real_labels_list] for name,rlabel,flabel,d_fake_label,d_real_label in zip(*LabelList): with tf.name_scope(name): d_flabel=tf.cast(tf.round(d_fake_label),tf.int32) d_rlabel=tf.cast(tf.round(d_real_label),tf.int32) f_acc=tf.contrib.metrics.accuracy(tf.cast(tf.round(flabel),tf.int32),d_flabel) r_acc=tf.contrib.metrics.accuracy(tf.cast(tf.round(rlabel),tf.int32),d_rlabel) summary_stats('d_fake_label',d_fake_label,hist=True) summary_stats('d_real_label',d_real_label,hist=True) tf.summary.scalar('ave_d_fake_abs_diff',tf.reduce_mean(tf.abs(flabel-d_fake_label))) tf.summary.scalar('ave_d_real_abs_diff',tf.reduce_mean(tf.abs(rlabel-d_real_label))) tf.summary.scalar('real_label_ave',tf.reduce_mean(rlabel)) tf.summary.scalar('real_label_accuracy',r_acc) tf.summary.scalar('fake_label_accuracy',f_acc) ##Summaries picked from last gpu to run tf.summary.scalar('losslabel/d_loss_real_label',tf.reduce_mean(self.ave_d_loss_real_label)) tf.summary.scalar('losslabel/d_loss_fake_label',tf.reduce_mean(self.ave_d_loss_fake_label)) tf.summary.scalar('losslabel/g_loss_label',self.g_loss_label) tf.summary.image("G", self.G), tf.summary.image("AE_G", self.AE_G), tf.summary.image("AE_x", self.AE_x), tf.summary.scalar("loss/d_loss", self.d_loss), tf.summary.scalar("loss/d_loss_fake", self.d_loss_fake), tf.summary.scalar("loss/g_loss", self.g_loss), tf.summary.scalar("misc/d_lr", self.d_lr), tf.summary.scalar("misc/g_lr", self.g_lr), tf.summary.scalar("misc/eqn1", self.eqn1),#From orig BEGAN paper tf.summary.scalar("misc/eqn2", self.eqn2),#From orig BEGAN paper #summaries of gpu-averaged values tf.summary.scalar("loss/d_loss_real",self.ave_d_loss_real), tf.summary.scalar("loss/g_loss_image", self.ave_g_loss_image), tf.summary.scalar("balance/l", self.balance_l), tf.summary.scalar("balance/k", self.balance_k), tf.summary.scalar("balance/z", self.balance_z), tf.summary.scalar("misc/measure", self.measure), tf.summary.scalar("misc/measure_complete", self.measure_complete), tf.summary.scalar("misc/k_t", self.k_t), tf.summary.scalar("misc/l_t", self.l_t), tf.summary.scalar("misc/z_t", self.z_t), #doesn't include summaries from causal controller #TODO: rework so only 1 copy of summaries if multiple gpu self.summary_op=tf.summary.merge_all() ================================================ FILE: causal_began/__init__.py ================================================ ================================================ FILE: causal_began/config.py ================================================ #-*- coding: utf-8 -*- import argparse def str2bool(v): #return (v is True) or (v.lower() in ('true', '1')) return v is True or v.lower() in ('true', '1') arg_lists = [] parser = argparse.ArgumentParser() def add_argument_group(name): arg = parser.add_argument_group(name) arg_lists.append(arg) return arg #Network net_arg = add_argument_group('Network') net_arg.add_argument('--c_dim',type=int, default=3, help='''number of color channels. I wouldn't really change this from 3''') net_arg.add_argument('--conv_hidden_num', type=int, default=128, choices=[64, 128],help='n in the paper') net_arg.add_argument('--separate_labeler', type=str2bool, default=True) net_arg.add_argument('--z_dim', type=int, default=64, choices=[64, 128], help='''dimension of the noise input to the generator along with the labels''') net_arg.add_argument('--z_num', type=int, default=64, help='''dimension of the hidden space of the autoencoder''') # Data data_arg = add_argument_group('Data') data_arg.add_argument('--dataset', type=str, default='celebA') data_arg.add_argument('--split', type=str, default='train') data_arg.add_argument('--batch_size', type=int, default=16) # Training / test parameters train_arg = add_argument_group('Training') train_arg.add_argument('--beta1', type=float, default=0.5) train_arg.add_argument('--beta2', type=float, default=0.999) train_arg.add_argument('--d_lr', type=float, default=0.00008) train_arg.add_argument('--g_lr', type=float, default=0.00008) train_arg.add_argument('--label_loss',type=str,default='squarediff',choices=['xe','absdiff','squarediff'], help='''what comparison should be made between the labeler output and the actual labels''') train_arg.add_argument('--lr_update_step', type=int, default=100000, choices=[100000, 75000]) train_arg.add_argument('--max_step', type=int, default=50000) train_arg.add_argument('--num_iter',type=int,default=250000, help='the number of training iterations to run the model for') train_arg.add_argument('--optimizer', type=str, default='adam') train_arg.add_argument('--round_fake_labels',type=str2bool,default=True, help='''Whether the label outputs of the causal controller should be rounded first before calculating the loss of generator or d-labeler''') train_arg.add_argument('--use_gpu', type=str2bool, default=True) train_arg.add_argument('--num_gpu', type=int, default=1, help='specify 0 for cpu. If k specified, will default to\ first k of n gpus detected. If use_gpu=True but num_gpu not\ specified will default to 1') margin_arg = add_argument_group('Margin') margin_arg.add_argument('--gamma', type=float, default=0.5) margin_arg.add_argument('--gamma_label', type=float, default=0.5) margin_arg.add_argument('--lambda_k', type=float, default=0.001) margin_arg.add_argument('--lambda_l', type=float, default=0.00008, help='''As mentioned in the paper this is lower because this margin can be responded to more quickly than the other margins. Im not sure if it definitely needs to be lower''') margin_arg.add_argument('--lambda_z', type=float, default=0.01) margin_arg.add_argument('--no_third_margin', type=str2bool, default=False, help='''Use True for appendix figure in paper. This is used to neglect the third margin (c3,b3)''') margin_arg.add_argument('--zeta', type=float, default=0.5, help='''This is gamma_3 in the paper''') # Misc misc_arg = add_argument_group('Misc') misc_arg.add_argument('--is_train',type=str2bool,default=False, help='''whether to enter the image training loop''') misc_arg.add_argument('--build_all', type=str2bool, default=False, help='''normally specifying is_pretrain=False will cause the pretraining components not to be built and likewise with is_train=False only the pretrain compoenent will (possibly) be built. This is here as a debug helper to enable building out the whole model without doing any training''') misc_arg.add_argument('--data_dir', type=str, default='data') misc_arg.add_argument('--dry_run', action='store_true') #misc_arg.add_argument('--dry_run', type=str2bool, default='False') misc_arg.add_argument('--log_step', type=int, default=100, help='''how often to log stuff. Sample images are created every 10*log_step''') misc_arg.add_argument('--num_log_samples', type=int, default=3) misc_arg.add_argument('--log_level', type=str, default='INFO', choices=['INFO', 'DEBUG', 'WARN']) misc_arg.add_argument('--log_dir', type=str, default='logs') def gpu_logic(config): #consistency between use_gpu and num_gpu if config.num_gpu>0: config.use_gpu=True else: config.use_gpu=False # if config.use_gpu and config.num_gpu==0: # config.num_gpu=1 return config def get_config(): config, unparsed = parser.parse_known_args() config=gpu_logic(config) #this has to respect gpu/cpu #data_format = 'NCHW' if config.use_gpu: data_format = 'NCHW' else: data_format = 'NHWC' setattr(config, 'data_format', data_format) print('Loaded ./causal_began/config.py') return config, unparsed if __name__=='__main__': #for debug of config config, unparsed = get_config() ================================================ FILE: causal_began/models.py ================================================ import numpy as np import tensorflow as tf slim = tf.contrib.slim def lrelu(x,leak=0.2,name='lrelu'): with tf.variable_scope(name): f1=0.5 * (1+leak) f2=0.5 * (1-leak) return f1*x + f2*tf.abs(x) def GeneratorCNN( z, config, reuse=None): hidden_num=config.conv_hidden_num output_num=config.c_dim repeat_num=config.repeat_num data_format=config.data_format with tf.variable_scope("G",reuse=reuse) as vs: x = slim.fully_connected(z, np.prod([8, 8, hidden_num]),activation_fn=None,scope='fc1') x = reshape(x, 8, 8, hidden_num, data_format) for idx in range(repeat_num): x = slim.conv2d(x, hidden_num, 3, 1, activation_fn=tf.nn.elu, data_format=data_format,scope='conv'+str(idx)+'a') x = slim.conv2d(x, hidden_num, 3, 1, activation_fn=tf.nn.elu, data_format=data_format,scope='conv'+str(idx)+'b') if idx < repeat_num - 1: x = upscale(x, 2, data_format) out = slim.conv2d(x, 3, 3, 1, activation_fn=None,data_format=data_format,scope='conv'+str(idx+1)) variables = tf.contrib.framework.get_variables(vs) return out, variables def DiscriminatorCNN(image, config, reuse=None): hidden_num=config.conv_hidden_num data_format=config.data_format input_channel=config.channel with tf.variable_scope("D",reuse=reuse) as vs: # Encoder with tf.variable_scope('encoder'): x = slim.conv2d(image, hidden_num, 3, 1, activation_fn=tf.nn.elu, data_format=data_format,scope='conv0') prev_channel_num = hidden_num for idx in range(config.repeat_num): channel_num = hidden_num * (idx + 1) x = slim.conv2d(x, channel_num, 3, 1, activation_fn=tf.nn.elu, data_format=data_format,scope='conv'+str(idx+1)+'a') x = slim.conv2d(x, channel_num, 3, 1, activation_fn=tf.nn.elu, data_format=data_format,scope='conv'+str(idx+1)+'b') if idx < config.repeat_num - 1: x = slim.conv2d(x, channel_num, 3, 2, activation_fn=tf.nn.elu, data_format=data_format,scope='conv'+str(idx+1)+'c') #x = tf.contrib.layers.max_pool2d(x, [2, 2], [2, 2], padding='VALID') x = tf.reshape(x, [-1, np.prod([8, 8, channel_num])]) z = x = slim.fully_connected(x, config.z_num, activation_fn=None,scope='proj') # Decoder with tf.variable_scope('decoder'): x = slim.fully_connected(x, np.prod([8, 8, hidden_num]), activation_fn=None) x = reshape(x, 8, 8, hidden_num, data_format) for idx in range(config.repeat_num): x = slim.conv2d(x, hidden_num, 3, 1, activation_fn=tf.nn.elu, data_format=data_format,scope='conv'+str(idx)+'a') x = slim.conv2d(x, hidden_num, 3, 1, activation_fn=tf.nn.elu, data_format=data_format,scope='conv'+str(idx)+'b') if idx < config.repeat_num - 1: x = upscale(x, 2, data_format) out = slim.conv2d(x, input_channel, 3, 1, activation_fn=None, data_format=data_format,scope='proj') variables = tf.contrib.framework.get_variables(vs) return out, z, variables def Discriminator_labeler(image, output_size, config, reuse=None): hidden_num=config.conv_hidden_num repeat_num=config.repeat_num data_format=config.data_format with tf.variable_scope("discriminator_labeler",reuse=reuse) as scope: x = slim.conv2d(image, hidden_num, 3, 1, activation_fn=tf.nn.elu, data_format=data_format,scope='conv0') prev_channel_num = hidden_num for idx in range(repeat_num): channel_num = hidden_num * (idx + 1) x = slim.conv2d(x, channel_num, 3, 1, activation_fn=tf.nn.elu, data_format=data_format,scope='conv'+str(idx+1)+'a') x = slim.conv2d(x, channel_num, 3, 1, activation_fn=tf.nn.elu, data_format=data_format,scope='conv'+str(idx+1)+'b') if idx < repeat_num - 1: x = slim.conv2d(x, channel_num, 3, 2, activation_fn=tf.nn.elu, data_format=data_format,scope='conv'+str(idx+1)+'c') #x = tf.contrib.layers.max_pool2d(x, [2, 2], [2, 2], padding='VALID') x = tf.reshape(x, [-1, np.prod([8, 8, channel_num])]) label_logit = slim.fully_connected(x, output_size, activation_fn=None,scope='proj') variables = tf.contrib.framework.get_variables(scope) return label_logit,variables def next(loader): return loader.next()[0].data.numpy() def to_nhwc(image, data_format): if data_format == 'NCHW': #Isn't this backward? new_image = nchw_to_nhwc(image) else: new_image = image return new_image def to_nchw_numpy(image): if image.shape[3] in [1, 3]: new_image = image.transpose([0, 3, 1, 2]) else: new_image = image return new_image def norm_img(image, data_format=None): image = image/127.5 - 1. if data_format: image = to_nhwc(image, data_format) return image def denorm_img(norm, data_format): return tf.clip_by_value(to_nhwc((norm + 1)*127.5, data_format), 0, 255) def slerp(val, low, high): """Code from https://github.com/soumith/dcgan.torch/issues/14""" omega = np.arccos(np.clip(np.dot(low/np.linalg.norm(low), high/np.linalg.norm(high)), -1, 1)) so = np.sin(omega) if so == 0: return (1.0-val) * low + val * high # L'Hopital's rule/LERP return np.sin((1.0-val)*omega) / so * low + np.sin(val*omega) / so * high def int_shape(tensor): shape = tensor.get_shape().as_list() return [num if num is not None else -1 for num in shape] def get_conv_shape(tensor, data_format): shape = int_shape(tensor) # always return [N, H, W, C] if data_format == 'NCHW': return [shape[0], shape[2], shape[3], shape[1]] elif data_format == 'NHWC': return shape def nchw_to_nhwc(x): return tf.transpose(x, [0, 2, 3, 1]) def nhwc_to_nchw(x): return tf.transpose(x, [0, 3, 1, 2]) def reshape(x, h, w, c, data_format): if data_format == 'NCHW': x = tf.reshape(x, [-1, c, h, w]) else: x = tf.reshape(x, [-1, h, w, c]) return x def resize_nearest_neighbor(x, new_size, data_format): if data_format == 'NCHW': x = nchw_to_nhwc(x) x = tf.image.resize_nearest_neighbor(x, new_size) x = nhwc_to_nchw(x) else: x = tf.image.resize_nearest_neighbor(x, new_size) return x def upscale(x, scale, data_format): _, h, w, _ = get_conv_shape(x, data_format) return resize_nearest_neighbor(x, (h*scale, w*scale), data_format) #https://github.com/tensorflow/models/blob/master/tutorials/image/cifar10/cifar10_multi_gpu_train.py#L168 def average_gradients(tower_grads): """Calculate the average gradient for each shared variable across all towers. Note that this function provides a synchronization point across all towers. Args: tower_grads: List of lists of (gradient, variable) tuples. The outer list is over individual gradients. The inner list is over the gradient calculation for each tower. Returns: List of pairs of (gradient, variable) where the gradient has been averaged across all towers. """ average_grads = [] for grad_and_vars in zip(*tower_grads): # Note that each grad_and_vars looks like the following: # ((grad0_gpu0, var0_gpu0), ... , (grad0_gpuN, var0_gpuN)) grads = [] for g, _ in grad_and_vars: # Add 0 dimension to the gradients to represent the tower. expanded_g = tf.expand_dims(g, 0) # Append on a 'tower' dimension which we will average over below. grads.append(expanded_g) # Average over the 'tower' dimension. grad = tf.concat(axis=0, values=grads) grad = tf.reduce_mean(grad, 0) # Keep in mind that the Variables are redundant because they are shared # across towers. So .. we will just return the first tower's pointer to the Variable. v = grad_and_vars[0][1] grad_and_var = (grad, v) average_grads.append(grad_and_var) return average_grads ================================================ FILE: causal_began/utils.py ================================================ from __future__ import print_function import tensorflow as tf import os from os import listdir from os.path import isfile, join import shutil import sys import math import json import logging import numpy as np from PIL import Image from datetime import datetime from tensorflow.core.framework import summary_pb2 def make_summary(name, val): return summary_pb2.Summary(value=[summary_pb2.Summary.Value(tag=name, simple_value=val)]) def summary_stats(name,tensor,collections=None,hist=False): collections=collections or [tf.GraphKeys.SUMMARIES] ave=tf.reduce_mean(tensor) std=tf.sqrt(tf.reduce_mean(tf.square(ave-tensor))) tf.summary.scalar(name+'_ave',ave,collections) tf.summary.scalar(name+'_std',std,collections) if hist: tf.summary.histogram(name+'_hist',tensor,collections) def prepare_dirs_and_logger(config): formatter = logging.Formatter("%(asctime)s:%(levelname)s::%(message)s") logger = logging.getLogger() for hdlr in logger.handlers: logger.removeHandler(hdlr) handler = logging.StreamHandler() handler.setFormatter(formatter) logger.addHandler(handler) if config.load_path: if config.load_path.startswith(config.log_dir): config.model_dir = config.load_path else: if config.load_path.startswith(config.dataset): config.model_name = config.load_path else: config.model_name = "{}_{}".format(config.dataset, config.load_path) else: config.model_name = "{}_{}".format(config.dataset, get_time()) if not hasattr(config, 'model_dir'): config.model_dir = os.path.join(config.log_dir, config.model_name) config.data_path = os.path.join(config.data_dir, config.dataset) if not config.load_path: config.log_code_dir=os.path.join(config.model_dir,'code') for path in [config.log_dir, config.data_dir, config.model_dir, config.log_code_dir]: if not os.path.exists(path): os.makedirs(path) #Copy python code in directory into model_dir/code for future reference: code_dir=os.path.dirname(os.path.realpath(sys.argv[0])) model_files = [f for f in listdir(code_dir) if isfile(join(code_dir, f))] for f in model_files: if f.endswith('.py'): shutil.copy2(f,config.log_code_dir) def get_time(): return datetime.now().strftime("%m%d_%H%M%S") def save_config(config): param_path = os.path.join(config.model_dir, "params.json") print("[*] MODEL dir: %s" % config.model_dir) print("[*] PARAM path: %s" % param_path) with open(param_path, 'w') as fp: json.dump(config.__dict__, fp, indent=4, sort_keys=True) def get_available_gpus(): from tensorflow.python.client import device_lib local_device_protos = device_lib.list_local_devices() return [x.name for x in local_device_protos if x.device_type=='GPU'] def distribute_input_data(data_loader,num_gpu): ''' data_loader is a dictionary of tensors that are fed into our model This function takes that dictionary of n*batch_size dimension tensors and breaks it up into n dictionaries with the same key of tensors with dimension batch_size. One is given to each gpu ''' if num_gpu==0: return {'/cpu:0':data_loader} gpus=get_available_gpus() if num_gpu > len(gpus): raise ValueError('number of gpus specified={}, more than gpus available={}'.format(num_gpu,len(gpus))) gpus=gpus[:num_gpu] data_by_gpu={g:{} for g in gpus} for key,value in data_loader.items(): spl_vals=tf.split(value,num_gpu) for gpu,val in zip(gpus,spl_vals): data_by_gpu[gpu][key]=val return data_by_gpu def rank(array): return len(array.shape) def make_grid(tensor, nrow=8, padding=2, normalize=False, scale_each=False): """Code based on https://github.com/pytorch/vision/blob/master/torchvision/utils.py""" nmaps = tensor.shape[0] xmaps = min(nrow, nmaps) ymaps = int(math.ceil(float(nmaps) / xmaps)) height, width = int(tensor.shape[1] + padding), int(tensor.shape[2] + padding) grid = np.zeros([height * ymaps + 1 + padding // 2, width * xmaps + 1 + padding // 2, 3], dtype=np.uint8) k = 0 for y in range(ymaps): for x in range(xmaps): if k >= nmaps: break h, h_width = y * height + 1 + padding // 2, height - padding w, w_width = x * width + 1 + padding // 2, width - padding grid[h:h+h_width, w:w+w_width] = tensor[k] k = k + 1 return grid def save_image(tensor, filename, nrow=8, padding=2, normalize=False, scale_each=False): ndarr = make_grid(tensor, nrow=nrow, padding=padding, normalize=normalize, scale_each=scale_each) im = Image.fromarray(ndarr) im.save(filename) ================================================ FILE: causal_controller/ArrayDict.py ================================================ import numpy as np class ArrayDict(object): ''' This is a class for manipulating dictionaries of arrays or dictionaries of scalars. I find this comes up pretty often when dealing with tensorflow, because you can pass dictionaries to feed_dict and get dictionaries back. If you use a smaller batch_size, you then want to "concatenate" these outputs for each key. ''' def __init__(self): self.dict={} def __len__(self): if len(self.dict)==0: return 0 else: return len(self.dict.values()[0]) def __repr__(self): return repr(self.dict) def keys(self): return self.dict.keys() def items(self): return self.dict.items() def validate_dict(self,a_dict): #Check keys for key,val in self.dict.items(): if not key in a_dict.keys(): raise ValueError('key:',key,'was not in a_dict.keys()') for key,val in a_dict.items(): #Check same keys if not key in self.dict.keys(): raise ValueError('argument key:',key,'was not in self.dict') if isinstance(val,np.ndarray): #print('ndarray') my_val=self.dict[key] if not np.all(val.shape[1:]==my_val.shape[1:]): raise ValueError('key:',key,'value shape',val.shape,'does\ not match existing shape',my_val.shape) else: #scalar a_val=np.array([[val]])#[1,1]shape array my_val=self.dict[key] if not np.all(my_val.shape[1:]==a_val.shape[1:]): raise ValueError('key:',key,'value shape',val.shape,'does\ not match existing shape',my_val.shape) def arr_dict(self,a_dict): if isinstance(a_dict.values()[0],np.ndarray): return a_dict else: return {k:np.array([[v]]) for k,v in a_dict.items()} def concat(self,a_dict): if self.dict=={}: self.dict=self.arr_dict(a_dict)#store interally as array else: self.validate_dict(a_dict) self.dict={k:np.vstack([v,a_dict[k]]) for k,v in self.items()} def __getitem__(self,at): return {k:v[at] for k,v in self.items()} #debug, run tests if __name__=='__main__': out1=ArrayDict() d1={'Male':np.ones((3,1)),'Young':2*np.ones((3,1))} d2={'Male':3,'Young':33} d3={'Male':4*np.ones((4,1)),'Young':4*np.ones((4,1))} out1.concat(d1) out1.concat(d2) out2=ArrayDict() out2.concat(d2) out2.concat(d1) out2.concat(d3) ================================================ FILE: causal_controller/CausalController.py ================================================ from __future__ import print_function from itertools import chain import numpy as np import tensorflow as tf import pandas as pd import os slim = tf.contrib.slim from models import lrelu,DiscriminatorW,Grad_Penalty from utils import summary_stats,did_succeed from ArrayDict import ArrayDict#Collector of outputs debug=False class CausalController(object): model_type='controller' summs=['cc_summaries'] def summary_scalar(self,name,ten): tf.summary.scalar(name,ten,collections=self.summs) def summary_stats(self,name,ten,hist=False): summary_stats(name,ten,collections=self.summs,hist=hist) def load(self,sess,path): ''' sess is a tf.Session object path is the path of the file you want to load, (not the directory) Example ./checkpoint/somemodel/saved/model.ckpt-3000 (leave off the extensions) ''' if not hasattr(self,'saver'):#should have one now self.saver=tf.train.Saver(var_list=self.var) print('Attempting to load model:',path) self.saver.restore(sess,path) def __init__(self,batch_size,config): ''' Args: config : This carries all the aguments defined in causal_controller/config.py with it. It also defines config.graph, which is a nested list that specifies the graph batch_size: This is separate from config because it is actually a tf.placeholder so that batch_size can be set during sess.run, but also synchronized between the models. A causal graph (config.graph) is specified as follows: just supply a list of pairs (node, node_parents) Example: A->B<-C; D->E [ ['A',[]], ['B',['A','C']], ['C',[]], ['D',[]], ['E',['D']] ] I use a list right now instead of a dict because I don't think dict.keys() are gauranteed to be returned a particular order. TODO:A good improvement would be to use collections.OrderedDict #old #Pass indep_causal=True to use Unif[0,1] labels #input_dict allows the model to take in some aritrary input instead #of using tf_random_uniform nodes #pass reuse if constructing for a second time Access nodes ether with: model.cc.node_dict['Male'] or with: model.cc.Male Other models such as began/dcgan are intended to be build more than once (for example on 2 gpus), but causal_controller is just built once. ''' self.config=config self.batch_size=batch_size #tf.placeholder_with_default self.graph=config.graph print('causal graph size:',len(self.graph)) self.node_names, self.parent_names=zip(*self.graph) self.node_names=list(self.node_names) self.label_names=self.node_names #set nodeclass attributes if debug: print('Using ',self.config.cc_n_layers,'between each causal node') CausalNode.n_layers=self.config.cc_n_layers CausalNode.n_hidden=self.config.cc_n_hidden CausalNode.batch_size=self.batch_size with tf.variable_scope('causal_controller') as vs: self.step=tf.Variable(0, name='step', trainable=False) self.inc_step=tf.assign(self.step,self.step+1) self.nodes=[CausalNode(name=n,config=config) for n in self.node_names] for node,rents in zip(self.nodes,self.parent_names): node.parents=[n for n in self.nodes if n.name in rents] ##construct graph## #Lazy construction avoids the pain of traversing the causal graph explicitly #python recursion error if the graph is not a DAG for node in self.nodes: node.setup_tensor() self.labels=tf.concat(self.list_labels(),-1) self.fake_labels=self.labels self.fake_labels_logits= tf.concat( self.list_label_logits(),-1 ) self.label_dict={n.name:n.label for n in self.nodes} self.node_dict={n.name:n for n in self.nodes} self.z_dict={n.name:n.z for n in self.nodes} #enable access directly. Little dangerous #Please don't have any nodes named "batch_size" for example self.__dict__.update(self.node_dict) #dcc variables are not saved, so if you reload in the middle of a #pretrain, that might be a quirk. I don't find it makes much of a #difference though self.var = tf.contrib.framework.get_variables(vs) trainable=tf.get_collection('trainable_variables') self.train_var=[v for v in self.var if v in trainable] #wont save dcc var self.saver=tf.train.Saver(var_list=self.var) self.model_dir=os.path.join(self.config.model_dir,self.model_type) self.save_model_dir=os.path.join(self.model_dir,'checkpoints') self.save_model_name=os.path.join(self.save_model_dir,'CC-Model') if not os.path.exists(self.model_dir): os.mkdir(self.model_dir) if not os.path.exists(self.save_model_dir): os.mkdir(self.save_model_dir) def build_pretrain(self,label_loader): ''' This is not called if for example using an existing model label_loader is a queue of only labels that moves quickly because no images ''' config=self.config #Pretraining setup self.DCC=DiscriminatorW #if self.config.pt_factorized: #self.DCC=FactorizedNetwork(self.graph,self.DCC,self.config) #reasonable alternative with equal performance if self.config.pt_factorized:#Each node owns a dcc print('CC is factorized!') for node in self.nodes: node.setup_pretrain(config,label_loader,self.DCC) with tf.control_dependencies([self.inc_step]): self.c_optim=tf.group(*[n.c_optim for n in self.nodes]) self.dcc_optim=tf.group(*[n.dcc_optim for n in self.nodes]) self.train_op=tf.group(self.c_optim,self.dcc_optim) self.c_loss=tf.reduce_sum([n.c_loss for n in self.nodes]) self.dcc_loss=tf.reduce_sum([n.dcc_loss for n in self.nodes]) self.summary_stats('total_c_loss',self.c_loss) self.summary_stats('total_dcc_loss',self.dcc_loss) #default. else:#Not factorized. CC owns dcc print('setting up pretrain:','CausalController') real_inputs=tf.concat([label_loader[n] for n in self.node_names],axis=1) fake_inputs=self.labels n_hidden=self.config.critic_hidden_size real_prob,self.dcc_real_logit,self._dcc_var=self.DCC(real_inputs,self.batch_size,n_hidden,self.config) fake_prob,self.dcc_fake_logit,_=self.DCC(fake_inputs,self.batch_size,n_hidden,self.config,reuse=True) grad_cost,self.dcc_slopes=Grad_Penalty(real_inputs,fake_inputs,self.DCC,self.config) self.dcc_diff = self.dcc_fake_logit - self.dcc_real_logit self.dcc_gan_loss=tf.reduce_mean(self.dcc_diff) self.dcc_grad_loss=grad_cost self.dcc_loss=self.dcc_gan_loss+self.dcc_grad_loss# self.c_loss=-tf.reduce_mean(self.dcc_fake_logit)# optimizer = tf.train.AdamOptimizer self.c_optimizer, self.dcc_optimizer = optimizer(config.pt_cc_lr),optimizer(config.pt_dcc_lr) with tf.control_dependencies([self.inc_step]): self.c_optim=self.c_optimizer.minimize(self.c_loss,var_list=self.train_var) self.dcc_optim=self.dcc_optimizer.minimize(self.dcc_loss,var_list=self.dcc_var) self.train_op=tf.group(self.c_optim,self.dcc_optim) self.summary_stats('total_c_loss',self.c_loss) self.summary_stats('total_dcc_loss',self.dcc_loss) for node in self.nodes: with tf.name_scope(node.name): #TODO:replace with summary_stats self.summary_stats(node.name+'_fake',node.label,hist=True) self.summary_stats(node.name+'_real',label_loader[node.name],hist=True) self.summaries=tf.get_collection(self.summs[0]) print('causalcontroller has',len(self.summaries),'summaries') self.summary_op=tf.summary.merge(self.summaries) @property def dcc_var(self): if self.config.is_pretrain: if self.config.pt_factorized: return list(chain.from_iterable([n.dcc_var for n in self.nodes])) else: return self._dcc_var else: return [] def critic_update(self,sess): fetch_dict = {"critic_op":self.dcc_optim } for i in range(self.config.n_critic): result = sess.run(fetch_dict) def __len__(self): return len(self.node_dict) def list_placeholders(self): return [n.z for n in self.nodes] def list_labels(self): return [n.label for n in self.nodes] def list_label_logits(self): return [n.label_logit for n in self.nodes] def do2feed(self,do_dict): ''' used internally to convert a dictionary to a feed_dict ''' feed_dict={} for key,value in do_dict.items(): feed_dict[self.label_dict[key]]=value return feed_dict def sample_label(self, sess, cond_dict=None,do_dict=None,N=None,verbose=False): ''' This is a method to sample conditional and internventional distributions over labels. This is disconnected from interventions/conditioning that include the image because it is potentially faster. (images are not generated for rejected samples). The intent is to pass these labels to the image generator. This is low level. One experiment type(N times) per function call. values of dictionaries should be scalars Assumed that label_dict is always the fetch may combine conditioning and intervening ''' do_dict= do_dict or {} cond_dict= cond_dict or {} fetch_dict=self.label_dict #boolean scalars are all that is allowed for v in cond_dict.values(): assert(v==0 or v==1) for v in do_dict.values(): assert(v==0 or v==1) arr_do_dict={k:v*np.ones([N,1]) for k,v in do_dict.items()} feed_dict = self.do2feed(arr_do_dict)#{tensor:array} feed_dict.update({self.batch_size:N}) if verbose: print('feed_dict',feed_dict) print('fetch_dict',fetch_dict) #No conditioning loop needed if not cond_dict: return sess.run(fetch_dict, feed_dict) else:#cond_dict not None rows=np.arange(N)#what idx do we need #init max_fail=4000 n_fails=0 outputs=ArrayDict() iter_rows=np.arange(N) n_remaining=N ii=0 while( n_remaining > 0 ): ii+=1 #Run N samples out=sess.run(fetch_dict, feed_dict) bool_pass = did_succeed(out,cond_dict) pass_idx=iter_rows[bool_pass] pass_idx=pass_idx[:n_remaining] pass_dict={k:v[pass_idx] for k,v in out.items()} outputs.concat(pass_dict) n_remaining=N-len(outputs) # :( if ii>max_fail: print('WARNING: for cond_dict:',cond_dict,) print('could not condition in ',max_fail*N, 'samples') break else: if verbose: print('for cond_dict:',cond_dict,) print('conditioning finished normally with ',ii,'tries') return outputs.dict class CausalNode(object): ''' A CausalNode sets up a small neural network: z_noise+[,other causes] -> label_logit Everything is defined in terms of @property to allow tensorflow graph to be lazily generated as called because I don't enforce that a node's parent tf graph is constructed already during class.setup_tensor Uniform[-1,1] + other causes pases through n_layers fully connected layers. ''' train = True name=None #logit is going to be 1 dim with sigmoid #as opposed to 2 dim with softmax _label_logit=None _label=None parents=[]#list of CausalNodes n_layers=3 n_hidden=10 batch_size=-1#Must be set by cc summs=['cc_summaries'] def summary_scalar(self,name,ten): tf.summary.scalar(name,ten,collections=self.summs) def summary_stats(self,name,ten,hist=False): summary_stats(name,ten,collections=self.summs,hist=hist) def __init__(self,name,config): self.name=name self.config=config if self.batch_size==-1: raise Exception('class attribute CausalNode.batch_size must be set') with tf.variable_scope(self.name) as vs: #I think config.seed would have to be passed explicitly here self.z=tf.random_uniform((self.batch_size,self.n_hidden),minval=-1.0,maxval=1.0) self.init_var = tf.contrib.framework.get_variables(vs) self.setup_var=[]#empty until setup_tensor runs def setup_tensor(self): if self._label is not None:#already setup if debug: #Notify that already setup (normal behavior) print('self.',self.name,' has refuted setting up tensor') return tf_parents=[self.z]+[node.label for node in self.parents] with tf.variable_scope(self.name) as vs: h=tf.concat(tf_parents,-1)#tensor of parent values for l in range(self.n_layers-1): h=slim.fully_connected(h,self.n_hidden,activation_fn=lrelu,scope='layer'+str(l)) self._label_logit = slim.fully_connected(h,1,activation_fn=None,scope='proj') self._label=tf.nn.sigmoid( self._label_logit ) if debug: print('self.',self.name,' has setup _label=',self._label) #There could actually be some (quiet) error here I think if one of the #names in the causal graph is a substring of some other name. #e.g. 'hair' and 'black_hair' #Sorry, not coded to anticipate corner case self.setup_var=tf.contrib.framework.get_variables(vs) @property def var(self): if len(self.setup_var)==0: print('WARN: node var was accessed before it was constructed') return self.init_var+self.setup_var @property def train_var(self): trainable=tf.get_collection('trainable_variables') return [v for v in self.var if v in trainable] @property def label_logit(self): #Less stable. Better to access labels #for input to another model if self._label_logit is not None: return self._label_logit else: self.setup_tensor() return self._label_logit @property def label(self): if self._label is not None: return self._label else: self.setup_tensor() return self._label def setup_pretrain(self,config,label_loader,DCC): ''' This function is not functional because this only happens if cc_config.pt_factorized=True. In this case convergence of each node is treated like its own gan conditioned on the parent nodes labels. I couldn't bring myself to delete it, but it's not needed to get good convergence for the models we tested. ''' print('setting up pretrain:',self.name) with tf.variable_scope(self.name,reuse=self.reuse) as vs: self.config=config n_hidden=self.config.critic_hidden_size parent_names=[p.name for p in self.parents] real_inputs=tf.concat([label_loader[n] for n in parent_names]+[label_loader[self.name]],axis=1) fake_inputs=tf.concat([p.label for p in self.parents]+[self.label],axis=1) real_prob,self.dcc_real_logit,self.dcc_var=DCC(real_inputs,self.batch_size,n_hidden,self.config) fake_prob,self.dcc_fake_logit,_=DCC(fake_inputs,self.batch_size,n_hidden,self.config,reuse=True) grad_cost,self.dcc_slopes=Grad_Penalty(real_inputs,fake_inputs,DCC,self.config) self.dcc_diff = self.dcc_fake_logit - self.dcc_real_logit self.dcc_gan_loss=tf.reduce_mean(self.dcc_diff) self.dcc_grad_loss=grad_cost self.dcc_loss=self.dcc_gan_loss+self.dcc_grad_loss# self.c_loss=-tf.reduce_mean(self.dcc_fake_logit)# self.summary_scalar('dcc_gan_loss',self.dcc_gan_loss) self.summary_scalar('dcc_grad_loss',self.dcc_grad_loss) self.summary_stats('dcc_slopes',self.dcc_slopes,hist=True) if config.optimizer == 'adam': optimizer = tf.train.AdamOptimizer else: raise Exception("[!] Caution! Optimizer untested {}. Only tested Adam".format(config.optimizer)) self.c_optimizer, self.dcc_optimizer = optimizer(config.pt_cc_lr),optimizer(config.pt_dcc_lr) self.c_optim=self.c_optimizer.minimize(self.c_loss,var_list=self.train_var) self.dcc_optim=self.dcc_optimizer.minimize(self.dcc_loss,var_list=self.dcc_var) self.summary_stats('c_loss',self.c_loss) self.summary_stats('dcc_loss',self.c_loss) self.summary_stats('dcc_real_logit',self.dcc_real_logit,hist=True) self.summary_stats('dcc_fake_logit',self.dcc_fake_logit,hist=True) ================================================ FILE: causal_controller/__init__.py ================================================ ================================================ FILE: causal_controller/config.py ================================================ ''' These are the command line parameters that pertain exlusively to the CausalController. ''' from __future__ import print_function import argparse def str2bool(v): #return (v is True) or (v.lower() in ('true', '1')) return v is True or v.lower() in ('true', '1') arg_lists = [] parser = argparse.ArgumentParser() def add_argument_group(name): arg = parser.add_argument_group(name) arg_lists.append(arg) return arg #Pretrain network pretrain_arg=add_argument_group('Pretrain') pretrain_arg.add_argument('--pt_load_path', type=str, default='') pretrain_arg.add_argument('--is_pretrain',type=str2bool,default=False, help='to do pretraining') #pretrain_arg.add_argument('--only_pretrain', action='store_true', # help='simply complete pretrain and exit') #Used to be an option, but now is solved #pretrain_arg.add_argument('--pretrain_type',type=str,default='wasserstein',choices=['wasserstein','gan']) pretrain_arg.add_argument('--pt_cc_lr',type=float,default=0.00008,# help='learning rate for causal controller') pretrain_arg.add_argument('--pt_dcc_lr',type=float,default=0.00008,# help='learning rate for causal controller') pretrain_arg.add_argument('--lambda_W',type=float,default=0.1,# help='penalty for gradient of W critic') pretrain_arg.add_argument('--n_critic',type=int,default=20,#5 for speed help='number of critic iterations between gen update') pretrain_arg.add_argument('--critic_layers',type=int,default=6,#4 usual.8 might help help='number of layers in the Wasserstein discriminator') pretrain_arg.add_argument('--critic_hidden_size',type=int,default=15,#10,15 help='hidden_size for critic of discriminator') pretrain_arg.add_argument('--min_tvd',type=float,default=0.02, help='if tvdN to do stuff for example: if self.config.pretrain_LabelerR and counter < self.config.pretrain_LabelerR_no_of_iters: sess.run(self.d_label_optim) else: if np.mod(counter, 3) == 0: sess.run(self.g_optim) sess.run([self.train_op,self.k_t_update,self.inc_step])#all ops else: sess.run([self.g_optim, self.k_t_update ,self.inc_step]) sess.run(self.g_optim) ================================================ FILE: causal_dcgan/__init__.py ================================================ ================================================ FILE: causal_dcgan/config.py ================================================ from __future__ import print_function import argparse def str2bool(v): return v is True or v.lower() in ('true', '1') arg_lists = [] parser = argparse.ArgumentParser() def add_argument_group(name): arg = parser.add_argument_group(name) arg_lists.append(arg) return arg # Data data_arg = add_argument_group('Data') data_arg.add_argument('--batch_size', type=int, default=64, help='''default batch_size when using this model and not specifying the batch_size elsewhere''') data_arg.add_argument('--label_specific_noise',type=str2bool,default=False, help='whether to add noise dependent on the data mean') #This flag doesn't function. Model is designed to take in CC.labels data_arg.add_argument('--fakeLabels_distribution',type=str,choices=['real_joint','iid_uniform'],default='real_joint') data_arg.add_argument('--label_type',type=str,choices=['discrete','continuous'],default='continuous') data_arg.add_argument('--round_fake_labels',type=str2bool,default=True, help='''whether to round the outputs of causal controller before (possibly) adding noise to them or using them as input to the image generator. I highly recommend as a small improvement.''') data_arg.add_argument('--type_input_to_generator',type=str,choices=['labels','logits'], default='logits',help='''Whether to send labels or logits to the generator to form images. Chris recommends labels''') #Network net_arg = add_argument_group('Network') #TODO need help strings net_arg.add_argument('--df_dim',type=int, default=64 ) net_arg.add_argument('--gf_dim',type=int, default=64, help='''output dimensions [gf_dim,gf_dim] for generator''') net_arg.add_argument('--c_dim',type=int, default=3, help='''number of color channels. I wouldn't really change this from 3''') net_arg.add_argument('--z_dim',type=int,default=100, help='''the number of dimensions for the noise input that will be concatenated with labels and fed to the image generator''') net_arg.add_argument('--loss_function',type=int,default=1, help='''which loss function to choose. See CausalGAN.py''') net_arg.add_argument('--critic_hidden_size',type=int,default=10, help='''number of neurons per fc layer in discriminator''') net_arg.add_argument('--reconstr_loss',type=str2bool,default=False, help='''whether to inclue g_loss_on_z in the generator loss. This was True by default until recently which is where there are a lot of unneccsary networks''') net_arg.add_argument('--stab_proj',type=str2bool,default=False, help='''stabalizing projection method used for discriminator. Stabalizing GAN Training with Multiple Random Projections https://arxiv.org/abs/1705.07831''') net_arg.add_argument('--n_stab_proj',type=int,default=256, help='''number of stabalizing projections. Need stab_proj=True for this to have effect''') # Training / test parameters train_arg = add_argument_group('Training') train_arg.add_argument('--num_iter',type=int,default=100000, help='the number of training iterations to run the model for') train_arg.add_argument('--learning_rate',type=float,default=0.0002, help='Learning rate for adam [0.0002]') train_arg.add_argument('--beta1',type=float,default=0.5, help='Momentum term of adam [0.5]') train_arg.add_argument('--off_label_losses',type=str2bool,default=False) #TODO unclear on default for these two arguments #Not yet setup. Use False train_arg.add_argument('--pretrain_LabelerR',type=str2bool,default=False) #counters over epochs preferred #train_arg.add_argument('--pretrain_LabelerR_no_of_epochs',type=int,default=5) train_arg.add_argument('--pretrain_LabelerR_no_of_iters',type=int,default=15000) #TODO: add help strings describing params train_arg.add_argument('--lambda_m',type=float,default=0.05,)#0.05 train_arg.add_argument('--lambda_k',type=float,default=0.05,)#0.05 train_arg.add_argument('--lambda_l',type=float,default=0.001,)#0.005 train_arg.add_argument('--gamma_m',type=float,default=-1.0,)# NOT USED! train_arg.add_argument('--gamma_k',type=float,default=-1.0,#0.8#FLAGS.gamma_k not used help='''default initial value''') train_arg.add_argument('--gamma_l',type=float,default=-1.0, ) train_arg.add_argument('--tau',type=float,default=3000, help='''time constant. Every tau calls of k_t_update will reduce k_t by a factor of 1/e.''') #old config file differed from implementation: # FLAGS.gamma_k = -1.0 # FLAGS.gamma_m = -1.0 # set to 1/gamma_k in the code # FLAGS.gamma_l = -1.0 # made more extreme # FLAGS.lambda_k = 0.05 # FLAGS.lambda_m = 0.05 # FLAGS.lambda_l = 0.001 # Misc misc_arg = add_argument_group('Misc') misc_arg.add_argument('--is_train',type=str2bool,default=False, help='''whether to enter the image training loop''') misc_arg.add_argument('--log_level', type=str, default='INFO', choices=['INFO', 'DEBUG', 'WARN']) misc_arg.add_argument('--log_dir', type=str, default='logs') misc_arg.add_argument('--log_step', type=int, default=100, help='''how often to log stuff. Sample images are created every 10*log_step''') ##REFERENCE # elif model_ID == 44: # FLAGS.is_train = True # #FLAGS.graph = "big_causal_graph" # FLAGS.graph = "complete_big_causal_graph" # FLAGS.loss_function = 1 # FLAGS.pretrain_LabelerR = False # FLAGS.pretrain_LabelerR_no_of_epochs = 3 # FLAGS.fakeLabels_distribution = "real_joint" # FLAGS.gamma_k = -1.0 # FLAGS.gamma_m = -1.0 # set to 1/gamma_k in the code # FLAGS.gamma_l = -1.0 # made more extreme # FLAGS.lambda_k = 0.05 # FLAGS.lambda_m = 0.05 # FLAGS.lambda_l = 0.001 # FLAGS.label_type = 'continuous' # return FLAGS def get_config(): config, unparsed = parser.parse_known_args() print('Loaded ./causal_dcgan/config.py') return config, unparsed if __name__=='__main__': #for debug of config config, unparsed = get_config() ================================================ FILE: causal_dcgan/models.py ================================================ import tensorflow as tf import numpy as np slim = tf.contrib.slim import math from ops import lrelu,linear,conv_cond_concat,batch_norm,add_minibatch_features from ops import conv2d,deconv2d def conv_out_size_same(size, stride): return int(math.ceil(float(size) / float(stride))) def GeneratorCNN( z, config, reuse=None): ''' maps z to a 64x64 images with values in [-1,1] uses batch normalization internally ''' #trying to get around batch_size like this: batch_size=tf.shape(z)[0] #batch_size=tf.placeholder_with_default(64,[],'bs') with tf.variable_scope("generator",reuse=reuse) as vs: g_bn0 = batch_norm(name='g_bn0') g_bn1 = batch_norm(name='g_bn1') g_bn2 = batch_norm(name='g_bn2') g_bn3 = batch_norm(name='g_bn3') s_h, s_w = config.gf_dim, config.gf_dim#64,64 s_h2, s_w2 = conv_out_size_same(s_h, 2), conv_out_size_same(s_w, 2) s_h4, s_w4 = conv_out_size_same(s_h2, 2), conv_out_size_same(s_w2, 2) s_h8, s_w8 = conv_out_size_same(s_h4, 2), conv_out_size_same(s_w4, 2) s_h16, s_w16 = conv_out_size_same(s_h8, 2), conv_out_size_same(s_w8, 2) # project `z` and reshape z_, self_h0_w, self_h0_b = linear( z, config.gf_dim*8*s_h16*s_w16, 'g_h0_lin', with_w=True) self_h0 = tf.reshape( z_, [-1, s_h16, s_w16, config.gf_dim * 8]) h0 = tf.nn.relu(g_bn0(self_h0)) h1, h1_w, h1_b = deconv2d( h0, [batch_size, s_h8, s_w8, config.gf_dim*4], name='g_h1', with_w=True) h1 = tf.nn.relu(g_bn1(h1)) h2, h2_w, h2_b = deconv2d( h1, [batch_size, s_h4, s_w4, config.gf_dim*2], name='g_h2', with_w=True) h2 = tf.nn.relu(g_bn2(h2)) h3, h3_w, h3_b = deconv2d( h2, [batch_size, s_h2, s_w2, config.gf_dim*1], name='g_h3', with_w=True) h3 = tf.nn.relu(g_bn3(h3)) h4, h4_w, h4_b = deconv2d( h3, [batch_size, s_h, s_w, config.c_dim], name='g_h4', with_w=True) out=tf.nn.tanh(h4) variables = tf.contrib.framework.get_variables(vs) return out, variables def DiscriminatorCNN(image, config, reuse=None): ''' Discriminator for GAN model. image : batch_size x 64x64x3 image config : see causal_dcgan/config.py reuse : pass True if not calling for first time returns: probabilities(real) : logits(real) : first layer activation used to estimate z from : variables list ''' with tf.variable_scope("discriminator",reuse=reuse) as vs: d_bn1 = batch_norm(name='d_bn1') d_bn2 = batch_norm(name='d_bn2') d_bn3 = batch_norm(name='d_bn3') if not config.stab_proj: h0 = lrelu(conv2d(image, config.df_dim, name='d_h0_conv'))#16,32,32,64 else:#method to restrict disc from winning #I think this is equivalent to just not letting disc optimize first layer #and also removing nonlinearity #k_h=5, k_w=5, d_h=2, d_w=2, stddev=0.02, #paper used 8x8 kernel, but I'm using 5x5 because it is more similar to my achitecture #n_projs=config.df_dim#64 instead of 32 in paper n_projs=config.n_stab_proj#64 instead of 32 in paper print("WARNING:STAB_PROJ active, using ",n_projs," projections") w_proj = tf.get_variable('w_proj', [5, 5, image.get_shape()[-1],n_projs], initializer=tf.truncated_normal_initializer(stddev=0.02),trainable=False) conv = tf.nn.conv2d(image, w_proj, strides=[1, 2, 2, 1], padding='SAME') b_proj = tf.get_variable('b_proj', [n_projs],#does nothing initializer=tf.constant_initializer(0.0),trainable=False) h0=tf.nn.bias_add(conv,b_proj) h1_ = lrelu(d_bn1(conv2d(h0, config.df_dim*2, name='d_h1_conv')))#16,16,16,128 h1 = add_minibatch_features(h1_, config.df_dim) h2 = lrelu(d_bn2(conv2d(h1, config.df_dim*4, name='d_h2_conv')))#16,16,16,248 h3 = lrelu(d_bn3(conv2d(h2, config.df_dim*8, name='d_h3_conv'))) #print('h3shape: ',h3.get_shape().as_list()) #print('8df_dim:',config.df_dim*8) #dim3=tf.reduce_prod(tf.shape(h3)[1:]) dim3=np.prod(h3.get_shape().as_list()[1:]) h3_flat=tf.reshape(h3, [-1,dim3]) h4 = linear(h3_flat, 1, 'd_h3_lin') prob=tf.nn.sigmoid(h4) variables = tf.contrib.framework.get_variables(vs,collection=tf.GraphKeys.TRAINABLE_VARIABLES) return prob, h4, h1_, variables def discriminator_labeler(image, output_dim, config, reuse=None): batch_size=tf.shape(image)[0] with tf.variable_scope("disc_labeler",reuse=reuse) as vs: dl_bn1 = batch_norm(name='dl_bn1') dl_bn2 = batch_norm(name='dl_bn2') dl_bn3 = batch_norm(name='dl_bn3') h0 = lrelu(conv2d(image, config.df_dim, name='dl_h0_conv'))#16,32,32,64 h1 = lrelu(dl_bn1(conv2d(h0, config.df_dim*2, name='dl_h1_conv')))#16,16,16,128 h2 = lrelu(dl_bn2(conv2d(h1, config.df_dim*4, name='dl_h2_conv')))#16,16,16,248 h3 = lrelu(dl_bn3(conv2d(h2, config.df_dim*8, name='dl_h3_conv'))) dim3=np.prod(h3.get_shape().as_list()[1:]) h3_flat=tf.reshape(h3, [-1,dim3]) D_labels_logits = linear(h3_flat, output_dim, 'dl_h3_Label') D_labels = tf.nn.sigmoid(D_labels_logits) variables = tf.contrib.framework.get_variables(vs) return D_labels, D_labels_logits, variables def discriminator_gen_labeler(image, output_dim, config, reuse=None): batch_size=tf.shape(image)[0] with tf.variable_scope("disc_gen_labeler",reuse=reuse) as vs: dl_bn1 = batch_norm(name='dl_bn1') dl_bn2 = batch_norm(name='dl_bn2') dl_bn3 = batch_norm(name='dl_bn3') h0 = lrelu(conv2d(image, config.df_dim, name='dgl_h0_conv'))#16,32,32,64 h1 = lrelu(dl_bn1(conv2d(h0, config.df_dim*2, name='dgl_h1_conv')))#16,16,16,128 h2 = lrelu(dl_bn2(conv2d(h1, config.df_dim*4, name='dgl_h2_conv')))#16,16,16,248 h3 = lrelu(dl_bn3(conv2d(h2, config.df_dim*8, name='dgl_h3_conv'))) dim3=np.prod(h3.get_shape().as_list()[1:]) h3_flat=tf.reshape(h3, [-1,dim3]) D_labels_logits = linear(h3_flat, output_dim, 'dgl_h3_Label') D_labels = tf.nn.sigmoid(D_labels_logits) variables = tf.contrib.framework.get_variables(vs) return D_labels, D_labels_logits,variables def discriminator_on_z(image, config, reuse=None): batch_size=tf.shape(image)[0] with tf.variable_scope("disc_z_labeler",reuse=reuse) as vs: dl_bn1 = batch_norm(name='dl_bn1') dl_bn2 = batch_norm(name='dl_bn2') dl_bn3 = batch_norm(name='dl_bn3') h0 = lrelu(conv2d(image, config.df_dim, name='dzl_h0_conv'))#16,32,32,64 h1 = lrelu(dl_bn1(conv2d(h0, config.df_dim*2, name='dzl_h1_conv')))#16,16,16,128 h2 = lrelu(dl_bn2(conv2d(h1, config.df_dim*4, name='dzl_h2_conv')))#16,16,16,248 h3 = lrelu(dl_bn3(conv2d(h2, config.df_dim*8, name='dzl_h3_conv'))) dim3=np.prod(h3.get_shape().as_list()[1:]) h3_flat=tf.reshape(h3, [-1,dim3]) D_labels_logits = linear(h3_flat, config.z_dim, 'dzl_h3_Label') D_labels = tf.nn.tanh(D_labels_logits) variables = tf.contrib.framework.get_variables(vs) return D_labels,variables ================================================ FILE: causal_dcgan/ops.py ================================================ import math import numpy as np import tensorflow as tf from tensorflow.python.framework import ops from utils import * class batch_norm(object): def __init__(self, epsilon=1e-5, momentum = 0.9, name="batch_norm"): with tf.variable_scope(name): self.epsilon = epsilon self.momentum = momentum self.name = name def __call__(self, x, train=True): return tf.contrib.layers.batch_norm(x, decay=self.momentum, updates_collections=None, epsilon=self.epsilon, scale=True, is_training=train, scope=self.name) def conv_cond_concat(x, y): """Concatenate conditioning vector on feature map axis.""" #print('input x:',x.get_shape().as_list()) #print('input y:',y.get_shape().as_list()) xshape=x.get_shape() #tile by [1,64,64,1] tile_shape=tf.stack([1,xshape[1],xshape[2],1]) tile_y=tf.tile(y,tile_shape) #print('tile y:',tile_y.get_shape().as_list()) return tf.concat([x,tile_y],axis=3) #x_shapes = x.get_shape() #y_shapes = y.get_shape() #return tf.concat([ #x, y*tf.ones([x_shapes[0], x_shapes[1], x_shapes[2], y_shapes[3]])], 3) def conv2d(input_, output_dim, k_h=5, k_w=5, d_h=2, d_w=2, stddev=0.02, name="conv2d"): with tf.variable_scope(name): w = tf.get_variable('w', [k_h, k_w, input_.get_shape()[-1], output_dim], initializer=tf.truncated_normal_initializer(stddev=stddev)) conv = tf.nn.conv2d(input_, w, strides=[1, d_h, d_w, 1], padding='SAME') biases = tf.get_variable('biases', [output_dim], initializer=tf.constant_initializer(0.0)) #conv = tf.reshape(tf.nn.bias_add(conv, biases), conv.get_shape()) conv=tf.nn.bias_add(conv,biases) return conv def deconv2d(input_, output_shape, k_h=5, k_w=5, d_h=2, d_w=2, stddev=0.02, name="deconv2d", with_w=False): with tf.variable_scope(name): # filter : [height, width, output_channels, in_channels] w = tf.get_variable('w', [k_h, k_w, output_shape[-1], input_.get_shape()[-1]], initializer=tf.random_normal_initializer(stddev=stddev)) tf_output_shape=tf.stack(output_shape) deconv = tf.nn.conv2d_transpose(input_, w, output_shape=tf_output_shape, strides=[1, d_h, d_w, 1]) biases = tf.get_variable('biases', [output_shape[-1]], initializer=tf.constant_initializer(0.0)) #deconv = tf.reshape(tf.nn.bias_add(deconv, biases), deconv.get_shape()) deconv = tf.reshape(tf.nn.bias_add(deconv, biases), tf_output_shape) if with_w: return deconv, w, biases else: return deconv def lrelu(x,leak=0.2,name='lrelu'): with tf.variable_scope(name): f1=0.5 * (1+leak) f2=0.5 * (1-leak) return f1*x + f2*tf.abs(x) #This takes more memory than above #def lrelu(x, leak=0.2, name="lrelu"): # return tf.maximum(x, leak*x) def linear(input_, output_size, scope=None, stddev=0.02, bias_start=0.0, with_w=False): shape = input_.get_shape().as_list() #mat_shape=tf.stack([tf.shape(input_)[1],output_size]) mat_shape=[shape[1],output_size] with tf.variable_scope(scope or "Linear"): #matrix = tf.get_variable("Matrix", [shape[1], output_size], tf.float32, matrix = tf.get_variable("Matrix", mat_shape, tf.float32, tf.random_normal_initializer(stddev=stddev)) bias = tf.get_variable("bias", [output_size], initializer=tf.constant_initializer(bias_start)) if with_w: return tf.matmul(input_, matrix) + bias, matrix, bias else: return tf.matmul(input_, matrix) + bias #minibatch method that improves on openai #because it doesn't fix batchsize: #TODO: recheck when not sleepy def add_minibatch_features(image,df_dim): shape = image.get_shape().as_list() dim = np.prod(shape[1:]) # dim = prod(9,2) = 18 h_mb0 = lrelu(conv2d(image, df_dim, name='d_mb0_conv')) h_mb1 = conv2d(h_mb0, df_dim, name='d_mbh1_conv') dims=h_mb1.get_shape().as_list() conv_dims=np.prod(dims[1:]) image_ = tf.reshape(h_mb1, tf.stack([-1, conv_dims])) #image_ = tf.reshape(h_mb1, tf.stack([batch_size, -1])) n_kernels = 300 dim_per_kernel = 50 x = linear(image_, n_kernels * dim_per_kernel,'d_mbLinear') act = tf.reshape(x, (-1, n_kernels, dim_per_kernel)) act= tf.reshape(x, (-1, n_kernels, dim_per_kernel)) act_tp=tf.transpose(act, [1,2,0]) #bs x n_ker x dim_ker x bs -> bs x n_ker x bs : abs_dif = tf.reduce_sum(tf.abs(tf.expand_dims(act, 3) - tf.expand_dims(act_tp, 0)), 2) eye=tf.expand_dims( tf.eye( tf.shape(abs_dif)[0] ), 1)#bs x 1 x bs masked=tf.exp(-abs_dif) - eye f1=tf.reduce_mean( masked, 2) mb_features = tf.reshape(f1, [-1, 1, 1, n_kernels]) return conv_cond_concat(image, mb_features) ## following is from https://github.com/openai/improved-gan/blob/master/imagenet/discriminator.py#L88 #def add_minibatch_features(image,df_dim,batch_size): # shape = image.get_shape().as_list() # dim = np.prod(shape[1:]) # dim = prod(9,2) = 18 # h_mb0 = lrelu(conv2d(image, df_dim, name='d_mb0_conv')) # h_mb1 = conv2d(h_mb0, df_dim, name='d_mbh1_conv') # # dims=h_mb1.get_shape().as_list() # conv_dims=np.prod(dims[1:]) # # image_ = tf.reshape(h_mb1, tf.stack([-1, conv_dims])) # #image_ = tf.reshape(h_mb1, tf.stack([batch_size, -1])) # # n_kernels = 300 # dim_per_kernel = 50 # x = linear(image_, n_kernels * dim_per_kernel,'d_mbLinear') # activation = tf.reshape(x, (batch_size, n_kernels, dim_per_kernel)) # big = np.zeros((batch_size, batch_size), dtype='float32') # big += np.eye(batch_size) # big = tf.expand_dims(big, 1) # abs_dif = tf.reduce_sum(tf.abs(tf.expand_dims(activation, 3) - tf.expand_dims(tf.transpose(activation, [1, 2, 0]), 0)), 2) # mask = 1. - big # masked = tf.exp(-abs_dif) * mask # f1 = tf.reduce_sum(masked, 2) / tf.reduce_sum(mask) # mb_features = tf.reshape(f1, [batch_size, 1, 1, n_kernels]) # return conv_cond_concat(image, mb_features) ================================================ FILE: causal_dcgan/utils.py ================================================ """ Some codes from https://github.com/Newmu/dcgan_code """ from __future__ import division import math import json import random import pprint import scipy.misc import numpy as np from time import gmtime, strftime from six.moves import xrange import os pp = pprint.PrettyPrinter() get_stddev = lambda x, k_h, k_w: 1/math.sqrt(k_w*k_h*x.get_shape()[-1]) def get_image(image_path, input_height, input_width, resize_height=64, resize_width=64, is_crop=True, is_grayscale=False): image = imread(image_path, is_grayscale) return transform(image, input_height, input_width, resize_height, resize_width, is_crop) def save_images(images, size, image_path): return imsave(inverse_transform(images), size, image_path) def imread(path, is_grayscale = False): if (is_grayscale): return scipy.misc.imread(path, flatten = True).astype(np.float) else: return scipy.misc.imread(path).astype(np.float) def merge_images(images, size): return inverse_transform(images) def merge(images, size): h, w = images.shape[1], images.shape[2] img = np.zeros((h * size[0], w * size[1], 3)) for idx, image in enumerate(images): i = idx % size[1] j = idx // size[1] img[j*h:j*h+h, i*w:i*w+w, :] = image return img def imsave(images, size, path): return scipy.misc.imsave(path, merge(images, size)) def center_crop(x, crop_h, crop_w, resize_h=64, resize_w=64): if crop_w is None: crop_w = crop_h h, w = x.shape[:2] j = int(round((h - crop_h)/2.)) i = int(round((w - crop_w)/2.)) return scipy.misc.imresize( x[j:j+crop_h, i:i+crop_w], [resize_h, resize_w]) def transform(image, input_height, input_width, resize_height=64, resize_width=64, is_crop=True): if is_crop: cropped_image = center_crop( image, input_height, input_width, resize_height, resize_width) else: cropped_image = scipy.misc.imresize(image, [resize_height, resize_width]) return np.array(cropped_image)/127.5 - 1. def inverse_transform(images): return (images+1.)/2. def to_json(output_path, *layers): with open(output_path, "w") as layer_f: lines = "" for w, b, bn in layers: layer_idx = w.name.split('/')[0].split('h')[1] B = b.eval() if "lin/" in w.name: W = w.eval() depth = W.shape[1] else: W = np.rollaxis(w.eval(), 2, 0) depth = W.shape[0] biases = {"sy": 1, "sx": 1, "depth": depth, "w": ['%.2f' % elem for elem in list(B)]} if bn != None: gamma = bn.gamma.eval() beta = bn.beta.eval() gamma = {"sy": 1, "sx": 1, "depth": depth, "w": ['%.2f' % elem for elem in list(gamma)]} beta = {"sy": 1, "sx": 1, "depth": depth, "w": ['%.2f' % elem for elem in list(beta)]} else: gamma = {"sy": 1, "sx": 1, "depth": 0, "w": []} beta = {"sy": 1, "sx": 1, "depth": 0, "w": []} if "lin/" in w.name: fs = [] for w in W.T: fs.append({"sy": 1, "sx": 1, "depth": W.shape[0], "w": ['%.2f' % elem for elem in list(w)]}) lines += """ var layer_%s = { "layer_type": "fc", "sy": 1, "sx": 1, "out_sx": 1, "out_sy": 1, "stride": 1, "pad": 0, "out_depth": %s, "in_depth": %s, "biases": %s, "gamma": %s, "beta": %s, "filters": %s };""" % (layer_idx.split('_')[0], W.shape[1], W.shape[0], biases, gamma, beta, fs) else: fs = [] for w_ in W: fs.append({"sy": 5, "sx": 5, "depth": W.shape[3], "w": ['%.2f' % elem for elem in list(w_.flatten())]}) lines += """ var layer_%s = { "layer_type": "deconv", "sy": 5, "sx": 5, "out_sx": %s, "out_sy": %s, "stride": 2, "pad": 1, "out_depth": %s, "in_depth": %s, "biases": %s, "gamma": %s, "beta": %s, "filters": %s };""" % (layer_idx, 2**(int(layer_idx)+2), 2**(int(layer_idx)+2), W.shape[0], W.shape[3], biases, gamma, beta, fs) layer_f.write(" ".join(lines.replace("'","").split())) def make_gif(images, fname, duration=2, true_image=False): import moviepy.editor as mpy def make_frame(t): try: x = images[int(len(images)/duration*t)] except: x = images[-1] if true_image: return x.astype(np.uint8) else: return ((x+1)/2*255).astype(np.uint8) clip = mpy.VideoClip(make_frame, duration=duration) clip.write_gif(fname, fps = len(images) / duration) ================================================ FILE: causal_graph.py ================================================ ''' To use a particular causal graph, just specify it here Strings specified have to match *exactly* to keys in attribute text file A graph lists each node and it's parents in pairs A->B, C->D, D->B: [['A',[]], ['B',['A','D']], ['C',[]], ['D',[]]] ''' #A reminder of what labels are available #Make sure to use caps-sensitive correct spelling all_nodes=[ ['5_o_Clock_Shadow',[]], ['Arched_Eyebrows',[]], ['Attractive',[]], ['Bags_Under_Eyes',[]], ['Bald',[]], ['Bangs',[]], ['Big_Lips',[]], ['Big_Nose',[]], ['Black_Hair',[]], ['Blond_Hair',[]], ['Blurry',[]], ['Brown_Hair',[]], ['Bushy_Eyebrows',[]], ['Chubby',[]], ['Double_Chin',[]], ['Eyeglasses',[]], ['Goatee',[]], ['Gray_Hair',[]], ['Heavy_Makeup',[]], ['High_Cheekbones',[]], ['Male',[]], ['Mouth_Slightly_Open',[]], ['Mustache',[]], ['Narrow_Eyes',[]], ['No_Beard',[]], ['Oval_Face',[]], ['Pale_Skin',[]], ['Pointy_Nose',[]], ['Receding_Hairline',[]], ['Rosy_Cheeks',[]], ['Sideburns',[]], ['Smiling',[]], ['Straight_Hair',[]], ['Wavy_Hair',[]], ['Wearing_Earrings',[]], ['Wearing_Hat',[]], ['Wearing_Lipstick',[]], ['Wearing_Necklace',[]], ['Wearing_Necktie',[]], ['Young',[]] ] causal_graphs={ #'complete_all':[ # ['Young',[]], # ['Male',['Young']], # ['Eyeglasses',['Male','Young']], # ['Bald', ['Male','Young','Eyeglasses']], # ['Mustache', ['Male','Young','Eyeglasses','Bald']], # ['Smiling', ['Male','Young','Eyeglasses','Bald','Mustache']], # ['Wearing_Lipstick',['Male','Young','Eyeglasses','Bald','Mustache','Smiling']], # ['Mouth_Slightly_Open',['Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick']], # ['Narrow_Eyes',['Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], # ['5_o_Clock_Shadow',['Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], # ['Arched_Eyebrows',['5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], # ['Attractive',['Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], # ['Bags_Under_Eyes',['Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], # ['Bangs',['Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], # ['Big_Lips',['Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], # ['Big_Nose',['Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], # ['Black_Hair',['Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], # ['Blond_Hair',['Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], # ['Blurry',['Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], # ['Brown_Hair',['Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], # ['Bushy_Eyebrows',['Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], # ['Chubby',['Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], # ['Double_Chin',['Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], #['Goatee',['Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], #['Gray_Hair',['Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], #['Heavy_Makeup',['Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], #['High_Cheekbones',['Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], #['Mouth_Slightly_Open',['High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], #['Mustache',['Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], #['Narrow_Eyes',['Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], #['No_Beard',['Narrow_Eyes','Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], #['Oval_Face',['No_Beard','Narrow_Eyes','Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], #['Pale_Skin',['Oval_Face','No_Beard','Narrow_Eyes','Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], #['Pointy_Nose',['Pale_Skin','Oval_Face','No_Beard','Narrow_Eyes','Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], #['Receding_Hairline',['Pointy_Nose','Pale_Skin','Oval_Face','No_Beard','Narrow_Eyes','Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], #['Rosy_Cheeks',['Receding_Hairline','Pointy_Nose','Pale_Skin','Oval_Face','No_Beard','Narrow_Eyes','Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], #['Sideburns',['Rosy_Cheeks','Receding_Hairline','Pointy_Nose','Pale_Skin','Oval_Face','No_Beard','Narrow_Eyes','Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], #['Smiling',['Sideburns','Rosy_Cheeks','Receding_Hairline','Pointy_Nose','Pale_Skin','Oval_Face','No_Beard','Narrow_Eyes','Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], #['Straight_Hair',['Smiling','Sideburns','Rosy_Cheeks','Receding_Hairline','Pointy_Nose','Pale_Skin','Oval_Face','No_Beard','Narrow_Eyes','Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], #['Wavy_Hair',['Straight_Hair','Smiling','Sideburns','Rosy_Cheeks','Receding_Hairline','Pointy_Nose','Pale_Skin','Oval_Face','No_Beard','Narrow_Eyes','Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], #['Wearing_Earrings',['Wavy_Hair','Straight_Hair','Smiling','Sideburns','Rosy_Cheeks','Receding_Hairline','Pointy_Nose','Pale_Skin','Oval_Face','No_Beard','Narrow_Eyes','Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], #['Wearing_Hat',['Wearing_Earrings','Wavy_Hair','Straight_Hair','Smiling','Sideburns','Rosy_Cheeks','Receding_Hairline','Pointy_Nose','Pale_Skin','Oval_Face','No_Beard','Narrow_Eyes','Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], #['Wearing_Lipstick',['Wearing_Hat','Wearing_Earrings','Wavy_Hair','Straight_Hair','Smiling','Sideburns','Rosy_Cheeks','Receding_Hairline','Pointy_Nose','Pale_Skin','Oval_Face','No_Beard','Narrow_Eyes','Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], #['Wearing_Necklace',['Wearing_Lipstick','Wearing_Hat','Wearing_Earrings','Wavy_Hair','Straight_Hair','Smiling','Sideburns','Rosy_Cheeks','Receding_Hairline','Pointy_Nose','Pale_Skin','Oval_Face','No_Beard','Narrow_Eyes','Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], #['Wearing_Necktie',['Wearing_Necklace','Wearing_Lipstick','Wearing_Hat','Wearing_Earrings','Wavy_Hair','Straight_Hair','Smiling','Sideburns','Rosy_Cheeks','Receding_Hairline','Pointy_Nose','Pale_Skin','Oval_Face','No_Beard','Narrow_Eyes','Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], # ], 'subset1_nodes':[ ['Bald',[]], # ['Blurry',[]], # ['Brown_Hair',[]], # ['Bushy_Eyebrows',[]], # ['Chubby',[]], ['Double_Chin',[]], # ['Eyeglasses',[]], # ['Goatee',[]], # ['Gray_Hair',[]], ['Male',[]], ['Mustache',[]], ['No_Beard',[]], ['Smiling',[]], # ['Straight_Hair',[]], # ['Wavy_Hair',[]], ['Wearing_Earrings',[]], # ['Wearing_Hat',[]], ['Wearing_Lipstick',[]], ['Young',[]] ], 'standard_graph':[ ['Male' , [] ], ['Young' , [] ], ['Smiling', ['Male','Young']] ], 'male_causes_beard':[ ['Male',[]], ['No_Beard',['Male']], ], 'male_causes_mustache':[ ['Male',[]], ['Mustache',['Male']], ], 'mustache_causes_male':[ ['Male',['Mustache']], ['Mustache',[]], ], 'young_causes_gray':[ ['Young',[]], ['Gray_Hair',['Young']], ], 'gray_causes_young':[ ['Young',['Gray_Hair']], ['Gray_Hair',[]], ], 'young_ind_gray':[ ['Young',[]], ['Gray_Hair',[]], ], 'small_causal_graph':[ ['Young',[]], ['Male',[]], ['Mustache', ['Male','Young']], ['Smiling', ['Male','Young']], ['Wearing_Lipstick',['Male','Young']], ['Mouth_Slightly_Open',['Male','Young','Smiling']], ['Narrow_Eyes', ['Male','Young','Smiling']], ], 'big_causal_graph':[ ['Young',[]], ['Male',[]], ['Eyeglasses',['Young']], ['Bald', ['Male','Young']], ['Mustache', ['Male','Young']], ['Smiling', ['Male','Young']], ['Wearing_Lipstick',['Male','Young']], ['Mouth_Slightly_Open',['Young','Smiling']], ['Narrow_Eyes', ['Male','Young','Smiling']], ], 'complete_big_causal_graph':[ ['Young',[]], ['Male',['Young']], ['Eyeglasses',['Male','Young']], ['Bald', ['Male','Young','Eyeglasses']], ['Mustache', ['Male','Young','Eyeglasses','Bald']], ['Smiling', ['Male','Young','Eyeglasses','Bald','Mustache']], ['Wearing_Lipstick',['Male','Young','Eyeglasses','Bald','Mustache','Smiling']], ['Mouth_Slightly_Open',['Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick']], ['Narrow_Eyes',['Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']], ], 'reverse_complete_big_causal_graph':[ ['Narrow_Eyes', []], ['Mouth_Slightly_Open',['Narrow_Eyes']], ['Wearing_Lipstick', ['Narrow_Eyes','Mouth_Slightly_Open']], ['Smiling', ['Narrow_Eyes','Mouth_Slightly_Open','Wearing_Lipstick']], ['Mustache', ['Narrow_Eyes','Mouth_Slightly_Open','Wearing_Lipstick','Smiling']], ['Bald', ['Narrow_Eyes','Mouth_Slightly_Open','Wearing_Lipstick','Smiling','Mustache']], ['Eyeglasses', ['Narrow_Eyes','Mouth_Slightly_Open','Wearing_Lipstick','Smiling','Mustache','Bald']], ['Male', ['Narrow_Eyes','Mouth_Slightly_Open','Wearing_Lipstick','Smiling','Mustache','Bald','Eyeglasses']], ['Young', ['Narrow_Eyes','Mouth_Slightly_Open','Wearing_Lipstick','Smiling','Mustache','Bald','Eyeglasses','Male']], ], 'indep_big_causal_graph':[ ['Young',[]], ['Male',[]], ['Eyeglasses',[]], ['Bald', []], ['Mustache', []], ['Smiling', []], ['Wearing_Lipstick',[]], ['Mouth_Slightly_Open',[]], ['Narrow_Eyes', []], ], 'complete_minimal_graph':[ ['Young',[]], ['Male',['Young']], ['Mustache', ['Male','Young']], ['Wearing_Lipstick',['Male','Young','Mustache']], ['Smiling', ['Male','Young','Mustache','Wearing_Lipstick']], ], 'male_ind_mustache ': [ ['Male',[]], ['Mustache',[]] ], 'Smiling_MSO ': [ ['Smiling',[]], ['Mouth_Slightly_Open',['Smiling']] ], 'Male_Young_Eyeglasses':[ ['Male',[]], ['Young',[]], ['Eyeglasses',['Male','Young']] ], 'MYESO':[ ['Male',[]], ['Young',['Male']], ['Eyeglasses',['Male','Young']], ['Smiling',['Male','Young','Eyeglasses']], ['Mouth_Slightly_Open',['Male','Young','Eyeglasses','Smiling']], ], 'mustache':[ ['Mustache',[]] ], 'male_ind_mustache ': [ ['Male',[]], ['Mustache',[]] ], 'male_smiling_lipstick':[ ['Male' , []], ['Wearing_Lipstick' , ['Male']], ['Smiling', ['Male']] ], 'SLM':[ ['Smiling' , []], ['Wearing_Lipstick' , ['Smiling']], ['Male', ['Smiling','Wearing_Lipstick']] ], 'MLS':[ ['Male' , []], ['Wearing_Lipstick' , ['Male']], ['Smiling', ['Male','Wearing_Lipstick']] ], 'M':[ ['Male',[]] ], 'Smiling_MSO ': [ ['Smiling',[]], ['Mouth_Slightly_Open',['Smiling']] ], 'MYESO':[ ['Male',[]], ['Young',['Male']], ['Eyeglasses',['Male','Young']], ['Smiling',['Male','Young','Eyeglasses']], ['Mouth_Slightly_Open',['Male','Young','Eyeglasses','Smiling']], ], 'MSO_smiling ': [ ['Smiling',['Mouth_Slightly_Open']], ['Mouth_Slightly_Open',[]] ], 'Male_Young_Eyeglasses ': [ ['Male',[]], ['Young',[]], ['Eyeglasses',['Male','Young']] ], 'Male_Young_Eyeglasses_complete ': [ ['Male',[]], ['Young',['Male']], ['Eyeglasses',['Male','Young']] ], 'male_mustache_lipstick':[ ['Male' , []], ['Mustache', ['Male']], ['Wearing_Lipstick' , ['Male','Mustache']] ] } def get_causal_graph(causal_model=None,*args,**kwargs): #define complete_all list_nodes,_=zip(*all_nodes) complete_all=[] so_far=[] for node in list_nodes: complete_all.append([node,so_far[:]]) so_far.append(node) causal_graphs['complete_all']=complete_all if not causal_model in causal_graphs.keys(): raise ValueError('the specified graph:',causal_model,' was not one of\ those listed in ',__file__) else: return causal_graphs[causal_model] ================================================ FILE: config.py ================================================ from __future__ import print_function import argparse def str2bool(v): #return (v is True) or (v.lower() in ('true', '1')) return v is True or v.lower() in ('true', '1') arg_lists = [] parser = argparse.ArgumentParser() def add_argument_group(name): arg = parser.add_argument_group(name) arg_lists.append(arg) return arg # Data data_arg = add_argument_group('Data') #data_arg.add_argument('--batch_size', type=int, default=16)#default set elsewhere data_arg.add_argument('--causal_model', type=str, help='''Matches the argument with a key in ./causal_graph.py and sets the graph attribute of cc_config to be a list of lists defining the causal graph''') data_arg.add_argument('--data_dir', type=str, default='data') data_arg.add_argument('--dataset', type=str, default='celebA') data_arg.add_argument('--do_shuffle', type=str2bool, default=True)#never used data_arg.add_argument('--input_scale_size', type=int, default=64, help='input image will be resized with the given value as width and height') data_arg.add_argument('--is_crop', type=str2bool, default='True') data_arg.add_argument('--grayscale', type=str2bool, default=False)#never used data_arg.add_argument('--split', type=str, default='train')#never used data_arg.add_argument('--num_worker', type=int, default=24, help='number of threads to use for loading and preprocessing data') data_arg.add_argument('--resize_method',type=str,default='AREA',choices=['AREA','BILINEAR','BICUBIC','NEAREST_NEIGHBOR'], help='''methods to resize image to 64x64. AREA seems to work best, possibly some scipy methods could work better. It wasn't clear to me why the results should be so different''') # Training / test parameters train_arg = add_argument_group('Training') train_arg.add_argument('--build_train', type=str2bool, default=False, help='''You may want to build all the components for training, without doing any training right away. This is for that. This arg is effectively True when is_train=True''') train_arg.add_argument('--build_pretrain', type=str2bool, default=False, help='''You may want to build all the components for training, without doing any training right away. This is for that. This arg is effectively True when is_pretrain=True''') train_arg.add_argument('--model_type',type=str,default='',choices=['dcgan','began'], help='''Which model to use. If the argument is not passed, only causal_controller is built. This overrides is_train=True, since no image model to train''') train_arg.add_argument('--use_gpu', type=str2bool, default=True) train_arg.add_argument('--num_gpu', type=int, default=1, help='specify 0 for cpu. If k specified, will default to\ first k of n detected. If use_gpu=True but num_gpu not\ specified will default to 1') # Misc misc_arg = add_argument_group('Misc') #misc_arg.add_argument('--build_all', type=str2bool, default=False, # help='''normally specifying is_pretrain=False will cause # the pretraining components not to be built and likewise # with is_train=False only the pretrain compoenent will # (possibly) be built. This is here as a debug helper to # enable building out the whole model without doing any # training''') misc_arg.add_argument('--descrip', type=str, default='',help=''' Only use this when creating a new model. New model folder names are generated automatically by using the time-date. Then you cant rename them while the model is running. If provided, this is a short string that appends to the end of a model folder name to help keep track of what the contents of that folder were without getting into the content of that folder. No weird characters''') misc_arg.add_argument('--dry_run', action='store_true',help='''Build and load the model and all the specified components, but don't actually do any pretraining/training etc. This overrides --is_pretrain, --is_train. This is mostly used for just bringing the model into the workspace if you say wanted to manipulated it in ipython''') misc_arg.add_argument('--load_path', type=str, default='', help='''This is a "global" load path. You can simply pass the model_dir of the whatever run, and all the variables (dcgan/began and causal_controller both). If you want to just load one component: for example, the pretrained part of a previous model, use pt_load_path from the causal_controller.config section''') misc_arg.add_argument('--log_step', type=int, default=100, help='''this is used for generic summaries that are common to both models. Use model specific config files for logging done within train_step''') #misc_arg.add_argument('--save_step', type=int, default=5000) misc_arg.add_argument('--log_level', type=str, default='INFO', choices=['INFO', 'DEBUG', 'WARN']) misc_arg.add_argument('--log_dir', type=str, default='logs', help='''where to store model and model results. Do not put a leading "./" out front''') #misc_arg.add_argument('--sample_per_image', type=int, default=64, # help='# of sample per image during test sample generation') misc_arg.add_argument('--seed', type=int, default=22,help= '''Not working right now: TF seed should be fixed to make sure exogenous noise for each causal node is fixed also''') #Doesn't do anything atm #misc_arg.add_argument('--visualize', action='store_true') def gpu_logic(config): #consistency between use_gpu and num_gpu if config.num_gpu>0: config.use_gpu=True else: config.use_gpu=False # if config.use_gpu and config.num_gpu==0: # config.num_gpu=1 return config def get_config(): config, unparsed = parser.parse_known_args() config=gpu_logic(config) config.num_devices=max(1,config.num_gpu)#that are used in backprop #Just for BEGAN: ##this has to respect gpu/cpu ##data_format = 'NCHW' #if config.use_gpu: # data_format = 'NCHW' #else: # data_format = 'NHWC' #setattr(config, 'data_format', data_format) print('Loaded ./config.py') return config, unparsed if __name__=='__main__': #for debug of config config, unparsed = get_config() ================================================ FILE: data_loader.py ================================================ import os import numpy as np import pandas as pd from PIL import Image from glob import glob import tensorflow as tf from IPython.core import debugger debug = debugger.Pdb().set_trace def logodds(p): return np.log(p/(1.-p)) class DataLoader(object): '''This loads the image and the labels through a tensorflow queue. All of the labels are loaded regardless of what is specified in graph, because this model is gpu throttled anyway so there shouldn't be any overhead For multiple gpu, the strategy here is to have 1 queue with 2xbatch_size then use tf.split within trainer.train() ''' def __init__(self,label_names,config): self.label_names=label_names self.config=config self.scale_size=config.input_scale_size #self.data_format=config.data_format self.split=config.split self.do_shuffle=config.do_shuffle self.num_worker=config.num_worker self.is_crop=config.is_crop self.is_grayscale=config.grayscale attr_file= glob("{}/*.{}".format(config.data_path, 'txt'))[0] setattr(config,'attr_file',attr_file) attributes = pd.read_csv(config.attr_file,delim_whitespace=True) #+-1 #Store all labels for reference self.all_attr= 0.5*(attributes+1)# attributes is {0,1} self.all_label_means=self.all_attr.mean() #but only return desired labels in queues self.attr=self.all_attr[label_names] self.label_means=self.attr.mean()# attributes is 0,1 self.image_dir=os.path.join(config.data_path,'images') self.filenames=[os.path.join(self.image_dir,j) for j in self.attr.index] self.num_examples_per_epoch=len(self.filenames) self.min_fraction_of_examples_in_queue=0.001#go faster during debug #self.min_fraction_of_examples_in_queue=0.01 self.min_queue_examples=int(self.num_examples_per_epoch*self.min_fraction_of_examples_in_queue) def get_label_queue(self,batch_size): tf_labels = tf.convert_to_tensor(self.attr.values, dtype=tf.uint8)#0,1 with tf.name_scope('label_queue'): uint_label=tf.train.slice_input_producer([tf_labels])[0] label=tf.to_float(uint_label) #All labels, not just those in causal_model dict_data={sl:tl for sl,tl in zip(self.label_names,tf.split(label,len(self.label_names)))} num_preprocess_threads = max(self.num_worker-3,1) data_batch = tf.train.shuffle_batch( dict_data, batch_size=batch_size, num_threads=num_preprocess_threads, capacity=self.min_queue_examples + 3 * batch_size, min_after_dequeue=self.min_queue_examples, ) return data_batch def get_data_queue(self,batch_size): image_files = tf.convert_to_tensor(self.filenames, dtype=tf.string) tf_labels = tf.convert_to_tensor(self.attr.values, dtype=tf.uint8) with tf.name_scope('filename_queue'): #must be list str_queue=tf.train.slice_input_producer([image_files,tf_labels]) img_filename, uint_label= str_queue img_contents=tf.read_file(img_filename) image = tf.image.decode_jpeg(img_contents, channels=3) image=tf.cast(image,dtype=tf.float32) if self.config.is_crop:#use dcgan cropping #dcgan center-crops input to 108x108, outputs 64x64 #centrally crops it #We emulate that here image=tf.image.resize_image_with_crop_or_pad(image,108,108) #image=tf.image.resize_bilinear(image,[scale_size,scale_size])#must be 4D resize_method=getattr(tf.image.ResizeMethod,self.config.resize_method) image=tf.image.resize_images(image,[self.scale_size,self.scale_size], method=resize_method) #Some dataset enlargement. Might as well. image=tf.image.random_flip_left_right(image) ##carpedm-began crops to 128x128 starting at (50,25), then resizes to 64x64 #image=tf.image.crop_to_bounding_box(image, 50, 25, 128, 128) #image=tf.image.resize_nearest_neighbor(image, [scale_size, scale_size]) tf.summary.image('real_image',tf.expand_dims(image,0)) label=tf.to_float(uint_label) #Creates a dictionary {'Male',male_tensor, 'Young',young_tensor} etc.. dict_data={sl:tl for sl,tl in zip(self.label_names,tf.split(label,len(self.label_names)))} assert not 'x' in dict_data.keys()#don't have a label named "x" dict_data['x']=image print ('Filling queue with %d Celeb images before starting to train. ' 'I don\'t know how long this will take' % self.min_queue_examples) num_preprocess_threads = max(self.num_worker,1) data_batch = tf.train.shuffle_batch( dict_data, batch_size=batch_size, num_threads=num_preprocess_threads, capacity=self.min_queue_examples + 3 * batch_size, min_after_dequeue=self.min_queue_examples, ) return data_batch ================================================ FILE: download.py ================================================ """ Modification of https://github.com/carpedm20/BEGAN-tensorflow/blob/master/download.py """ from __future__ import print_function import os import zipfile import requests import subprocess from tqdm import tqdm from collections import OrderedDict def download_file_from_google_drive(id, destination): URL = "https://docs.google.com/uc?export=download" session = requests.Session() response = session.get(URL, params={ 'id': id }, stream=True) token = get_confirm_token(response) if token: params = { 'id' : id, 'confirm' : token } response = session.get(URL, params=params, stream=True) save_response_content(response, destination) def get_confirm_token(response): for key, value in response.cookies.items(): if key.startswith('download_warning'): return value return None def save_response_content(response, destination, chunk_size=32*1024): total_size = int(response.headers.get('content-length', 0)) with open(destination, "wb") as f: for chunk in tqdm(response.iter_content(chunk_size), total=total_size, unit='B', unit_scale=True, desc=destination): if chunk: # filter out keep-alive new chunks f.write(chunk) def unzip(filepath): print("Extracting: " + filepath) base_path = os.path.dirname(filepath) with zipfile.ZipFile(filepath) as zf: zf.extractall(base_path) os.remove(filepath) def download_celeb_a(base_path): data_path = os.path.join(base_path, 'celebA') images_path = os.path.join(data_path, 'images') if os.path.exists(data_path): print('[!] Found celeb-A - skip') return filename, drive_id = "img_align_celeba.zip", "0B7EVK8r0v71pZjFTYXZWM3FlRnM" save_path = os.path.join(base_path, filename) if os.path.exists(save_path): print('[*] {} already exists'.format(save_path)) else: download_file_from_google_drive(drive_id, save_path) zip_dir = '' with zipfile.ZipFile(save_path) as zf: zip_dir = zf.namelist()[0] zf.extractall(base_path) if not os.path.exists(data_path): os.mkdir(data_path) os.rename(os.path.join(base_path, "img_align_celeba"), images_path) os.remove(save_path) download_attr_file(data_path) def download_attr_file(data_path): attr_gdID='0B7EVK8r0v71pblRyaVFSWGxPY0U' attr_fname=os.path.join(data_path,'list_attr_celeba.txt') download_file_from_google_drive(attr_gdID, attr_fname) delete_top_line(attr_fname)#make pandas readable #Top line was just an integer saying how many samples there were def prepare_data_dir(path = './data'): if not os.path.exists(path): os.mkdir(path) # check, if file exists, make link def check_link(in_dir, basename, out_dir): in_file = os.path.join(in_dir, basename) if os.path.exists(in_file): link_file = os.path.join(out_dir, basename) rel_link = os.path.relpath(in_file, out_dir) os.symlink(rel_link, link_file) def add_splits(base_path): data_path = os.path.join(base_path, 'celebA') images_path = os.path.join(data_path, 'images') train_dir = os.path.join(data_path, 'splits', 'train') valid_dir = os.path.join(data_path, 'splits', 'valid') test_dir = os.path.join(data_path, 'splits', 'test') if not os.path.exists(train_dir): os.makedirs(train_dir) if not os.path.exists(valid_dir): os.makedirs(valid_dir) if not os.path.exists(test_dir): os.makedirs(test_dir) # these constants based on the standard celebA splits NUM_EXAMPLES = 202599 TRAIN_STOP = 162770 VALID_STOP = 182637 for i in range(0, TRAIN_STOP): basename = "{:06d}.jpg".format(i+1) check_link(images_path, basename, train_dir) for i in range(TRAIN_STOP, VALID_STOP): basename = "{:06d}.jpg".format(i+1) check_link(images_path, basename, valid_dir) for i in range(VALID_STOP, NUM_EXAMPLES): basename = "{:06d}.jpg".format(i+1) check_link(images_path, basename, test_dir) def delete_top_line(txt_fname): lines=open(txt_fname,'r').readlines() open(txt_fname,'w').writelines(lines[1:]) if __name__ == '__main__': base_path = './data' prepare_data_dir() download_celeb_a(base_path) add_splits(base_path) ================================================ FILE: figure_scripts/__init__.py ================================================ ================================================ FILE: figure_scripts/distributions.py ================================================ import tensorflow as tf import numpy as np import os import scipy.misc import numpy as np import pandas as pd from tqdm import trange,tqdm import pandas as pd from itertools import combinations, product import sys from utils import save_figure_images,make_sample_dir,guess_model_step from sample import get_joint,sample def get_pdf(model, do_dict=None,cond_dict=None,name='',N=6400,return_discrete=True,step=''): str_step=str(step) or guess_model_step(model) joint=get_joint(model,int_do_dict=do_dict,int_cond_dict=cond_dict,N=N,return_discrete=return_discrete) sample_dir=make_sample_dir(model) if name: name+='_' f_pdf=os.path.join(sample_dir,str_step+name+'dist'+'.csv') pdf=pd.DataFrame.from_dict({k:val.mean() for k,val in joint.items()}) #print 'get pdf cond_dict:',cond_dict if not do_dict and not cond_dict: data=model.attr.mean() pdf['data']=data if not do_dict and cond_dict: bool_cond=np.logical_and.reduce([model.attr[k]==v for k,v in cond_dict.items()]) attr=model.attr[bool_cond] pdf['data']=attr.mean() print 'Writing to file',f_pdf pdf.to_csv(f_pdf) return pdf TINY=1e-6 def get_interv_table(model,intrv=True): n_batches=25 table_outputs=[] d_vals=np.linspace(TINY,0.6,n_batches) for name in model.cc.node_names: outputs=[] for d_val in d_vals: do_dict={model.cc.node_dict[name].label_logit : d_val*np.ones((model.batch_size,1))} outputs.append(model.sess.run(model.fake_labels,do_dict)) out=np.vstack(outputs) table_outputs.append(out) table=np.stack(table_outputs,axis=2) np.mean(np.round(table),axis=0) return table #dT=pd.DataFrame(index=p_names, data=T, columns=do_names) #T=np.mean(np.round(table),axis=0) #table=get_interv_table(model) def record_interventional(model,step=''): ''' designed for truncated exponential noise. For each node that could be intervened on, sample interventions from the continuous distribution that discrete intervention corresponds to. Collect the joint and output to a csv file ''' make_sample_dir(model) str_step=str(step) if str_step=='': if hasattr(model,'step'): str_step=str( model.sess.run(model.step) )+'_' m=20 do =lambda val: np.linspace(0,val*0.8,m) for name in model.cc.node_names: for int_val,intv in enumerate([do(-1), do(+1)]): do_dict={name:intv} joint=get_joint(model, do_dict=None, N=5,return_discrete=True,step='') lab_df=pd.DataFrame(data=joint['g_fake_label']) dfl_df=pd.DataFrame(data=joint['d_fake_label']) lab_fname=str_step+str(name)+str(int_val)+'.csv' dfl_fname=str_step+str(name)+str(int_val)+'.csv' lab_df.to_csv(lab_fname) dfl_df.to_csv(dfl_fname) #with open(dfl_xtab_fn,'w') as dlf_f, open(lab_xtab_fn,'w') as lab_f: ================================================ FILE: figure_scripts/encode.py ================================================ #from __future__ import print_function import tensorflow as tf #import scipy import scipy.misc import numpy as np from tqdm import trange import os import pandas as pd from itertools import combinations import sys from Causal_controller import * from began.models import GeneratorCNN, DiscriminatorCNN from utils import to_nhwc,read_prepared_uint8_image,make_encode_dir from utils import transform, inverse_transform #dcgan img norm from utils import norm_img, denorm_img #began norm image def var_like_z(z_ten,name): z_dim=z_ten.get_shape().as_list()[-1] return tf.get_variable(name,shape=(1,z_dim)) def noise_like_z(z_ten,name): z_dim=z_ten.get_shape().as_list()[-1] noise=tf.random_uniform([1,z_dim],minval=-1.,maxval=1.,) return noise class Encoder: ''' This is a class where you pass a model, and an image file and it creates more tensorflow variables, along with surrounding saving and summary functionality for encoding that image back into the hidden space using gradient descent ''' model_name = "Encode.model" model_type= 'encoder' summ_col='encoder_summaries' def __init__(self,model,image,image_name=None,max_tr_steps=50000,load_path=''): ''' image is assumed to be a path to a precropped 64x64x3 uint8 image ''' #Some hardcoded defaults here self.log_step=500 self.lr=0.0005 self.max_tr_steps=max_tr_steps self.model=model self.load_path=load_path self.image_name=image_name or os.path.basename(image).replace('.','_') self.encode_dir=make_encode_dir(model,self.image_name) self.model_dir=self.encode_dir#different from self.model.model_dir self.save_dir=os.path.join(self.model_dir,'save') self.sess=self.model.sess#session should already be in progress if model.model_type =='dcgan': self.data_format='NHWC'#Don't change elif model.model_type == 'began': self.data_format=model.data_format#'NCHW' if gpu else: raise Exception('Should not happen. model_type=',model.model_type) #Notation: #self.uint_x/G ; 3D [0,255] #self.x/G ; 4D [-1,1] self.uint_x=read_prepared_uint8_image(image)#x is [0,255] print('Read image shape',self.uint_x.shape) self.x=norm_img(np.expand_dims(self.uint_x,0),self.data_format)#bs=1 #self.x=norm_img(tf.expand_dims(self.uint_x,0),self.data_format)#bs=1 print('Shape after norm:',self.x.get_shape().as_list()) ##All variables created under encoder have uniform init vs=tf.variable_scope('encoder', initializer=tf.random_uniform_initializer(minval=-1.,maxval=1.), dtype=tf.float32) with vs as scope: #avoid creating adams params optimizer = tf.train.GradientDescentOptimizer #optimizer = tf.train.AdamOptimizer self.g_optimizer = optimizer(self.lr) encode_var={n.name:var_like_z(n.z,n.name) for n in model.cc.nodes} encode_var['gen']=var_like_z(model.z_gen,'gen') print 'encode variables created' self.train_var = tf.contrib.framework.get_variables(scope) self.step=tf.Variable(0,name='step') self.var = tf.contrib.framework.get_variables(scope) #all encode vars created by now self.saver = tf.train.Saver(var_list=self.var) print('Summaries will be written to ',self.model_dir) self.summary_writer = tf.summary.FileWriter(self.model_dir) #load or initialize enmodel variables self.init() if model.model_type =='dcgan': self.cc=CausalController(graph=model.graph, input_dict=encode_var, reuse=True) self.fake_labels_logits= tf.concat( self.cc.list_label_logits(),-1 ) self.z_fake_labels=self.fake_labels_logits #self.z_gen = noise_like_z( self.model.z_gen,'en_z_gen') self.z_gen=encode_var['gen'] self.z= tf.concat( [self.z_gen, self.z_fake_labels], axis=1 , name='z') self.G=model.generator( self.z , bs=1, reuse=True) elif model.model_type == 'began': with tf.variable_scope('tower'):#reproduce variable scope self.cc=CausalController(graph=model.graph, input_dict=encode_var, reuse=True) self.fake_labels= tf.concat( self.cc.list_labels(),-1 ) self.fake_labels_logits= tf.concat( self.cc.list_label_logits(),-1 ) #self.z_gen = noise_like_z( self.model.z_gen,'en_z_gen') self.z_gen=encode_var['gen'] self.z= tf.concat( [self.fake_labels, self.z_gen],axis=-1,name='z') self.G,_ = GeneratorCNN( self.z, model.conv_hidden_num, model.channel, model.repeat_num, model.data_format,reuse=True) d_out, self.D_zG, self.D_var = DiscriminatorCNN( self.G, model.channel, model.z_num, model.repeat_num, model.conv_hidden_num, model.data_format,reuse=True) _ , self.D_zX, _ = DiscriminatorCNN( self.x, model.channel, model.z_num, model.repeat_num, model.conv_hidden_num, model.data_format,reuse=True) self.norm_AE_G=d_out #AE_G, AE_x = tf.split(d_out, 2) self.AE_G=denorm_img(self.norm_AE_G, model.data_format) self.aeg_sum=tf.summary.image('encoder/AE_G',self.AE_G) node_summaries=[] for node in self.cc.nodes: with tf.name_scope(node.name): ave_label=tf.reduce_mean(node.label) node_summaries.append(tf.summary.scalar('ave',ave_label)) #unclear how scope with adam param works #with tf.variable_scope('encoderGD') as scope: #use L1 loss #self.g_loss_image = tf.reduce_mean(tf.abs(self.x - self.G)) #use L2 loss #self.g_loss_image = tf.reduce_mean(tf.square(self.x - self.G)) #use autoencoder reconstruction loss #3.1.1 series #self.g_loss_image = tf.reduce_mean(tf.abs(self.x - self.norm_AE_G)) #use L1 in autoencoded space# 3.2 self.g_loss_image = tf.reduce_mean(tf.abs(self.D_zX - self.D_zG)) g_loss_sum=tf.summary.scalar( 'encoder/g_loss_image',\ self.g_loss_image,self.summ_col) self.g_loss= self.g_loss_image self.train_op=self.g_optimizer.minimize(self.g_loss, var_list=self.train_var,global_step=self.step) self.uint_G=tf.squeeze(denorm_img( self.G ,self.data_format))#3D[0,255] gimg_sum=tf.summary.image( 'encoder/Reconstruct',tf.stack([self.uint_x,self.uint_G]),\ max_outputs=2,collections=self.summ_col) #self.summary_op=tf.summary.merge_all(self.summ_col) #self.summary_op=tf.summary.merge_all(self.summ_col) if model.model_type=='dcgan': self.summary_op=tf.summary.merge([g_loss_sum,gimg_sum]+node_summaries) elif model.model_type=='began': self.summary_op=tf.summary.merge([g_loss_sum,gimg_sum,self.aeg_sum]+node_summaries) #print 'encoder summaries:',self.summ_col #print 'encoder summaries:',tf.get_collection(self.summ_col) def init(self): if self.load_path: print 'Attempting to load directly from path:', print self.load_path self.saver.restore(self.sess,self.load_path) else: print 'New ENCODE Model..init new Z parameters' init=tf.variables_initializer(var_list=self.var) print 'Initializing following variables:' for v in self.var: print v.name, v.get_shape().as_list() self.model.sess.run(init) def save(self, step=None): if step is None: step=self.sess.run(self.step) if not os.path.exists(self.save_dir): print 'Creating Directory:',self.save_dir os.makedirs(self.save_dir) savefile=os.path.join(self.save_dir,self.model_name) print 'Saving file:',savefile self.saver.save(self.model.sess,savefile,global_step=step) def train(self, n_step=None): max_step=n_step or self.max_tr_steps if False:#debug print 'a' self.sess.run(self.train_op) print 'b' self.sess.run(self.summary_op) print 'c' self.sess.run(self.g_loss) print 'd' print 'max_step;',max_step for counter in trange(max_step): fetch_dict = { "train_op": self.train_op, } if counter%self.log_step==0: fetch_dict.update({ "summary": self.summary_op, "g_loss": self.g_loss, "global_step":self.step }) result = self.sess.run(fetch_dict) if counter % self.log_step == 0: g_loss=result['g_loss'] step=result['global_step'] self.summary_writer.add_summary(result['summary'],step) self.summary_writer.flush() print("[{}/{}] Reconstr Loss_G: {:.6f}".format(counter,max_step,g_loss)) if counter % (10.*self.log_step) == 0: self.save(step=step) self.save() ##Just for reference## #def load(self, checkpoint_dir): # print(" [*] Reading checkpoints...") # checkpoint_dir = os.path.join(checkpoint_dir, self.model_dir) # ckpt = tf.train.get_checkpoint_state(checkpoint_dir) # if ckpt and ckpt.model_checkpoint_path: # ckpt_name = os.path.basename(ckpt.model_checkpoint_path) # self.saver.restore(self.sess, os.path.join(checkpoint_dir, ckpt_name)) # print(" [*] Success to read {}".format(ckpt_name)) # return True # else: # print(" [*] Failed to find a checkpoint") # return False #def norm_img(image, data_format=None): # image = image/127.5 - 1. # if data_format: # image = to_nhwc(image, data_format) # return image #def transform: # stuff # return np.array(cropped_image)/127.5 - 1. #def denorm_img(norm, data_format): # return tf.clip_by_value(to_nhwc((norm + 1)*127.5, data_format), 0, 255) #def inverse_transform(images): # return (images+1.)/2. #if model.model_name=='began': # fake_labels=model.fake_labels # D_fake_labels=model.D_fake_labels # #result_dir=os.path.join('began',model.model_dir) # result_dir=model.model_dir # if str_step=='': # str_step=str( model.sess.run(model.step) )+'_' # attr=model.attr[list(model.cc.node_names)] #elif model.model_name=='dcgan': # fake_labels=model.fake_labels # D_fake_labels=model.D_labels_for_fake # result_dir=model.checkpoint_dir # attr=0.5*(model.attributes+1) # attr=attr[list(model.cc.names)] ================================================ FILE: figure_scripts/high_level.py ================================================ import tensorflow as tf import numpy as np import os import scipy.misc import numpy as np import pandas as pd from tqdm import trange,tqdm import pandas as pd from itertools import combinations, product import sys from utils import save_figure_images,make_sample_dir,guess_model_step from sample import get_joint,sample,find_logit_percentile ''' This is a file where each function creates a particular figure. No real need for this to be configurable. Just make a new function for each figure This uses functions in sample.py and distribution.py, which are intended to be lower level functions that can be used more generally. ''' def fig1(model, output_folder): ''' This function makes two 2x10 images showing the difference between conditioning and intervening ''' str_step=guess_model_step(model) fname=os.path.join(output_folder,str_step+model.model_type) for key in ['Young','Smiling','Wearing_Lipstick','Male','Mouth_Slightly_Open','Narrow_Eyes']: #for key in ['Mustache','Bald']: #for key in ['Mustache']: print 'Starting ',key, #for key in ['Bald']: p50,n50=find_logit_percentile(model,key,50) do_dict={key:np.repeat([p50],10)} eps=3 cond_dict={key:np.repeat([+eps],10)} out,_=sample(model,do_dict=do_dict) intv_images=out['G'] out,_=sample(model,cond_dict=cond_dict) cond_images=out['G'] images=np.vstack([intv_images,cond_images]) dc_file=fname+'_'+key+'_topdo1_botcond1.pdf' save_figure_images(model.model_type,images,dc_file,size=[2,10]) do_dict={key:np.repeat([p50,n50],10)} cond_dict={key:np.repeat([+eps,-eps],10)} dout,_=sample(model,do_dict=do_dict) cout,_=sample(model,cond_dict=cond_dict) itv_file = fname+'_'+key+'_topdo1_botdo0.pdf' cond_file = fname+'_'+key+'_topcond1_botcond0.pdf' eps=3 save_figure_images(model.model_type,dout['G'],itv_file,size=[2,10]) save_figure_images(model.model_type,cout['G'],cond_file,size=[2,10]) print '..finished ',key #return images,cout['G'],dout['G'] return key ================================================ FILE: figure_scripts/pairwise.py ================================================ from __future__ import print_function import time import tensorflow as tf import os import scipy.misc import numpy as np from tqdm import trange import pandas as pd from itertools import combinations import sys from sample import sample def calc_tvd(label_dict,attr): ''' attr should be a 0,1 pandas dataframe with columns corresponding to label names for example: names=zip(*self.graph)[0] calc_tvd(label_dict,attr[names]) label_dict should be a dictionary key:1d-array of samples ''' ####Calculate Total Variation#### if np.min(attr.values)<0: raise ValueError('calc_tvd received \ attr that may not have been in {0,1}') label_names=label_dict.keys() attr=attr[label_names] df2=attr.drop_duplicates() df2 = df2.reset_index(drop = True).reset_index() df2=df2.rename(columns = {'index':'ID'}) real_data_id=pd.merge(attr,df2) real_counts = pd.value_counts(real_data_id['ID']) real_pdf=real_counts/len(attr) label_list_dict={k:np.round(v.ravel()) for k,v in label_dict.items()} df_dat=pd.DataFrame.from_dict(label_list_dict) dat_id=pd.merge(df_dat,df2,on=label_names,how='left') dat_counts=pd.value_counts(dat_id['ID']) dat_pdf = dat_counts / dat_counts.sum() diff=real_pdf.subtract(dat_pdf, fill_value=0) tvd=0.5*diff.abs().sum() return tvd def crosstab(model,result_dir=None,report_tvd=True,no_save=False,N=500000): ''' This is a script for outputing [0,1/2], [1/2,1] binned pdfs including the marginals and the pairwise comparisons report_tvd is given as optional because it is somewhat time consuming result_dir is where to save the distribution text files. defaults to model.cc.model_dir ''' result_dir=result_dir or model.cc.model_dir result={} n_labels=len(model.cc.nodes) #Not really sure how this should scale #N=1000*n_labels #N=500*n_labels**2#open to ideas that avoid a while loop #N=12000 #tvd will not be reported as low unless N is large #N=500000 #default print('Calculating joint distribution with',) t0=time.time() label_dict=sample(model,fetch_dict=model.cc.label_dict,N=N) print('sampling model N=',N,' times took ',time.time()-t0,'sec') #fake_labels=model.cc.fake_labels str_step=str( model.sess.run(model.cc.step) )+'_' attr=model.data.attr attr=attr[model.cc.node_names] lab_xtab_fn = os.path.join(result_dir,str_step+'glabel_crosstab.txt') print('Writing to files:',lab_xtab_fn) if report_tvd: t0=time.time() tvd=calc_tvd(label_dict,attr) result['tvd']=tvd print('calculating tvd from samples took ',time.time()-t0,'sec') if no_save: return result t0=time.time() joint={} label_joint={} #for name, lab in zip(model.cc.node_names,list_labels): for name, lab in label_dict.items(): joint[name]={ 'g_fake_label':lab } #with open(dfl_xtab_fn,'w') as dlf_f, open(lab_xtab_fn,'w') as lab_f, open(gvsd_xtab_fn,'w') as gldf_f: with open(lab_xtab_fn,'w') as lab_f: if report_tvd: lab_f.write('TVD:'+str(tvd)+'\n\n') lab_f.write('Marginals:\n') #Marginals for name in joint.keys(): lab_f.write('Node: '+name+'\n') true_marg=np.mean((attr[name]>0.5).values) lab_marg=(joint[name]['g_fake_label'] > 0.5).astype('int') lab_f.write(' mean='+str(np.mean(lab_marg))+'\t'+\ 'true mean='+str(true_marg)+'\n') lab_f.write('\n') #Pairs of labels lab_f.write('\nPairwise:\n') for node1,node2 in combinations(joint.keys(),r=2): lab_node1=(joint[node1]['g_fake_label']>0.5).astype('int') lab_node2=(joint[node2]['g_fake_label']>0.5).astype('int') lab_df=pd.DataFrame(data=np.hstack([lab_node1,lab_node2]),columns=[node1,node2]) lab_ct=pd.crosstab(index=lab_df[node1],columns=lab_df[node2],margins=True,normalize=True) true_ct=pd.crosstab(index=attr[node1],columns=attr[node2],margins=True,normalize=True) lab_f.write('\n\tFake:\n') lab_ct.to_csv(lab_xtab_fn,mode='a') lab_f.write( lab_ct.__repr__() ) lab_f.write('\n\tReal:\n') lab_f.write( true_ct.__repr__() ) lab_f.write('\n\n') print('calculating pairwise crosstabs and saving results took ',time.time()-t0,'sec') return result ================================================ FILE: figure_scripts/probability_table.txt ================================================ model: celebA_0627_200239 graph:MLS [img,cc,d_fake_labels,true] P(M=1|S=1) = [0.28, ================================================ FILE: figure_scripts/sample.py ================================================ from __future__ import print_function import tensorflow as tf import numpy as np import os import scipy.misc import numpy as np from tqdm import trange,tqdm import pandas as pd from itertools import combinations, product import sys from utils import save_figure_images#makes grid image plots #convenience functions from utils import make_sample_dir,guess_model_step,infer_grid_image_shape from IPython.core import debugger debug = debugger.Pdb().set_trace def find_logit_percentile(model, key, per): data=[] for _ in range(30): data.append(model.sess.run(model.cc.node_dict[key].label_logit)) D=np.vstack(data) pos_logits,neg_logits=D[D>0], D[D<0] pos_tile = np.percentile(pos_logits,per) neg_tile = np.percentile(neg_logits,100-per) return pos_tile,neg_tile def fixed_label_diversity(model, config,step=''): sample_dir=make_sample_dir(model) str_step=str(step) or guess_model_step(model) N=64#per image n_combo=5#n label combinations #0,1 label combinations fixed_labels=model.attr.sample(n_combo)[model.cc.node_names] size=infer_grid_image_shape(N) for j, fx_label in enumerate(fixed_labels.values): fx_label=np.reshape(fx_label,[1,-1]) fx_label=np.tile(fx_label,[N,1]) do_dict={model.cc.labels: fx_label} images, feed_dict= sample(model, do_dict=do_dict) fx_file=os.path.join(sample_dir, str_step+'fxlab'+str(j)+'.pdf') save_figure_images(model.model_type,images['G'],fx_file,size=size) #which image is what label fixed_labels=fixed_labels.reset_index(drop=True) fixed_labels.to_csv(os.path.join(sample_dir,str_step+'fxlab'+'.csv')) def get_joint(model, int_do_dict=None,int_cond_dict=None, N=6400,return_discrete=True): ''' Returns a dictionary of dataframes of samples. Each dataframe correponds to a different tensor i.e. cc labels, d_labeler labels etc. int_do_dict and int_cond_dict indicate that just a simple +1 or 0 should be passed in ex: int_do_dict={'Wearing_Lipstick':+1} Ex: if intervention=+1 corresponds to logits uniform in [0,0.6], pass np.linspace(0,0.6,n) N is number of batches to sample at each location in logitspace (num_labels dimensional) ''' #values are either +1 or -1 in cond and do dict do_dict,cond_dict={},{} if int_do_dict is not None: for key,value in int_do_dict.items(): #Intervene in the middle of where the model is used to operating print('calculating percentile...') data=[] for _ in range(30): data.append(model.sess.run(model.cc.node_dict[key].label_logit)) D=np.vstack(data) pos_logits,neg_logits=D[D>0], D[D<0] if value == 1: intv = np.percentile(pos_logits,50) elif value == 0: intv = np.percentile(neg_logits,50) else: raise ValueError('pass either +1 or 0') do_dict[key]=np.repeat([intv],N) if int_cond_dict is not None: for key,value in int_cond_dict.items(): eps=3. if value == 1: cond_dict[key]=np.repeat([+eps],N) elif value == 0: cond_dict[key]=np.repeat([-eps],N) else: raise ValueError('pass either +1 or 0') #print 'getjoint: cond_dict:',cond_dict #print 'getjoint: do_dict:',do_dict #Terminology if model.model_type=='began': fake_labels=model.fake_labels D_fake_labels=model.D_fake_labels D_real_labels=model.D_real_labels elif model.model_type=='dcgan': fake_labels=model.fake_labels D_fake_labels=model.D_labels_for_fake D_real_labels=model.D_labels_for_real #fetch_dict={'cc_labels':model.cc.labels} fetch_dict={'d_fake_labels':D_fake_labels, 'cc_labels':model.cc.labels} if model.model_type=='began':#dcgan not fully connected if not cond_dict and not do_dict: #Havent coded conditioning on real data fetch_dict.update({'d_real_labels':D_real_labels}) print('Calculating joint distribution') result,_=sample(model, cond_dict=cond_dict, do_dict=do_dict,N=N, fetch=fetch_dict,return_failures=False) print('fetd keys:',fetch_dict.keys()) result={k:result[k] for k in fetch_dict.keys()} n_labels=len(model.cc.node_names) #list_labels=np.split( result['cfl'],n_labels, axis=1) #list_d_fake_labels=np.split(result['dfl'],n_labels, axis=1) #list_d_real_labels=np.split(result['drl'],n_labels, axis=1) for k in result.keys(): print('valshape',result[k].shape) print('result',result[k]) list_result={k:np.split(val,n_labels, axis=1) for k,val in result.items()} pd_joint={} for key,r in list_result.items(): joint={} for name,val in zip(model.cc.node_names,r): int_val=(val>0.5).astype('int') joint[name]=int_val.ravel() pd_joint[key]=pd.DataFrame.from_dict(joint) return pd_joint for name, lab, dfl in zip(model.cc.node_names,list_labels,list_d_fake_labels): if return_discrete: cfl_val=(lab>0.5).astype('int') dfl_val=(dfl>0.5).astype('int') joint['dfl'][name]=dfl_val joint['cfl'][name]=cfl_val cfl=pd.DataFrame.from_dict( {k:val.ravel() for k,val in joint['cfl'].items()} ) dfl=pd.DataFrame.from_dict( {k:val.ravel() for k,val in joint['cfl'].items()} ) print('get_joint successful') return cfl,dfl #__________ def take_product(do_dict): ''' this function takes some dictionary like: {key1:1, key2:[a,b], key3:[c,d]} and returns the dictionary: {key1:[1,1,1], key2[a,a,b,b,],key3[c,d,c,d]} computing the product of values ''' values=[] for v in do_dict.values(): if hasattr(v,'__iter__'): values.append(v) else: values.append([v])#allows scalar to be passed prod_values=np.vstack(product(*values)) return {k:np.array(v) for k,v in zip(do_dict.keys(),zip(*prod_values))} def chunks(input_dict, chunk_size): """ Yield successive n-sized chunks. Takes a dictionary of iterables and makes an iterable of dictionaries """ if len(input_dict)==0: return [{}] n=chunk_size batches=[] L=len(input_dict.values()[0]) for i in xrange(0, L, n): fd={} n=n- max(0, (i+n) - L )#incase doesn't evenly divide for key,value in input_dict.items(): fd[key]=value[i:i+n] batches.append(fd) return batches def do2feed( do_dict, model, on_logits=True): ''' this contains logit for parsing "do_dict" into a feed dict that can actually be worked with ''' feed_dict={} for key,value in do_dict.items(): if isinstance(key,tf.Tensor): feed_dict[key]=value elif isinstance(key,str): if key in model.cc.node_names: node=model.cc.node_dict[key] if on_logits:# intervene on logits by default feed_dict[node.label_logit]=value else: feed_dict[node.label]=value elif hasattr(model,key): feed_dict[getattr(model,key)]=value else: raise ValueError('string keys must be attributes of either\ model.cc or model. Got string:',key) else: raise ValueError('keys must be tensors or strings but got',type(key)) #Make sure [64,] isn't passed to [64,1] for example for tensor,value in feed_dict.items(): #Make last dims line up: tf_shape=tensor.get_shape().as_list() shape=[len(value)]+tf_shape[1:] try: feed_dict[tensor]=np.reshape(value,shape) except Exception,e: print('Unexpected difficulty reshaping inputs:',tensor.name, tf_shape, len(value), np.size(value)) raise e return feed_dict def cond2fetch( cond_dict=None, model=None, on_logits=True): ''' this contains logit for parsing "cond_dict" into a fetch dict that can actually be worked with. A fetch dict can be passed into the first argument of session.run and therefore has values that are all tensors ''' cond_dict=cond_dict or {} fetch_dict={} for key,value in cond_dict.items(): if isinstance(value,tf.Tensor): fetch_dict[key]=value#Nothing to be done elif isinstance(key,tf.Tensor): fetch_dict[key]=key#strange scenario, but possible elif isinstance(key,str): if key in model.cc.node_names: node=model.cc.node_dict[key] if on_logits:# intervene on logits by default fetch_dict[key]=node.label_logit else: fetch_dict[key]=node.label elif hasattr(model,key): fetch_dict[key]=getattr(model,key) else: raise ValueError('string keys must be attributes of either\ model.cc or model. Got string:',key) else: raise ValueError('keys must be tensors or strings but got',type(key)) return fetch_dict def interpret_dict( a_dict, model,n_times=1, on_logits=True): ''' pass either a do_dict or a cond_dict. The rules for converting arguments to numpy arrays to pass to tensorflow are identical ''' if a_dict is None: return {} elif len(a_dict)==0: return {} if n_times>1: token=tf.placeholder_with_default(2.22) a_dict[token]=-2.22 p_a_dict=take_product(a_dict) ##Need divisible batch_size for most models if len(p_a_dict)>0: L=len(p_a_dict.values()[0]) else: L=0 print("L is " + str(L)) print(p_a_dict) ##Check compatability batch_size and L if L>=model.batch_size: if not L % model.batch_size == 0: raise ValueError('a_dict must be dividable by batch_size\ but instead product of inputs was of length',L) elif model.batch_size % L == 0: p_a_dict = {key:np.repeat(value,model.batch_size/L,axis=0) for key,value in p_a_dict.items()} else: raise ValueError('No. of intervened values must divide batch_size.') return p_a_dict def slice_dict(feed_dict, rows): ''' conditional sampling requires doing only certain indicies depending on the result of the previous iteration. This function takes a feed_dict and "slices" it, returning a dictionary with the same keys, but with values[rows,:] ''' fd_out={} for key,value in feed_dict.iteritems(): fd_out[key]=value[rows] return fd_out def did_succeed( output_dict, cond_dict ): ''' Used in rejection sampling: for each row, determine if cond is satisfied for every cond in cond_dict success is hardcoded as being more extreme than the condition specified ''' test_key=cond_dict.keys()[0] #print('output_dict:',np.squeeze(output_dict[test_key])) #print('cond_dict:',cond_dict[test_key]) #definition success: def is_win(key): cond=np.squeeze(cond_dict[key]) val=np.squeeze(output_dict[key]) cond1=np.sign(val)==np.sign(cond) cond2=np.abs(val)>np.abs(cond) return cond1*cond2 scoreboard=[is_win(key) for key in cond_dict] #print('scoreboard', scoreboard) all_victories_bool=np.logical_and.reduce(scoreboard) return all_victories_bool.flatten() def sample(model, cond_dict=None, do_dict=None, fetch_dict=None,N=None, on_logits=True,return_failures=True): ''' fetch_dict should be a dict of tensors to do sess.run on do_dict is a list of strings or tensors of the form: {'Male':1, model.z_gen:[0,1], model.cc.Smiling:[0.1,0.9]} N is used only if cond_dict and do_dict are None ''' do_dict= do_dict or {} cond_dict= cond_dict or {} fetch_dict=fetch_dict or {'G':model.G} ##Handle the case where len querry doesn't divide batch_size #a_dict=cond_dict or do_dict #if a_dict: # nsamples=len(a_dict.values()[0]) #elif N: # nsamples=N #else: # raise ValueError('either pass a dictionary or N') ##Pad to be batch_size divisible #npad=(64-nsamples)%64 #if npad>0: # print("Warn. nsamples doesnt divide batch_size, pad=",npad) ##N+=npad #if npad>0: # if do_dict: # for k in do_dict.keys(): # keypad=np.tile(do_dict[k][0],[npad]) # do_dict[k]=np.concatenate([do_dict[k],keypad]) # if cond_dict: # for k in cond_dict.keys(): # keypad=np.tile(cond_dict[k][0],[npad]) # cond_dict[k]=np.concatenate([cond_dict[k],keypad]) verbose=False #verbose=True feed_dict = do2feed(do_dict, model, on_logits=on_logits)#{tensor:array} cond_fetch_dict= cond2fetch(cond_dict,model,on_logits=on_logits) #{string:tensor} fetch_dict.update(cond_fetch_dict) #print('actual cond_dict', cond_dict )#{} #print('actual do_dict', do_dict )#{} if verbose: print('feed_dict',feed_dict) print('fetch_dict',fetch_dict) if not cond_dict and do_dict: #Simply do intervention w/o loop if verbose: print('sampler mode:Interventional') #fds=chunks(feed_dict,model.batch_size) fds=chunks(feed_dict,model.default_batch_size) outputs={k:[] for k in fetch_dict.keys()} for fd in fds: out=model.sess.run(fetch_dict, fd) #outputs.append(out['G']) for k,val in out.items(): outputs[k].append(val) for k in outputs.keys(): outputs[k]=np.vstack(outputs[k])[:nsamples] return outputs,feed_dict #return np.vstack(outputs), feed_dict elif not cond_dict and not do_dict: #neither passed, but get N samples assert(N>0) if verbose: print('sampling model N=',N,' times') ##Should be variable batch_size allowed outputs=model.sess.run(fetch_dict,{model.batch_size:N}) ##fds=chunks({'idx':range(npad+N)},model.batch_size) #fds=chunks({'idx':range(npad+N)},model.default_batch_size) #outputs={k:[] for k in fetch_dict.keys()} #for fd in fds: # out=model.sess.run(fetch_dict) # for k,val in out.items(): # outputs[k].append(val) #for k in outputs.keys(): # outputs[k]=np.vstack(outputs[k])[:nsamples] #return outputs, feed_dict return outputs #elif cond_dict and not do_dict: elif cond_dict: #Could also pass do_dict here to be interesting ##Implements rejection sampling if verbose: print('sampler mode:Conditional') print('conddict',cond_dict) rows=np.arange( len(cond_dict.values()[0]))#what idx do we need assert(len(rows)>=model.batch_size)#should already be true. if verbose: print('nrows:',len(rows)) #init max_fail=4000 #max_fail=10000 n_fails=np.zeros_like(rows) remaining_rows=rows.copy() completed_rows=[] bad_rows=set() #null=lambda :[-1 for r in rows] if verbose: print('cond fetch_dict',fetch_dict) outputs={key:[np.zeros(fetch_dict[key].get_shape().as_list()[1:]) for r in rows] for key in fetch_dict} if verbose: print('n keys in outputs:',len(outputs.keys())) #debug() ii=0 while( len(remaining_rows)>0 ): #debug() ii+=1 #loop if not return_failures: if len(completed_rows)>=nsamples: if verbose: print('Have enough for now; breaking') break iter_rows=remaining_rows[:model.batch_size] n_pad = model.batch_size - len(iter_rows) if verbose: print('Iter:',ii, 'to go:',len(iter_rows)) #print('iter_rows:',len(iter_rows),':',iter_rows) #iter_rows.extend( [iter_rows[-1]]*n_pad )#just duplicate pad_iter_rows=list(iter_rows)+ ( [iter_rows[-1]]*n_pad ) iter_rows=np.array(iter_rows) pad_iter_rows=np.array(pad_iter_rows) fed=slice_dict( feed_dict, pad_iter_rows ) cond=slice_dict( cond_dict, pad_iter_rows ) out=model.sess.run(fetch_dict, fed) bool_pass = did_succeed(out,cond)[:len(iter_rows)] if verbose: print('bool_pass:',len(bool_pass),':',bool_pass) pass_idx=iter_rows[bool_pass] fail_idx=iter_rows[~bool_pass] #yuck for key in out: for i,row_pass in enumerate(bool_pass): idx=iter_rows[i] if row_pass: outputs[key][idx]=out[key][i] else: n_fails[idx]+=1 good_rows=set( iter_rows[bool_pass] ) completed_rows.extend(list(good_rows)) #print('good_rows',good_rows) bad_rows=set( rows[ n_fails>=max_fail ] ) #print('bad_rows',bad_rows) for key in out: for idx_giveup in bad_rows: shape=fetch_dict[key].get_shape().as_list()[1:] outputs[key][idx_giveup]=np.zeros(shape) if verbose: print('key:',key,' shape giveup:',shape) ##Remove rows remaining_rows=list( set(remaining_rows)-good_rows-bad_rows ) #debug() if verbose: print('conditioning took',ii,' tries') n_fails.sort() print('10 most fail counts(limit=',max_fail,'):',n_fails[-10:]) if verbose: print('means:') for k in outputs.keys(): for v in outputs[k]: print(np.mean(v)) if not return_failures: #useful for pdf calculations. #not useful for image grids if verbose: print('Not returning failures!..',) for k in outputs.keys(): outputs[k]=[outputs[k][i] for i in completed_rows] if verbose: print('..Returning', len(completed_rows),'/',len(cond_dict.values()[0])) else: for k in outputs.keys(): outputs[k]=outputs[k][:nsamples] for k in outputs.keys(): if verbose: print('tobestacked:',len(outputs[k])) print('tobestacked:',isinstance(outputs[k][0],np.ndarray)) values=outputs[k][:nsamples] if verbose: for v in values: try: print(v.shape) except: print(type(v)) if len(fetch_dict[k].get_shape().as_list())>1: outputs[k]=np.stack(outputs[k]) else: outputs[k]=np.concatenate(outputs[k]) return outputs,cond_dict else: raise Exception('This should not happen') def condition2d( model, cond_dict,cond_dict_name,step='', on_logits=True): ''' Function largely copied from intervention2d with minor changes. This function is a wrapper around the more general function "sample". In this function, the cond_dict is assumed to have only two varying parameters on which a 2d interventions plot can be made. ''' #TODO: Unify function with intervention2d if not on_logits: raise ValueError('on_logits=False not implemented') #Interpret defaults: #n_defaults=len( filter(lambda l:l == 'model_default', cond_dict.values() )) #accept any string for now n_defaults=len( filter(lambda l: isinstance(l,str), cond_dict.values() )) if n_defaults>0: print(n_defaults,' default values given..using 8 for each of them') try: for key,value in cond_dict.items(): if value == 'model_default': print('Warning! using 1/2*model.intervention_range\ to specify the conditioning defaults') cond_min,cond_max=model.intervention_range[key] #cond_dict[key]=np.linspace(cond_min,cond_max,8) cond_dict[key]=[0.5*cond_min,0.5*cond_max] print('Condition dict used:',cond_dict) elif value=='int': #for integer pretrained models #eps=0.1 #usually logits are around 4-20 eps=3 #usually logits are around 4-10 #sigmoid(3) ~ 0.95 cond_dict[key]=np.repeat([+eps,-eps],64) #logit on either size of 0 elif value=='percentile': ##I'm changing this to do 50th percentile #of positive or of negative class print('calculating percentile...') data=[] for _ in range(30): data.append(model.sess.run(model.cc.node_dict[key].label_logit)) D=np.vstack(data) pos_logits,neg_logits=D[D>0], D[D<0] print("Conditioning on 5th percentile") pos_intv = np.percentile(pos_logits,5) neg_intv = np.percentile(neg_logits,95) cond_dict[key]=np.repeat([pos_intv,neg_intv],64) print('percentile5 for',key,'is',np.percentile(D,5)) print('percentile25 for',key,'is',np.percentile(D,25)) print('percentile50 for',key,'is',np.percentile(D,50)) print('percentile75 for',key,'is',np.percentile(D,75)) print('percentile95 for',key,'is',np.percentile(D,95)) #OLD: ##fetch=cond2fetch(cond_dict) #print('...calculating percentile') #data=[] #for _ in range(30): # data.append(model.sess.run(model.cc.node_dict[key].label_logit)) #D=np.vstack(data) #print('dat',D.flatten()) #cond_dict[key]=np.repeat([np.percentile(D,95),np.percentile(D,5)],64) #print('percentiles for',key,'are',[np.percentile(D,5),np.percentile(D,95)]) else: #otherwise pass a number, list, or array assert(not isinstance(value,str)) except Exception, e: raise(e,'Difficulty accessing default model interventions') str_step=str(step) lengths = [ len(v) for v in cond_dict.values() if hasattr(v,'__len__') ] #print('lengths',lengths) print('lengths',lengths) gt_one = filter(lambda l:l>1,lengths) if not 0<=len(gt_one)<=2: raise ValueError('for visualizing intervention, must have < 3 parameters varying') if len(gt_one) == 0: image_dim = np.sqrt(model.batch_size).astype(int) size = [image_dim,image_dim] # if len(gt_one)==1 and lengths[0]>=model.batch_size: # size=[gt_one[0],1] # elif len(gt_one)==1 and lengths[0]0: print(n_defaults,' default values given..using 8 for each of them') try: for key,value in do_dict.items(): if value == 'model_default': itv_min,itv_max=model.intervention_range[key] do_dict[key]=np.linspace(itv_min,itv_max,8) elif value=='int': #for integer pretrained models #eps=0.1 #usually logits are around 4-20 eps=3 #usually logits are around 4-10 #sigmoid(3) ~ 0.95 do_dict[key]=np.repeat([-eps,+eps],64) #logit on either size of 0 elif value=='percentile': ##I'm changing this to do 50th percentile #of positive or of negative class print('calculating percentile...') data=[] for _ in range(30): data.append(model.sess.run(model.cc.node_dict[key].label_logit)) D=np.vstack(data) pos_logits,neg_logits=D[D>0], D[D<0] pos_intv = np.percentile(pos_logits,50) neg_intv = np.percentile(neg_logits,50) do_dict[key]=np.repeat([pos_intv,neg_intv],64) print('percentile5 for',key,'is',np.percentile(D,5)) print('percentile25 for',key,'is',np.percentile(D,25)) print('percentile50 for',key,'is',np.percentile(D,50)) print('percentile75 for',key,'is',np.percentile(D,75)) print('percentile95 for',key,'is',np.percentile(D,95)) else: #otherwise pass a number, list, or array assert(not isinstance(value,str)) except Exception, e: raise(e,'Difficulty accessing default model interventions') str_step=str(step) lengths = [ len(v) for v in do_dict.values() if hasattr(v,'__len__') ] #print('lengths',lengths) print('lengths',lengths) gt_one = filter(lambda l:l>1,lengths) if not 0<=len(gt_one)<=2: raise ValueError('for visualizing intervention, must have < 3 parameters varying') if len(gt_one) == 0: #image_dim = np.sqrt(model.batch_size).astype(int) image_dim = np.sqrt(64).astype(int) size = [image_dim,image_dim] #if len(gt_one)==1 and lengths[0]>=model.batch_size: # size=[gt_one[0],1] #elif len(gt_one)==1 and lengths[0]= nmaps: break h, h_width = y * height + 1 + padding // 2, height - padding w, w_width = x * width + 1 + padding // 2, width - padding grid[h:h+h_width, w:w+w_width] = tensor[k] k = k + 1 return grid def began_save_image(tensor, filename, nrow=8, padding=2, normalize=False, scale_each=False): ndarr = make_grid(tensor, nrow=nrow, padding=padding, normalize=normalize, scale_each=scale_each) im = Image.fromarray(ndarr) im.save(filename) #Dcgan originally get_stddev = lambda x, k_h, k_w: 1/math.sqrt(k_w*k_h*x.get_shape()[-1]) def get_image(image_path, input_height, input_width, resize_height=64, resize_width=64, is_crop=True, is_grayscale=False): image = imread(image_path, is_grayscale) return transform(image, input_height, input_width, resize_height, resize_width, is_crop) def dcgan_save_images(images, size, image_path): return imsave(inverse_transform(images), size, image_path) def imread(path, is_grayscale = False): if (is_grayscale): return scipy.misc.imread(path, flatten = True).astype(np.float) else: return scipy.misc.imread(path).astype(np.float) def merge_images(images, size): return inverse_transform(images) def merge(images, size): h, w = images.shape[1], images.shape[2] img = np.zeros((h * size[0], w * size[1], 3)) for idx, image in enumerate(images): i = idx % size[1] j = idx // size[1] img[j*h:j*h+h, i*w:i*w+w, :] = image return img def imsave(images, size, path): return scipy.misc.imsave(path, merge(images, size)) def center_crop(x, crop_h, crop_w, resize_h=64, resize_w=64): if crop_w is None: crop_w = crop_h h, w = x.shape[:2] j = int(round((h - crop_h)/2.)) i = int(round((w - crop_w)/2.)) return scipy.misc.imresize( x[j:j+crop_h, i:i+crop_w], [resize_h, resize_w]) def transform(image, input_height, input_width, resize_height=64, resize_width=64, is_crop=True): if is_crop: cropped_image = center_crop( image, input_height, input_width, resize_height, resize_width) else: cropped_image = scipy.misc.imresize(image, [resize_height, resize_width]) return np.array(cropped_image)/127.5 - 1. def inverse_transform(images): return (images+1.)/2. ================================================ FILE: main.py ================================================ from __future__ import print_function import numpy as np import os import tensorflow as tf from trainer import Trainer from causal_graph import get_causal_graph from utils import prepare_dirs_and_logger, save_configs #Generic configuration arguments from config import get_config #Submodel specific configurations from causal_controller.config import get_config as get_cc_config from causal_dcgan.config import get_config as get_dcgan_config from causal_began.config import get_config as get_began_config from causal_began import CausalBEGAN from causal_dcgan import CausalGAN from IPython.core import debugger debug = debugger.Pdb().set_trace def get_trainer(): print('tf: resetting default graph!') tf.reset_default_graph()#for repeated calls in ipython ####GET CONFIGURATION#### #TODO:load configurations from previous model when loading previous model ##if load_path: #load config files from dir #except if pt_load_path, get cc_config from before #overwrite is_train, is_pretrain with current args--sort of a mess ##else: config,_=get_config() cc_config,_=get_cc_config() dcgan_config,_=get_dcgan_config() began_config,_=get_began_config() ###SEEDS### np.random.seed(config.seed) #tf.set_random_seed(config.seed) # Not working right now. prepare_dirs_and_logger(config) if not config.load_path: print('saving config because load path not given') save_configs(config,cc_config,dcgan_config,began_config) #Resolve model differences and batch_size if config.model_type: if config.model_type=='dcgan': config.batch_size=dcgan_config.batch_size cc_config.batch_size=dcgan_config.batch_size # make sure the batch size of cc is the same as the image model config.Model=CausalGAN.CausalGAN model_config=dcgan_config if config.model_type=='began': config.batch_size=began_config.batch_size cc_config.batch_size=began_config.batch_size # make sure the batch size of cc is the same as the image model config.Model=CausalBEGAN.CausalBEGAN model_config=began_config else:#no image model model_config=None config.batch_size=cc_config.batch_size if began_config.is_train or dcgan_config.is_train: raise ValueError('need to specify model_type for is_train=True') #Interpret causal_model keyword cc_config.graph=get_causal_graph(config.causal_model) #Builds and loads specified models: trainer=Trainer(config,cc_config,model_config) return trainer def main(trainer): #Do pretraining if trainer.cc_config.is_pretrain: trainer.pretrain_loop() if trainer.model_config: if trainer.model_config.is_train: trainer.train_loop() if __name__ == "__main__": trainer=get_trainer() #make ipython easier sess=trainer.sess cc=trainer.cc if hasattr(trainer,'model'): model=trainer.model main(trainer) tf.logging.set_verbosity(tf.logging.ERROR) ================================================ FILE: synthetic/README.md ================================================ # Causal(BE)GAN in Tensorflow # (test comment) Synthetic Data Figures <> (Tensorflow implementation of [BEGAN: Boundary Equilibrium Generative Adversarial Networks](https://arxiv.org/abs/1703.10717).) Authors' Tensorflow implementation Synthetic portion of [CausalGAN: Learning Implicit Causal Models with Adversarial Training] <>some results files ## Setup. If not already set, make sure that run_datasets.sh is an executable by running $ chmod +x run_datasets.sh ## Usage A single run of main.py trains as many GANs as are in models.py (presently 6) for a single --data_type. This author can fit 3 such runs on a single gpu and conveniently there are 3 datasets considered. $ CUDA_VISIBLE_DEVICES='0' python main.py --data_type=linear Again the tboard.py utility is available to view the most recent model summaries. $ python tboard.py Recovering statistics means averaging over many runs. Mass usage follows the script run_datasets.sh. This bash script will train all GAN models on each of 3 datasets 30 times per dataset. The following will train 2(calls) x 30(loop/call) x 3(datasets/loop) x 6(gan models/dataset)=1080(gan models) $ (open first terminal) $ CUDA_VISIBLE_DEVICES='0' ./run_datasets.sh $ (open second terminal) $ CUDA_VISIBLE_DEVICES='1' ./run_datasets.sh ## Collecting Statistics ## Results ## Authors Christopher Snyder / [@22csnyder](http://22csnyder.github.io) Murat Kocaoglu / [@mkocaoglu](http://mkocaoglu.github.io) ================================================ FILE: synthetic/collect_stats.py ================================================ import pandas as pd import numpy as np import time from scipy import stats import os import matplotlib.pyplot as plt from models import GeneratorTypes,DataTypes import brewer2mpl def makeplots(x_iter,tvd_datastore,show=False,save=False,save_name=None): #Make plots dtypes=tvd_datastore.keys() fig,axes=plt.subplots(1,len(dtypes)) #fig.subplots_adjust(hspace=0.5,wspace=0.025) fig.subplots_adjust(hspace=0.75,wspace=0.05) x_iter=x_iter.astype('float')/1000 for ax,dtype in zip(axes,dtypes): if ax in axes[:-1]: use_legend = False else: use_legend = True if ax==axes[0]: prefix='Synthetic Data Graph: ' posfix=' ' else: prefix='' posfix='' axtitle=prefix+dtype+posfix #df=pd.DataFrame.from_dict(tvd_datastore[dtype]) df_tvd=pd.DataFrame(data={gtype:tvd_datastore[dtype][gtype]['tvd'] for gtype in gtypes}) df_sem=pd.DataFrame(data={gtype:tvd_datastore[dtype][gtype]['sem'] for gtype in gtypes}) df_tvd.index=x_iter;df_sem.index=x_iter df_tvd.plot.line(ax=ax,sharey=True,use_index=True,yerr=df_sem,legend=use_legend,capsize=5,capthick=3,elinewidth=1,errorevery=100) ax.set_title(axtitle.title(),fontsize=18) ax.set_ylabel('Total Variational Distance',fontsize=18) if ax is axes[1]: ax.set_xlabel('iter(thousands)',fontsize=18) t='Graph Structured Generator tvd Convergence on Synthetic Data with Known Causal Graph' plt.suptitle(t,fontsize=20) fig.set_figwidth(15,forward=True) fig.set_figheight(7,forward=True) if save: save_name=save_name or 'synth_tvd_vs_time.pdf' save_path=os.path.join('assets',save_name) plt.savefig(save_path,bbox_inches='tight') #plt.savefig(save_path) if show: plt.show(block=False) return fig,axes def make_individual_plots(x_iter,tvd_datastore,smooth=True,show=False,save=False,save_name=None): fontsize=17.5 tickfont=15 gtypes=GeneratorTypes.keys() dtypes=tvd_datastore.keys() format_columns={ 'fc3' :'FC3', 'fc5' :'FC5', 'fc10' :'FC10', 'collider':'Collider', 'linear' :'Linear', 'complete':'Complete', } #styles={ # 'FC3' :'bs-', # 'FC5' :'ro-', # 'FC10' :'y^-', # 'Collider':'g+-', # 'Linear' :'m>-', # 'Complete':'kd-', # } styles={ 'FC3' :'s-', 'FC5' :'o-', 'FC10' :'^-', 'Collider':'+-', 'Linear' :'>-', 'Complete':'d-', } #bmap = brewer2mpl.get_map('Set2', 'qualitative', 7) #colors = bmap.mpl_colors colors=['b','r','y','g','m','k'] markers=['s','o','^','+','>','d'] #plt.style.use('seaborn-dark-palette') #plt.style.use('ggplot') plt.style.use('seaborn-deep') #Make plots #fig.subplots_adjust(hspace=0.5,wspace=0.025) #fig.subplots_adjust(hspace=0.75,wspace=0.05) x_iter=x_iter.astype('float')/1000 for dtype in dtypes: use_legend=True #fig=plt.figure() df_tvd=pd.DataFrame(data={format_columns[gtype]:tvd_datastore[dtype][gtype]['tvd'] for gtype in gtypes}) df_sem=pd.DataFrame(data={format_columns[gtype]:tvd_datastore[dtype][gtype]['sem'] for gtype in gtypes}) df_tvd.index=x_iter;df_sem.index=x_iter if smooth: df_tvd=df_tvd.rolling(window=5,min_periods=1,center=True).mean() #styles=['bs-','ro-','y^-','g+-','m>-','kd-'] # df_tvd.plot.line(use_index=True,yerr=df_sem,legend=use_legend,capsize=5,capthick=3,elinewidth=1,errorevery=100,figsize=(6,4),style=styles,markevery=10,markersize=100) #df_tvd.plot.line(use_index=True,yerr=df_sem,legend=use_legend,capsize=5,capthick=3,elinewidth=1,errorevery=100,figsize=(6,4),style=styles,markersize=100) fig=plt.figure() ax=fig.add_subplot(111) i=0 for col in df_tvd.columns: #df_tvd[col].plot(ax=ax,use_index=True,yerr=df_sem[col],legend=use_legend,capsize=5,capthick=3,elinewidth=1,errorevery=100,figsize=(6,4),linestyle='-',color=colors[i],marker=markers[i],markevery=50,markersize=7) #print 'col',col#Linear last #df_tvd[col].plot(ax=ax,use_index=True,yerr=df_sem[col],legend=use_legend,capsize=5,capthick=3,elinewidth=1,errorevery=100,figsize=(6,4),linestyle='-',marker=markers[i],markevery=50,markersize=7) df_tvd[col].plot(ax=ax,use_index=True,yerr=df_sem[col],capsize=5,capthick=3,elinewidth=1,errorevery=100,figsize=(6,4),linestyle='-',marker=markers[i],markevery=50,markersize=7) i+=1 ax.set_yscale('log') plt.legend() plt.xticks(fontsize=tickfont) plt.yticks(fontsize=tickfont) plt.ylim([0,1]) plt.ylabel('Total Variation Distance',fontsize=fontsize) plt.xlabel('Iteration (in thousands)',fontsize=fontsize) if save: file_name=save_name or 'synth_tvd_vs_time.pdf' file_name=dtype+'_'+file_name save_path=os.path.join('assets',file_name) plt.savefig(save_path,bbox_inches='tight') #plt.savefig(save_path) if show: plt.show(block=False) if __name__=='__main__': dtypes=DataTypes.keys() gtypes=GeneratorTypes.keys() logdir='logs/figure_logs' #init #Create a dictionary for each dataset, of dictionaries for each gen_type tvd_all_datastore={dt:{gt:[] for gt in gtypes} for dt in dtypes} tvd_datastore={dt:{} for dt in dtypes} runs=os.listdir(logdir) for dtype in dtypes: print '' print 'Collecting data for datatype ',dtype,'...' typed_runs=filter(lambda x:x.endswith(dtype),runs) for gtype in gtypes: n_runs=0 #Go through all runs for each (dtype,gtype) pair for run in typed_runs: #tvd_csv={gt:os.path.join(logdir,run,gt,'tvd.csv') for gt in gtypes} tvd_csv=os.path.join(logdir,run,gtype,'tvd.csv') #cols=['step','tvd','mvd'] dat=pd.read_csv(tvd_csv,sep=' ') if len(dat)!=1001: print 'WARN: file',tvd_csv,'was of length:',len(dat), print 'it may be in the process of optimizing.. not using' continue #tvd_all_datastore[dtype][gtype]+=dat['tvd'] tvd_all_datastore[dtype][gtype].append(dat['tvd']) n_runs+=1 #after (dtype,gtype) collection if n_runs==0: #remove key since no matching gtype for dtype print 'Warning: for dtype',dtype,' no runs of gtype ',gtype #tvd_all_datastore[dtype].pop(gtype) else: df_concat=pd.concat(tvd_all_datastore[dtype][gtype],axis=1) gb=df_concat.groupby(by=df_concat.columns,axis=1) mean=gb.mean() sem=gb.sem().rename(columns={'tvd':'sem'}) tvd_datastore[dtype][gtype]=pd.concat([mean,sem],axis=1) #tvd_all_datastore[dtype][gtype]/=n_runs #concat #groupby #after dtype collection if len(tvd_datastore[dtype])==0: print 'Warning: no runs of dtype ',dtype tvd_datastore.pop(dtype) print '...There were ',n_runs,' runs of ',dtype x_iter=dat['iter'].values #run in ipython depending on what you want #fig,axes=makeplots(x_iter,tvd_datastore,show=False,save=True) make_individual_plots(x_iter,tvd_datastore,smooth=True,show=True,save=True) time.sleep(10) ================================================ FILE: synthetic/config.py ================================================ import argparse from models import DataTypes def str2bool(v): return v is True or v.lower() in ('true', '1') dtypes=DataTypes.keys() arg_lists = [] parser = argparse.ArgumentParser() def add_argument_group(name): arg = parser.add_argument_group(name) arg_lists.append(arg) return arg #Pretrain network data_arg=add_argument_group('Data') gan_arg=add_argument_group('GAN') misc_arg=add_argument_group('misc') model_arg=add_argument_group('Model') data_arg.add_argument('--data_type',type=str,choices=dtypes, default='collider', help='''This is the graph structure that generates the synthetic dataset through polynomials''') gan_arg.add_argument('--gen_z_dim',type=int,default=10, help='''dim of noise input for generator''') gan_arg.add_argument('--gen_hidden_size',type=int,default=10,#3, help='''hidden size used for layers of generator''') gan_arg.add_argument('--disc_hidden_size',type=int,default=10,#6, help='''hidden size used for layers of discriminator''') gan_arg.add_argument('--lr_gen',type=float,default=0.0005,#0.005 help='''generator learning rate''') gan_arg.add_argument('--lr_disc',type=float,default=0.0005,#0.0025 help='''discriminator learning rate''') #broken #misc_arg.add_argument('--save_pdfs',type=str2bool,default=False, # help='''whether to save pdfs of scatterplots of x1x3 along # with tensorboard summaries''') misc_arg.add_argument('--model_dir',type=str,default='logs') #misc_arg.add_argument('--np_random_seed', type=int, default=123) #misc_arg.add_argument('--tf_random_seed', type=int, default=123) model_arg.add_argument('--load_path',type=str,default='', help='''Path to folder containing model to load. This should be actual checkpoint to load. Example: --load_path=./logs/0817_153755_collider/checkpoints/Model-50000''') model_arg.add_argument('--is_train',type=str2bool,default=True, help='''whether the model should train''') model_arg.add_argument('--batch_size',type=int,default=64, help='''batch_size for all generators and all discriminators''') def get_config(): #setattr(config, 'data_dir', data_format) config, unparsed = parser.parse_known_args() return config, unparsed ================================================ FILE: synthetic/figure_generation.ipynb ================================================ { "cells": [ { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "tf: resetting default graph!\n", "Using data_type linear\n", "Model directory is ./logs/0818_072052_linear/checkpoints/Model-50000\n", "[*] MODEL dir: ./logs/0818_072052_linear/checkpoints/Model-50000\n", "[*] PARAM path: ./logs/0818_072052_linear/checkpoints/Model-50000/params.json\n", "GAN Model directory is ./logs/0818_072052_linear/checkpoints/Model-50000/fc3\n", "GAN Model directory is ./logs/0818_072052_linear/checkpoints/Model-50000/collider\n", "GAN Model directory is ./logs/0818_072052_linear/checkpoints/Model-50000/fc5\n", "GAN Model directory is ./logs/0818_072052_linear/checkpoints/Model-50000/linear\n", "GAN Model directory is ./logs/0818_072052_linear/checkpoints/Model-50000/fc10\n", "GAN Model directory is ./logs/0818_072052_linear/checkpoints/Model-50000/complete\n", " [*] Attempting to restore ./logs/0818_072052_linear/checkpoints/Model-50000\n", "INFO:tensorflow:Restoring parameters from ./logs/0818_072052_linear/checkpoints/Model-50000\n", "built trainer successfully\n" ] } ], "source": [ "%run main.py --data_type 'linear' --load_path './logs/0818_072052_linear/checkpoints/Model-50000' --is_train False" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Using matplotlib backend: TkAgg\n" ] } ], "source": [ "%matplotlib\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": true }, "outputs": [], "source": [ "sess=trainer.sess;gans=trainer.gans\n", "Xgs=[sess.run(g.gen.X,{g.gen.N:5000}) for g in gans]\n", "split_Xgs=[np.split(x,3,axis=1) for x in Xgs]\n", "X13gs=[[x[0],x[-1]] for x in split_Xgs]\n", "Xds=np.split(sess.run(trainer.data.X,{trainer.data.N:5000}),3,axis=1)\n", "X13d=[Xds[0],Xds[-1]]\n", "\n", "data_dict={'data':X13d}\n", "for g,dat in zip(gans,X13gs):\n", " data_dict[g.gan_type]=dat\n", "\n", "gan_plots=['data','linear','collider','fc5']\n" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": true }, "outputs": [], "source": [ "titles={'data':'Data Distribution',\n", " 'linear':'Linear Generator',\n", " 'complete':'Complete Generator',\n", " 'collider':'Collider Generator',\n", " 'fc5':'Fully Connected Generator'}" ] }, { "cell_type": "code", "execution_count": 74, "metadata": { "collapsed": false }, "outputs": [], "source": [ "#all at once\n", "fig,axes=plt.subplots(1,len(gan_plots),sharey=True)\n", "\n", "for gtype,ax in zip(gan_plots,axes):\n", " data=data_dict[gtype]\n", " ax.scatter(data[0],data[1])\n", " \n", " ax.set_title(titles[gtype])\n", " ax.set_xlabel('X1')\n", " if gtype==gan_plots[0]:\n", " ax.set_ylabel('X3')\n", "\n", " \n", "fig.canvas.draw()\n", "plt.show() \n", "\n", "fig.subplots_adjust(wspace=0.04,left=0.05,hspace=0.04,right=0.98)\n", "\n", "fig.set_figheight(4)\n", "fig.set_figwidth(12)\n", "\n", "plt.savefig('assets/0818_072052_x1x3_all.pdf')" ] }, { "cell_type": "code", "execution_count": 97, "metadata": { "collapsed": false }, "outputs": [], "source": [ "#one at a time\n", "\n", "for gtype in titles.keys():\n", " data=data_dict[gtype]\n", " fig=plt.figure()\n", " plt.scatter(data[0],data[1])\n", " plt.xlim([0,1])\n", " plt.ylim([0,1])\n", " \n", " plt.title(titles[gtype],fontsize=20)\n", "\n", " plt.ylabel('X3',fontsize=16)\n", " plt.xlabel('X1',fontsize=16)\n", " save_path='assets/'+'0818_072052/'+'x1x3_'+gtype+'.pdf'\n", " plt.savefig(save_path)\n", "\n" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [], "source": [ "#no titles\n", "\n", "for gtype in titles.keys():\n", " data=data_dict[gtype]\n", " fig=plt.figure()\n", " plt.scatter(data[0],data[1])\n", " plt.xlim([0,1])\n", " plt.ylim([0,1])\n", " \n", " #plt.title(titles[gtype],fontsize=20)\n", "\n", " plt.ylabel('X3',fontsize=16)\n", " plt.xlabel('X1',fontsize=16)\n", " save_path='assets/'+'0818_072052/'+'x1x3_notitle'+gtype+'.pdf'\n", " plt.savefig(save_path)\n", "\n" ] }, { "cell_type": "code", "execution_count": 96, "metadata": { "collapsed": false }, "outputs": [], "source": [ "#no text\n", "#No titles: leave to latex to add titles/axes\n", "\n", "for gtype in titles.keys():\n", " data=data_dict[gtype]\n", " fig=plt.figure()\n", " plt.scatter(data[0],data[1])\n", " plt.xlim([0,1])\n", " plt.ylim([0,1])\n", " \n", " #plt.title(titles[gtype],fontsize=14)\n", "\n", " #plt.ylabel('X3',fontsize=14)\n", " #plt.xlabel('X1',fontsize=14)\n", " save_path='assets/'+'0818_072052/'+'x1x3_notext'+gtype+'.pdf'\n", " plt.savefig(save_path)\n", "\n" ] }, { "cell_type": "code", "execution_count": 68, "metadata": { "collapsed": true }, "outputs": [], "source": [ "fig.subplots_adjust?" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "ename": "NameError", "evalue": "name 'trainer' is not defined", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mtrainer\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mNameError\u001b[0m: name 'trainer' is not defined" ] } ], "source": [ "trainer" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from utils import scatter2d" ] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.12" } }, "nbformat": 4, "nbformat_minor": 1 } ================================================ FILE: synthetic/main.py ================================================ from __future__ import print_function import numpy as np import tensorflow as tf from trainer import Trainer from config import get_config import os from IPython.core import debugger debug = debugger.Pdb().set_trace '''main code for synthetic experiments ''' def get_trainer(config): print('tf: resetting default graph!') tf.reset_default_graph() #tf.set_random_seed(config.random_seed) #np.random.seed(22) print('Using data_type ',config.data_type) trainer=Trainer(config,config.data_type) print('built trainer successfully') tf.logging.set_verbosity(tf.logging.ERROR) return trainer def main(trainer,config): if config.is_train: trainer.train() def get_model(config=None): if not None: config, unparsed = get_config() return get_trainer(config) if __name__ == "__main__": config, unparsed = get_config() if not os.path.exists(config.model_dir): os.mkdir(config.model_dir) trainer=get_trainer(config) main(trainer,config) ================================================ FILE: synthetic/models.py ================================================ import tensorflow as tf import matplotlib.pyplot as plt from utils import * #class Data3d def sxe(logits,labels): #use zeros or ones if pass in scalar if not isinstance(labels,tf.Tensor): labels=labels*tf.ones_like(logits) return tf.nn.sigmoid_cross_entropy_with_logits( logits=logits,labels=labels) #def linear(input_, output_dim, scope=None, stddev=10.): def linear(input_, output_dim, scope=None, stddev=.7): unif = tf.uniform_unit_scaling_initializer() norm = tf.random_normal_initializer(stddev=stddev) const = tf.constant_initializer(0.0) with tf.variable_scope(scope or 'linear'): #w = tf.get_variable('w', [input_.get_shape()[1], output_dim], initializer=unif) w = tf.get_variable('w', [input_.get_shape()[1], output_dim], initializer=norm) b = tf.get_variable('b', [output_dim], initializer=const) return tf.matmul(input_, w) + b class Arrows: x_dim=3 e_dim=3 bdry_buffer=0.05# output in [bdry_buffer,1-bdry_buffer] def __init__(self,N): with tf.variable_scope('Arrow') as scope: self.N=tf.placeholder_with_default(N,shape=[]) #self.N=tf.constant(N) #how many to sample at a time self.e1=tf.random_uniform([self.N,1],0,1) self.e2=tf.random_uniform([self.N,1],0,1) self.e3=tf.random_uniform([self.N,1],0,1) self.build() #WARN. some of these are not trainable: i.e. poly self.var = tf.contrib.framework.get_variables(scope) def build(self): pass def normalize_output(self,X): ''' I think that data literally in [0,1] was difficult for sigmoid network. Therefore, I am normalizing it to [bdry_buffer,1-bdry_buffer] X: assumed to be in [0,1] ''' return (1.-2*self.bdry_buffer)*X + self.bdry_buffer class Generator: x_dim=3 def __init__(self, N, hidden_size=10,z_dim=10): with tf.variable_scope('Gen') as scope: self.N=tf.placeholder_with_default(N,shape=[]) self.hidden_size=hidden_size self.z_dim=z_dim self.build() self.tr_var = tf.contrib.framework.get_variables(scope) self.step=tf.Variable(0,name='step',trainable=False) self.var = tf.contrib.framework.get_variables(scope) def build(self): raise Exception('must override') def smallNN(self,inputs,name='smallNN'): with tf.variable_scope(name): if isinstance(inputs,list): inputs=tf.concat(inputs,axis=1) h01 = tf.tanh(linear(inputs, self.hidden_size, name+'l1')) h11 = tf.tanh(linear(h01, self.hidden_size, name+'l21')) #h21 = output_nonlinearity(linear(h11, 1, name+'l31')) #h21 = linear(h11, 1, name+'l31') h21 = tf.sigmoid(linear(h11, 1, name+'l31')) return h21#rank2 #return tf.sigmoid(h21)#rank2 randunif=tf.random_uniform_initializer(0,1,dtype=tf.float32) def poly(cause,cause2=None,cause3=None,name='poly1d',reuse=None): #assumes input is in [0,1]. Enforces output is in [0,1] #if cause2 is not given, this is a cubic poly is 1 variable #cause and cause2 should be given as tensors like (N,1) #Check conditions if isinstance(cause2,str): raise ValueError('cause2 was a string. you probably forgot to include\ the "name=" keyword when specifying only 1 cause') if isinstance(cause3,str): raise ValueError('cause3 was a string. you probably forgot to include\ the "name=" keyword when specifying only 1 cause') if not len(cause.shape)>=2: cshape=cause.get_shape().as_list() raise ValueError('cause and cause2 must have len(shape)>=2. shape was' , cshape ) if cause2 is not None: if not len(cause2.get_shape().as_list())>=2: cshape2=cause2.get_shape().as_list() raise ValueError('cause and cause2 must have len(shape)>=2. shape was %r'%(cshape2)) if cause3 is not None: if not len(cause3.get_shape().as_list())>=2: cshape3=cause3.get_shape().as_list() raise ValueError('cause and cause3 must have len(shape)>=2. shape was %r'%(cshape3)) #Start with tf.variable_scope(name,reuse=reuse): if cause2 is not None and cause3 is not None: inputs=[tf.ones_like(cause),cause,cause2,cause3] if cause2 is not None and cause3 is None: inputs=[tf.ones_like(cause),cause,cause2] else: inputs=[tf.ones_like(cause),cause] dim=len(inputs)#2 or 3 or 4 C=np.random.rand(1,dim,dim,dim).astype(np.float32)#unif C=2*C-1 #unif[-1,1] n=200 N=n**(dim-1) grids=np.mgrid[[slice(0,1,1./n) for i in inputs[1:]]] y=np.hstack([np.ones((N,1))]+[g.reshape(N,1) for g in grids]) y1=np.reshape(y,[N,-1,1,1]) y2=np.reshape(y,[N,1,-1,1]) y3=np.reshape(y,[N,1,1,-1]) test_poly=np.sum(y1*y2*y3*C,axis=(1,2,3)) Cmin=np.min(test_poly) Cmax=np.max(test_poly) #normalize [0,1]->[0,1] C[0,0,0,0]-=Cmin C/=(Cmax-Cmin) coeff=tf.Variable(C,name='coef',trainable=False) #M=cause.get_shape.as_list()[0] x=tf.concat(inputs,axis=1) x1=tf.reshape(x,[-1,dim,1,1]) x2=tf.reshape(x,[-1,1,dim,1]) x3=tf.reshape(x,[-1,1,1,dim]) poly=tf.reduce_sum(x1*x2*x3*coeff,axis=[1,2,3]) return tf.reshape(poly,[-1,1]) class CompleteArrows(Arrows): # Data generated from the causal graph X1->X2->X3 name='complete' def build(self): with tf.variable_scope(self.name): self.X1=poly(self.e1,name='X1') #self.X2=0.5*poly(self.X1,name='X1cX2')+0.5*self.e2 #self.X3=0.5*poly(self.X1,self.X2,name='X1X2cX3')+0.5*self.e3 self.X2=poly(self.X1,self.e2,name='X1cX2') self.X3=poly(self.X1,self.X2,self.e3,name='X1X2cX3') self.X=tf.concat([self.X1,self.X2,self.X3],axis=1) self.X=self.normalize_output(self.X) #print 'completearrowX.shape:',self.X.get_shape().as_list() class CompleteGenerator(Generator): name='complete' def build(self): with tf.variable_scope(self.name): self.z=tf.random_uniform((self.N,self.x_dim*self.z_dim), 0,1,name='z') z1,z2,z3=tf.split( self.z ,3,axis=1)#3=x_dim self.X1=self.smallNN(z1,'X1') self.X2=self.smallNN([self.X1,z2],'X1cX2') self.X3=self.smallNN([self.X1,self.X2,z3],'X1X2cX3') self.X=tf.concat([self.X1,self.X2,self.X3],axis=1) #print 'completegenX.shape:',self.X.get_shape().as_list() class ColliderArrows(Arrows): name='collider' def build(self): with tf.variable_scope(self.name): self.X1=poly(self.e1,name='X1') self.X3=poly(self.e3,name='X3') #self.X2=0.5*poly(self.X1,self.X3,name='X1X3cX2')+0.5*self.e2 self.X2=poly(self.X1,self.X3,self.e2,name='X1X3cX2') self.X=tf.concat([self.X1,self.X2,self.X3],axis=1) self.X=self.normalize_output(self.X) class ColliderGenerator(Generator): name='collider' def build(self): with tf.variable_scope(self.name): self.z=tf.random_uniform((self.N,self.x_dim*self.z_dim), 0,1,name='z') z1,z2,z3=tf.split( self.z ,3,axis=1)#3=x_dim self.X1=self.smallNN(z1,'X1') self.X3=self.smallNN(z3,'X3') self.X2=self.smallNN([self.X1,self.X3,z2],'X1X3cX2') self.X=tf.concat([self.X1,self.X2,self.X3],axis=1) class LinearArrows(Arrows): name='linear' def build(self): with tf.variable_scope(self.name): self.X1=poly(self.e1,name='X1') #self.X2=0.5*poly(self.X1,name='X2')+0.5*self.e2 #self.X3=0.5*poly(self.X2,name='X3')+0.5*self.e3 self.X2=poly(self.X1,self.e2,name='X2') self.X3=poly(self.X2,self.e3,name='X3') self.X=tf.concat([self.X1,self.X2,self.X3],axis=1) self.X=self.normalize_output(self.X) class LinearGenerator(Generator): name='linear' def build(self): with tf.variable_scope(self.name): self.z=tf.random_uniform((self.N,self.x_dim*self.z_dim), 0,1,name='z') z1,z2,z3=tf.split( self.z ,3,axis=1)#3=x_dim self.X1=self.smallNN(z1,'X1') self.X2=self.smallNN([self.X1,z2],'X2') self.X3=self.smallNN([self.X2,z3],'X3') self.X=tf.concat([self.X1,self.X2,self.X3],axis=1) class NetworkArrows(Arrows): name='network' def build(self): with tf.variable_scope(self.name): self.hidden_size=10 h0 = tf.tanh(linear(self.e1, self.hidden_size, 'netarrow0')) h1 = tf.tanh(linear(h0, self.hidden_size, 'netarrow1')) h2 = tf.tanh(linear(h1, self.hidden_size, 'netarrow2')) h3 = tf.tanh(linear(h2, self.hidden_size, 'netarrow3')) h4 = tf.sigmoid(linear(h3, self.x_dim, 'netarrow4')) self.X=self.normalize_output(h4) class FC3_Generator(Generator): name='fc3' def build(self): z=tf.random_uniform((self.N,self.x_dim*self.z_dim), 0,1,name='z') z1,z2,z3=tf.split( z ,3,axis=1)#3=x_dim h0 = tf.tanh(linear(z1, self.hidden_size, 'fc3gen0')) h1 = tf.tanh(linear(h0, self.hidden_size, 'fc3gen1')) h2 = tf.sigmoid(linear(h1, self.x_dim, 'fc3gen2')) self.X=h2 class FC5_Generator(Generator): name='fc5' def build(self): z=tf.random_uniform((self.N,self.x_dim*self.z_dim), 0,1,name='z') z1,z2,z3=tf.split( z ,3,axis=1)#3=x_dim h0 = tf.tanh(linear(z1, self.hidden_size, 'fc5gen0')) h1 = tf.tanh(linear(h0, self.hidden_size, 'fc5gen1')) h2 = tf.tanh(linear(h1, self.hidden_size, 'fc5gen2')) h3 = tf.tanh(linear(h2, self.hidden_size, 'fc5gen3')) h4 = tf.sigmoid(linear(h3, self.x_dim, 'fc5gen4')) self.X=h4 class FC10_Generator(Generator): name='fc10' def build(self): z=tf.random_uniform((self.N,self.x_dim*self.z_dim), 0,1,name='z') z1,z2,z3=tf.split( z ,3,axis=1)#3=x_dim h0 = tf.tanh(linear(z1, self.hidden_size, 'fc10gen0')) h1 = tf.tanh(linear(h0, self.hidden_size, 'fc10gen1')) h2 = tf.tanh(linear(h1, self.hidden_size, 'fc10gen2')) h3 = tf.tanh(linear(h2, self.hidden_size, 'fc10gen3')) h4 = tf.tanh(linear(h3, self.hidden_size, 'fc10gen4')) h5 = tf.tanh(linear(h4, self.hidden_size, 'fc10gen5')) h6 = tf.tanh(linear(h5, self.hidden_size, 'fc10gen6')) h7 = tf.tanh(linear(h6, self.hidden_size, 'fc10gen7')) h8 = tf.tanh(linear(h7, self.hidden_size, 'fc10gen8')) h9 = tf.sigmoid(linear(h8, self.x_dim, 'fc10gen9')) self.X=h9 def minibatch(input_, num_kernels=5, kernel_dim=3): x = linear(input_, num_kernels * kernel_dim, scope='minibatch', stddev=0.02) activation = tf.reshape(x, (-1, num_kernels, kernel_dim)) diffs = tf.expand_dims(activation, 3) - tf.expand_dims(tf.transpose(activation, [1, 2, 0]), 0) abs_diffs = tf.reduce_sum(tf.abs(diffs), 2) minibatch_features = tf.reduce_sum(tf.exp(-abs_diffs), 2) return tf.concat([input_, minibatch_features],1) def Discriminator(input_, hidden_size,minibatch_layer=True,alpha=0.5,reuse=None): with tf.variable_scope('discriminator',reuse=reuse): h0_ = tf.nn.relu(linear(input_, hidden_size, 'disc0')) h0 = tf.maximum(alpha*h0_,h0_) h1_ = tf.nn.relu(linear(h0, hidden_size, 'disc1')) h1 = tf.maximum(alpha*h1_,h1_) if minibatch_layer: h2 = minibatch(h1) else: h2_ = tf.nn.relu(linear(h1, hidden_size, 'disc2')) h2 = tf.maximum(alpha*h2_,h2_) h3 = linear(h2, 1, 'disc3') return h3 GeneratorTypes={CompleteGenerator.name:CompleteGenerator, ColliderGenerator.name:ColliderGenerator, LinearGenerator.name:LinearGenerator, FC3_Generator.name:FC3_Generator, FC5_Generator.name:FC5_Generator, FC10_Generator.name:FC10_Generator} DataTypes={CompleteArrows.name:CompleteArrows, ColliderArrows.name:ColliderArrows, LinearArrows.name:LinearArrows, NetworkArrows.name:NetworkArrows} #def poly1d(cause,name='poly1d',reuse=None): # #assumes input is in [0,1]. Enforces output is in [0,1] # print 'Warning poly1d not ready yet' # with tf.variable_scope(name,initializer=randunif,reuse=reuse): # #C=np.random.rand(1,2,2).astype(np.float32)#unif # C=np.random.rand(1,2,2,2).astype(np.float32)#unif # # #find min and max # N=2000 # y=np.hstack([np.ones((N,1)),np.linspace(0,1.,N).reshape((N,1))]) # y1=np.reshape(y,[N,2,1,1]) # y2=np.reshape(y,[N,1,2,1]) # y3=np.reshape(y,[N,1,1,2]) # # test_poly=np.sum(y1*y2*y3*C,axis=(1,2,3)) # Cmin=np.min(test_poly) # Cmax=np.max(test_poly) # # #normalize [0,1]->[0,1] # C[0,0,0,0]-=Cmin # C/=(Cmax-Cmin) # # coeff=tf.Variable(C,name='coef',trainable=False) # x2=tf.reshape(tf.stack([tf.ones_like(cause),cause],axis=1),[-1,1,2]) # x1=tf.transpose(x2,[0,2,1]) # poly=tf.reduce_sum(x1*x2*coeff,axis=[1,2]) # out= tf.squeeze(poly) # return poly # # #coeff=tf.Variable(trainable=False,expected_shape=[1,3]) # # X=tf.stack([cause,cause*cause,cause*cause*cause],axis=1) # # return tf.reduce_sum(coeff*X,axis=1)/tf.reduce_max(coeff) # #def poly2d(cause,cause2,name='poly2d',reuse=None): # with tf.variable_scope(name,initializer=randunif,reuse=reuse): # #coeff=tf.Variable(np.random.randn(1,2,2,2).astype(np.float32),trainable=False) # #x3=tf.reshape(tf.stack([cause,cause2],axis=0),[-1,1,1,2]) # #x2=tf.transpose(x3,[0,2,3,1]) # #x1=tf.transpose(x2,[0,2,3,1]) # # C=np.random.rand(1,3,3,3).astype(np.float32) # C[:,0,0,0]=0.#constant # C[:,0,2,0]=1.#x^3,y^3 coeff # C[:,0,0,2]=1. # coeff=tf.Variable(C, trainable=False) # x3=tf.reshape(tf.stack([tf.ones_like(cause),cause,cause2],axis=1),[-1,1,1,3]) # x2=tf.transpose(x3,[0,2,3,1]) # x1=tf.transpose(x2,[0,2,3,1]) # # poly=tf.reduce_sum(x1*x2*x3*coeff,axis=[1,2,3]) # # #out = tf.squeeze(poly)/tf.reduce_max(coeff) # out= tf.squeeze(poly) # return out ================================================ FILE: synthetic/run_datasets.sh ================================================ #!/bin/bash #This script should be called with CUDA_VISIBLE_DEVICES #already set. This script runs 1 of each gan model for #1 of each dataset model set -e cvd=${CUDA_VISIBLE_DEVICES:?"Needs to be set"} echo "DEVICES=$cvd" #Sorry tqmd will produce some spastic output #for i in {1..5} for i in {1..30} do echo "GPU "$CUDA_VISIBLE_DEVICES" Iter $i" python main.py --data_type=linear & sleep 2s python main.py --data_type=collider & sleep 2s python main.py --data_type=complete #python main.py --data_type=linear & #sleep 2s #python main.py --data_type=linear & #sleep 2s #python main.py --data_type=linear #python main.py --data_type=network & #python main.py --data_type=network & #python main.py --data_type=network #Make sure all finished echo "Sleeping" sleep 5m done echo "finshed fork_datasets.sh" ================================================ FILE: synthetic/tboard.py ================================================ import os import sys from subprocess import call def file2number(fname): nums=[s for s in fname.split('_') if s.isdigit()] if len(nums)==0: nums=['0'] number=int(''.join(nums)) return number if __name__=='__main__': root='./logs' logs=os.listdir(root) logs.sort(key=lambda x:file2number(x)) logdir=os.path.join(root,logs[-1]) print 'running tensorboard on logdir:',logdir call(['tensorboard', '--logdir',logdir]) ================================================ FILE: synthetic/trainer.py ================================================ from __future__ import print_function import tensorflow as tf import logging import numpy as np import pandas as pd import shutil import json import sys import os from datetime import datetime from tqdm import trange import matplotlib.pyplot as plt from os import listdir from os.path import isfile,join from utils import calc_tvd,summary_scatterplots,Timer,summary_losses,make_summary from models import GeneratorTypes,DataTypes,Discriminator,sxe class GAN(object): def __init__(self,config,gan_type,data,parent_dir): self.config=config self.gan_type=gan_type self.data=data self.Xd=data.X self.parent_dir=parent_dir self.prepare_model_dir() self.prepare_logger() with tf.variable_scope(gan_type): self.step=tf.Variable(0,'step') self.inc_step=tf.assign(self.step,self.step+1) self.build_model() self.build_summaries()#This can be either in var_scope(name) or out def build_model(self): Gen=GeneratorTypes[self.gan_type] config=self.config self.gen=Gen(config.batch_size,config.gen_hidden_size,config.gen_z_dim) with tf.variable_scope('Disc') as scope: self.D1 = Discriminator(self.data.X, config.disc_hidden_size) scope.reuse_variables() self.D2 = Discriminator(self.gen.X, config.disc_hidden_size) d_var = tf.contrib.framework.get_variables(scope) d_loss_real=tf.reduce_mean( sxe(self.D1,1) ) d_loss_fake=tf.reduce_mean( sxe(self.D2,0) ) self.loss_d = d_loss_real + d_loss_fake self.loss_g = tf.reduce_mean( sxe(self.D2,1) ) optimizer=tf.train.AdamOptimizer g_optimizer=optimizer(self.config.lr_gen) d_optimizer=optimizer(self.config.lr_disc) self.opt_d = d_optimizer.minimize(self.loss_d,var_list= d_var) self.opt_g = g_optimizer.minimize(self.loss_g,var_list= self.gen.tr_var, global_step=self.gen.step) with tf.control_dependencies([self.inc_step]): self.train_op=tf.group(self.opt_d,self.opt_g) def build_summaries(self): d_summ=tf.summary.scalar(self.data.name+'_dloss',self.loss_d) g_summ=tf.summary.scalar(self.data.name+'_gloss',self.loss_g) self.summaries=[d_summ,g_summ] self.summary_op=tf.summary.merge(self.summaries) self.tf_scatter=tf.placeholder(tf.uint8,[3,480,640,3]) scatter_name='scatter_D'+self.data.name+'_G'+self.gen.name self.g_scatter_summary=tf.summary.image(scatter_name,self.tf_scatter,max_outputs=3) self.summary_writer=tf.summary.FileWriter(self.model_dir) def record_losses(self,sess): step, sum_loss_g, sum_loss_d = summary_losses(sess,self) self.summary_writer.add_summary(sum_loss_g,step) self.summary_writer.add_summary(sum_loss_d,step) self.summary_writer.flush() def record_tvd(self,sess): step,tvd,mvd = calc_tvd(sess,self.gen,self.data) self.log_tvd(step,tvd,mvd) summ_tvd=make_summary(self.data.name+'_tvd',tvd) summ_mvd=make_summary(self.data.name+'_mvd',mvd) self.summary_writer.add_summary(summ_tvd,step) self.summary_writer.add_summary(summ_mvd,step) self.summary_writer.flush() def record_scatter(self,sess): Xg=sess.run(self.gen.X,{self.gen.N:5000}) X1,X2,X3=np.split(Xg,3,axis=1) x1x2,x1x3,x2x3 = summary_scatterplots(X1,X2,X3) step,Pg_summ=sess.run([self.step,self.g_scatter_summary],{self.tf_scatter:np.concatenate([x1x2,x1x3,x2x3])}) self.summary_writer.add_summary(Pg_summ,step) self.summary_writer.flush() # if self.config.save_pdfs: # self.save_np_scatter(step,X1,X3) #Maybe it's the supervisor creating the segfault?? #Try just one model at a time # #will cause segfault ;) # def save_np_scatter(self,step,x,y,save_dir=None,ext='.pdf'): # ''' # This is a convenience that just saves the image as a pdf in addition to putting it on # tensorboard. only does x1x3 because that's what I needed at the moment # # sorry I wrote this really quickly # TODO: make less bad. # ''' # plt.scatter(x,y) # plt.title('X1X3') # plt.xlabel('X1') # plt.ylabel('X3') # plt.xlim([0,1]) # plt.ylim([0,1]) # # scatter_dir=os.path.join(self.model_dir,'scatter') # # save_dir=save_dir or scatter_dir # if not os.path.exists(save_dir): # os.mkdir(save_dir) # # save_name=os.path.join(save_dir,'{}_scatter_x1x3_{}_{}'+ext) # save_path=save_name.format(step,self.config.data_type,self.gan_type) # # plt.savefig(save_path) def prepare_model_dir(self): self.model_dir=os.path.join(self.parent_dir,self.gan_type) if not os.path.exists(self.model_dir): os.mkdir(self.model_dir) print('GAN Model directory is ',self.model_dir) def prepare_logger(self): self.logger=logging.getLogger(self.gan_type) pth=os.path.join(self.model_dir,'tvd.csv') file_handler=logging.FileHandler(pth) self.logger.addHandler(file_handler) self.logger.setLevel(logging.INFO) self.logger.info('iter tvd mvd') def log_tvd(self,step,tvd,mvd): log_str=' '.join([str(step),str(tvd),str(mvd)]) self.logger.info(log_str) class Trainer(object): def __init__(self,config,data_type): self.config=config self.data_type=data_type self.prepare_model_dir() #with tf.variable_scope('trainer'):#commented to get summaries on same plot self.step=tf.Variable(0,'step') self.inc_step=tf.assign(self.step,self.step+1) self.build_model() self.summary_writer=tf.summary.FileWriter(self.model_dir) self.saver=tf.train.Saver() #sv = tf.train.Supervisor( # logdir=self.save_model_dir, # is_chief=True, # saver=self.saver, # summary_op=None, # summary_writer=self.summary_writer, # save_model_secs=300, # global_step=self.step, # ready_for_local_init_op=None # ) gpu_options = tf.GPUOptions(allow_growth=True, per_process_gpu_memory_fraction=0.333) sess_config = tf.ConfigProto(allow_soft_placement=True, gpu_options=gpu_options) #self.sess = sv.prepare_or_wait_for_session(config=sess_config) self.sess = tf.Session(config=sess_config) init=tf.global_variables_initializer() self.sess.run(init) #if load_path, replace initialized values if self.config.load_path: print(" [*] Attempting to restore {}".format(self.config.load_path)) self.saver.restore(self.sess,self.config.load_path) #print(" [*] Attempting to restore {}".format(ckpt)) #self.saver.restore(self.sess,ckpt) #print(" [*] Success to read {}".format(ckpt)) if not self.config.load_path: #once data scatterplot (doesn't change during training) self.data_scatterplot() def data_scatterplot(self): Xd=self.sess.run(self.data.X,{self.data.N:5000}) X1,X2,X3=np.split(Xd,3,axis=1) x1x2,x1x3,x2x3 = summary_scatterplots(X1,X2,X3) step,Pg_summ=self.sess.run([self.step,self.d_scatter_summary],{self.tf_scatter:np.concatenate([x1x2,x1x3,x2x3])}) self.summary_writer.add_summary(Pg_summ,step) self.summary_writer.flush() def build_model(self): self.data=DataTypes[self.data_type](self.config.batch_size) self.gans=[GAN(self.config,n,self.data,self.model_dir) for n in GeneratorTypes.keys()] with tf.control_dependencies([self.inc_step]): self.train_op=tf.group(*[gan.train_op for gan in self.gans]) #self.train_op=tf.group(gan.train_op for gan in self.gans.values()) #Used for generating image summaries of scatterplots self.tf_scatter=tf.placeholder(tf.uint8,[3,480,640,3]) self.d_scatter_summary=tf.summary.image('scatter_Data_'+self.data_type,self.tf_scatter,max_outputs=3) def train(self): self.train_timer =Timer() self.losses_timer =Timer() self.tvd_timer =Timer() self.scatter_timer =Timer() self.log_step=50 self.max_step=50001 #self.max_step=501 for step in trange(self.max_step): if step % self.log_step == 0: for gan in self.gans: self.losses_timer.on() gan.record_losses(self.sess) self.losses_timer.off() self.tvd_timer.on() gan.record_tvd(self.sess) self.tvd_timer.off() if step % (10*self.log_step) == 0: for gan in self.gans: self.scatter_timer.on() gan.record_scatter(self.sess) #DEBUG: reassure me nothing changes during optimization #self.data_scatterplot() self.scatter_timer.off() if step % (5000) == 0: self.saver.save(self.sess,self.save_model_name,step) self.train_timer.on() self.sess.run(self.train_op) self.train_timer.off() print("Timers:") print(self.train_timer) print(self.losses_timer) print(self.tvd_timer) print(self.scatter_timer) def prepare_model_dir(self): if self.config.load_path: self.model_dir=self.config.load_path else: pth=datetime.now().strftime("%m%d_%H%M%S")+'_'+self.data_type self.model_dir=os.path.join(self.config.model_dir,pth) if not os.path.exists(self.model_dir): os.mkdir(self.model_dir) print('Model directory is ',self.model_dir) self.save_model_dir=os.path.join(self.model_dir,'checkpoints') if not os.path.exists(self.save_model_dir): os.mkdir(self.save_model_dir) self.save_model_name=os.path.join(self.save_model_dir,'Model') param_path = os.path.join(self.model_dir, "params.json") print("[*] MODEL dir: %s" % self.model_dir) print("[*] PARAM path: %s" % param_path) with open(param_path, 'w') as fp: json.dump(self.config.__dict__, fp, indent=4, sort_keys=True) config=self.config if config.is_train and not config.load_path: config.log_code_dir=os.path.join(self.model_dir,'code') for path in [self.model_dir, config.log_code_dir]: if not os.path.exists(path): os.makedirs(path) #Copy python code in directory into model_dir/code for future reference: code_dir=os.path.dirname(os.path.realpath(sys.argv[0])) model_files = [f for f in listdir(code_dir) if isfile(join(code_dir, f))] for f in model_files: if f.endswith('.py'): shutil.copy2(f,config.log_code_dir) ================================================ FILE: synthetic/utils.py ================================================ from __future__ import print_function import tensorflow as tf import os from os import listdir from os.path import isfile, join from skimage import io import shutil import sys import math import time import json import logging import numpy as np from PIL import Image from datetime import datetime from tensorflow.core.framework import summary_pb2 import matplotlib.pyplot as plt def make_summary(name, val): return summary_pb2.Summary(value=[summary_pb2.Summary.Value(tag=name, simple_value=val)]) def summary_losses(sess,model,N=1000): step,loss_g,loss_d=sess.run([model.step,model.loss_g,model.loss_d],{model.data.N:N,model.gen.N:N}) lgsum=make_summary(model.data.name+'_gloss',loss_g) ldsum=make_summary(model.data.name+'_dloss',loss_d) return step,lgsum, ldsum def calc_tvd(sess,Generator,Data,N=50000,nbins=10): Xd=sess.run(Data.X,{Data.N:N}) step,Xg=sess.run([Generator.step,Generator.X],{Generator.N:N}) p_gen,_ = np.histogramdd(Xg,bins=nbins,range=[[0,1],[0,1],[0,1]],normed=True) p_dat,_ = np.histogramdd(Xd,bins=nbins,range=[[0,1],[0,1],[0,1]],normed=True) p_gen/=nbins**3 p_dat/=nbins**3 tvd=0.5*np.sum(np.abs( p_gen-p_dat )) mvd=np.max(np.abs( p_gen-p_dat )) return step,tvd, mvd s_tvd=make_summary(Data.name+'_tvd',tvd) s_mvd=make_summary(Data.name+'_mvd',mvd) return step,s_tvd,s_mvd #return make_summary('tvd/'+Generator.name,tvd) def summary_stats(name,tensor,hist=False): ave=tf.reduce_mean(tensor) std=tf.sqrt(tf.reduce_mean(tf.square(ave-tensor))) tf.summary.scalar(name+'_ave',ave) tf.summary.scalar(name+'_std',std) if hist: tf.summary.histogram(name+'_hist',tensor) def summary_scatterplots(X1,X2,X3): with tf.name_scope('scatter'): img1=summary_scatter2d(X1,X2,'X1X2',xlabel='X1',ylabel='X2') img2=summary_scatter2d(X1,X3,'X1X3',xlabel='X1',ylabel='X3') img3=summary_scatter2d(X2,X3,'X2X3',xlabel='X2',ylabel='X3') plt.close() return img1,img2,img3 def summary_scatter2d(x,y,title='2dscatterplot',xlabel=None,ylabel=None): fig=scatter2d(x,y,title,xlabel=xlabel,ylabel=ylabel) fig.canvas.draw() rgb=fig.canvas.tostring_rgb() buf=np.fromstring(rgb,dtype=np.uint8) w,h = fig.canvas.get_width_height() img=buf.reshape(1,h,w,3) #summary=tf.summary.image(title,img) plt.close(fig) #fig.clf() return img def scatter2d(x,y,title='2dscatterplot',xlabel=None,ylabel=None): fig=plt.figure() plt.scatter(x,y) plt.title(title) if xlabel: plt.xlabel(xlabel) if ylabel: plt.ylabel(ylabel) if not 0<=np.min(x)<=np.max(x)<=1: raise ValueError('summary_scatter2d title:',title,' input x exceeded [0,1] range.\ min:',np.min(x),' max:',np.max(x)) if not 0<=np.min(y)<=np.max(y)<=1: raise ValueError('summary_scatter2d title:',title,' input y exceeded [0,1] range.\ min:',np.min(y),' max:',np.max(y)) plt.xlim([0,1]) plt.ylim([0,1]) return fig def prepare_dirs_and_logger(config): formatter = logging.Formatter("%(asctime)s:%(levelname)s::%(message)s") logger = logging.getLogger() for hdlr in logger.handlers: logger.removeHandler(hdlr) handler = logging.StreamHandler() handler.setFormatter(formatter) logger.addHandler(handler) if config.load_path: if config.load_path.startswith(config.log_dir): config.model_dir = config.load_path else: if config.load_path.startswith(config.dataset): config.model_name = config.load_path else: config.model_name = "{}_{}".format(config.dataset, config.load_path) else: config.model_name = "{}_{}".format(config.dataset, get_time()) if not hasattr(config, 'model_dir'): config.model_dir = os.path.join(config.log_dir, config.model_name) config.data_path = os.path.join(config.data_dir, config.dataset) if config.is_train: config.log_code_dir=os.path.join(config.model_dir,'code') for path in [config.log_dir, config.data_dir, config.model_dir, config.log_code_dir]: if not os.path.exists(path): os.makedirs(path) #Copy python code in directory into model_dir/code for future reference: code_dir=os.path.dirname(os.path.realpath(sys.argv[0])) model_files = [f for f in listdir(code_dir) if isfile(join(code_dir, f))] for f in model_files: if f.endswith('.py'): shutil.copy2(f,config.log_code_dir) def get_time(): return datetime.now().strftime("%m%d_%H%M%S") def save_config(config): param_path = os.path.join(config.model_dir, "params.json") print("[*] MODEL dir: %s" % config.model_dir) print("[*] PARAM path: %s" % param_path) with open(param_path, 'w') as fp: json.dump(config.__dict__, fp, indent=4, sort_keys=True) class Timer(object): def __init__(self): self.total_section_time=0. self.iter=0 def on(self): self.t0=time.time() def off(self): self.total_section_time+=time.time()-self.t0 self.iter+=1 def __str__(self): n_min=self.total_section_time/60. return '%.2fmin'%n_min ================================================ FILE: tboard.py ================================================ import os import sys from subprocess import call def file2number(fname): nums=[s for s in fname.split('_') if s.isdigit()] if len(nums)==0: nums=['0'] number=int(''.join(nums)) return number if __name__=='__main__': root='./logs' logs=os.listdir(root) logs.sort(key=lambda x:file2number(x)) logdir=os.path.join(root,logs[-1]) print 'running tensorboard on logdir:',logdir call(['tensorboard', '--logdir',logdir]) ================================================ FILE: trainer.py ================================================ from __future__ import print_function import numpy as np import tensorflow as tf from causal_controller.CausalController import CausalController from tqdm import trange import os import pandas as pd from utils import make_summary,distribute_input_data,get_available_gpus from utils import save_image from data_loader import DataLoader from figure_scripts.pairwise import crosstab class Trainer(object): def __init__(self, config, cc_config, model_config=None): self.config=config self.cc_config=cc_config self.model_dir = config.model_dir self.cc_config.model_dir=config.model_dir self.model_config=model_config if self.model_config: self.model_config.model_dir=config.model_dir self.save_model_dir=os.path.join(self.model_dir,'checkpoints') if not os.path.exists(self.save_model_dir): os.mkdir(self.save_model_dir) self.summary_dir=os.path.join(self.model_dir,'summaries') if not os.path.exists(self.summary_dir): os.mkdir(self.summary_dir) self.load_path = config.load_path self.use_gpu = config.use_gpu #This tensor controls batch_size for all models #Not expected to change during training, but during testing it can be #helpful to change it self.batch_size=tf.placeholder_with_default(self.config.batch_size,[],name='batch_size') loader_batch_size=config.num_devices*config.batch_size #Always need to build CC print('setting up CausalController') cc_batch_size=config.num_devices*self.batch_size#Tensor/placeholder self.cc=CausalController(cc_batch_size,cc_config) self.step=self.cc.step #Data print('setting up data') self.data=DataLoader(self.cc.label_names,config) if self.cc_config.is_pretrain or self.config.build_pretrain: print('setup pretrain') #queue system to feed labels quickly. This does not queue images label_queue= self.data.get_label_queue(loader_batch_size) self.cc.build_pretrain(label_queue) #Build Model if self.model_config: #Will build both gen and discrim self.model=self.config.Model(self.batch_size,self.model_config) #Trainer step is defined as cc.step+model.step #e.g. 10k iter pretrain and 100k iter image model #will have image summaries at 100k but trainer model saved at Model-110k self.step+=self.model.step # This queue holds (image,label) pairs, and is used for training conditional GANs data_queue=self.data.get_data_queue(loader_batch_size) self.real_data_by_gpu = distribute_input_data(data_queue,config.num_gpu) self.fake_data_by_gpu = distribute_input_data(self.cc.label_dict,config.num_gpu) with tf.variable_scope('tower'): for gpu in get_available_gpus(): print('using device:',gpu) real_data=self.real_data_by_gpu[gpu] fake_data=self.fake_data_by_gpu[gpu] tower=gpu.replace('/','').replace(':','_') with tf.device(gpu),tf.name_scope(tower): #Build num_gpu copies of graph: inputs->gradient #Updates self.tower_dict self.model(real_data,fake_data) #allow future gpu to use same variables tf.get_variable_scope().reuse_variables() if self.model_config.is_train or self.config.build_train: self.model.build_train_op() self.model.build_summary_op() else: print('Image model not built') self.saver = tf.train.Saver(keep_checkpoint_every_n_hours=2) self.summary_writer = tf.summary.FileWriter(self.summary_dir) print('trainer.model_dir:',self.model_dir) gpu_options = tf.GPUOptions(allow_growth=True, per_process_gpu_memory_fraction=0.333) sess_config = tf.ConfigProto(allow_soft_placement=True, gpu_options=gpu_options) sv = tf.train.Supervisor( logdir=self.save_model_dir, is_chief=True, saver=self.saver, summary_op=None, summary_writer=self.summary_writer, save_model_secs=300, global_step=self.step, ready_for_local_init_op=None ) self.sess = sv.prepare_or_wait_for_session(config=sess_config) if cc_config.pt_load_path: print('Attempting to load pretrain model:',cc_config.pt_load_path) self.cc.load(self.sess,cc_config.pt_load_path) print('Check tvd after restore') info=crosstab(self,report_tvd=True) print('tvd after load:',info['tvd']) #save copy of cc model in new dir cc_step=self.sess.run(self.cc.step) self.cc.saver.save(self.sess,self.cc.save_model_name,cc_step) if config.load_path:#Declare loading point pnt_str='Loaded variables at ccStep:{}' cc_step=self.sess.run(self.cc.step) pnt_str=pnt_str.format(cc_step) print('pntstr',pnt_str) if self.model_config: pnt_str+=' imagemodelStep:{}' model_step=self.sess.run pnt_str=pnt_str.format(model_step) print(pnt_str) #PREPARE training: #TODO save as Variables so they are restored to same values when load model fixed_batch_size=256 #get this many fixed z values self.fetch_fixed_z={n.z:n.z for n in self.cc.nodes} if model_config: self.fetch_fixed_z[self.model.z_gen]=self.model.z_gen #feed_dict that ensures constant inputs #add feed_fixed_z[self.cc.Male.label]=1*ones() to intervene self.feed_fixed_z=self.sess.run(self.fetch_fixed_z,{self.batch_size:fixed_batch_size}) def pretrain_loop(self,num_iter=None): ''' num_iter : is the number of *additional* iterations to do baring one of the quit conditions (the model may already be trained for some number of iterations). Defaults to cc_config.pretrain_iter. ''' #TODO: potentially should be moved into CausalController for consistency num_iter = num_iter or self.cc.config.pretrain_iter if hasattr(self,'model'): model_step=self.sess.run(self.model.step) assert model_step==0,'if pretraining, model should not be trained already' cc_step=self.sess.run(self.cc.step) if cc_step>0: print('Resuming training of already optimized CC model at\ step:',cc_step) label_stats=crosstab(self,report_tvd=True) def break_pretrain(label_stats,counter): c1=counter>=self.cc.config.min_pretrain_iter c2= (label_stats['tvd'] len(gpus): raise ValueError('number of gpus specified={}, more than gpus available={}'.format(num_gpu,len(gpus))) gpus=gpus[:num_gpu] data_by_gpu={g:{} for g in gpus} for key,value in data_loader.items(): spl_vals=tf.split(value,num_gpu) for gpu,val in zip(gpus,spl_vals): data_by_gpu[gpu][key]=val return data_by_gpu def rank(array): return len(array.shape) def make_grid(tensor, nrow=8, padding=2, normalize=False, scale_each=False): """Code based on https://github.com/pytorch/vision/blob/master/torchvision/utils.py minor improvement, row/col was reversed""" nmaps = tensor.shape[0] ymaps = min(nrow, nmaps) xmaps = int(math.ceil(float(nmaps) / ymaps)) height, width = int(tensor.shape[1] + padding), int(tensor.shape[2] + padding) grid = np.zeros([height * ymaps + 1 + padding // 2, width * xmaps + 1 + padding // 2, 3], dtype=np.uint8) k = 0 for y in range(ymaps): for x in range(xmaps): if k >= nmaps: break h, h_width = y * height + 1 + padding // 2, height - padding w, w_width = x * width + 1 + padding // 2, width - padding grid[h:h+h_width, w:w+w_width] = tensor[k] k = k + 1 return grid def save_image(tensor, filename, nrow=8, padding=2, normalize=False, scale_each=False): ndarr = make_grid(tensor, nrow=nrow, padding=padding, normalize=normalize, scale_each=scale_each) im = Image.fromarray(ndarr) im.save(filename)