Full Code of mkocaoglu/CausalGAN for AI

master 9d52b520b5ef cached
51 files
288.4 KB
77.7k tokens
284 symbols
1 requests
Download .txt
Showing preview only (304K chars total). Download the full file or copy to clipboard to get everything.
Repository: mkocaoglu/CausalGAN
Branch: master
Commit: 9d52b520b5ef
Files: 51
Total size: 288.4 KB

Directory structure:
gitextract_fvn7v_0h/

├── .gitignore
├── LICENSE
├── README.md
├── assets/
│   ├── 0808_112404_cbcg.csv
│   ├── 0810_191625_bcg.csv
│   ├── 0821_213901_rcbcg.csv
│   ├── guide_to_gifs.txt
│   └── tvdplot.ipynb
├── causal_began/
│   ├── CausalBEGAN.py
│   ├── __init__.py
│   ├── config.py
│   ├── models.py
│   └── utils.py
├── causal_controller/
│   ├── ArrayDict.py
│   ├── CausalController.py
│   ├── __init__.py
│   ├── config.py
│   ├── models.py
│   └── utils.py
├── causal_dcgan/
│   ├── CausalGAN.py
│   ├── __init__.py
│   ├── config.py
│   ├── models.py
│   ├── ops.py
│   └── utils.py
├── causal_graph.py
├── config.py
├── data_loader.py
├── download.py
├── figure_scripts/
│   ├── __init__.py
│   ├── distributions.py
│   ├── encode.py
│   ├── high_level.py
│   ├── pairwise.py
│   ├── probability_table.txt
│   ├── sample.py
│   └── utils.py
├── main.py
├── synthetic/
│   ├── README.md
│   ├── collect_stats.py
│   ├── config.py
│   ├── figure_generation.ipynb
│   ├── main.py
│   ├── models.py
│   ├── run_datasets.sh
│   ├── tboard.py
│   ├── trainer.py
│   └── utils.py
├── tboard.py
├── trainer.py
└── utils.py

================================================
FILE CONTENTS
================================================

================================================
FILE: .gitignore
================================================
data/
data
.*.swp

logs
old

final_checkpoints
checkpoint/
figures/
*.pyc
.DS_Store
.ipynb_checkpoints
[._]*.s[a-v][a-z]
[._]*.sw[a-p]
[._]s[a-v][a-z]
[._]sw[a-p]

samples
outputs



================================================
FILE: LICENSE
================================================
MIT License

Copyright (c) 2017 Murat Kocaoglu, Christopher Snyder

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.


================================================
FILE: README.md
================================================
# CausalGAN/CausalBEGAN in Tensorflow

Tensorflow implementation of [CausalGAN: Learning Causal Implicit Generative Models with Adversarial Training](https://arxiv.org/abs/1709.02023)

### Top: Random samples from do(Bald=1); Bottom: Random samples from cond(Bald=1)
![alt text](./assets/314393_began_Bald_topdo1_botcond1.png)
### Top: Random samples from do(Mustache=1); Bottom: Random samples from cond(Mustache=1)
![alt text](./assets/314393_began_Mustache_topdo1_botcond1.png)


## Requirements
- Python 2.7
- [Pillow](https://pillow.readthedocs.io/en/4.0.x/)
- [tqdm](https://github.com/tqdm/tqdm)
- [requests](https://github.com/kennethreitz/requests) (Only used for downloading CelebA dataset)
- [TensorFlow 1.1.0](https://github.com/tensorflow/tensorflow)

## Getting Started

First download [CelebA](http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html) datasets with:

    $ apt-get install p7zip-full # ubuntu
    $ brew install p7zip # Mac
    $ pip install tqdm
    $ python download.py

## Usage

The CausalGAN/CausalBEGAN code factorizes into two components, which can be trained or loaded independently: the causal_controller module specifies the model which learns a causal generative model over labels, and the causal_dcgan or causal_began modules learn a GAN over images given those labels. We denote training the causal controller over labels as "pretraining" (--is_pretrain=True), and training a GAN over images given labels as "training" (--is_train=True)

To train a causal implicit model over labels and then over the image given the labels use

    $ python main.py --causal_model big_causal_graph --is_pretrain True --model_type began --is_train True

where "big_causal_graph" is one of the causal graphs specified by the keys in the causal_graphs dictionary in causal_graph.py. 

Alternatively, one can first train a causal implicit model over labels only with the following command:

    $ python main.py --causal_model big_causal_graph --is_pretrain True

One can then train a conditional generative model for the images given the trained causal generative model for the labels (causal controller), which yields a causal implicit generative model for the image and the labels, as suggested in [arXiv link to the paper]:

    $ echo CC-MODEL_PATH='./logs/celebA_0810_191625_0.145tvd_bcg/controller/checkpoints/CC-Model-20000'
    $ python main.py --causal_model big_causal_graph --pt_load_path $CC-MODEL_PATH --model_type began --is_train True 

Instead of loading the model piecewise, once image training has been run once, the entire joint model can be loaded more simply by specifying the model directory:

    $ python main.py --causal_model big_causal_graph --load_path ./logs/celebA_0815_170635 --model_type began --is_train True 

Tensorboard visualization of the most recently created model is simply (as long as port 6006 is free):

    $ python tboard.py


To interact with an already trained model I recommend the following procedure:

    ipython
    In [1]: %run main --causal_model big_causal_graph --load_path './logs/celebA_0815_170635' --model_type 'began'

For example to sample N=22 interventional images from do(Smiling=1) (as long as your causal graph includes a "Smiling" node:

    In [2]: sess.run(model.G,{cc.Smiling.label:np.ones((22,1), trainer.batch_size:22})

Conditional sampling is most efficiently done through 2 session calls: the first to cc.sample_label to get, and the second feeds that sampled label to get an image. See trainer.causal_sampling for a more extensive example. Note that is also possible combine conditioning and intervention during sampling.

    In [3]: lab_samples=cc.sample_label(sess,do_dict={'Bald':1}, cond_dict={'Mustache':1},N=22)

will sample all labels from the joint distribution conditioned on Mustache=1 and do(Bald=1). These label samples can be turned into image samples as follows:

    In [4]: feed_dict={cc.label_dict[k]:v for k,v in lab_samples.iteritems()}
    In [5]: feed_dict[trainer.batch_size]=22
    In [6]: images=sess.run(trainer.G,feed_dict)


### Configuration
Since this really controls training of 3 different models (CausalController, CausalGAN, and CausalBEGAN), many configuration options are available. To make things managable, there are 4 files corresponding to configurations specific to different parts of the model. Not all configuration combinations are tested. Default parameters are gauranteed to work.

configurations:
./config.py  :  generic data and scheduling
./causal_controller/config  :  specific to CausalController
./causal_dcgan/config  :  specific to CausalGAN
./causal_began/config  :  specific to CausalBEGAN

For convenience, the configurations used are saved in 4 .json files in the model directory for future reference.


## Results

### Causal Controller convergence
We show convergence in TVD for Causal Graph 1 (big_causal_graph in causal_graph.py), a completed version of Causal Graph 1 (complete_big_causal_graph in causal_graph.py, and an edge reversed version of the complete Causal Graph 1 (reverse_big_causal_graph in causal_graph.py). We could get reasonable marginals with a complete DAG containing all 40 nodes, but TVD becomes very difficult to measure. We show TVD convergence for 9 nodes for two complete graphs. When the graph is incomplete, there is a "TVD gap" but reasonable convergence.

![alt text](./assets/tvd_vs_step.png)

### Conditional vs Interventional Sampling:
We trained a causal implicit generative model assuming we are given the following causal graph over labels:
For the following images when we condition or intervene, these operations can be reasoned about from the graph structure. e.g., conditioning on mustache=1 should give more male whereas intervening should not (since the edges from the parents are disconnected in an intervention).

### CausalGAN Conditioning vs Intervening
For each label, images were randomly sampled by either _intervening_ (top row) or _conditioning_ (bottom row) on label=1.

![alt text](./assets/causalgan_pictures/45507_intvcond_Bald=1_2x10.png) Bald

![alt text](./assets/causalgan_pictures/45507_intvcond_Mouth_Slightly_Open=1_2x10.png) Mouth Slightly Open

![alt text](./assets/causalgan_pictures/45507_intvcond_Mustache=1_2x10.png) Mustache

![alt text](./assets/causalgan_pictures/45507_intvcond_Narrow_Eyes=1_2x10.png) Narrow Eyes

![alt text](./assets/causalgan_pictures/45507_intvcond_Smiling=1_2x10.png) Smiling

![alt text](./assets/causalgan_pictures/45507_intvcond_Eyeglasses=1_2x10.png) Eyeglasses

![alt text](./assets/causalgan_pictures/45507_intvcond_Wearing_Lipstick=1_2x10.png) Wearing Lipstick

### CausalBEGAN Conditioning vs Intervening
For each label, images were randomly sampled by either _intervening_ (top row) or _conditioning_ (bottom row) on label=1.

![alt text](./assets/causalbegan_pictures/190001_intvcond_Bald=1_2x10.png) Bald

![alt text](./assets/causalbegan_pictures/190001_intvcond_Mouth_Slightly_Open=1_2x10.png) Mouth Slightly Open

![alt text](./assets/causalbegan_pictures/190001_intvcond_Mustache=1_2x10.png) Mustache

![alt text](./assets/causalbegan_pictures/190001_intvcond_Narrow_Eyes=1_2x10.png) Narrow Eyes

![alt text](./assets/causalbegan_pictures/190001_intvcond_Smiling=1_2x10.png) Smiling

![alt text](./assets/causalbegan_pictures/190001_intvcond_Eyeglasses=1_2x10.png)  Eyeglasses

![alt text](./assets/causalbegan_pictures/190001_intvcond_Wearing_Lipstick=1_2x10.png) Wearing Lipstick

### CausalGAN Generator output (10x10) (randomly sampled label)
![alt text](https://user-images.githubusercontent.com/10726729/30076306-09743002-923e-11e7-8011-8523cd914f25.gif)

### CausalBEGAN Generator output (10x10) (randomly sampled label)
![alt text](https://user-images.githubusercontent.com/10726729/30076379-38b407fc-923e-11e7-81aa-4310c76a2e39.gif)

<---
  Repo originally forked from these two
- [BEGAN-tensorflow](https://github.com/carpedm20/BEGAN-tensorflow)
- [DCGAN-tensorflow](https://github.com/carpedm20/DCGAN-tensorflow)
-->

## Related works
- [Generative Adversarial Networks](https://arxiv.org/abs/1406.2661)
- [Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks](https://arxiv.org/abs/1511.06434)
- [Wasserstein GAN](https://arxiv.org/abs/1701.07875)
- [BEGAN: Boundary Equilibrium Generative Adversarial Networks](https://arxiv.org/abs/1703.10717)

## Authors

Christopher Snyder / [@22csnyder](http://22csnyder.github.io)
Murat Kocaoglu / [@mkocaoglu](http://mkocaoglu.github.io)


================================================
FILE: assets/0808_112404_cbcg.csv
================================================
Wall time,Step,Value
1502209477.065396,1,0.9871935844421387
1502210175.629644,1001,0.5611526370048523
1502210858.027971,2001,0.48091334104537964
1502211539.450148,3001,0.3693326711654663
1502212228.305266,4001,0.2690610885620117
1502212916.163691,5001,0.1852252036333084
1502213605.455342,6001,0.11786147207021713
1502214290.655429,7001,0.10585799068212509
1502214974.834744,8001,0.11575613915920258
1502215664.377923,9001,0.09277261048555374
1502216342.813149,10001,0.08084549009799957
1502217004.542623,11001,0.07447165995836258
1502217677.840079,12001,0.07388914376497269
1502218338.794636,13001,0.06354445964097977
1502219000.20777,14001,0.058855485171079636
1502219659.079145,15001,0.06558254361152649
1502220348.8056,16001,0.051907140761613846
1502221033.399544,17001,0.04890892282128334
1502221718.709654,18001,0.04604059085249901
1502222403.268966,19001,0.04389917105436325
1502223087.183902,20001,0.04280887916684151
1502223772.410776,21001,0.04196497052907944
1502224457.815937,22001,0.038901761174201965
1502225141.198389,23001,0.04273799806833267
1502225826.618027,24001,0.041886329650878906
1502226518.698883,25001,0.04319506511092186
1502227208.700241,26001,0.042861778289079666
1502227899.513253,27001,0.04321207478642464
1502228588.126751,28001,0.035417430102825165
1502229277.24218,29001,0.03713845834136009
1502229964.6007,30001,0.03938867151737213


================================================
FILE: assets/0810_191625_bcg.csv
================================================
Wall time,Step,Value
1502410626.387592,1,0.9544087648391724
1502411081.292726,1001,0.5290326476097107
1502411533.622933,2001,0.44044023752212524
1502411981.535893,3001,0.35751280188560486
1502412434.074014,4001,0.2676760256290436
1502412884.345166,5001,0.20682139694690704
1502413336.727762,6001,0.1853639930486679
1502413786.845507,7001,0.19252602756023407
1502414239.265506,8001,0.19284175336360931
1502414689.356373,9001,0.16991157829761505
1502415145.18223,10001,0.15723274648189545
1502415595.021095,11001,0.15078511834144592
1502416037.124821,12001,0.14841803908348083
1502416478.158467,13001,0.1522006243467331
1502416920.270544,14001,0.15191766619682312
1502417364.060506,15001,0.14936088025569916
1502417803.97219,16001,0.14549562335014343
1502418242.907475,17001,0.14224907755851746
1502418684.820146,18001,0.13779735565185547
1502419124.551228,19001,0.14404024183750153


================================================
FILE: assets/0821_213901_rcbcg.csv
================================================
Wall time,Step,Value
1503369574.677247,1,0.8920440077781677
1503370041.447478,1001,0.512530505657196
1503370517.215026,2001,0.44317319989204407
1503370985.171754,3001,0.35666027665138245
1503371450.274446,4001,0.2928802967071533
1503371929.346399,5001,0.19688302278518677
1503372408.39261,6001,0.13801704347133636
1503372886.733545,7001,0.1106921136379242
1503373363.362404,8001,0.08717407286167145
1503373839.834317,9001,0.0857364684343338
1503374318.503915,10001,0.07331433147192001
1503374802.444324,11001,0.07706638425588608
1503375279.389205,12001,0.06169278547167778
1503375752.728541,13001,0.059477031230926514
1503376226.577342,14001,0.061632610857486725
1503376699.448754,15001,0.06138858571648598
1503377174.465165,16001,0.05955960601568222
1503377653.261056,17001,0.04774799197912216
1503378126.625743,18001,0.05300581455230713
1503378604.128631,19001,0.047743991017341614
1503379079.647434,20001,0.05426724627614021
1503379555.901424,21001,0.04658582806587219
1503380028.219916,22001,0.04909271374344826
1503380498.204313,23001,0.05326574668288231
1503380962.853232,24001,0.05447468161582947
1503381428.927937,25001,0.05708151310682297
1503381893.354328,26001,0.051777616143226624
1503382360.002207,27001,0.046131476759910583
1503382825.077767,28001,0.04513547569513321
1503383290.90524,29001,0.044165026396512985


================================================
FILE: assets/guide_to_gifs.txt
================================================
#Approach uses imagemagick
#Take the first 20 images in a folder and convert to gif
ls -v | head -20 | xargs cp -t newfolder
cd newfolder
mogrify -format png *.pdf
mogrify -crop 62.5%x62.5%+0+0 +repage *.png
rm *.pdf
convert -delay 20 $(ls -v) -loop 0 -layers optimize mygifname.gif


================================================
FILE: assets/tvdplot.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Using matplotlib backend: TkAgg\n"
     ]
    }
   ],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "import tensorflow as tf\n",
    "import pandas as pd\n",
    "%matplotlib"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "\n",
    "raw_data={'cG1': pd.read_csv('0808_112404_cbcg.csv'),\n",
    "      'G1' : pd.read_csv('0810_191625_bcg.csv'),\n",
    "      'rcG1': pd.read_csv('0821_213901_rcbcg.csv')}\n",
    "xlabel='Training Step'\n",
    "dfs=[pd.DataFrame(data={k:v['Value'].values,xlabel:v['Step'].values}) for k,v in raw_data.items()]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "\n",
    "raw_data={'Causal Graph 1' : pd.read_csv('0810_191625_bcg.csv'),\n",
    "          'complete Causal Graph 1': pd.read_csv('0808_112404_cbcg.csv'),      \n",
    "          'edge-reversed complete Causal Graph 1': pd.read_csv('0821_213901_rcbcg.csv')}\n",
    "xlabel='Training Step'\n",
    "dfs=[pd.DataFrame(data={k:v['Value'].values,xlabel:v['Step'].values}) for k,v in raw_data.items()]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "def my_merge(df1,df2):\n",
    "    return pd.merge(df1,df2,how='outer',on=xlabel)\n",
    "    \n",
    "\n",
    "plot_data=reduce(my_merge,dfs)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.text.Text at 0x7f376528c690>"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ax=plot_data.plot.line(x=xlabel,xlim=[0,18000],ylim=[0,1],style = ['bs-','ro-','y^-'])\n",
    "ax.set_ylabel('Total Variation Distance',fontsize=18)\n",
    "ax.set_title('TVD of Label Generation',fontsize=18)\n",
    "ax.set_xlabel(xlabel,fontsize=18)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "plt.savefig('tvd_vs_step.pdf')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}


================================================
FILE: causal_began/CausalBEGAN.py
================================================
from __future__ import print_function
from utils import save_image,distribute_input_data,summary_stats,make_summary
import pandas as pd
import os
import StringIO
import scipy.misc
import numpy as np
from glob import glob
from tqdm import trange
from itertools import chain
from collections import deque
from figure_scripts.pairwise import crosstab
from figure_scripts.sample import intervention2d,condition2d

from utils import summary_stats
from models import *

class CausalBEGAN(object):
    '''
    A quick quirk about this class.
    if the model is built with a gpu, it must
    later be loaded with a gpu in order to preserve
    tensor structure: NCHW/NHCW (number-channel-height-width/number-height-channel-width)

    in paper <-> in code
    b1,c1    <-> b_k, k_t
    b2,c2    <-> b_l, l_t
    b3,c3    <-> b_z, z_t
    '''

    def __init__(self,batch_size,config):
        '''
        batch_size: again a tensorflow placeholder
        config    : see causal_began/config.py
        '''

        self.batch_size=batch_size #a tensor
        self.config=config
        self.use_gpu = config.use_gpu
        self.data_format=self.config.data_format#NHWC or NCHW
        self.TINY = 10**-6

        #number of calls to self.g_optim
        self.step = tf.Variable(0, name='step', trainable=False)

        #optimizers
        self.g_lr = tf.Variable(config.g_lr, name='g_lr')
        self.d_lr = tf.Variable(config.d_lr, name='d_lr')

        self.g_lr_update = tf.assign(self.g_lr, self.g_lr * 0.5, name='g_lr_update')
        self.d_lr_update = tf.assign(self.d_lr, self.d_lr * 0.5, name='d_lr_update')

        optimizer = tf.train.AdamOptimizer
        self.g_optimizer, self.d_optimizer = optimizer(self.g_lr), optimizer(self.d_lr)

        self.lambda_k = config.lambda_k
        self.lambda_l = config.lambda_l
        self.lambda_z = config.lambda_z
        self.gamma = config.gamma
        self.gamma_label = config.gamma_label
        self.zeta=config.zeta
        self.z_dim = config.z_dim
        self.conv_hidden_num = config.conv_hidden_num

        self.model_dir = config.model_dir

        self.start_step = 0
        self.log_step = config.log_step
        self.max_step = config.max_step
        self.lr_update_step = config.lr_update_step
        self.is_train = config.is_train

        #Keeps track of params from different devices
        self.tower_dict=dict(
                    c_tower_grads=[],
                    dcc_tower_grads=[],
                    g_tower_grads=[],
                    d_tower_grads=[],
                    tower_g_loss_image=[],
                    tower_d_loss_real=[],
                    tower_g_loss_label=[],
                    tower_d_loss_real_label=[],
                    tower_d_loss_fake_label=[],
            )
        self.k_t = tf.get_variable(name='k_t',initializer=0.,trainable=False)
        self.l_t = tf.get_variable(name='l_t',initializer=0.,trainable=False)
        self.z_t = tf.get_variable(name='z_t',initializer=0.,trainable=False)

    def __call__(self, real_inputs, fake_inputs):
        '''
        in a multi gpu setting, self.__call__ is done once for every device with variables shared so
        that a copy of the tensorflow variables created in self.__call__ resides on
        each device. This would be run multiple times in a loop over devices.

        Parameters:
        fake inputs : a dictionary of labels from cc
        real_inputs : also a dictionary of labels
                      with an additional key 'x' for the real image
        '''
        config=self.config

        #The keys are all the labels union 'x'
        self.real_inputs=real_inputs
        self.fake_inputs=fake_inputs
        n_labels=len(fake_inputs)#number of labels in graph, not dataset

        #[0,255] NHWC
        self.x = self.real_inputs.pop('x')

        #used to change dataformat in data queue
        if self.data_format == 'NCHW':
            #self.x = tf.transpose(self.x, [2, 0, 1])#3D
            self.x = tf.transpose(self.x, [0, 3, 1, 2])#4D
        elif self.data_format == 'NHWC':
            pass
        else:
            raise Exception("[!] Unkown data_format: {}".format(self.data_format))

        _, height, width, self.channel = \
                get_conv_shape(self.x, self.data_format)
        self.config.repeat_num= int(np.log2(height)) - 2
        self.config.channel=self.channel

        #There are two versions: "x" and "self.x".
        #    "x" is normalized for computation
        #    "self.x" is unnormalized for saving and summaries
        #    likewise for "G" and "self.G"
        #x in [-1,1]
        x = norm_img(self.x)

        self.real_labels=tf.concat(self.real_inputs.values(),-1)
        self.fake_labels=tf.concat(self.fake_inputs.values(),-1)

        #noise given to generate image in addition to labels
        self.z_gen = tf.random_uniform(
            (self.batch_size, self.z_dim), minval=-1.0, maxval=1.0)

        if self.config.round_fake_labels:#default
            self.z= tf.concat( [tf.round(self.fake_labels), self.z_gen],axis=-1,name='z')
        else:
            self.z= tf.concat( [self.fake_labels, self.z_gen],axis=-1,name='z')

        G, self.G_var = GeneratorCNN(self.z,config)
        d_out, self.D_z, self.D_var = DiscriminatorCNN(tf.concat([G, x],0),config)
        AE_G, AE_x = tf.split(d_out, 2)
        self.D_encode_G, self.D_encode_x=tf.split(self.D_z, 2)#axis=0 by default

        if not self.config.separate_labeler:
            self.D_fake_labels_logits=tf.slice(self.D_encode_G,[0,0],[-1,n_labels])
            self.D_real_labels_logits=tf.slice(self.D_encode_x,[0,0],[-1,n_labels])
        else:#default
            self.D_fake_labels_logits,self.DL_var=Discriminator_labeler(G,n_labels,config)
            self.D_real_labels_logits,_=Discriminator_labeler(x,n_labels,config,reuse=True)
            self.D_var += self.DL_var

        self.D_real_labels=tf.sigmoid(self.D_real_labels_logits)
        self.D_fake_labels=tf.sigmoid(self.D_fake_labels_logits)
        self.D_real_labels_list=tf.split(self.D_real_labels,n_labels,axis=1)
        self.D_fake_labels_list=tf.split(self.D_fake_labels,n_labels,axis=1)

        # sigmoid_cross_entropy_with_logits
        def sxe(logits,labels):
            #use zeros or ones if pass in scalar
            if not isinstance(labels,tf.Tensor):
                labels=labels*tf.ones_like(logits)
            return tf.nn.sigmoid_cross_entropy_with_logits(
                logits=logits,labels=labels)

        #Round fake labels before calc loss
        if self.config.round_fake_labels:#default
            fake_labels=tf.round(self.fake_labels)
        else:
            fake_labels=self.fake_labels

        #This is here because it's used in cross_entropy calc, but it's not used by default
        self.fake_labels_logits= -tf.log(1/(self.fake_labels+self.TINY)-1)

        #One of three label losses available
        # Default is squared loss, "squarediff"
        self.d_xe_real_label=sxe(self.D_real_labels_logits,self.real_labels)
        self.d_xe_fake_label=sxe(self.D_fake_labels_logits,fake_labels)
        self.g_xe_label=sxe(self.fake_labels_logits, self.D_fake_labels)

        self.d_absdiff_real_label=tf.abs(self.D_real_labels  - self.real_labels)
        self.d_absdiff_fake_label=tf.abs(self.D_fake_labels  - fake_labels)
        self.g_absdiff_label     =tf.abs(fake_labels  -  self.D_fake_labels)

        self.d_squarediff_real_label=tf.square(self.D_real_labels  - self.real_labels)
        self.d_squarediff_fake_label=tf.square(self.D_fake_labels  - fake_labels)
        self.g_squarediff_label     =tf.square(fake_labels  -  self.D_fake_labels)

        if self.config.label_loss=='xe':
            self.d_loss_real_label = tf.reduce_mean(self.d_xe_real_label)
            self.d_loss_fake_label = tf.reduce_mean(self.d_xe_fake_label)
            self.g_loss_label      = tf.reduce_mean(self.g_xe_label)
        elif self.config.label_loss=='absdiff':
            self.d_loss_real_label = tf.reduce_mean(self.d_absdiff_real_label)
            self.d_loss_fake_label = tf.reduce_mean(self.d_absdiff_fake_label)
            self.g_loss_label      = tf.reduce_mean(self.g_absdiff_label)
        elif self.config.label_loss=='squarediff':
            self.d_loss_real_label = tf.reduce_mean(self.d_squarediff_real_label)
            self.d_loss_fake_label = tf.reduce_mean(self.d_squarediff_fake_label)
            self.g_loss_label      = tf.reduce_mean(self.g_squarediff_label)

        #"self.G" is [0,255], "G" is [-1,1]
        self.G = denorm_img(G, self.data_format)
        self.AE_G, self.AE_x = denorm_img(AE_G, self.data_format), denorm_img(AE_x, self.data_format)

        u1=tf.abs(AE_x - x)
        u2=tf.abs(AE_G - G)
        m1=tf.reduce_mean(u1)
        m2=tf.reduce_mean(u2)
        c1=tf.reduce_mean(tf.square(u1-m1))
        c2=tf.reduce_mean(tf.square(u2-m2))
        self.eqn2 = tf.square(m1-m2)#from orig began paper
        self.eqn1 = (c1+c2-2*tf.sqrt(c1*c2))/self.eqn2#from orig began paper

        self.d_loss_real = tf.reduce_mean(u1)
        self.d_loss_fake = tf.reduce_mean(u2)
        self.g_loss_image = tf.reduce_mean(tf.abs(AE_G - G))

        self.d_loss_image=self.d_loss_real       -   self.k_t*self.d_loss_fake
        self.d_loss_label=self.d_loss_real_label -   self.l_t*self.d_loss_fake_label
        self.d_loss=self.d_loss_image+self.d_loss_label

        if not self.config.no_third_margin:#normal mode
            #Careful on z_t sign!#(z_t <==> c_3 from paper)
            self.g_loss = self.g_loss_image + self.z_t*self.g_loss_label
        else:
            print('Warning: not using third margin')
            self.g_loss = self.g_loss_image + 1.*self.g_loss_label

        # Calculate the gradients for the batch of data,
        # on this particular gpu tower.
        g_grad=self.g_optimizer.compute_gradients(self.g_loss,var_list=self.G_var)
        d_grad=self.d_optimizer.compute_gradients(self.d_loss,var_list=self.D_var)

        self.tower_dict['g_tower_grads'].append(g_grad)
        self.tower_dict['d_tower_grads'].append(d_grad)
        self.tower_dict['tower_g_loss_image'].append(self.g_loss_image)
        self.tower_dict['tower_d_loss_real'].append(self.d_loss_real)
        self.tower_dict['tower_g_loss_label'].append(self.g_loss_label)
        self.tower_dict['tower_d_loss_real_label'].append(self.d_loss_real_label)
        self.tower_dict['tower_d_loss_fake_label'].append(self.d_loss_fake_label)

        self.var=self.G_var+self.D_var+[self.step]

    def build_train_op(self):
        #Now outside gpu loop

        #attributes starting with ave_ are averaged over devices
        self.ave_d_loss_real       =tf.reduce_mean(self.tower_dict['tower_d_loss_real'])
        self.ave_g_loss_image      =tf.reduce_mean(self.tower_dict['tower_g_loss_image'])
        self.ave_d_loss_real_label =tf.reduce_mean(self.tower_dict['tower_d_loss_real_label'])
        self.ave_d_loss_fake_label =tf.reduce_mean(self.tower_dict['tower_d_loss_fake_label'])
        self.ave_g_loss_label      =tf.reduce_mean(self.tower_dict['tower_g_loss_label'])

        #recalculate balance equations (b1,b2,b3 in paper)
        self.balance_k = self.gamma * self.ave_d_loss_real - self.ave_g_loss_image
        self.balance_l = self.gamma_label * self.ave_d_loss_real_label - self.ave_d_loss_fake_label
        self.balance_z = self.zeta*tf.nn.relu(self.balance_k) - tf.nn.relu(self.balance_l)

        self.measure = self.ave_d_loss_real + tf.abs(self.balance_k)
        self.measure_complete = self.ave_d_loss_real + self.ave_d_loss_real_label + \
            tf.abs(self.balance_k)+tf.abs(self.balance_l)+tf.abs(self.balance_z)

        #update margins coefficients (c1,c2,c3 in paper)
        k_update = tf.assign(
            self.k_t, tf.clip_by_value(self.k_t + self.lambda_k*self.balance_k, 0, 1))
        l_update = tf.assign(
            self.l_t, tf.clip_by_value(self.l_t + self.lambda_l*self.balance_l, 0, 1))
        z_update = tf.assign(
            self.z_t, tf.clip_by_value(self.z_t + self.lambda_z*self.balance_z, 0, 1))

        g_grads=average_gradients(self.tower_dict['g_tower_grads'])
        d_grads=average_gradients(self.tower_dict['d_tower_grads'])

        g_optim = self.g_optimizer.apply_gradients(g_grads, global_step=self.step)
        d_optim = self.d_optimizer.apply_gradients(d_grads)

        #every time train_op is run, run k_update, l_update, z_update
        with tf.control_dependencies([k_update,l_update,z_update]):
            #when train_op is run, run [g_optim,d_optim]
            self.train_op=tf.group(g_optim, d_optim)

    def train_step(self,sess,counter):
        sess.run(self.train_op)

        if counter % self.config.lr_update_step == self.lr_update_step - 1:
            sess.run([self.g_lr_update, self.d_lr_update])

    def build_summary_op(self):
        names,real_labels_list=zip(*self.real_inputs.items())
        _    ,fake_labels_list=zip(*self.fake_inputs.items())
        LabelList=[names,real_labels_list,fake_labels_list,
                   self.D_fake_labels_list,self.D_real_labels_list]
        for name,rlabel,flabel,d_fake_label,d_real_label in zip(*LabelList):
            with tf.name_scope(name):

                d_flabel=tf.cast(tf.round(d_fake_label),tf.int32)
                d_rlabel=tf.cast(tf.round(d_real_label),tf.int32)
                f_acc=tf.contrib.metrics.accuracy(tf.cast(tf.round(flabel),tf.int32),d_flabel)
                r_acc=tf.contrib.metrics.accuracy(tf.cast(tf.round(rlabel),tf.int32),d_rlabel)

                summary_stats('d_fake_label',d_fake_label,hist=True)
                summary_stats('d_real_label',d_real_label,hist=True)

                tf.summary.scalar('ave_d_fake_abs_diff',tf.reduce_mean(tf.abs(flabel-d_fake_label)))
                tf.summary.scalar('ave_d_real_abs_diff',tf.reduce_mean(tf.abs(rlabel-d_real_label)))

                tf.summary.scalar('real_label_ave',tf.reduce_mean(rlabel))
                tf.summary.scalar('real_label_accuracy',r_acc)
                tf.summary.scalar('fake_label_accuracy',f_acc)

        ##Summaries picked from last gpu to run
        tf.summary.scalar('losslabel/d_loss_real_label',tf.reduce_mean(self.ave_d_loss_real_label))
        tf.summary.scalar('losslabel/d_loss_fake_label',tf.reduce_mean(self.ave_d_loss_fake_label))
        tf.summary.scalar('losslabel/g_loss_label',self.g_loss_label)

        tf.summary.image("G", self.G),
        tf.summary.image("AE_G", self.AE_G),
        tf.summary.image("AE_x", self.AE_x),

        tf.summary.scalar("loss/d_loss", self.d_loss),
        tf.summary.scalar("loss/d_loss_fake", self.d_loss_fake),
        tf.summary.scalar("loss/g_loss", self.g_loss),

        tf.summary.scalar("misc/d_lr", self.d_lr),
        tf.summary.scalar("misc/g_lr", self.g_lr),
        tf.summary.scalar("misc/eqn1", self.eqn1),#From orig BEGAN paper
        tf.summary.scalar("misc/eqn2", self.eqn2),#From orig BEGAN paper

        #summaries of gpu-averaged values
        tf.summary.scalar("loss/d_loss_real",self.ave_d_loss_real),
        tf.summary.scalar("loss/g_loss_image", self.ave_g_loss_image),
        tf.summary.scalar("balance/l", self.balance_l),
        tf.summary.scalar("balance/k", self.balance_k),
        tf.summary.scalar("balance/z", self.balance_z),
        tf.summary.scalar("misc/measure", self.measure),
        tf.summary.scalar("misc/measure_complete", self.measure_complete),
        tf.summary.scalar("misc/k_t", self.k_t),
        tf.summary.scalar("misc/l_t", self.l_t),
        tf.summary.scalar("misc/z_t", self.z_t),

        #doesn't include summaries from causal controller
        #TODO: rework so only 1 copy of summaries if multiple gpu
        self.summary_op=tf.summary.merge_all()



================================================
FILE: causal_began/__init__.py
================================================


================================================
FILE: causal_began/config.py
================================================
#-*- coding: utf-8 -*-
import argparse

def str2bool(v):
    #return (v is True) or (v.lower() in ('true', '1'))
    return v is True or v.lower() in ('true', '1')

arg_lists = []
parser = argparse.ArgumentParser()

def add_argument_group(name):
    arg = parser.add_argument_group(name)
    arg_lists.append(arg)
    return arg


#Network
net_arg = add_argument_group('Network')
net_arg.add_argument('--c_dim',type=int, default=3,
                     help='''number of color channels. I wouldn't really change
                     this from 3''')
net_arg.add_argument('--conv_hidden_num', type=int, default=128,
                     choices=[64, 128],help='n in the paper')
net_arg.add_argument('--separate_labeler', type=str2bool, default=True)
net_arg.add_argument('--z_dim', type=int, default=64, choices=[64, 128],
                    help='''dimension of the noise input to the generator along
                    with the labels''')
net_arg.add_argument('--z_num', type=int, default=64,
                    help='''dimension of the hidden space of the autoencoder''')


# Data
data_arg = add_argument_group('Data')
data_arg.add_argument('--dataset', type=str, default='celebA')
data_arg.add_argument('--split', type=str, default='train')
data_arg.add_argument('--batch_size', type=int, default=16)

# Training / test parameters
train_arg = add_argument_group('Training')
train_arg.add_argument('--beta1', type=float, default=0.5)
train_arg.add_argument('--beta2', type=float, default=0.999)
train_arg.add_argument('--d_lr', type=float, default=0.00008)
train_arg.add_argument('--g_lr', type=float, default=0.00008)
train_arg.add_argument('--label_loss',type=str,default='squarediff',choices=['xe','absdiff','squarediff'],
                      help='''what comparison should be made between the
                       labeler output and the actual labels''')
train_arg.add_argument('--lr_update_step', type=int, default=100000, choices=[100000, 75000])
train_arg.add_argument('--max_step', type=int, default=50000)
train_arg.add_argument('--num_iter',type=int,default=250000,
                       help='the number of training iterations to run the model for')
train_arg.add_argument('--optimizer', type=str, default='adam')
train_arg.add_argument('--round_fake_labels',type=str2bool,default=True,
                       help='''Whether the label outputs of the causal
                       controller should be rounded first before calculating
                       the loss of generator or d-labeler''')
train_arg.add_argument('--use_gpu', type=str2bool, default=True)
train_arg.add_argument('--num_gpu', type=int, default=1,
                      help='specify 0 for cpu. If k specified, will default to\
                      first k of n gpus detected. If use_gpu=True but num_gpu not\
                      specified will default to 1')

margin_arg = add_argument_group('Margin')
margin_arg.add_argument('--gamma', type=float, default=0.5)
margin_arg.add_argument('--gamma_label', type=float, default=0.5)
margin_arg.add_argument('--lambda_k', type=float, default=0.001)
margin_arg.add_argument('--lambda_l', type=float, default=0.00008,
                       help='''As mentioned in the paper this is lower because
                       this margin can be responded to more quickly than the
                        other margins. Im not sure if it definitely needs to be lower''')
margin_arg.add_argument('--lambda_z', type=float, default=0.01)
margin_arg.add_argument('--no_third_margin', type=str2bool, default=False,
                       help='''Use True for appendix figure in paper. This is
                        used to neglect the third margin (c3,b3)''')
margin_arg.add_argument('--zeta', type=float, default=0.5,
                       help='''This is gamma_3 in the paper''')

# Misc
misc_arg = add_argument_group('Misc')
misc_arg.add_argument('--is_train',type=str2bool,default=False,
                      help='''whether to enter the image training loop''')
misc_arg.add_argument('--build_all', type=str2bool, default=False,
                     help='''normally specifying is_pretrain=False will cause
                     the pretraining components not to be built and likewise
                      with is_train=False only the pretrain compoenent will
                      (possibly) be built. This is here as a debug helper to
                      enable building out the whole model without doing any
                      training''')
misc_arg.add_argument('--data_dir', type=str, default='data')
misc_arg.add_argument('--dry_run', action='store_true')
#misc_arg.add_argument('--dry_run', type=str2bool, default='False')
misc_arg.add_argument('--log_step', type=int, default=100,
                     help='''how often to log stuff. Sample images are created
                     every 10*log_step''')
misc_arg.add_argument('--num_log_samples', type=int, default=3)
misc_arg.add_argument('--log_level', type=str, default='INFO', choices=['INFO', 'DEBUG', 'WARN'])
misc_arg.add_argument('--log_dir', type=str, default='logs')



def gpu_logic(config):
    #consistency between use_gpu and num_gpu
    if config.num_gpu>0:
        config.use_gpu=True
    else:
        config.use_gpu=False
#        if config.use_gpu and config.num_gpu==0:
#            config.num_gpu=1
    return config


def get_config():
    config, unparsed = parser.parse_known_args()
    config=gpu_logic(config)

    #this has to respect gpu/cpu
    #data_format = 'NCHW'
    if config.use_gpu:
        data_format = 'NCHW'
    else:
        data_format = 'NHWC'
    setattr(config, 'data_format', data_format)


    print('Loaded ./causal_began/config.py')

    return config, unparsed

if __name__=='__main__':
    #for debug of config
    config, unparsed = get_config()



================================================
FILE: causal_began/models.py
================================================
import numpy as np
import tensorflow as tf
slim = tf.contrib.slim


def lrelu(x,leak=0.2,name='lrelu'):
    with tf.variable_scope(name):
        f1=0.5 * (1+leak)
        f2=0.5 * (1-leak)
        return f1*x + f2*tf.abs(x)

def GeneratorCNN( z, config, reuse=None):
    hidden_num=config.conv_hidden_num
    output_num=config.c_dim
    repeat_num=config.repeat_num
    data_format=config.data_format

    with tf.variable_scope("G",reuse=reuse) as vs:
        x = slim.fully_connected(z, np.prod([8, 8, hidden_num]),activation_fn=None,scope='fc1')
        x = reshape(x, 8, 8, hidden_num, data_format)

        for idx in range(repeat_num):
            x = slim.conv2d(x, hidden_num, 3, 1, activation_fn=tf.nn.elu,
                            data_format=data_format,scope='conv'+str(idx)+'a')
            x = slim.conv2d(x, hidden_num, 3, 1, activation_fn=tf.nn.elu,
                            data_format=data_format,scope='conv'+str(idx)+'b')
            if idx < repeat_num - 1:
                x = upscale(x, 2, data_format)

        out = slim.conv2d(x, 3, 3, 1, activation_fn=None,data_format=data_format,scope='conv'+str(idx+1))

    variables = tf.contrib.framework.get_variables(vs)
    return out, variables

def DiscriminatorCNN(image, config, reuse=None):
    hidden_num=config.conv_hidden_num
    data_format=config.data_format
    input_channel=config.channel

    with tf.variable_scope("D",reuse=reuse) as vs:
        # Encoder
        with tf.variable_scope('encoder'):
            x = slim.conv2d(image, hidden_num, 3, 1, activation_fn=tf.nn.elu,
                            data_format=data_format,scope='conv0')

            prev_channel_num = hidden_num
            for idx in range(config.repeat_num):
                channel_num = hidden_num * (idx + 1)
                x = slim.conv2d(x, channel_num, 3, 1, activation_fn=tf.nn.elu,
                                data_format=data_format,scope='conv'+str(idx+1)+'a')
                x = slim.conv2d(x, channel_num, 3, 1, activation_fn=tf.nn.elu,
                                data_format=data_format,scope='conv'+str(idx+1)+'b')
                if idx < config.repeat_num - 1:
                    x = slim.conv2d(x, channel_num, 3, 2, activation_fn=tf.nn.elu,
                                    data_format=data_format,scope='conv'+str(idx+1)+'c')
                    #x = tf.contrib.layers.max_pool2d(x, [2, 2], [2, 2], padding='VALID')

            x = tf.reshape(x, [-1, np.prod([8, 8, channel_num])])
            z = x = slim.fully_connected(x, config.z_num, activation_fn=None,scope='proj')

        # Decoder
        with tf.variable_scope('decoder'):
            x = slim.fully_connected(x, np.prod([8, 8, hidden_num]), activation_fn=None)
            x = reshape(x, 8, 8, hidden_num, data_format)

            for idx in range(config.repeat_num):
                x = slim.conv2d(x, hidden_num, 3, 1, activation_fn=tf.nn.elu,
                                data_format=data_format,scope='conv'+str(idx)+'a')
                x = slim.conv2d(x, hidden_num, 3, 1, activation_fn=tf.nn.elu,
                                data_format=data_format,scope='conv'+str(idx)+'b')
                if idx < config.repeat_num - 1:
                    x = upscale(x, 2, data_format)
            out = slim.conv2d(x, input_channel, 3, 1, activation_fn=None,
                              data_format=data_format,scope='proj')

    variables = tf.contrib.framework.get_variables(vs)
    return out, z, variables


def Discriminator_labeler(image, output_size, config, reuse=None):
    hidden_num=config.conv_hidden_num
    repeat_num=config.repeat_num
    data_format=config.data_format
    with tf.variable_scope("discriminator_labeler",reuse=reuse) as scope:

        x = slim.conv2d(image, hidden_num, 3, 1, activation_fn=tf.nn.elu,
                        data_format=data_format,scope='conv0')

        prev_channel_num = hidden_num
        for idx in range(repeat_num):
            channel_num = hidden_num * (idx + 1)
            x = slim.conv2d(x, channel_num, 3, 1, activation_fn=tf.nn.elu,
                            data_format=data_format,scope='conv'+str(idx+1)+'a')
            x = slim.conv2d(x, channel_num, 3, 1, activation_fn=tf.nn.elu,
                            data_format=data_format,scope='conv'+str(idx+1)+'b')
            if idx < repeat_num - 1:
                x = slim.conv2d(x, channel_num, 3, 2, activation_fn=tf.nn.elu,
                                data_format=data_format,scope='conv'+str(idx+1)+'c')
                #x = tf.contrib.layers.max_pool2d(x, [2, 2], [2, 2], padding='VALID')

        x = tf.reshape(x, [-1, np.prod([8, 8, channel_num])])
        label_logit = slim.fully_connected(x, output_size, activation_fn=None,scope='proj')

        variables = tf.contrib.framework.get_variables(scope)
        return label_logit,variables

def next(loader):
    return loader.next()[0].data.numpy()

def to_nhwc(image, data_format):
    if data_format == 'NCHW':
        #Isn't this backward?
        new_image = nchw_to_nhwc(image)
    else:
        new_image = image
    return new_image

def to_nchw_numpy(image):
    if image.shape[3] in [1, 3]:
        new_image = image.transpose([0, 3, 1, 2])
    else:
        new_image = image
    return new_image

def norm_img(image, data_format=None):
    image = image/127.5 - 1.
    if data_format:
        image = to_nhwc(image, data_format)
    return image

def denorm_img(norm, data_format):
    return tf.clip_by_value(to_nhwc((norm + 1)*127.5, data_format), 0, 255)

def slerp(val, low, high):
    """Code from https://github.com/soumith/dcgan.torch/issues/14"""
    omega = np.arccos(np.clip(np.dot(low/np.linalg.norm(low), high/np.linalg.norm(high)), -1, 1))
    so = np.sin(omega)
    if so == 0:
        return (1.0-val) * low + val * high # L'Hopital's rule/LERP
    return np.sin((1.0-val)*omega) / so * low + np.sin(val*omega) / so * high

def int_shape(tensor):
    shape = tensor.get_shape().as_list()
    return [num if num is not None else -1 for num in shape]

def get_conv_shape(tensor, data_format):
    shape = int_shape(tensor)
    # always return [N, H, W, C]
    if data_format == 'NCHW':
        return [shape[0], shape[2], shape[3], shape[1]]
    elif data_format == 'NHWC':
        return shape

def nchw_to_nhwc(x):
    return tf.transpose(x, [0, 2, 3, 1])

def nhwc_to_nchw(x):
    return tf.transpose(x, [0, 3, 1, 2])

def reshape(x, h, w, c, data_format):
    if data_format == 'NCHW':
        x = tf.reshape(x, [-1, c, h, w])
    else:
        x = tf.reshape(x, [-1, h, w, c])
    return x

def resize_nearest_neighbor(x, new_size, data_format):
    if data_format == 'NCHW':
        x = nchw_to_nhwc(x)
        x = tf.image.resize_nearest_neighbor(x, new_size)
        x = nhwc_to_nchw(x)
    else:
        x = tf.image.resize_nearest_neighbor(x, new_size)
    return x

def upscale(x, scale, data_format):
    _, h, w, _ = get_conv_shape(x, data_format)
    return resize_nearest_neighbor(x, (h*scale, w*scale), data_format)



#https://github.com/tensorflow/models/blob/master/tutorials/image/cifar10/cifar10_multi_gpu_train.py#L168
def average_gradients(tower_grads):
    """Calculate the average gradient for each shared variable across all towers.
    Note that this function provides a synchronization point across all towers.
    Args:
    tower_grads: List of lists of (gradient, variable) tuples.
    The outer list
    is over individual gradients. The inner list is over the gradient
    calculation for each tower.
    Returns:
    List of pairs of (gradient, variable) where the gradient has been averaged across all towers.
    """
    average_grads = []
    for grad_and_vars in zip(*tower_grads):
        # Note that each grad_and_vars looks like the following:
        #   ((grad0_gpu0, var0_gpu0), ... , (grad0_gpuN, var0_gpuN))
        grads = []
        for g, _ in grad_and_vars:
            # Add 0 dimension to the gradients to represent the tower.
            expanded_g = tf.expand_dims(g, 0)

            # Append on a 'tower' dimension which we will average over below.
            grads.append(expanded_g)

        # Average over the 'tower' dimension.
        grad = tf.concat(axis=0, values=grads)
        grad = tf.reduce_mean(grad, 0)

        # Keep in mind that the Variables are redundant because they are shared
        # across towers.  So ..  we will just return the first tower's pointer to the Variable.
        v = grad_and_vars[0][1]
        grad_and_var = (grad, v)
        average_grads.append(grad_and_var)
    return average_grads






================================================
FILE: causal_began/utils.py
================================================
from __future__ import print_function
import tensorflow as tf
import os
from os import listdir
from os.path import isfile, join
import shutil
import sys
import math
import json
import logging
import numpy as np
from PIL import Image
from datetime import datetime
from tensorflow.core.framework import summary_pb2

def make_summary(name, val):
    return summary_pb2.Summary(value=[summary_pb2.Summary.Value(tag=name, simple_value=val)])

def summary_stats(name,tensor,collections=None,hist=False):
    collections=collections or [tf.GraphKeys.SUMMARIES]
    ave=tf.reduce_mean(tensor)
    std=tf.sqrt(tf.reduce_mean(tf.square(ave-tensor)))
    tf.summary.scalar(name+'_ave',ave,collections)
    tf.summary.scalar(name+'_std',std,collections)
    if hist:
        tf.summary.histogram(name+'_hist',tensor,collections)


def prepare_dirs_and_logger(config):
    formatter = logging.Formatter("%(asctime)s:%(levelname)s::%(message)s")
    logger = logging.getLogger()

    for hdlr in logger.handlers:
        logger.removeHandler(hdlr)

    handler = logging.StreamHandler()
    handler.setFormatter(formatter)

    logger.addHandler(handler)

    if config.load_path:
        if config.load_path.startswith(config.log_dir):
            config.model_dir = config.load_path
        else:
            if config.load_path.startswith(config.dataset):
                config.model_name = config.load_path
            else:
                config.model_name = "{}_{}".format(config.dataset, config.load_path)
    else:
        config.model_name = "{}_{}".format(config.dataset, get_time())

    if not hasattr(config, 'model_dir'):
        config.model_dir = os.path.join(config.log_dir, config.model_name)
    config.data_path = os.path.join(config.data_dir, config.dataset)

    if not config.load_path:
        config.log_code_dir=os.path.join(config.model_dir,'code')
        for path in [config.log_dir, config.data_dir,
                     config.model_dir, config.log_code_dir]:
            if not os.path.exists(path):
                os.makedirs(path)

        #Copy python code in directory into model_dir/code for future reference:
        code_dir=os.path.dirname(os.path.realpath(sys.argv[0]))
        model_files = [f for f in listdir(code_dir) if isfile(join(code_dir, f))]
        for f in model_files:
            if f.endswith('.py'):
                shutil.copy2(f,config.log_code_dir)

def get_time():
    return datetime.now().strftime("%m%d_%H%M%S")

def save_config(config):
    param_path = os.path.join(config.model_dir, "params.json")

    print("[*] MODEL dir: %s" % config.model_dir)
    print("[*] PARAM path: %s" % param_path)

    with open(param_path, 'w') as fp:
        json.dump(config.__dict__, fp, indent=4, sort_keys=True)

def get_available_gpus():
    from tensorflow.python.client import device_lib
    local_device_protos = device_lib.list_local_devices()
    return [x.name for x in local_device_protos if x.device_type=='GPU']

def distribute_input_data(data_loader,num_gpu):
    '''
    data_loader is a dictionary of tensors that are fed into our model

    This function takes that dictionary of n*batch_size dimension tensors
    and breaks it up into n dictionaries with the same key of tensors with
    dimension batch_size. One is given to each gpu
    '''
    if num_gpu==0:
        return {'/cpu:0':data_loader}

    gpus=get_available_gpus()
    if num_gpu > len(gpus):
        raise ValueError('number of gpus specified={}, more than gpus available={}'.format(num_gpu,len(gpus)))

    gpus=gpus[:num_gpu]


    data_by_gpu={g:{} for g in gpus}
    for key,value in data_loader.items():
        spl_vals=tf.split(value,num_gpu)
        for gpu,val in zip(gpus,spl_vals):
            data_by_gpu[gpu][key]=val

    return data_by_gpu


def rank(array):
    return len(array.shape)

def make_grid(tensor, nrow=8, padding=2,
              normalize=False, scale_each=False):
    """Code based on https://github.com/pytorch/vision/blob/master/torchvision/utils.py"""
    nmaps = tensor.shape[0]
    xmaps = min(nrow, nmaps)
    ymaps = int(math.ceil(float(nmaps) / xmaps))
    height, width = int(tensor.shape[1] + padding), int(tensor.shape[2] + padding)
    grid = np.zeros([height * ymaps + 1 + padding // 2, width * xmaps + 1 + padding // 2, 3], dtype=np.uint8)
    k = 0
    for y in range(ymaps):
        for x in range(xmaps):
            if k >= nmaps:
                break
            h, h_width = y * height + 1 + padding // 2, height - padding
            w, w_width = x * width + 1 + padding // 2, width - padding

            grid[h:h+h_width, w:w+w_width] = tensor[k]
            k = k + 1
    return grid

def save_image(tensor, filename, nrow=8, padding=2,
               normalize=False, scale_each=False):
    ndarr = make_grid(tensor, nrow=nrow, padding=padding,
                            normalize=normalize, scale_each=scale_each)
    im = Image.fromarray(ndarr)
    im.save(filename)


================================================
FILE: causal_controller/ArrayDict.py
================================================
import numpy as np
class ArrayDict(object):

    '''
    This is a class for manipulating dictionaries of arrays
    or dictionaries of scalars. I find this comes up pretty often when dealing
    with tensorflow, because you can pass dictionaries to feed_dict and get
    dictionaries back. If you use a smaller batch_size, you then want to
    "concatenate" these outputs for each key.
    '''

    def __init__(self):
        self.dict={}
    def __len__(self):
        if len(self.dict)==0:
            return 0
        else:
            return len(self.dict.values()[0])
    def __repr__(self):
        return repr(self.dict)
    def keys(self):
        return self.dict.keys()
    def items(self):
        return self.dict.items()

    def validate_dict(self,a_dict):
        #Check keys
        for key,val in self.dict.items():
            if not key in a_dict.keys():
                raise ValueError('key:',key,'was not in a_dict.keys()')

        for key,val in a_dict.items():
            #Check same keys
            if not key in self.dict.keys():
                raise ValueError('argument key:',key,'was not in self.dict')

            if isinstance(val,np.ndarray):
                #print('ndarray')
                my_val=self.dict[key]
                if not np.all(val.shape[1:]==my_val.shape[1:]):
                    raise ValueError('key:',key,'value shape',val.shape,'does\
                                     not match existing shape',my_val.shape)
            else: #scalar
                a_val=np.array([[val]])#[1,1]shape array
                my_val=self.dict[key]
                if not np.all(my_val.shape[1:]==a_val.shape[1:]):
                    raise ValueError('key:',key,'value shape',val.shape,'does\
                                     not match existing shape',my_val.shape)
    def arr_dict(self,a_dict):
        if isinstance(a_dict.values()[0],np.ndarray):
            return a_dict
        else:
            return {k:np.array([[v]]) for k,v in a_dict.items()}


    def concat(self,a_dict):
        if self.dict=={}:
            self.dict=self.arr_dict(a_dict)#store interally as array
        else:
            self.validate_dict(a_dict)
            self.dict={k:np.vstack([v,a_dict[k]]) for k,v in self.items()}

    def __getitem__(self,at):
        return {k:v[at] for k,v in self.items()}

#debug, run tests
if __name__=='__main__':
    out1=ArrayDict()
    d1={'Male':np.ones((3,1)),'Young':2*np.ones((3,1))}
    d2={'Male':3,'Young':33}
    d3={'Male':4*np.ones((4,1)),'Young':4*np.ones((4,1))}

    out1.concat(d1)
    out1.concat(d2)

    out2=ArrayDict()
    out2.concat(d2)
    out2.concat(d1)
    out2.concat(d3)



================================================
FILE: causal_controller/CausalController.py
================================================
from __future__ import print_function
from itertools import chain
import numpy as np
import tensorflow as tf
import pandas as pd
import os
slim = tf.contrib.slim
from models import lrelu,DiscriminatorW,Grad_Penalty
from utils import summary_stats,did_succeed
from ArrayDict import ArrayDict#Collector of outputs

debug=False

class CausalController(object):
    model_type='controller'
    summs=['cc_summaries']
    def summary_scalar(self,name,ten):
        tf.summary.scalar(name,ten,collections=self.summs)
    def summary_stats(self,name,ten,hist=False):
        summary_stats(name,ten,collections=self.summs,hist=hist)

    def load(self,sess,path):
        '''
        sess is a tf.Session object
        path is the path of the file you want to load, (not the directory)
        Example
        ./checkpoint/somemodel/saved/model.ckpt-3000
        (leave off the extensions)
        '''
        if not hasattr(self,'saver'):#should have one now
            self.saver=tf.train.Saver(var_list=self.var)
        print('Attempting to load model:',path)
        self.saver.restore(sess,path)

    def __init__(self,batch_size,config):
        '''
        Args:
            config    : This carries all the aguments defined in
            causal_controller/config.py with it. It also defines config.graph,
            which is a nested list that specifies the graph

            batch_size: This is separate from config because it is actually a
            tf.placeholder so that batch_size can be set during sess.run, but
            also synchronized between the models.

        A causal graph (config.graph) is specified as follows:
            just supply a list of pairs (node, node_parents)

            Example: A->B<-C; D->E

            [ ['A',[]],
              ['B',['A','C']],
              ['C',[]],
              ['D',[]],
              ['E',['D']]
            ]

            I use a list right now instead of a dict because I don't think
            dict.keys() are gauranteed to be returned a particular order.
            TODO:A good improvement would be to use collections.OrderedDict

            #old
            #Pass indep_causal=True to use Unif[0,1] labels
            #input_dict allows the model to take in some aritrary input instead
            #of using tf_random_uniform nodes
            #pass reuse if constructing for a second time

            Access nodes ether with:
            model.cc.node_dict['Male']
            or with:
            model.cc.Male


        Other models such as began/dcgan are intended to be build more than
        once (for example on 2 gpus), but causal_controller is just built once.

        '''

        self.config=config
        self.batch_size=batch_size #tf.placeholder_with_default
        self.graph=config.graph
        print('causal graph size:',len(self.graph))
        self.node_names, self.parent_names=zip(*self.graph)
        self.node_names=list(self.node_names)
        self.label_names=self.node_names

        #set nodeclass attributes
        if debug:
            print('Using ',self.config.cc_n_layers,'between each causal node')
        CausalNode.n_layers=self.config.cc_n_layers
        CausalNode.n_hidden=self.config.cc_n_hidden
        CausalNode.batch_size=self.batch_size

        with tf.variable_scope('causal_controller') as vs:
            self.step=tf.Variable(0, name='step', trainable=False)
            self.inc_step=tf.assign(self.step,self.step+1)

            self.nodes=[CausalNode(name=n,config=config) for n in self.node_names]

            for node,rents in zip(self.nodes,self.parent_names):
                node.parents=[n for n in self.nodes if n.name in rents]

            ##construct graph##
            #Lazy construction avoids the pain of traversing the causal graph explicitly
            #python recursion error if the graph is not a DAG
            for node in self.nodes:
                node.setup_tensor()

            self.labels=tf.concat(self.list_labels(),-1)
            self.fake_labels=self.labels
            self.fake_labels_logits= tf.concat( self.list_label_logits(),-1 )

        self.label_dict={n.name:n.label for n in self.nodes}
        self.node_dict={n.name:n for n in self.nodes}
        self.z_dict={n.name:n.z for n in self.nodes}

        #enable access directly. Little dangerous
        #Please don't have any nodes named "batch_size" for example
        self.__dict__.update(self.node_dict)

        #dcc variables are not saved, so if you reload in the middle of a
        #pretrain, that might be a quirk. I don't find it makes much of a
        #difference though
        self.var = tf.contrib.framework.get_variables(vs)
        trainable=tf.get_collection('trainable_variables')
        self.train_var=[v for v in self.var if v in trainable]

        #wont save dcc var
        self.saver=tf.train.Saver(var_list=self.var)
        self.model_dir=os.path.join(self.config.model_dir,self.model_type)
        self.save_model_dir=os.path.join(self.model_dir,'checkpoints')
        self.save_model_name=os.path.join(self.save_model_dir,'CC-Model')

        if not os.path.exists(self.model_dir):
            os.mkdir(self.model_dir)
        if not os.path.exists(self.save_model_dir):
            os.mkdir(self.save_model_dir)


    def build_pretrain(self,label_loader):
        '''
        This is not called if for example using an existing model
        label_loader is a queue of only labels that moves quickly because no
        images
        '''
        config=self.config

        #Pretraining setup
        self.DCC=DiscriminatorW

        #if self.config.pt_factorized:
            #self.DCC=FactorizedNetwork(self.graph,self.DCC,self.config)

        #reasonable alternative with equal performance
        if self.config.pt_factorized:#Each node owns a dcc
            print('CC is factorized!')
            for node in self.nodes:
                node.setup_pretrain(config,label_loader,self.DCC)

            with tf.control_dependencies([self.inc_step]):
                self.c_optim=tf.group(*[n.c_optim for n in self.nodes])
            self.dcc_optim=tf.group(*[n.dcc_optim for n in self.nodes])
            self.train_op=tf.group(self.c_optim,self.dcc_optim)

            self.c_loss=tf.reduce_sum([n.c_loss for n in self.nodes])
            self.dcc_loss=tf.reduce_sum([n.dcc_loss for n in self.nodes])

            self.summary_stats('total_c_loss',self.c_loss)
            self.summary_stats('total_dcc_loss',self.dcc_loss)

        #default.
        else:#Not factorized. CC owns dcc
            print('setting up pretrain:','CausalController')
            real_inputs=tf.concat([label_loader[n] for n in self.node_names],axis=1)
            fake_inputs=self.labels
            n_hidden=self.config.critic_hidden_size
            real_prob,self.dcc_real_logit,self._dcc_var=self.DCC(real_inputs,self.batch_size,n_hidden,self.config)
            fake_prob,self.dcc_fake_logit,_=self.DCC(fake_inputs,self.batch_size,n_hidden,self.config,reuse=True)
            grad_cost,self.dcc_slopes=Grad_Penalty(real_inputs,fake_inputs,self.DCC,self.config)

            self.dcc_diff = self.dcc_fake_logit - self.dcc_real_logit
            self.dcc_gan_loss=tf.reduce_mean(self.dcc_diff)
            self.dcc_grad_loss=grad_cost
            self.dcc_loss=self.dcc_gan_loss+self.dcc_grad_loss#
            self.c_loss=-tf.reduce_mean(self.dcc_fake_logit)#

            optimizer = tf.train.AdamOptimizer
            self.c_optimizer, self.dcc_optimizer = optimizer(config.pt_cc_lr),optimizer(config.pt_dcc_lr)

            with tf.control_dependencies([self.inc_step]):
                self.c_optim=self.c_optimizer.minimize(self.c_loss,var_list=self.train_var)
            self.dcc_optim=self.dcc_optimizer.minimize(self.dcc_loss,var_list=self.dcc_var)
            self.train_op=tf.group(self.c_optim,self.dcc_optim)

            self.summary_stats('total_c_loss',self.c_loss)
            self.summary_stats('total_dcc_loss',self.dcc_loss)


            for node in self.nodes:
                with tf.name_scope(node.name):
                    #TODO:replace with summary_stats
                    self.summary_stats(node.name+'_fake',node.label,hist=True)
                    self.summary_stats(node.name+'_real',label_loader[node.name],hist=True)


        self.summaries=tf.get_collection(self.summs[0])
        print('causalcontroller has',len(self.summaries),'summaries')
        self.summary_op=tf.summary.merge(self.summaries)


    @property
    def dcc_var(self):
        if self.config.is_pretrain:
            if self.config.pt_factorized:
                return list(chain.from_iterable([n.dcc_var for n in self.nodes]))
            else:
                return self._dcc_var
        else:
            return []


    def critic_update(self,sess):
        fetch_dict = {"critic_op":self.dcc_optim }
        for i in range(self.config.n_critic):
            result = sess.run(fetch_dict)


    def __len__(self):
        return len(self.node_dict)


    def list_placeholders(self):
        return [n.z for n in self.nodes]
    def list_labels(self):
        return [n.label for n in self.nodes]
    def list_label_logits(self):
        return [n.label_logit for n in self.nodes]

    def do2feed(self,do_dict):
        '''
        used internally to convert a dictionary to a feed_dict
        '''
        feed_dict={}
        for key,value in do_dict.items():
            feed_dict[self.label_dict[key]]=value
        return feed_dict

    def sample_label(self, sess, cond_dict=None,do_dict=None,N=None,verbose=False):
        '''
        This is a method to sample conditional and internventional
        distributions over labels. This is disconnected from
        interventions/conditioning that include the image because it is
        potentially faster. (images are not generated for rejected samples).
        The intent is to pass these labels to the image generator.

        This is low level. One experiment type(N times) per function call.
        values of dictionaries should be scalars

        Assumed that label_dict is always the fetch

        may combine conditioning and intervening
        '''

        do_dict= do_dict or {}
        cond_dict= cond_dict or {}
        fetch_dict=self.label_dict

        #boolean scalars are all that is allowed
        for v in cond_dict.values():
            assert(v==0 or v==1)
        for v in do_dict.values():
            assert(v==0 or v==1)

        arr_do_dict={k:v*np.ones([N,1]) for k,v in do_dict.items()}

        feed_dict = self.do2feed(arr_do_dict)#{tensor:array}
        feed_dict.update({self.batch_size:N})

        if verbose:
            print('feed_dict',feed_dict)
            print('fetch_dict',fetch_dict)

        #No conditioning loop needed
        if not cond_dict:
            return sess.run(fetch_dict, feed_dict)

        else:#cond_dict not None

            rows=np.arange(N)#what idx do we need
            #init
            max_fail=4000
            n_fails=0
            outputs=ArrayDict()
            iter_rows=np.arange(N)
            n_remaining=N

            ii=0
            while( n_remaining > 0 ):
                ii+=1

                #Run N samples
                out=sess.run(fetch_dict, feed_dict)

                bool_pass = did_succeed(out,cond_dict)
                pass_idx=iter_rows[bool_pass]
                pass_idx=pass_idx[:n_remaining]
                pass_dict={k:v[pass_idx] for k,v in out.items()}

                outputs.concat(pass_dict)
                n_remaining=N-len(outputs)

                #    :(
                if ii>max_fail:
                    print('WARNING: for cond_dict:',cond_dict,)
                    print('could not condition in ',max_fail*N, 'samples')
                    break

            else:
                if verbose:
                    print('for cond_dict:',cond_dict,)
                    print('conditioning finished normally with ',ii,'tries')

            return outputs.dict




class CausalNode(object):
    '''
    A CausalNode sets up a small neural network:
    z_noise+[,other causes] -> label_logit

    Everything is defined in terms of @property
    to allow tensorflow graph to be lazily generated as called
    because I don't enforce that a node's parent tf graph
    is constructed already during class.setup_tensor

    Uniform[-1,1] + other causes pases through n_layers fully connected layers.
    '''
    train = True
    name=None
    #logit is going to be 1 dim with sigmoid
    #as opposed to 2 dim with softmax
    _label_logit=None
    _label=None
    parents=[]#list of CausalNodes
    n_layers=3
    n_hidden=10
    batch_size=-1#Must be set by cc
    summs=['cc_summaries']

    def summary_scalar(self,name,ten):
        tf.summary.scalar(name,ten,collections=self.summs)
    def summary_stats(self,name,ten,hist=False):
        summary_stats(name,ten,collections=self.summs,hist=hist)

    def __init__(self,name,config):
        self.name=name
        self.config=config

        if self.batch_size==-1:
            raise Exception('class attribute CausalNode.batch_size must be set')

        with tf.variable_scope(self.name) as vs:
            #I think config.seed would have to be passed explicitly here
            self.z=tf.random_uniform((self.batch_size,self.n_hidden),minval=-1.0,maxval=1.0)
            self.init_var = tf.contrib.framework.get_variables(vs)
            self.setup_var=[]#empty until setup_tensor runs

    def setup_tensor(self):
        if self._label is not None:#already setup
            if debug:
                #Notify that already setup (normal behavior)
                print('self.',self.name,' has refuted setting up tensor')
            return

        tf_parents=[self.z]+[node.label for node in self.parents]


        with tf.variable_scope(self.name) as vs:
            h=tf.concat(tf_parents,-1)#tensor of parent values
            for l in range(self.n_layers-1):
                h=slim.fully_connected(h,self.n_hidden,activation_fn=lrelu,scope='layer'+str(l))

            self._label_logit = slim.fully_connected(h,1,activation_fn=None,scope='proj')
            self._label=tf.nn.sigmoid( self._label_logit )
            if debug:
                print('self.',self.name,' has setup _label=',self._label)

            #There could actually be some (quiet) error here I think if one of the
            #names in the causal graph is a substring of some other name.
                #e.g. 'hair' and 'black_hair'
            #Sorry, not coded to anticipate corner case
            self.setup_var=tf.contrib.framework.get_variables(vs)
    @property
    def var(self):
        if len(self.setup_var)==0:
            print('WARN: node var was accessed before it was constructed')
        return self.init_var+self.setup_var
    @property
    def train_var(self):
        trainable=tf.get_collection('trainable_variables')
        return [v for v in self.var if v in trainable]

    @property
    def label_logit(self):
        #Less stable. Better to access labels
        #for input to another model
        if self._label_logit is not None:
            return self._label_logit
        else:
            self.setup_tensor()
            return self._label_logit
    @property
    def label(self):
        if self._label is not None:
            return self._label
        else:
            self.setup_tensor()
            return self._label


    def setup_pretrain(self,config,label_loader,DCC):
        '''
        This function is not functional because
        this only happens if cc_config.pt_factorized=True.

        In this case convergence of each node is treated like its
        own gan conditioned on the parent nodes labels.

        I couldn't bring myself to delete it, but it's not needed
        to get good convergence for the models we tested.
        '''

        print('setting up pretrain:',self.name)

        with tf.variable_scope(self.name,reuse=self.reuse) as vs:
            self.config=config
            n_hidden=self.config.critic_hidden_size

            parent_names=[p.name for p in self.parents]
            real_inputs=tf.concat([label_loader[n] for n in parent_names]+[label_loader[self.name]],axis=1)
            fake_inputs=tf.concat([p.label for p in self.parents]+[self.label],axis=1)

            real_prob,self.dcc_real_logit,self.dcc_var=DCC(real_inputs,self.batch_size,n_hidden,self.config)
            fake_prob,self.dcc_fake_logit,_=DCC(fake_inputs,self.batch_size,n_hidden,self.config,reuse=True)

            grad_cost,self.dcc_slopes=Grad_Penalty(real_inputs,fake_inputs,DCC,self.config)

            self.dcc_diff = self.dcc_fake_logit - self.dcc_real_logit
            self.dcc_gan_loss=tf.reduce_mean(self.dcc_diff)
            self.dcc_grad_loss=grad_cost
            self.dcc_loss=self.dcc_gan_loss+self.dcc_grad_loss#
            self.c_loss=-tf.reduce_mean(self.dcc_fake_logit)#

            self.summary_scalar('dcc_gan_loss',self.dcc_gan_loss)
            self.summary_scalar('dcc_grad_loss',self.dcc_grad_loss)
            self.summary_stats('dcc_slopes',self.dcc_slopes,hist=True)

            if config.optimizer == 'adam':
                optimizer = tf.train.AdamOptimizer
            else:
                raise Exception("[!] Caution! Optimizer untested {}. Only tested Adam".format(config.optimizer))
            self.c_optimizer, self.dcc_optimizer = optimizer(config.pt_cc_lr),optimizer(config.pt_dcc_lr)

            self.c_optim=self.c_optimizer.minimize(self.c_loss,var_list=self.train_var)
            self.dcc_optim=self.dcc_optimizer.minimize(self.dcc_loss,var_list=self.dcc_var)

            self.summary_stats('c_loss',self.c_loss)
            self.summary_stats('dcc_loss',self.c_loss)
            self.summary_stats('dcc_real_logit',self.dcc_real_logit,hist=True)
            self.summary_stats('dcc_fake_logit',self.dcc_fake_logit,hist=True)



================================================
FILE: causal_controller/__init__.py
================================================


================================================
FILE: causal_controller/config.py
================================================
'''

These are the command line parameters that pertain exlusively to the
CausalController.

'''

from __future__ import print_function
import argparse

def str2bool(v):
    #return (v is True) or (v.lower() in ('true', '1'))
    return v is True or v.lower() in ('true', '1')

arg_lists = []
parser = argparse.ArgumentParser()

def add_argument_group(name):
    arg = parser.add_argument_group(name)
    arg_lists.append(arg)
    return arg

#Pretrain network
pretrain_arg=add_argument_group('Pretrain')
pretrain_arg.add_argument('--pt_load_path', type=str, default='')
pretrain_arg.add_argument('--is_pretrain',type=str2bool,default=False,
                         help='to do pretraining')
#pretrain_arg.add_argument('--only_pretrain', action='store_true',
#                         help='simply complete pretrain and exit')

#Used to be an option, but now is solved
#pretrain_arg.add_argument('--pretrain_type',type=str,default='wasserstein',choices=['wasserstein','gan'])

pretrain_arg.add_argument('--pt_cc_lr',type=float,default=0.00008,#
                          help='learning rate for causal controller')
pretrain_arg.add_argument('--pt_dcc_lr',type=float,default=0.00008,#
                          help='learning rate for causal controller')
pretrain_arg.add_argument('--lambda_W',type=float,default=0.1,#
                          help='penalty for gradient of W critic')
pretrain_arg.add_argument('--n_critic',type=int,default=20,#5 for speed
                          help='number of critic iterations between gen update')
pretrain_arg.add_argument('--critic_layers',type=int,default=6,#4 usual.8 might help
                          help='number of layers in the Wasserstein discriminator')
pretrain_arg.add_argument('--critic_hidden_size',type=int,default=15,#10,15
                         help='hidden_size for critic of discriminator')

pretrain_arg.add_argument('--min_tvd',type=float,default=0.02,
                          help='if tvd<min_tvd then stop pretrain')
pretrain_arg.add_argument('--min_pretrain_iter',type=int,default=5000,
                          help='''pretrain for at least this long before
                          stopping early due to tvd convergence. This is to
                          avoid being able to get a low tvd without labels
                          being clustered near integers''')
pretrain_arg.add_argument('--pretrain_iter',type=int,default=10000,
                          help='if iter>pretrain_iter then stop pretrain')
#pretrain_arg.add_argument('--pretrain_labeler',type=str2bool,default=False,
#                          help='''whether to train the labeler on real images
#                          during pretraining''')

pretrain_arg.add_argument('--pt_factorized',type=str2bool,default=False,
                          help='''Interesting approach that seemed to stabalize
                          training, but is not needed in this application.
                          It turned out that we could get very good training without
                          this complication, so we did not include in the paper.
                          I've left it commented out here in the code.

                          Whether the discriminator should be
                          factorized according to the structure of the graph
                          to speed/stabalize convergence.
                          
                          This creates a separate discriminator for each node
                          that only looks at each causal nodes value and its
                          parents''')

#Network
net_arg = add_argument_group('Network')

net_arg.add_argument('--cc_n_layers',type=int, default=6,
                     help='''This is the number of neural network fc layers
                     between the causes of a node and the node itsef.''')
net_arg.add_argument('--cc_n_hidden',type=int, default=10,
                     help='''number of neurons per layer in causal controller.
                     Also functions as the dimensionality of the uniform noise
                     input to the controller''')

# Data
data_arg = add_argument_group('Data')
data_arg.add_argument('--causal_model', type=str)
data_arg.add_argument('--dataset', type=str, default='celebA')

data_arg.add_argument('--batch_size', type=int, default=16)
data_arg.add_argument('--num_worker', type=int, default=24,
     help='number of threads to use for loading and preprocessing data')

# Training / test parameters
train_arg = add_argument_group('Training')




# Misc
misc_arg = add_argument_group('Misc')
misc_arg.add_argument('--load_path', type=str, default='')
misc_arg.add_argument('--log_step', type=int, default=100)
misc_arg.add_argument('--save_step', type=int, default=5000)
misc_arg.add_argument('--num_log_samples', type=int, default=3)
misc_arg.add_argument('--log_level', type=str, default='INFO', choices=['INFO', 'DEBUG', 'WARN'])
misc_arg.add_argument('--log_dir', type=str, default='logs')


def get_config():
    config, unparsed = parser.parse_known_args()
    print('Loaded ./causal_controller/config.py')
    return config, unparsed

if __name__=='__main__':
    #for debug of config
    config, unparsed = get_config()



================================================
FILE: causal_controller/models.py
================================================
import numpy as np
import tensorflow as tf
slim = tf.contrib.slim


def lrelu(x,leak=0.2,name='lrelu'):
    with tf.variable_scope(name):
        #Trick that saves memory by avoiding tf.max
        f1=0.5 * (1+leak)
        f2=0.5 * (1-leak)
        return f1*x + f2*tf.abs(x)


def DiscriminatorW(labels,batch_size, n_hidden, config, reuse=None):
    '''
    A simple discriminator to be used with Wasserstein optimization.
    No minibatch features or batch normalization is used.
    '''
    with tf.variable_scope("WasserDisc") as scope:
        if reuse:
            scope.reuse_variables()
        h=labels
        act_fn=lrelu
        n_neurons=n_hidden
        for i in range(config.critic_layers):
            if i==config.critic_layers-1:
                act_fn=None
                n_neurons=1
            scp='WD'+str(i)
            h = slim.fully_connected(h,n_neurons,activation_fn=act_fn,scope=scp)
        variables = tf.contrib.framework.get_variables(scope)
        return tf.nn.sigmoid(h),h,variables


def Grad_Penalty(real_data,fake_data,Discriminator,config):
    '''
    Implemention from "Improved training of Wasserstein"
    Interpolation based estimation of the gradient of the discriminator.
    Used to penalize the derivative rather than explicitly constrain lipschitz.
    '''
    batch_size=config.batch_size
    LAMBDA=config.lambda_W
    n_hidden=config.critic_hidden_size
    alpha = tf.random_uniform([batch_size,1],0.,1.)
    interpolates = alpha*real_data + ((1-alpha)*fake_data)#Could do more if not fixed batch_size
    disc_interpolates = Discriminator(interpolates,batch_size,n_hidden=n_hidden,config=config, reuse=True)[1]#logits
    gradients = tf.gradients(disc_interpolates,[interpolates])[0]#orig
    slopes = tf.sqrt(tf.reduce_sum(tf.square(gradients),
                           reduction_indices=[1]))
    gradient_penalty = tf.reduce_mean((slopes-1)**2)
    grad_cost = LAMBDA*gradient_penalty
    return grad_cost,slopes



================================================
FILE: causal_controller/utils.py
================================================
from __future__ import print_function
import numpy as np
import tensorflow as tf

def summary_stats(name,tensor,collections=None,hist=False):
    collections=collections or [tf.GraphKeys.SUMMARIES]
    ave=tf.reduce_mean(tensor)
    std=tf.sqrt(tf.reduce_mean(tf.square(ave-tensor)))
    tf.summary.scalar(name+'_ave',ave,collections)
    tf.summary.scalar(name+'_std',std,collections)
    if hist:
        tf.summary.histogram(name+'_hist',tensor,collections)

def did_succeed( output_dict, cond_dict ):
    '''
    Used in rejection sampling:
    for each row, determine if cond is satisfied
    for every cond in cond_dict

    success is hardcoded as round(label) being exactly equal
    to the integer in cond_dict
    '''

    #definition success:
    def is_win(key):
        #cond=np.squeeze(cond_dict[key])
        cond=np.squeeze(cond_dict[key])
        val=np.squeeze(output_dict[key])
        condition= np.round(val)==cond
        return condition

    scoreboard=[is_win(key) for key in cond_dict]
    #print('scoreboard', scoreboard)
    all_victories_bool=np.logical_and.reduce(scoreboard)
    return all_victories_bool.flatten()



================================================
FILE: causal_dcgan/CausalGAN.py
================================================
from __future__ import division,print_function
from figure_scripts.pairwise import crosstab
from figure_scripts.sample import intervention2d
import os
import time
import math
from glob import glob
import tensorflow as tf
import numpy as np
from six.moves import xrange
import pandas as pd
import sys
import scipy.stats as stats

from models import GeneratorCNN,DiscriminatorCNN,discriminator_labeler
from models import discriminator_gen_labeler,discriminator_on_z

from tensorflow.core.framework import summary_pb2
from tensorflow.contrib import slim

from ops import batch_norm,lrelu

from causal_graph import get_causal_graph

def norm_img(image):
    image = image/127.5 - 1.
    return image
def denorm_img(norm):
    return tf.clip_by_value((norm + 1)*127.5, 0, 255)

def tf_truncexpon(batch_size,rate,right):
    '''
    a tensorflow node that returns a random variable
    sampled from an Exp(rate) random variable
    which has been truncated and normalized to [0,right]

    #Leverages that log of uniform is exponential

    batch_size: a tensorflow placeholder to sync batch_size everywhere
    rate: lambda rate parameter for exponential dist
    right: float in (0,inf) where to truncate exp distribution
    '''

    uleft=tf.exp(-1*rate*right)
    U=tf.random_uniform(shape=(batch_size,1),minval=uleft,maxval=1)
    tExp=(-1/rate)*tf.log(U)

    return tExp

def add_texp_noise(batch_size,labels01):
    labels=0.3+labels01*0.4#{0.3,0.7}
    lower, upper, scale = 0, 0.2, 1/25.0
    lower_tail, upper_tail, scale_tail = 0, 0.3, 1/50.0
    #before #t = stats.truncexpon(b=(upper-lower)/scale, loc=lower, scale=scale)
    #b*scale was the right-boundary
    b=(upper-lower)/scale
    b_tail=(upper_tail-lower_tail)/scale_tail

    s=tf_truncexpon(batch_size,rate=b,right=upper)
    s_tail=tf_truncexpon(batch_size,rate=b_tail,right=upper_tail)
    labels = labels + ((0.5-labels)/0.2)*s + ((-0.5+labels)/0.2)*s_tail
    return labels, [s,s_tail]

class CausalGAN(object):
    model_type='dcgan'

    def __init__(self,batch_size,config):

        self.batch_size = batch_size #a tensor
        self.config=config
        self.model_dir=config.model_dir
        self.TINY = 10**-6

        self.step = tf.Variable(0, name='step', trainable=False)
        self.inc_step=tf.assign(self.step,self.step+1)

        #########################################
        ##### Following is not used anymore #####
        #########################################
        self.gamma_k = tf.get_variable(name='gamma_k',initializer=config.gamma_k,trainable=False)
        self.lambda_k = config.lambda_k#0.05
        self.gamma_l = config.gamma_l#self.label_loss_hyperparameter
        self.lambda_l = config.lambda_l#0.005
        self.gamma_m = 1./(self.gamma_k+self.TINY)#gamma_m#4.0 # allowing gan loss to be 8 times labelerR loss
        #self.gamma_m=config.gamma_m
        self.lambda_m =config.lambda_m#0.05
        #########################################

        self.k_t = tf.get_variable(name='k_t',initializer=1.,trainable=False) # kt is the closed loop feedback coefficient to balance the loss between LR and LG

        self.rec_loss_coeff = 0.0
        print('WARNING:CausalGAN.rec_loss_coff=',self.rec_loss_coeff)

        self.hidden_size=config.critic_hidden_size

        self.gf_dim = config.gf_dim
        self.df_dim = config.df_dim

        self.loss_function = config.loss_function

    def __call__(self, real_inputs, fake_inputs):
        '''
        This builds the model on the inputs. Potentially this would be called
        multiple times in a multi-gpu situation. Put "setup" type stuff in
        __init__ instead.

        This is like self.build_model()

        fake inputs is a dictionary of labels from cc
        real_inputs is also a dictionary of labels
            with an additional key 'x' for the real image
        '''
        config=self.config#used many times

        #dictionaries
        self.real_inputs=real_inputs
        self.fake_inputs=fake_inputs

        n_labels=len(fake_inputs)
        self.x = self.real_inputs.pop('x')#[0,255]
        x = norm_img(self.x)#put in [-1,1]

        #These are 0,1 labels. To add noise, add noise from here.
        self.real_labels=tf.concat(self.real_inputs.values(),-1)
        self.fake_labels=tf.concat(self.fake_inputs.values(),-1)

        ##BEGIN manipulating labels##

        #Fake labels will already be nearly discrete
        if config.round_fake_labels: #default
            fake_labels=tf.round(self.fake_labels)#{0,1}
            real_labels=tf.round(self.real_labels)#should already be rounded
        else:
            fake_labels=self.fake_labels#{0,1}
            real_labels=self.real_labels

        if config.label_type=='discrete':
            fake_labels=0.3+fake_labels*0.4#{0.3,0.7}
            real_labels=0.3+real_labels*0.4#{0.3,0.7}

        elif config.label_type=='continuous':

            #this is so that they can be set to 0 in label_interpolation
            self.noise_variables=[]

            if config.label_specific_noise:
                #TODO#uniform see above #REFERENCE
                raise Exception('label_specific_noise=True not yet implemented')
            else:#default
                fake_labels,nvfake=add_texp_noise(self.batch_size,fake_labels)
                real_labels,nvreal=add_texp_noise(self.batch_size,real_labels)
                self.noise_variables.extend(nvfake)
                self.noise_variables.extend(nvreal)

            tf.summary.histogram('noisy_fake_labels',fake_labels)
            tf.summary.histogram('noisy_real_labels',real_labels)

        self.fake_labels_logits= -tf.log(1/(fake_labels+self.TINY)-1)
        self.real_labels_logits = -tf.log(1/(real_labels+self.TINY)-1)

        self.noisy_fake_labels=fake_labels
        self.noisy_real_labels=real_labels

        if config.type_input_to_generator=='labels':
            self.fake_labels_inputs=fake_labels
            self.real_labels_inputs=real_labels#for reconstruction
        elif config.type_input_to_generator=='logits': #default
            self.fake_labels_inputs=self.fake_labels_logits
            self.real_labels_inputs=self.real_labels_logits

        ##FINISHED manipulating labels##

        self.z_gen = tf.random_uniform( [self.batch_size, config.z_dim],minval=-1.0, maxval=1.0,name='z_gen')

        self.z= tf.concat( [self.z_gen, self.fake_labels_inputs],axis=-1,name='z')

        G, self.g_vars = GeneratorCNN(self.z,config)#[-1,1]float
        self.G=denorm_img(G)#[0,255]

        #Discriminator
        D_on_real=DiscriminatorCNN(x,config)
        D_on_fake=DiscriminatorCNN(G,config,reuse=True)
        self.D, self.D_logits ,self.features_to_estimate_z_on_input ,self.d_vars=D_on_real
        self.D_,self.D_logits_,self.features_to_estimate_z_on_generated,_ =D_on_fake

        #Discriminator Labeler
        self.D_labels_for_real, self.D_labels_for_real_logits, self.dl_vars =\
                discriminator_labeler(x,n_labels,config)
        self.D_labels_for_fake, self.D_labels_for_fake_logits, _ =\
                discriminator_labeler(G,n_labels,config,reuse=True)

        #Other discriminators
        self.D_gen_labels_for_fake,self.D_gen_labels_for_fake_logits,self.dl_gen_vars=\
            discriminator_gen_labeler(G,n_labels,config)
            #discriminator_gen_labeler(self.G,n_labels,config)

        self.D_on_z_real,_ =discriminator_on_z(self.features_to_estimate_z_on_input,config)
        self.D_on_z,self.dz_vars=discriminator_on_z(self.features_to_estimate_z_on_generated,config,reuse=True)

        #order of concat matters
        self.z_for_real = tf.concat([self.D_on_z_real,self.real_labels_inputs], axis=1 , name ='z_real')
        self.inputs_reconstructed,_ = GeneratorCNN(self.z_for_real,self.config, reuse = True)
        # Reconstructability is an idea that we tried. It does not provide big improvements, hence is not used ini the current version.

        tf.summary.histogram('d',self.D)
        tf.summary.histogram('d_',self.D_)
        tf.summary.image('G',self.G,max_outputs=10)

        def sigmoid_cross_entropy_with_logits(x, y):
            return tf.nn.sigmoid_cross_entropy_with_logits(logits=x, labels=y)

        # We tried different loss functions: 0,1,2 all have the order of terms in the cross entropy loss flipped, whereas 3,4,5 are not (consistent with theory).
        # Although all works to some extent, we have seen the sharpest images and best image quality with "loss function 1".
        # Difference between 0, 1, 2: This is to see the effect of using different GAN losses, as mentioned in the paper.
        if self.loss_function == 0:
            self.g_lossLabels= tf.reduce_mean(sigmoid_cross_entropy_with_logits(self.fake_labels_logits,self.D_labels_for_fake))
            self.g_lossGAN = tf.reduce_mean(
              -sigmoid_cross_entropy_with_logits(self.D_logits_, tf.zeros_like(self.D_))+sigmoid_cross_entropy_with_logits(self.D_logits_, tf.ones_like(self.D_)))
        elif self.loss_function == 1:#default
            self.g_lossLabels= tf.reduce_mean(sigmoid_cross_entropy_with_logits(self.fake_labels_logits,self.D_labels_for_fake))
            self.g_lossGAN = tf.reduce_mean(sigmoid_cross_entropy_with_logits(self.D_logits_, tf.ones_like(self.D_)))
        elif self.loss_function == 2:
            self.g_lossLabels= tf.reduce_mean(sigmoid_cross_entropy_with_logits(self.fake_labels_logits,self.D_labels_for_fake))
            self.g_lossGAN = tf.reduce_mean(-sigmoid_cross_entropy_with_logits(self.D_logits_, tf.zeros_like(self.D_)))
        elif self.loss_function == 3:
            self.g_lossLabels= tf.reduce_mean(sigmoid_cross_entropy_with_logits(self.D_labels_for_fake_logits, self.fake_labels))
            self.g_lossGAN = tf.reduce_mean(
              -sigmoid_cross_entropy_with_logits(self.D_logits_, tf.zeros_like(self.D_))+sigmoid_cross_entropy_with_logits(self.D_logits_, tf.ones_like(self.D_)))
        elif self.loss_function == 4:
            self.g_lossLabels= tf.reduce_mean(sigmoid_cross_entropy_with_logits(self.D_labels_for_fake_logits, self.fake_labels))
            self.g_lossGAN = tf.reduce_mean(sigmoid_cross_entropy_with_logits(self.D_logits_, tf.ones_like(self.D_)))
        elif self.loss_function == 5:
            self.g_lossLabels= tf.reduce_mean(sigmoid_cross_entropy_with_logits(self.D_labels_for_fake_logits, self.fake_labels))
            self.g_lossGAN = tf.reduce_mean(-sigmoid_cross_entropy_with_logits(self.D_logits_, tf.zeros_like(self.D_)))
        else:
            raise Exception('Something is wrong with the loss function.\
                            self.loss_function=',self.loss_function)

        self.g_lossLabels_GLabeler = tf.reduce_mean(sigmoid_cross_entropy_with_logits(self.fake_labels_logits,self.D_gen_labels_for_fake))
        tf.summary.scalar("g_loss_labelerG",self.g_lossLabels_GLabeler)

        self.g_loss_on_z = tf.reduce_mean(tf.abs(self.z_gen - self.D_on_z)**2)
        #x is the real input image
        self.real_reconstruction_loss = tf.reduce_mean(tf.abs(x-self.inputs_reconstructed)**2)

        tf.summary.scalar('real_reconstruction_loss', self.real_reconstruction_loss)

        self.d_loss_real = tf.reduce_mean(
          sigmoid_cross_entropy_with_logits(self.D_logits, tf.ones_like(self.D)))
        self.d_loss_fake = tf.reduce_mean(
          sigmoid_cross_entropy_with_logits(self.D_logits_, tf.zeros_like(self.D_)))

        if config.reconstr_loss:
            g_loss_on_z=self.g_loss_on_z
        else:
            g_loss_on_z=0.
            # Default value for now, since reconstructability is not used in the current version.

        if config.off_label_losses:
            self.g_loss = self.g_lossGAN
        else:#default
            self.g_loss = self.g_lossGAN - 1.0*self.k_t*self.g_lossLabels_GLabeler + self.g_lossLabels + g_loss_on_z

        tf.summary.scalar('g_loss_labelerR', self.g_lossLabels)
        tf.summary.scalar('g_lossGAN', self.g_lossGAN)
        tf.summary.scalar('g_loss_on_z', self.g_loss_on_z)
        tf.summary.scalar('coeff_of_negLabelerG_loss_k_t', self.k_t)
        tf.summary.scalar('gamma_k_summary', self.gamma_k)

        self.d_labelLossReal = tf.reduce_mean(sigmoid_cross_entropy_with_logits(self.D_labels_for_real_logits,self.real_labels))

        tf.summary.scalar("d_loss_real", self.d_loss_real)
        tf.summary.scalar("d_loss_fake", self.d_loss_fake)
        tf.summary.scalar("d_loss_real_label", self.d_labelLossReal)

        self.d_loss = self.d_loss_real + self.d_loss_fake

        tf.summary.scalar("g_loss", self.g_loss)
        tf.summary.scalar("d_loss", self.d_loss)

    def build_train_op(self):
        config=self.config

        self.g_optim = tf.train.AdamOptimizer(config.learning_rate, beta1=config.beta1) \
                  .minimize(self.g_loss, var_list=self.g_vars)

        self.d_optim = tf.train.AdamOptimizer(config.learning_rate, beta1=config.beta1) \
                  .minimize(self.d_loss, var_list=self.d_vars)

        self.d_label_optim = tf.train.AdamOptimizer(config.learning_rate, beta1=config.beta1) \
                  .minimize(self.d_labelLossReal, var_list=self.dl_vars)

        self.d_gen_label_optim = tf.train.AdamOptimizer(config.learning_rate, beta1=config.beta1) \
                  .minimize(self.g_lossLabels_GLabeler, var_list=self.dl_gen_vars)

        self.d_on_z_optim = tf.train.AdamOptimizer(config.learning_rate, beta1=config.beta1) \
                  .minimize(self.g_loss_on_z + self.rec_loss_coeff*self.real_reconstruction_loss, var_list=self.dz_vars)

        self.k_t_update = tf.assign(self.k_t, self.k_t*tf.exp(-1.0/config.tau) )

        self.train_op=tf.group(self.d_gen_label_optim,self.d_label_optim,self.d_optim,self.g_optim,self.d_on_z_optim)

    def build_summary_op(self):
        self.summary_op=tf.summary.merge_all()

    def train_step(self,sess,counter):
        '''
        This is a generic function that will be called by the Trainer class
        once per iteration. The simplest body for this part would be simply
        "sess.run(self.train_op)". But you may have more complications.

        Running self.summary_op is handeled by Trainer.Supervisor and doesn't
        need to be addressed here

        Only counters, not epochs are explicitly kept track of
        '''

        ###You can wait until counter>N to do stuff for example:
        if self.config.pretrain_LabelerR and counter < self.config.pretrain_LabelerR_no_of_iters:
            sess.run(self.d_label_optim)

        else:
            if np.mod(counter, 3) == 0:

                sess.run(self.g_optim)
                sess.run([self.train_op,self.k_t_update,self.inc_step])#all ops

            else:
                sess.run([self.g_optim, self.k_t_update ,self.inc_step])
                sess.run(self.g_optim)

================================================
FILE: causal_dcgan/__init__.py
================================================


================================================
FILE: causal_dcgan/config.py
================================================
from __future__ import print_function
import argparse

def str2bool(v):
    return v is True or v.lower() in ('true', '1')

arg_lists = []
parser = argparse.ArgumentParser()

def add_argument_group(name):
    arg = parser.add_argument_group(name)
    arg_lists.append(arg)
    return arg

# Data
data_arg = add_argument_group('Data')
data_arg.add_argument('--batch_size', type=int, default=64,
                     help='''default batch_size when using this model and not
                      specifying the batch_size elsewhere''')



data_arg.add_argument('--label_specific_noise',type=str2bool,default=False,
                      help='whether to add noise dependent on the data mean')

#This flag doesn't function. Model is designed to take in CC.labels
data_arg.add_argument('--fakeLabels_distribution',type=str,choices=['real_joint','iid_uniform'],default='real_joint')


data_arg.add_argument('--label_type',type=str,choices=['discrete','continuous'],default='continuous')
data_arg.add_argument('--round_fake_labels',type=str2bool,default=True,
                    help='''whether to round the outputs of causal controller
                      before (possibly) adding noise to them or using them as
                      input to the image generator. I highly recommend as a
                      small improvement.''')

data_arg.add_argument('--type_input_to_generator',type=str,choices=['labels','logits'],
                      default='logits',help='''Whether to send labels or logits to the generator
                      to form images. Chris recommends labels''')

#Network
net_arg = add_argument_group('Network')

#TODO need help strings
net_arg.add_argument('--df_dim',type=int, default=64 )
net_arg.add_argument('--gf_dim',type=int, default=64,
                    help='''output dimensions [gf_dim,gf_dim] for generator''')
net_arg.add_argument('--c_dim',type=int, default=3,
                     help='''number of color channels. I wouldn't really change
                     this from 3''')

net_arg.add_argument('--z_dim',type=int,default=100,
                     help='''the number of dimensions for the noise input that
                     will be concatenated with labels and fed to the image
                     generator''')

net_arg.add_argument('--loss_function',type=int,default=1,
                     help='''which loss function to choose. See CausalGAN.py''')

net_arg.add_argument('--critic_hidden_size',type=int,default=10,
                    help='''number of neurons per fc layer in discriminator''')

net_arg.add_argument('--reconstr_loss',type=str2bool,default=False,
                     help='''whether to inclue g_loss_on_z in the generator
                     loss. This was True by default until recently which is where there are a lot of unneccsary networks''')


net_arg.add_argument('--stab_proj',type=str2bool,default=False,
                     help='''stabalizing projection method used for
                     discriminator. Stabalizing GAN Training with Multiple
                     Random Projections
                     https://arxiv.org/abs/1705.07831''')

net_arg.add_argument('--n_stab_proj',type=int,default=256,
                     help='''number of stabalizing projections. Need
                     stab_proj=True for this to have effect''')


# Training / test parameters
train_arg = add_argument_group('Training')
train_arg.add_argument('--num_iter',type=int,default=100000,
                       help='the number of training iterations to run the model for')
train_arg.add_argument('--learning_rate',type=float,default=0.0002,
                       help='Learning rate for adam [0.0002]')
train_arg.add_argument('--beta1',type=float,default=0.5,
                       help='Momentum term of adam [0.5]')

train_arg.add_argument('--off_label_losses',type=str2bool,default=False)

#TODO unclear on default for these two arguments
#Not yet setup. Use False
train_arg.add_argument('--pretrain_LabelerR',type=str2bool,default=False)

#counters over epochs preferred
#train_arg.add_argument('--pretrain_LabelerR_no_of_epochs',type=int,default=5)
train_arg.add_argument('--pretrain_LabelerR_no_of_iters',type=int,default=15000)


#TODO: add help strings describing params
train_arg.add_argument('--lambda_m',type=float,default=0.05,)#0.05
train_arg.add_argument('--lambda_k',type=float,default=0.05,)#0.05
train_arg.add_argument('--lambda_l',type=float,default=0.001,)#0.005
train_arg.add_argument('--gamma_m',type=float,default=-1.0,)# NOT USED!
train_arg.add_argument('--gamma_k',type=float,default=-1.0,#0.8#FLAGS.gamma_k not used
                       help='''default initial value''')
train_arg.add_argument('--gamma_l',type=float,default=-1.0,
                      )

train_arg.add_argument('--tau',type=float,default=3000,
                       help='''time constant. Every tau calls of k_t_update will
                       reduce k_t by a factor of 1/e.''')


#old config file differed from implementation:
#    FLAGS.gamma_k = -1.0
#    FLAGS.gamma_m = -1.0 # set to 1/gamma_k in the code
#    FLAGS.gamma_l = -1.0 # made more extreme
#    FLAGS.lambda_k = 0.05
#    FLAGS.lambda_m = 0.05
#    FLAGS.lambda_l = 0.001


# Misc
misc_arg = add_argument_group('Misc')
misc_arg.add_argument('--is_train',type=str2bool,default=False,
                      help='''whether to enter the image training loop''')
misc_arg.add_argument('--log_level', type=str, default='INFO', choices=['INFO', 'DEBUG', 'WARN'])
misc_arg.add_argument('--log_dir', type=str, default='logs')
misc_arg.add_argument('--log_step', type=int, default=100,
                     help='''how often to log stuff. Sample images are created
                     every 10*log_step''')


##REFERENCE
#  elif model_ID == 44:
#    FLAGS.is_train = True
#    #FLAGS.graph = "big_causal_graph"
#    FLAGS.graph = "complete_big_causal_graph"
#    FLAGS.loss_function = 1
#    FLAGS.pretrain_LabelerR = False
#    FLAGS.pretrain_LabelerR_no_of_epochs = 3
#    FLAGS.fakeLabels_distribution = "real_joint"
#    FLAGS.gamma_k = -1.0
#    FLAGS.gamma_m = -1.0 # set to 1/gamma_k in the code
#    FLAGS.gamma_l = -1.0 # made more extreme
#    FLAGS.lambda_k = 0.05
#    FLAGS.lambda_m = 0.05
#    FLAGS.lambda_l = 0.001
#    FLAGS.label_type = 'continuous'
#    return FLAGS



def get_config():
    config, unparsed = parser.parse_known_args()

    print('Loaded ./causal_dcgan/config.py')
    return config, unparsed

if __name__=='__main__':
    #for debug of config
    config, unparsed = get_config()



================================================
FILE: causal_dcgan/models.py
================================================
import tensorflow as tf
import numpy as np
slim = tf.contrib.slim
import math

from ops import lrelu,linear,conv_cond_concat,batch_norm,add_minibatch_features

from ops import conv2d,deconv2d


def conv_out_size_same(size, stride):
  return int(math.ceil(float(size) / float(stride)))

def GeneratorCNN( z, config, reuse=None):
    '''
    maps z to a 64x64 images with values in [-1,1]
    uses batch normalization internally
    '''

    #trying to get around batch_size like this:
    batch_size=tf.shape(z)[0]
    #batch_size=tf.placeholder_with_default(64,[],'bs')

    with tf.variable_scope("generator",reuse=reuse) as vs:
        g_bn0 = batch_norm(name='g_bn0')
        g_bn1 = batch_norm(name='g_bn1')
        g_bn2 = batch_norm(name='g_bn2')
        g_bn3 = batch_norm(name='g_bn3')

        s_h, s_w = config.gf_dim, config.gf_dim#64,64
        s_h2, s_w2 = conv_out_size_same(s_h, 2), conv_out_size_same(s_w, 2)
        s_h4, s_w4 = conv_out_size_same(s_h2, 2), conv_out_size_same(s_w2, 2)
        s_h8, s_w8 = conv_out_size_same(s_h4, 2), conv_out_size_same(s_w4, 2)
        s_h16, s_w16 = conv_out_size_same(s_h8, 2), conv_out_size_same(s_w8, 2)



        # project `z` and reshape
        z_, self_h0_w, self_h0_b = linear(
            z, config.gf_dim*8*s_h16*s_w16, 'g_h0_lin', with_w=True)

        self_h0 = tf.reshape(
            z_, [-1, s_h16, s_w16, config.gf_dim * 8])
        h0 = tf.nn.relu(g_bn0(self_h0))

        h1, h1_w, h1_b = deconv2d(
            h0, [batch_size, s_h8, s_w8, config.gf_dim*4], name='g_h1', with_w=True)
        h1 = tf.nn.relu(g_bn1(h1))

        h2, h2_w, h2_b = deconv2d(
            h1, [batch_size, s_h4, s_w4, config.gf_dim*2], name='g_h2', with_w=True)
        h2 = tf.nn.relu(g_bn2(h2))

        h3, h3_w, h3_b = deconv2d(
            h2, [batch_size, s_h2, s_w2, config.gf_dim*1], name='g_h3', with_w=True)
        h3 = tf.nn.relu(g_bn3(h3))

        h4, h4_w, h4_b = deconv2d(
            h3, [batch_size, s_h, s_w, config.c_dim], name='g_h4', with_w=True)
        out=tf.nn.tanh(h4)

    variables = tf.contrib.framework.get_variables(vs)
    return out, variables

def DiscriminatorCNN(image, config, reuse=None):
    '''
    Discriminator for GAN model.

    image      : batch_size x 64x64x3 image
    config     : see causal_dcgan/config.py
    reuse      : pass True if not calling for first time

    returns: probabilities(real)
           : logits(real)
           : first layer activation used to estimate z from
           : variables list
    '''
    with tf.variable_scope("discriminator",reuse=reuse) as vs:
        d_bn1 = batch_norm(name='d_bn1')
        d_bn2 = batch_norm(name='d_bn2')
        d_bn3 = batch_norm(name='d_bn3')

        if not config.stab_proj:
            h0 = lrelu(conv2d(image, config.df_dim, name='d_h0_conv'))#16,32,32,64

        else:#method to restrict disc from winning
            #I think this is equivalent to just not letting disc optimize first layer
            #and also removing nonlinearity

            #k_h=5, k_w=5, d_h=2, d_w=2, stddev=0.02,
            #paper used 8x8 kernel, but I'm using 5x5 because it is more similar to my achitecture
            #n_projs=config.df_dim#64 instead of 32 in paper
            n_projs=config.n_stab_proj#64 instead of 32 in paper

            print("WARNING:STAB_PROJ active, using ",n_projs," projections")

            w_proj = tf.get_variable('w_proj', [5, 5, image.get_shape()[-1],n_projs],
                initializer=tf.truncated_normal_initializer(stddev=0.02),trainable=False)
            conv = tf.nn.conv2d(image, w_proj, strides=[1, 2, 2, 1], padding='SAME')

            b_proj = tf.get_variable('b_proj', [n_projs],#does nothing
                 initializer=tf.constant_initializer(0.0),trainable=False)
            h0=tf.nn.bias_add(conv,b_proj)


        h1_ = lrelu(d_bn1(conv2d(h0, config.df_dim*2, name='d_h1_conv')))#16,16,16,128

        h1 = add_minibatch_features(h1_, config.df_dim)
        h2 = lrelu(d_bn2(conv2d(h1, config.df_dim*4, name='d_h2_conv')))#16,16,16,248
        h3 = lrelu(d_bn3(conv2d(h2, config.df_dim*8, name='d_h3_conv')))
        #print('h3shape: ',h3.get_shape().as_list())
        #print('8df_dim:',config.df_dim*8)
        #dim3=tf.reduce_prod(tf.shape(h3)[1:])
        dim3=np.prod(h3.get_shape().as_list()[1:])
        h3_flat=tf.reshape(h3, [-1,dim3])
        h4 = linear(h3_flat, 1, 'd_h3_lin')

        prob=tf.nn.sigmoid(h4)

        variables = tf.contrib.framework.get_variables(vs,collection=tf.GraphKeys.TRAINABLE_VARIABLES)

    return prob, h4, h1_, variables


def discriminator_labeler(image, output_dim, config, reuse=None):
    batch_size=tf.shape(image)[0]
    with tf.variable_scope("disc_labeler",reuse=reuse) as vs:
        dl_bn1 = batch_norm(name='dl_bn1')
        dl_bn2 = batch_norm(name='dl_bn2')
        dl_bn3 = batch_norm(name='dl_bn3')

        h0 = lrelu(conv2d(image, config.df_dim, name='dl_h0_conv'))#16,32,32,64
        h1 = lrelu(dl_bn1(conv2d(h0, config.df_dim*2, name='dl_h1_conv')))#16,16,16,128
        h2 = lrelu(dl_bn2(conv2d(h1, config.df_dim*4, name='dl_h2_conv')))#16,16,16,248
        h3 = lrelu(dl_bn3(conv2d(h2, config.df_dim*8, name='dl_h3_conv')))
        dim3=np.prod(h3.get_shape().as_list()[1:])
        h3_flat=tf.reshape(h3, [-1,dim3])
        D_labels_logits = linear(h3_flat, output_dim, 'dl_h3_Label')
        D_labels = tf.nn.sigmoid(D_labels_logits)
        variables = tf.contrib.framework.get_variables(vs)
    return D_labels, D_labels_logits, variables

def discriminator_gen_labeler(image, output_dim, config, reuse=None):
    batch_size=tf.shape(image)[0]
    with tf.variable_scope("disc_gen_labeler",reuse=reuse) as vs:
        dl_bn1 = batch_norm(name='dl_bn1')
        dl_bn2 = batch_norm(name='dl_bn2')
        dl_bn3 = batch_norm(name='dl_bn3')

        h0 = lrelu(conv2d(image, config.df_dim, name='dgl_h0_conv'))#16,32,32,64
        h1 = lrelu(dl_bn1(conv2d(h0, config.df_dim*2, name='dgl_h1_conv')))#16,16,16,128
        h2 = lrelu(dl_bn2(conv2d(h1, config.df_dim*4, name='dgl_h2_conv')))#16,16,16,248
        h3 = lrelu(dl_bn3(conv2d(h2, config.df_dim*8, name='dgl_h3_conv')))
        dim3=np.prod(h3.get_shape().as_list()[1:])
        h3_flat=tf.reshape(h3, [-1,dim3])
        D_labels_logits = linear(h3_flat, output_dim, 'dgl_h3_Label')
        D_labels = tf.nn.sigmoid(D_labels_logits)
        variables = tf.contrib.framework.get_variables(vs)
    return D_labels, D_labels_logits,variables

def discriminator_on_z(image, config, reuse=None):
    batch_size=tf.shape(image)[0]
    with tf.variable_scope("disc_z_labeler",reuse=reuse) as vs:
        dl_bn1 = batch_norm(name='dl_bn1')
        dl_bn2 = batch_norm(name='dl_bn2')
        dl_bn3 = batch_norm(name='dl_bn3')

        h0 = lrelu(conv2d(image, config.df_dim, name='dzl_h0_conv'))#16,32,32,64
        h1 = lrelu(dl_bn1(conv2d(h0, config.df_dim*2, name='dzl_h1_conv')))#16,16,16,128
        h2 = lrelu(dl_bn2(conv2d(h1, config.df_dim*4, name='dzl_h2_conv')))#16,16,16,248
        h3 = lrelu(dl_bn3(conv2d(h2, config.df_dim*8, name='dzl_h3_conv')))
        dim3=np.prod(h3.get_shape().as_list()[1:])
        h3_flat=tf.reshape(h3, [-1,dim3])
        D_labels_logits = linear(h3_flat, config.z_dim, 'dzl_h3_Label')
        D_labels = tf.nn.tanh(D_labels_logits)
        variables = tf.contrib.framework.get_variables(vs)
    return D_labels,variables





================================================
FILE: causal_dcgan/ops.py
================================================
import math
import numpy as np
import tensorflow as tf

from tensorflow.python.framework import ops

from utils import *



class batch_norm(object):
    def __init__(self, epsilon=1e-5, momentum = 0.9, name="batch_norm"):
        with tf.variable_scope(name):
            self.epsilon  = epsilon
            self.momentum = momentum
            self.name = name

    def __call__(self, x, train=True):
        return tf.contrib.layers.batch_norm(x,
                                          decay=self.momentum,
                                          updates_collections=None,
                                          epsilon=self.epsilon,
                                          scale=True,
                                          is_training=train,
                                          scope=self.name)

def conv_cond_concat(x, y):
    """Concatenate conditioning vector on feature map axis."""
    #print('input x:',x.get_shape().as_list())
    #print('input y:',y.get_shape().as_list())

    xshape=x.get_shape()
    #tile by [1,64,64,1]

    tile_shape=tf.stack([1,xshape[1],xshape[2],1])
    tile_y=tf.tile(y,tile_shape)

    #print('tile y:',tile_y.get_shape().as_list())

    return tf.concat([x,tile_y],axis=3)


    #x_shapes = x.get_shape()
    #y_shapes = y.get_shape()
    #return tf.concat([
    #x, y*tf.ones([x_shapes[0], x_shapes[1], x_shapes[2], y_shapes[3]])], 3)


def conv2d(input_, output_dim,
       k_h=5, k_w=5, d_h=2, d_w=2, stddev=0.02,
       name="conv2d"):
  with tf.variable_scope(name):
    w = tf.get_variable('w', [k_h, k_w, input_.get_shape()[-1], output_dim],
              initializer=tf.truncated_normal_initializer(stddev=stddev))
    conv = tf.nn.conv2d(input_, w, strides=[1, d_h, d_w, 1], padding='SAME')

    biases = tf.get_variable('biases', [output_dim], initializer=tf.constant_initializer(0.0))
    #conv = tf.reshape(tf.nn.bias_add(conv, biases), conv.get_shape())
    conv=tf.nn.bias_add(conv,biases)

    return conv

def deconv2d(input_, output_shape,
       k_h=5, k_w=5, d_h=2, d_w=2, stddev=0.02,
       name="deconv2d", with_w=False):
    with tf.variable_scope(name):
        # filter : [height, width, output_channels, in_channels]
        w = tf.get_variable('w', [k_h, k_w, output_shape[-1], input_.get_shape()[-1]],
                  initializer=tf.random_normal_initializer(stddev=stddev))

        tf_output_shape=tf.stack(output_shape)
        deconv = tf.nn.conv2d_transpose(input_, w, output_shape=tf_output_shape,
                strides=[1, d_h, d_w, 1])

        biases = tf.get_variable('biases', [output_shape[-1]], initializer=tf.constant_initializer(0.0))
        #deconv = tf.reshape(tf.nn.bias_add(deconv, biases), deconv.get_shape())
        deconv = tf.reshape(tf.nn.bias_add(deconv, biases), tf_output_shape)

        if with_w:
            return deconv, w, biases
        else:
            return deconv

def lrelu(x,leak=0.2,name='lrelu'):
    with tf.variable_scope(name):
        f1=0.5 * (1+leak)
        f2=0.5 * (1-leak)
        return f1*x + f2*tf.abs(x)

#This takes more memory than above
#def lrelu(x, leak=0.2, name="lrelu"):
#  return tf.maximum(x, leak*x)

def linear(input_, output_size, scope=None, stddev=0.02, bias_start=0.0, with_w=False):
    shape = input_.get_shape().as_list()

    #mat_shape=tf.stack([tf.shape(input_)[1],output_size])
    mat_shape=[shape[1],output_size]

    with tf.variable_scope(scope or "Linear"):
        #matrix = tf.get_variable("Matrix", [shape[1], output_size], tf.float32,
        matrix = tf.get_variable("Matrix", mat_shape, tf.float32,
                     tf.random_normal_initializer(stddev=stddev))
        bias = tf.get_variable("bias", [output_size],
                   initializer=tf.constant_initializer(bias_start))
        if with_w:
            return tf.matmul(input_, matrix) + bias, matrix, bias
        else:
            return tf.matmul(input_, matrix) + bias


#minibatch method that improves on openai
#because it doesn't fix batchsize:
#TODO: recheck when not sleepy
def add_minibatch_features(image,df_dim):
    shape = image.get_shape().as_list()
    dim = np.prod(shape[1:])            # dim = prod(9,2) = 18
    h_mb0 = lrelu(conv2d(image, df_dim, name='d_mb0_conv'))
    h_mb1 = conv2d(h_mb0, df_dim, name='d_mbh1_conv')

    dims=h_mb1.get_shape().as_list()
    conv_dims=np.prod(dims[1:])

    image_ = tf.reshape(h_mb1, tf.stack([-1, conv_dims]))
    #image_ = tf.reshape(h_mb1, tf.stack([batch_size, -1]))

    n_kernels = 300
    dim_per_kernel = 50
    x = linear(image_, n_kernels * dim_per_kernel,'d_mbLinear')
    act = tf.reshape(x, (-1, n_kernels, dim_per_kernel))

    act= tf.reshape(x, (-1, n_kernels, dim_per_kernel))
    act_tp=tf.transpose(act, [1,2,0])
    #bs x n_ker x dim_ker x bs -> bs x n_ker x bs :
    abs_dif = tf.reduce_sum(tf.abs(tf.expand_dims(act, 3) - tf.expand_dims(act_tp, 0)), 2)
    eye=tf.expand_dims( tf.eye( tf.shape(abs_dif)[0] ), 1)#bs x 1 x bs
    masked=tf.exp(-abs_dif) - eye
    f1=tf.reduce_mean( masked, 2)
    mb_features = tf.reshape(f1, [-1, 1, 1, n_kernels])
    return conv_cond_concat(image, mb_features)

## following is from https://github.com/openai/improved-gan/blob/master/imagenet/discriminator.py#L88
#def add_minibatch_features(image,df_dim,batch_size):
#    shape = image.get_shape().as_list()
#    dim = np.prod(shape[1:])            # dim = prod(9,2) = 18
#    h_mb0 = lrelu(conv2d(image, df_dim, name='d_mb0_conv'))
#    h_mb1 = conv2d(h_mb0, df_dim, name='d_mbh1_conv')
#
#    dims=h_mb1.get_shape().as_list()
#    conv_dims=np.prod(dims[1:])
#
#    image_ = tf.reshape(h_mb1, tf.stack([-1, conv_dims]))
#    #image_ = tf.reshape(h_mb1, tf.stack([batch_size, -1]))
#
#    n_kernels = 300
#    dim_per_kernel = 50
#    x = linear(image_, n_kernels * dim_per_kernel,'d_mbLinear')
#    activation = tf.reshape(x, (batch_size, n_kernels, dim_per_kernel))
#    big = np.zeros((batch_size, batch_size), dtype='float32')
#    big += np.eye(batch_size)
#    big = tf.expand_dims(big, 1)
#    abs_dif = tf.reduce_sum(tf.abs(tf.expand_dims(activation, 3) - tf.expand_dims(tf.transpose(activation, [1, 2, 0]), 0)), 2)
#    mask = 1. - big
#    masked = tf.exp(-abs_dif) * mask
#    f1 = tf.reduce_sum(masked, 2) / tf.reduce_sum(mask)
#    mb_features = tf.reshape(f1, [batch_size, 1, 1, n_kernels])
#    return conv_cond_concat(image, mb_features)







================================================
FILE: causal_dcgan/utils.py
================================================
"""
Some codes from https://github.com/Newmu/dcgan_code
"""
from __future__ import division
import math
import json
import random
import pprint
import scipy.misc
import numpy as np
from time import gmtime, strftime
from six.moves import xrange
import os

pp = pprint.PrettyPrinter()

get_stddev = lambda x, k_h, k_w: 1/math.sqrt(k_w*k_h*x.get_shape()[-1])


def get_image(image_path, input_height, input_width,
              resize_height=64, resize_width=64,
              is_crop=True, is_grayscale=False):
  image = imread(image_path, is_grayscale)
  return transform(image, input_height, input_width,
                   resize_height, resize_width, is_crop)

def save_images(images, size, image_path):
  return imsave(inverse_transform(images), size, image_path)

def imread(path, is_grayscale = False):
  if (is_grayscale):
    return scipy.misc.imread(path, flatten = True).astype(np.float)
  else:
    return scipy.misc.imread(path).astype(np.float)

def merge_images(images, size):
  return inverse_transform(images)

def merge(images, size):
  h, w = images.shape[1], images.shape[2]
  img = np.zeros((h * size[0], w * size[1], 3))
  for idx, image in enumerate(images):
    i = idx % size[1]
    j = idx // size[1]
    img[j*h:j*h+h, i*w:i*w+w, :] = image
  return img

def imsave(images, size, path):
  return scipy.misc.imsave(path, merge(images, size))

def center_crop(x, crop_h, crop_w,
                resize_h=64, resize_w=64):
  if crop_w is None:
    crop_w = crop_h
  h, w = x.shape[:2]
  j = int(round((h - crop_h)/2.))
  i = int(round((w - crop_w)/2.))
  return scipy.misc.imresize(
      x[j:j+crop_h, i:i+crop_w], [resize_h, resize_w])

def transform(image, input_height, input_width,
              resize_height=64, resize_width=64, is_crop=True):
  if is_crop:
    cropped_image = center_crop(
      image, input_height, input_width,
      resize_height, resize_width)
  else:
    cropped_image = scipy.misc.imresize(image, [resize_height, resize_width])
  return np.array(cropped_image)/127.5 - 1.

def inverse_transform(images):
  return (images+1.)/2.

def to_json(output_path, *layers):
  with open(output_path, "w") as layer_f:
    lines = ""
    for w, b, bn in layers:
      layer_idx = w.name.split('/')[0].split('h')[1]

      B = b.eval()

      if "lin/" in w.name:
        W = w.eval()
        depth = W.shape[1]
      else:
        W = np.rollaxis(w.eval(), 2, 0)
        depth = W.shape[0]

      biases = {"sy": 1, "sx": 1, "depth": depth, "w": ['%.2f' % elem for elem in list(B)]}
      if bn != None:
        gamma = bn.gamma.eval()
        beta = bn.beta.eval()

        gamma = {"sy": 1, "sx": 1, "depth": depth, "w": ['%.2f' % elem for elem in list(gamma)]}
        beta = {"sy": 1, "sx": 1, "depth": depth, "w": ['%.2f' % elem for elem in list(beta)]}
      else:
        gamma = {"sy": 1, "sx": 1, "depth": 0, "w": []}
        beta = {"sy": 1, "sx": 1, "depth": 0, "w": []}

      if "lin/" in w.name:
        fs = []
        for w in W.T:
          fs.append({"sy": 1, "sx": 1, "depth": W.shape[0], "w": ['%.2f' % elem for elem in list(w)]})

        lines += """
          var layer_%s = {
            "layer_type": "fc",
            "sy": 1, "sx": 1,
            "out_sx": 1, "out_sy": 1,
            "stride": 1, "pad": 0,
            "out_depth": %s, "in_depth": %s,
            "biases": %s,
            "gamma": %s,
            "beta": %s,
            "filters": %s
          };""" % (layer_idx.split('_')[0], W.shape[1], W.shape[0], biases, gamma, beta, fs)
      else:
        fs = []
        for w_ in W:
          fs.append({"sy": 5, "sx": 5, "depth": W.shape[3], "w": ['%.2f' % elem for elem in list(w_.flatten())]})

        lines += """
          var layer_%s = {
            "layer_type": "deconv",
            "sy": 5, "sx": 5,
            "out_sx": %s, "out_sy": %s,
            "stride": 2, "pad": 1,
            "out_depth": %s, "in_depth": %s,
            "biases": %s,
            "gamma": %s,
            "beta": %s,
            "filters": %s
          };""" % (layer_idx, 2**(int(layer_idx)+2), 2**(int(layer_idx)+2),
               W.shape[0], W.shape[3], biases, gamma, beta, fs)
    layer_f.write(" ".join(lines.replace("'","").split()))

def make_gif(images, fname, duration=2, true_image=False):
    import moviepy.editor as mpy

    def make_frame(t):
        try:
            x = images[int(len(images)/duration*t)]
        except:
            x = images[-1]

    if true_image:
        return x.astype(np.uint8)
    else:
        return ((x+1)/2*255).astype(np.uint8)

    clip = mpy.VideoClip(make_frame, duration=duration)
    clip.write_gif(fname, fps = len(images) / duration)


================================================
FILE: causal_graph.py
================================================
'''
To use a particular causal graph, just specify it here


Strings specified have to match *exactly* to keys in attribute text file


A graph lists each node and it's parents in pairs

A->B, C->D, D->B:
    [['A',[]],
     ['B',['A','D']],
     ['C',[]],
     ['D',[]]]

'''

#A reminder of what labels are available
#Make sure to use caps-sensitive correct spelling
all_nodes=[
        ['5_o_Clock_Shadow',[]],
        ['Arched_Eyebrows',[]],
        ['Attractive',[]],
        ['Bags_Under_Eyes',[]],
        ['Bald',[]],
        ['Bangs',[]],
        ['Big_Lips',[]],
        ['Big_Nose',[]],
        ['Black_Hair',[]],
        ['Blond_Hair',[]],
        ['Blurry',[]],
        ['Brown_Hair',[]],
        ['Bushy_Eyebrows',[]],
        ['Chubby',[]],
        ['Double_Chin',[]],
        ['Eyeglasses',[]],
        ['Goatee',[]],
        ['Gray_Hair',[]],
        ['Heavy_Makeup',[]],
        ['High_Cheekbones',[]],
        ['Male',[]],
        ['Mouth_Slightly_Open',[]],
        ['Mustache',[]],
        ['Narrow_Eyes',[]],
        ['No_Beard',[]],
        ['Oval_Face',[]],
        ['Pale_Skin',[]],
        ['Pointy_Nose',[]],
        ['Receding_Hairline',[]],
        ['Rosy_Cheeks',[]],
        ['Sideburns',[]],
        ['Smiling',[]],
        ['Straight_Hair',[]],
        ['Wavy_Hair',[]],
        ['Wearing_Earrings',[]],
        ['Wearing_Hat',[]],
        ['Wearing_Lipstick',[]],
        ['Wearing_Necklace',[]],
        ['Wearing_Necktie',[]],
        ['Young',[]]
    ]

causal_graphs={
#'complete_all':[
#        ['Young',[]],
#        ['Male',['Young']],
#        ['Eyeglasses',['Male','Young']],
#        ['Bald',            ['Male','Young','Eyeglasses']],
#        ['Mustache',        ['Male','Young','Eyeglasses','Bald']],
#        ['Smiling',         ['Male','Young','Eyeglasses','Bald','Mustache']],
#        ['Wearing_Lipstick',['Male','Young','Eyeglasses','Bald','Mustache','Smiling']],
#        ['Mouth_Slightly_Open',['Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick']],
#        ['Narrow_Eyes',['Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
#        ['5_o_Clock_Shadow',['Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
#        ['Arched_Eyebrows',['5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
#        ['Attractive',['Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
#        ['Bags_Under_Eyes',['Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
#        ['Bangs',['Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
#        ['Big_Lips',['Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
#        ['Big_Nose',['Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
#        ['Black_Hair',['Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
#        ['Blond_Hair',['Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
#        ['Blurry',['Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
#        ['Brown_Hair',['Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
#        ['Bushy_Eyebrows',['Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
#        ['Chubby',['Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
#        ['Double_Chin',['Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
        #['Goatee',['Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
        #['Gray_Hair',['Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
        #['Heavy_Makeup',['Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
        #['High_Cheekbones',['Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
        #['Mouth_Slightly_Open',['High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
        #['Mustache',['Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
        #['Narrow_Eyes',['Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
        #['No_Beard',['Narrow_Eyes','Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
        #['Oval_Face',['No_Beard','Narrow_Eyes','Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
        #['Pale_Skin',['Oval_Face','No_Beard','Narrow_Eyes','Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
        #['Pointy_Nose',['Pale_Skin','Oval_Face','No_Beard','Narrow_Eyes','Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
        #['Receding_Hairline',['Pointy_Nose','Pale_Skin','Oval_Face','No_Beard','Narrow_Eyes','Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
        #['Rosy_Cheeks',['Receding_Hairline','Pointy_Nose','Pale_Skin','Oval_Face','No_Beard','Narrow_Eyes','Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
        #['Sideburns',['Rosy_Cheeks','Receding_Hairline','Pointy_Nose','Pale_Skin','Oval_Face','No_Beard','Narrow_Eyes','Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
        #['Smiling',['Sideburns','Rosy_Cheeks','Receding_Hairline','Pointy_Nose','Pale_Skin','Oval_Face','No_Beard','Narrow_Eyes','Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
        #['Straight_Hair',['Smiling','Sideburns','Rosy_Cheeks','Receding_Hairline','Pointy_Nose','Pale_Skin','Oval_Face','No_Beard','Narrow_Eyes','Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
        #['Wavy_Hair',['Straight_Hair','Smiling','Sideburns','Rosy_Cheeks','Receding_Hairline','Pointy_Nose','Pale_Skin','Oval_Face','No_Beard','Narrow_Eyes','Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
        #['Wearing_Earrings',['Wavy_Hair','Straight_Hair','Smiling','Sideburns','Rosy_Cheeks','Receding_Hairline','Pointy_Nose','Pale_Skin','Oval_Face','No_Beard','Narrow_Eyes','Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
        #['Wearing_Hat',['Wearing_Earrings','Wavy_Hair','Straight_Hair','Smiling','Sideburns','Rosy_Cheeks','Receding_Hairline','Pointy_Nose','Pale_Skin','Oval_Face','No_Beard','Narrow_Eyes','Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
        #['Wearing_Lipstick',['Wearing_Hat','Wearing_Earrings','Wavy_Hair','Straight_Hair','Smiling','Sideburns','Rosy_Cheeks','Receding_Hairline','Pointy_Nose','Pale_Skin','Oval_Face','No_Beard','Narrow_Eyes','Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
        #['Wearing_Necklace',['Wearing_Lipstick','Wearing_Hat','Wearing_Earrings','Wavy_Hair','Straight_Hair','Smiling','Sideburns','Rosy_Cheeks','Receding_Hairline','Pointy_Nose','Pale_Skin','Oval_Face','No_Beard','Narrow_Eyes','Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
        #['Wearing_Necktie',['Wearing_Necklace','Wearing_Lipstick','Wearing_Hat','Wearing_Earrings','Wavy_Hair','Straight_Hair','Smiling','Sideburns','Rosy_Cheeks','Receding_Hairline','Pointy_Nose','Pale_Skin','Oval_Face','No_Beard','Narrow_Eyes','Mustache','Mouth_Slightly_Open','High_Cheekbones','Heavy_Makeup','Gray_Hair','Goatee','Double_Chin','Chubby','Bushy_Eyebrows','Brown_Hair','Blurry','Blond_Hair','Black_Hair','Big_Nose','Big_Lips','Bangs','Bags_Under_Eyes','Attractive','Arched_Eyebrows','5_o_Clock_Shadow','Narrow_Eyes','Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
#    ],

'subset1_nodes':[
    ['Bald',[]],
#        ['Blurry',[]],
#        ['Brown_Hair',[]],
#        ['Bushy_Eyebrows',[]],
#        ['Chubby',[]],
    ['Double_Chin',[]],
#        ['Eyeglasses',[]],
#        ['Goatee',[]],
#        ['Gray_Hair',[]],
    ['Male',[]],
    ['Mustache',[]],
    ['No_Beard',[]],
    ['Smiling',[]],
#        ['Straight_Hair',[]],
#        ['Wavy_Hair',[]],
    ['Wearing_Earrings',[]],
#        ['Wearing_Hat',[]],
    ['Wearing_Lipstick',[]],
    ['Young',[]]
],


'standard_graph':[
   ['Male'   , []              ],
   ['Young'  , []              ],
   ['Smiling', ['Male','Young']]
   ],

'male_causes_beard':[
    ['Male',[]],
    ['No_Beard',['Male']],
],
'male_causes_mustache':[
    ['Male',[]],
    ['Mustache',['Male']],
],

'mustache_causes_male':[
    ['Male',['Mustache']],
    ['Mustache',[]],
],

'young_causes_gray':[
    ['Young',[]],
    ['Gray_Hair',['Young']],
    ],

'gray_causes_young':[
    ['Young',['Gray_Hair']],
    ['Gray_Hair',[]],
    ],

'young_ind_gray':[
        ['Young',[]],
        ['Gray_Hair',[]],
        ],


'small_causal_graph':[
        ['Young',[]],
        ['Male',[]],
        ['Mustache',        ['Male','Young']],
        ['Smiling',         ['Male','Young']],
        ['Wearing_Lipstick',['Male','Young']],
        ['Mouth_Slightly_Open',['Male','Young','Smiling']],
        ['Narrow_Eyes',        ['Male','Young','Smiling']],
    ],


'big_causal_graph':[
        ['Young',[]],
        ['Male',[]],
        ['Eyeglasses',['Young']],
        ['Bald',            ['Male','Young']],
        ['Mustache',        ['Male','Young']],
        ['Smiling',         ['Male','Young']],
        ['Wearing_Lipstick',['Male','Young']],
        ['Mouth_Slightly_Open',['Young','Smiling']],
        ['Narrow_Eyes',        ['Male','Young','Smiling']],
    ],

'complete_big_causal_graph':[
        ['Young',[]],
        ['Male',['Young']],
        ['Eyeglasses',['Male','Young']],
        ['Bald',            ['Male','Young','Eyeglasses']],
        ['Mustache',        ['Male','Young','Eyeglasses','Bald']],
        ['Smiling',         ['Male','Young','Eyeglasses','Bald','Mustache']],
        ['Wearing_Lipstick',['Male','Young','Eyeglasses','Bald','Mustache','Smiling']],
        ['Mouth_Slightly_Open',['Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick']],
        ['Narrow_Eyes',['Male','Young','Eyeglasses','Bald','Mustache','Smiling','Wearing_Lipstick','Mouth_Slightly_Open']],
    ],

'reverse_complete_big_causal_graph':[

        ['Narrow_Eyes',        []],
        ['Mouth_Slightly_Open',['Narrow_Eyes']],
        ['Wearing_Lipstick',   ['Narrow_Eyes','Mouth_Slightly_Open']],
        ['Smiling',            ['Narrow_Eyes','Mouth_Slightly_Open','Wearing_Lipstick']],
        ['Mustache',           ['Narrow_Eyes','Mouth_Slightly_Open','Wearing_Lipstick','Smiling']],
        ['Bald',               ['Narrow_Eyes','Mouth_Slightly_Open','Wearing_Lipstick','Smiling','Mustache']],
        ['Eyeglasses',         ['Narrow_Eyes','Mouth_Slightly_Open','Wearing_Lipstick','Smiling','Mustache','Bald']],
        ['Male',               ['Narrow_Eyes','Mouth_Slightly_Open','Wearing_Lipstick','Smiling','Mustache','Bald','Eyeglasses']],
        ['Young',              ['Narrow_Eyes','Mouth_Slightly_Open','Wearing_Lipstick','Smiling','Mustache','Bald','Eyeglasses','Male']],

    ],

'indep_big_causal_graph':[
        ['Young',[]],
        ['Male',[]],
        ['Eyeglasses',[]],
        ['Bald',            []],
        ['Mustache',        []],
        ['Smiling',         []],
        ['Wearing_Lipstick',[]],
        ['Mouth_Slightly_Open',[]],
        ['Narrow_Eyes',        []],
    ],


'complete_minimal_graph':[
        ['Young',[]],
        ['Male',['Young']],
        ['Mustache',        ['Male','Young']],
        ['Wearing_Lipstick',['Male','Young','Mustache']],
        ['Smiling',         ['Male','Young','Mustache','Wearing_Lipstick']],
    ],

'male_ind_mustache ': [
        ['Male',[]],
        ['Mustache',[]]
    ],
'Smiling_MSO ': [
        ['Smiling',[]],
        ['Mouth_Slightly_Open',['Smiling']]
       ],

'Male_Young_Eyeglasses':[
    ['Male',[]],
    ['Young',[]],
    ['Eyeglasses',['Male','Young']]
    ],

'MYESO':[
    ['Male',[]],
    ['Young',['Male']],
    ['Eyeglasses',['Male','Young']],
    ['Smiling',['Male','Young','Eyeglasses']],
    ['Mouth_Slightly_Open',['Male','Young','Eyeglasses','Smiling']],
    ],

'mustache':[
    ['Mustache',[]]
    ],

'male_ind_mustache ': [
        ['Male',[]],
        ['Mustache',[]]
    ],

'male_smiling_lipstick':[
       ['Male'   , []],
       ['Wearing_Lipstick'  , ['Male']],
       ['Smiling', ['Male']]
       ],
'SLM':[
       ['Smiling'   , []],
       ['Wearing_Lipstick'  , ['Smiling']],
       ['Male', ['Smiling','Wearing_Lipstick']]
       ],
'MLS':[
       ['Male'   , []],
       ['Wearing_Lipstick'  , ['Male']],
       ['Smiling', ['Male','Wearing_Lipstick']]
       ],
'M':[
    ['Male',[]]
    ],

'Smiling_MSO ': [
        ['Smiling',[]],
        ['Mouth_Slightly_Open',['Smiling']]
       ],
'MYESO':[
    ['Male',[]],
    ['Young',['Male']],
    ['Eyeglasses',['Male','Young']],
    ['Smiling',['Male','Young','Eyeglasses']],
    ['Mouth_Slightly_Open',['Male','Young','Eyeglasses','Smiling']],
    ],

'MSO_smiling ': [
        ['Smiling',['Mouth_Slightly_Open']],
        ['Mouth_Slightly_Open',[]]
       ],
'Male_Young_Eyeglasses ': [
        ['Male',[]],
        ['Young',[]],
        ['Eyeglasses',['Male','Young']]
        ],
'Male_Young_Eyeglasses_complete ': [
        ['Male',[]],
        ['Young',['Male']],
        ['Eyeglasses',['Male','Young']]
        ],
'male_mustache_lipstick':[
       ['Male'   , []],
       ['Mustache', ['Male']],
       ['Wearing_Lipstick'  , ['Male','Mustache']]
       ]
}

def get_causal_graph(causal_model=None,*args,**kwargs):

    #define complete_all
    list_nodes,_=zip(*all_nodes)
    complete_all=[]
    so_far=[]
    for node in list_nodes:
        complete_all.append([node,so_far[:]])
        so_far.append(node)
    causal_graphs['complete_all']=complete_all


    if not causal_model in causal_graphs.keys():
        raise ValueError('the specified graph:',causal_model,' was not one of\
                         those listed in ',__file__)

    else:
        return causal_graphs[causal_model]



================================================
FILE: config.py
================================================
from __future__ import print_function
import argparse

def str2bool(v):
    #return (v is True) or (v.lower() in ('true', '1'))
    return v is True or v.lower() in ('true', '1')

arg_lists = []
parser = argparse.ArgumentParser()

def add_argument_group(name):
    arg = parser.add_argument_group(name)
    arg_lists.append(arg)
    return arg

# Data
data_arg = add_argument_group('Data')
#data_arg.add_argument('--batch_size', type=int, default=16)#default set elsewhere
data_arg.add_argument('--causal_model', type=str,
                     help='''Matches the argument with a key in ./causal_graph.py and sets the graph attribute of cc_config to be a list of lists defining the causal graph''')
data_arg.add_argument('--data_dir', type=str, default='data')
data_arg.add_argument('--dataset', type=str, default='celebA')
data_arg.add_argument('--do_shuffle', type=str2bool, default=True)#never used
data_arg.add_argument('--input_scale_size', type=int, default=64,
                     help='input image will be resized with the given value as width and height')
data_arg.add_argument('--is_crop', type=str2bool, default='True')
data_arg.add_argument('--grayscale', type=str2bool, default=False)#never used
data_arg.add_argument('--split', type=str, default='train')#never used
data_arg.add_argument('--num_worker', type=int, default=24,
                     help='number of threads to use for loading and preprocessing data')
data_arg.add_argument('--resize_method',type=str,default='AREA',choices=['AREA','BILINEAR','BICUBIC','NEAREST_NEIGHBOR'],
                     help='''methods to resize image to 64x64. AREA seems to work
                     best, possibly some scipy methods could work better. It
                     wasn't clear to me why the results should be so different''')


# Training / test parameters
train_arg = add_argument_group('Training')


train_arg.add_argument('--build_train', type=str2bool, default=False,
                      help='''You may want to build all the components for
                       training, without doing any training right away. This is
                      for that. This arg is effectively True when is_train=True''')
train_arg.add_argument('--build_pretrain', type=str2bool, default=False,
                      help='''You may want to build all the components for
                       training, without doing any training right away. This is
                      for that. This arg is effectively True when is_pretrain=True''')


train_arg.add_argument('--model_type',type=str,default='',choices=['dcgan','began'],
                      help='''Which model to use. If the argument is not
                       passed, only causal_controller is built. This overrides
                      is_train=True, since no image model to train''')
train_arg.add_argument('--use_gpu', type=str2bool, default=True)
train_arg.add_argument('--num_gpu', type=int, default=1,
                      help='specify 0 for cpu. If k specified, will default to\
                      first k of n detected. If use_gpu=True but num_gpu not\
                      specified will default to 1')

# Misc
misc_arg = add_argument_group('Misc')
#misc_arg.add_argument('--build_all', type=str2bool, default=False,
#                     help='''normally specifying is_pretrain=False will cause
#                     the pretraining components not to be built and likewise
#                      with is_train=False only the pretrain compoenent will
#                      (possibly) be built. This is here as a debug helper to
#                      enable building out the whole model without doing any
#                      training''')

misc_arg.add_argument('--descrip', type=str, default='',help='''
                      Only use this when creating a new model. New model folder names
                      are generated automatically by using the time-date. Then
                      you cant rename them while the model is running. If
                      provided, this is a short string that appends to the end
                      of a model folder name to help keep track of what the
                      contents of that folder were without getting into the
                      content of that folder. No weird characters''')

misc_arg.add_argument('--dry_run', action='store_true',help='''Build and load
                      the model and all the specified components, but don't actually do
                      any pretraining/training etc. This overrides
                      --is_pretrain, --is_train. This is mostly used for just
                      bringing the model into the workspace if you say wanted
                      to manipulated it in ipython''')

misc_arg.add_argument('--load_path', type=str, default='',
                     help='''This is a "global" load path. You can simply pass
                     the model_dir of the whatever run, and all the variables
                      (dcgan/began and causal_controller both). If you want to
                      just load one component: for example, the pretrained part
                      of a previous model, use pt_load_path from the
                      causal_controller.config section''')

misc_arg.add_argument('--log_step', type=int, default=100,
                     help='''this is used for generic summaries that are common
                     to both models. Use model specific config files for
                     logging done within train_step''')
#misc_arg.add_argument('--save_step', type=int, default=5000)
misc_arg.add_argument('--log_level', type=str, default='INFO', choices=['INFO', 'DEBUG', 'WARN'])
misc_arg.add_argument('--log_dir', type=str, default='logs', help='''where to store model and model results. Do not put a leading "./" out front''')

#misc_arg.add_argument('--sample_per_image', type=int, default=64,
#                      help='# of sample per image during test sample generation')

misc_arg.add_argument('--seed', type=int, default=22,help=
                      '''Not working right now: TF seed should be fixed to make sure exogenous noise for each causal node is fixed also''')

#Doesn't do anything atm
#misc_arg.add_argument('--visualize', action='store_true')


def gpu_logic(config):

    #consistency between use_gpu and num_gpu
    if config.num_gpu>0:
        config.use_gpu=True
    else:
        config.use_gpu=False
#        if config.use_gpu and config.num_gpu==0:
#            config.num_gpu=1
    return config


def get_config():
    config, unparsed = parser.parse_known_args()
    config=gpu_logic(config)
    config.num_devices=max(1,config.num_gpu)#that are used in backprop


    #Just for BEGAN:
    ##this has to respect gpu/cpu
    ##data_format = 'NCHW'
    #if config.use_gpu:
    #    data_format = 'NCHW'
    #else:
    #    data_format = 'NHWC'
    #setattr(config, 'data_format', data_format)

    print('Loaded ./config.py')

    return config, unparsed

if __name__=='__main__':
    #for debug of config
    config, unparsed = get_config()



================================================
FILE: data_loader.py
================================================
import os
import numpy as np
import pandas as pd
from PIL import Image
from glob import glob
import tensorflow as tf

from IPython.core import debugger
debug = debugger.Pdb().set_trace


def logodds(p):
    return np.log(p/(1.-p))

class DataLoader(object):
    '''This loads the image and the labels through a tensorflow queue.
    All of the labels are loaded regardless of what is specified in graph,
    because this model is gpu throttled anyway so there shouldn't be any
    overhead

    For multiple gpu, the strategy here is to have 1 queue with 2xbatch_size
    then use tf.split within trainer.train()
    '''
    def __init__(self,label_names,config):
        self.label_names=label_names
        self.config=config
        self.scale_size=config.input_scale_size
        #self.data_format=config.data_format
        self.split=config.split
        self.do_shuffle=config.do_shuffle
        self.num_worker=config.num_worker
        self.is_crop=config.is_crop
        self.is_grayscale=config.grayscale

        attr_file= glob("{}/*.{}".format(config.data_path, 'txt'))[0]
        setattr(config,'attr_file',attr_file)

        attributes = pd.read_csv(config.attr_file,delim_whitespace=True) #+-1
        #Store all labels for reference
        self.all_attr= 0.5*(attributes+1)# attributes is {0,1}
        self.all_label_means=self.all_attr.mean()

        #but only return desired labels in queues
        self.attr=self.all_attr[label_names]
        self.label_means=self.attr.mean()# attributes is 0,1

        self.image_dir=os.path.join(config.data_path,'images')
        self.filenames=[os.path.join(self.image_dir,j) for j in self.attr.index]

        self.num_examples_per_epoch=len(self.filenames)
        self.min_fraction_of_examples_in_queue=0.001#go faster during debug
        #self.min_fraction_of_examples_in_queue=0.01
        self.min_queue_examples=int(self.num_examples_per_epoch*self.min_fraction_of_examples_in_queue)


    def get_label_queue(self,batch_size):
        tf_labels = tf.convert_to_tensor(self.attr.values, dtype=tf.uint8)#0,1

        with tf.name_scope('label_queue'):
            uint_label=tf.train.slice_input_producer([tf_labels])[0]
        label=tf.to_float(uint_label)

        #All labels, not just those in causal_model
        dict_data={sl:tl for sl,tl in
                   zip(self.label_names,tf.split(label,len(self.label_names)))}


        num_preprocess_threads = max(self.num_worker-3,1)

        data_batch = tf.train.shuffle_batch(
                dict_data,
                batch_size=batch_size,
                num_threads=num_preprocess_threads,
                capacity=self.min_queue_examples + 3 * batch_size,
                min_after_dequeue=self.min_queue_examples,
                )

        return data_batch

    def get_data_queue(self,batch_size):
        image_files = tf.convert_to_tensor(self.filenames, dtype=tf.string)
        tf_labels = tf.convert_to_tensor(self.attr.values, dtype=tf.uint8)

        with tf.name_scope('filename_queue'):
            #must be list
            str_queue=tf.train.slice_input_producer([image_files,tf_labels])
        img_filename, uint_label= str_queue

        img_contents=tf.read_file(img_filename)
        image = tf.image.decode_jpeg(img_contents, channels=3)

        image=tf.cast(image,dtype=tf.float32)
        if self.config.is_crop:#use dcgan cropping
            #dcgan center-crops input to 108x108, outputs 64x64 #centrally crops it #We emulate that here
            image=tf.image.resize_image_with_crop_or_pad(image,108,108)
            #image=tf.image.resize_bilinear(image,[scale_size,scale_size])#must be 4D

            resize_method=getattr(tf.image.ResizeMethod,self.config.resize_method)
            image=tf.image.resize_images(image,[self.scale_size,self.scale_size],
                    method=resize_method)
            #Some dataset enlargement. Might as well.
            image=tf.image.random_flip_left_right(image)

            ##carpedm-began crops to 128x128 starting at (50,25), then resizes to 64x64
            #image=tf.image.crop_to_bounding_box(image, 50, 25, 128, 128)
            #image=tf.image.resize_nearest_neighbor(image, [scale_size, scale_size])

            tf.summary.image('real_image',tf.expand_dims(image,0))



        label=tf.to_float(uint_label)
        #Creates a dictionary  {'Male',male_tensor, 'Young',young_tensor} etc..
        dict_data={sl:tl for sl,tl in
                   zip(self.label_names,tf.split(label,len(self.label_names)))}
        assert not 'x' in dict_data.keys()#don't have a label named "x"
        dict_data['x']=image

        print ('Filling queue with %d Celeb images before starting to train. '
            'I don\'t know how long this will take' % self.min_queue_examples)
        num_preprocess_threads = max(self.num_worker,1)

        data_batch = tf.train.shuffle_batch(
                dict_data,
                batch_size=batch_size,
                num_threads=num_preprocess_threads,
                capacity=self.min_queue_examples + 3 * batch_size,
                min_after_dequeue=self.min_queue_examples,
                )
        return data_batch



================================================
FILE: download.py
================================================
"""
Modification of
https://github.com/carpedm20/BEGAN-tensorflow/blob/master/download.py
"""
from __future__ import print_function
import os
import zipfile
import requests
import subprocess
from tqdm import tqdm
from collections import OrderedDict

def download_file_from_google_drive(id, destination):
    URL = "https://docs.google.com/uc?export=download"
    session = requests.Session()

    response = session.get(URL, params={ 'id': id }, stream=True)
    token = get_confirm_token(response)

    if token:
        params = { 'id' : id, 'confirm' : token }
        response = session.get(URL, params=params, stream=True)

    save_response_content(response, destination)

def get_confirm_token(response):
    for key, value in response.cookies.items():
        if key.startswith('download_warning'):
            return value
    return None

def save_response_content(response, destination, chunk_size=32*1024):
    total_size = int(response.headers.get('content-length', 0))
    with open(destination, "wb") as f:
        for chunk in tqdm(response.iter_content(chunk_size), total=total_size,
                          unit='B', unit_scale=True, desc=destination):
            if chunk: # filter out keep-alive new chunks
                f.write(chunk)

def unzip(filepath):
    print("Extracting: " + filepath)
    base_path = os.path.dirname(filepath)
    with zipfile.ZipFile(filepath) as zf:
        zf.extractall(base_path)
    os.remove(filepath)

def download_celeb_a(base_path):
    data_path = os.path.join(base_path, 'celebA')
    images_path = os.path.join(data_path, 'images')
    if os.path.exists(data_path):
        print('[!] Found celeb-A - skip')
        return

    filename, drive_id  = "img_align_celeba.zip", "0B7EVK8r0v71pZjFTYXZWM3FlRnM"
    save_path = os.path.join(base_path, filename)

    if os.path.exists(save_path):
        print('[*] {} already exists'.format(save_path))
    else:
        download_file_from_google_drive(drive_id, save_path)

    zip_dir = ''
    with zipfile.ZipFile(save_path) as zf:
        zip_dir = zf.namelist()[0]
        zf.extractall(base_path)
    if not os.path.exists(data_path):
        os.mkdir(data_path)
    os.rename(os.path.join(base_path, "img_align_celeba"), images_path)
    os.remove(save_path)

    download_attr_file(data_path)


def download_attr_file(data_path):
    attr_gdID='0B7EVK8r0v71pblRyaVFSWGxPY0U'
    attr_fname=os.path.join(data_path,'list_attr_celeba.txt')
    download_file_from_google_drive(attr_gdID, attr_fname)
    delete_top_line(attr_fname)#make pandas readable
    #Top line was just an integer saying how many samples there were

def prepare_data_dir(path = './data'):
    if not os.path.exists(path):
        os.mkdir(path)

# check, if file exists, make link
def check_link(in_dir, basename, out_dir):
    in_file = os.path.join(in_dir, basename)
    if os.path.exists(in_file):
        link_file = os.path.join(out_dir, basename)
        rel_link = os.path.relpath(in_file, out_dir)
        os.symlink(rel_link, link_file)

def add_splits(base_path):
    data_path = os.path.join(base_path, 'celebA')
    images_path = os.path.join(data_path, 'images')
    train_dir = os.path.join(data_path, 'splits', 'train')
    valid_dir = os.path.join(data_path, 'splits', 'valid')
    test_dir = os.path.join(data_path, 'splits', 'test')
    if not os.path.exists(train_dir):
        os.makedirs(train_dir)
    if not os.path.exists(valid_dir):
        os.makedirs(valid_dir)
    if not os.path.exists(test_dir):
        os.makedirs(test_dir)

    # these constants based on the standard celebA splits
    NUM_EXAMPLES = 202599
    TRAIN_STOP = 162770
    VALID_STOP = 182637

    for i in range(0, TRAIN_STOP):
        basename = "{:06d}.jpg".format(i+1)
        check_link(images_path, basename, train_dir)
    for i in range(TRAIN_STOP, VALID_STOP):
        basename = "{:06d}.jpg".format(i+1)
        check_link(images_path, basename, valid_dir)
    for i in range(VALID_STOP, NUM_EXAMPLES):
        basename = "{:06d}.jpg".format(i+1)
        check_link(images_path, basename, test_dir)

def delete_top_line(txt_fname):
    lines=open(txt_fname,'r').readlines()
    open(txt_fname,'w').writelines(lines[1:])

if __name__ == '__main__':
    base_path = './data'
    prepare_data_dir()
    download_celeb_a(base_path)
    add_splits(base_path)


================================================
FILE: figure_scripts/__init__.py
================================================


================================================
FILE: figure_scripts/distributions.py
================================================
import tensorflow as tf
import numpy as np
import os
import scipy.misc
import numpy as np
import pandas as pd
from tqdm import trange,tqdm
import pandas as pd
from itertools import combinations, product
import sys
from utils import save_figure_images,make_sample_dir,guess_model_step
from sample import get_joint,sample



def get_pdf(model, do_dict=None,cond_dict=None,name='',N=6400,return_discrete=True,step=''):
    str_step=str(step) or guess_model_step(model)

    joint=get_joint(model,int_do_dict=do_dict,int_cond_dict=cond_dict,N=N,return_discrete=return_discrete)

    sample_dir=make_sample_dir(model)

    if name:
        name+='_'
    f_pdf=os.path.join(sample_dir,str_step+name+'dist'+'.csv')

    pdf=pd.DataFrame.from_dict({k:val.mean() for k,val in joint.items()})

    #print 'get pdf cond_dict:',cond_dict
    if not do_dict and not cond_dict:
        data=model.attr.mean()
        pdf['data']=data
    if not do_dict and cond_dict:
        bool_cond=np.logical_and.reduce([model.attr[k]==v for k,v in cond_dict.items()])
        attr=model.attr[bool_cond]
        pdf['data']=attr.mean()

    print 'Writing to file',f_pdf
    pdf.to_csv(f_pdf)

    return pdf


TINY=1e-6
def get_interv_table(model,intrv=True):

    n_batches=25
    table_outputs=[]
    d_vals=np.linspace(TINY,0.6,n_batches)
    for name in model.cc.node_names:
        outputs=[]
        for d_val in d_vals:
            do_dict={model.cc.node_dict[name].label_logit : d_val*np.ones((model.batch_size,1))}
            outputs.append(model.sess.run(model.fake_labels,do_dict))

        out=np.vstack(outputs)
        table_outputs.append(out)

    table=np.stack(table_outputs,axis=2)

    np.mean(np.round(table),axis=0)

    return table

#dT=pd.DataFrame(index=p_names, data=T, columns=do_names)
#T=np.mean(np.round(table),axis=0)
#table=get_interv_table(model)



def record_interventional(model,step=''):
    '''
    designed for truncated exponential noise.
    For each node that could be intervened on,
    sample interventions from the continuous
    distribution that discrete intervention
    corresponds to. Collect the joint and output
    to a csv file
    '''
    make_sample_dir(model)

    str_step=str(step)
    if str_step=='':
        if hasattr(model,'step'):
            str_step=str( model.sess.run(model.step) )+'_'

    m=20
    do =lambda val: np.linspace(0,val*0.8,m)
    for name in model.cc.node_names:
        for int_val,intv in enumerate([do(-1), do(+1)]):
            do_dict={name:intv}

            joint=get_joint(model, do_dict=None, N=5,return_discrete=True,step='')

            lab_df=pd.DataFrame(data=joint['g_fake_label'])
            dfl_df=pd.DataFrame(data=joint['d_fake_label'])

            lab_fname=str_step+str(name)+str(int_val)+'.csv'
            dfl_fname=str_step+str(name)+str(int_val)+'.csv'

            lab_df.to_csv(lab_fname)
            dfl_df.to_csv(dfl_fname)

    #with open(dfl_xtab_fn,'w') as dlf_f, open(lab_xtab_fn,'w') as lab_f:








================================================
FILE: figure_scripts/encode.py
================================================
#from __future__ import print_function
import tensorflow as tf
#import scipy
import scipy.misc
import numpy as np
from tqdm import trange
import os
import pandas as pd
from itertools import combinations
import sys
from Causal_controller import *
from began.models import GeneratorCNN, DiscriminatorCNN
from utils import to_nhwc,read_prepared_uint8_image,make_encode_dir

from utils import transform, inverse_transform #dcgan img norm
from utils import norm_img, denorm_img #began norm image

def var_like_z(z_ten,name):
    z_dim=z_ten.get_shape().as_list()[-1]
    return tf.get_variable(name,shape=(1,z_dim))
def noise_like_z(z_ten,name):
    z_dim=z_ten.get_shape().as_list()[-1]
    noise=tf.random_uniform([1,z_dim],minval=-1.,maxval=1.,)
    return noise


class Encoder:
    '''
    This is a class where you pass a model, and an image file
    and it creates more tensorflow variables, along with
    surrounding saving and summary functionality for encoding
    that image back into the hidden space using gradient descent
    '''
    model_name = "Encode.model"
    model_type= 'encoder'
    summ_col='encoder_summaries'
    def __init__(self,model,image,image_name=None,max_tr_steps=50000,load_path=''):
        '''
        image is assumed to be a path to a precropped 64x64x3 uint8 image
        '''

        #Some hardcoded defaults here
        self.log_step=500
        self.lr=0.0005
        self.max_tr_steps=max_tr_steps

        self.model=model
        self.load_path=load_path

        self.image_name=image_name or os.path.basename(image).replace('.','_')
        self.encode_dir=make_encode_dir(model,self.image_name)
        self.model_dir=self.encode_dir#different from self.model.model_dir
        self.save_dir=os.path.join(self.model_dir,'save')

        self.sess=self.model.sess#session should already be in progress

        if model.model_type =='dcgan':
            self.data_format='NHWC'#Don't change
        elif model.model_type == 'began':
            self.data_format=model.data_format#'NCHW' if gpu
        else:
            raise Exception('Should not happen. model_type=',model.model_type)

        #Notation:
        #self.uint_x/G ; 3D [0,255]
        #self.x/G ; 4D [-1,1]
        self.uint_x=read_prepared_uint8_image(image)#x is [0,255]

        print('Read image shape',self.uint_x.shape)
        self.x=norm_img(np.expand_dims(self.uint_x,0),self.data_format)#bs=1
        #self.x=norm_img(tf.expand_dims(self.uint_x,0),self.data_format)#bs=1
        print('Shape after norm:',self.x.get_shape().as_list())


        ##All variables created under encoder have uniform init
        vs=tf.variable_scope('encoder',
             initializer=tf.random_uniform_initializer(minval=-1.,maxval=1.),
             dtype=tf.float32)


        with vs as scope:
            #avoid creating adams params
            optimizer = tf.train.GradientDescentOptimizer
            #optimizer = tf.train.AdamOptimizer
            self.g_optimizer = optimizer(self.lr)

            encode_var={n.name:var_like_z(n.z,n.name) for n in model.cc.nodes}
            encode_var['gen']=var_like_z(model.z_gen,'gen')
            print 'encode variables created'
            self.train_var = tf.contrib.framework.get_variables(scope)
            self.step=tf.Variable(0,name='step')
            self.var = tf.contrib.framework.get_variables(scope)

        #all encode vars created by now
        self.saver = tf.train.Saver(var_list=self.var)
        print('Summaries will be written to ',self.model_dir)
        self.summary_writer = tf.summary.FileWriter(self.model_dir)

        #load or initialize enmodel variables
        self.init()

        if model.model_type =='dcgan':
            self.cc=CausalController(graph=model.graph, input_dict=encode_var, reuse=True)
            self.fake_labels_logits= tf.concat( self.cc.list_label_logits(),-1 )
            self.z_fake_labels=self.fake_labels_logits
            #self.z_gen = noise_like_z( self.model.z_gen,'en_z_gen')
            self.z_gen=encode_var['gen']
            self.z= tf.concat( [self.z_gen, self.z_fake_labels], axis=1 , name='z')

            self.G=model.generator( self.z , bs=1, reuse=True)

        elif model.model_type == 'began':
            with tf.variable_scope('tower'):#reproduce variable scope
                self.cc=CausalController(graph=model.graph, input_dict=encode_var, reuse=True)

                self.fake_labels= tf.concat( self.cc.list_labels(),-1 )
                self.fake_labels_logits= tf.concat( self.cc.list_label_logits(),-1 )
                #self.z_gen = noise_like_z( self.model.z_gen,'en_z_gen')
                self.z_gen=encode_var['gen']
                self.z= tf.concat( [self.fake_labels, self.z_gen],axis=-1,name='z')

                self.G,_ = GeneratorCNN(
                        self.z, model.conv_hidden_num, model.channel,
                        model.repeat_num, model.data_format,reuse=True)

                d_out, self.D_zG, self.D_var = DiscriminatorCNN(
                        self.G, model.channel, model.z_num,
                    model.repeat_num, model.conv_hidden_num,
                    model.data_format,reuse=True)

                _   , self.D_zX, _           = DiscriminatorCNN(
                        self.x, model.channel, model.z_num,
                    model.repeat_num, model.conv_hidden_num,
                    model.data_format,reuse=True)
                self.norm_AE_G=d_out

                #AE_G, AE_x = tf.split(d_out, 2)
                self.AE_G=denorm_img(self.norm_AE_G, model.data_format)
            self.aeg_sum=tf.summary.image('encoder/AE_G',self.AE_G)

        node_summaries=[]
        for node in self.cc.nodes:
            with tf.name_scope(node.name):
                ave_label=tf.reduce_mean(node.label)
                node_summaries.append(tf.summary.scalar('ave',ave_label))


        #unclear how scope with adam param works
        #with tf.variable_scope('encoderGD') as scope:

        #use L1 loss
        #self.g_loss_image = tf.reduce_mean(tf.abs(self.x - self.G))

        #use L2 loss
        #self.g_loss_image = tf.reduce_mean(tf.square(self.x - self.G))

        #use autoencoder reconstruction loss  #3.1.1 series
        #self.g_loss_image = tf.reduce_mean(tf.abs(self.x - self.norm_AE_G))

        #use L1 in autoencoded space# 3.2
        self.g_loss_image = tf.reduce_mean(tf.abs(self.D_zX - self.D_zG))

        g_loss_sum=tf.summary.scalar( 'encoder/g_loss_image',\
                          self.g_loss_image,self.summ_col)

        self.g_loss= self.g_loss_image
        self.train_op=self.g_optimizer.minimize(self.g_loss,
               var_list=self.train_var,global_step=self.step)

        self.uint_G=tf.squeeze(denorm_img( self.G ,self.data_format))#3D[0,255]
        gimg_sum=tf.summary.image( 'encoder/Reconstruct',tf.stack([self.uint_x,self.uint_G]),\
                max_outputs=2,collections=self.summ_col)

        #self.summary_op=tf.summary.merge_all(self.summ_col)
        #self.summary_op=tf.summary.merge_all(self.summ_col)

        if model.model_type=='dcgan':
            self.summary_op=tf.summary.merge([g_loss_sum,gimg_sum]+node_summaries)
        elif model.model_type=='began':
            self.summary_op=tf.summary.merge([g_loss_sum,gimg_sum,self.aeg_sum]+node_summaries)


        #print 'encoder summaries:',self.summ_col
        #print 'encoder summaries:',tf.get_collection(self.summ_col)


    def init(self):
        if self.load_path:
            print 'Attempting to load directly from path:',
            print self.load_path
            self.saver.restore(self.sess,self.load_path)
        else:
            print 'New ENCODE Model..init new Z parameters'
            init=tf.variables_initializer(var_list=self.var)
            print 'Initializing following variables:'
            for v in self.var:
                print v.name, v.get_shape().as_list()

            self.model.sess.run(init)

    def save(self, step=None):
        if step is None:
            step=self.sess.run(self.step)

        if not os.path.exists(self.save_dir):
            print 'Creating Directory:',self.save_dir
            os.makedirs(self.save_dir)
        savefile=os.path.join(self.save_dir,self.model_name)
        print 'Saving file:',savefile
        self.saver.save(self.model.sess,savefile,global_step=step)

    def train(self, n_step=None):
        max_step=n_step or self.max_tr_steps

        if False:#debug
            print 'a'
            self.sess.run(self.train_op)
            print 'b'
            self.sess.run(self.summary_op)
            print 'c'
            self.sess.run(self.g_loss)
            print 'd'

        print 'max_step;',max_step
        for counter in trange(max_step):

            fetch_dict = {
                "train_op": self.train_op,
            }
            if counter%self.log_step==0:
                fetch_dict.update({
                    "summary": self.summary_op,
                    "g_loss": self.g_loss,
                    "global_step":self.step
                    })

            result = self.sess.run(fetch_dict)

            if counter % self.log_step == 0:
                g_loss=result['g_loss']
                step=result['global_step']
                self.summary_writer.add_summary(result['summary'],step)
                self.summary_writer.flush()

                print("[{}/{}] Reconstr Loss_G: {:.6f}".format(counter,max_step,g_loss))

            if counter % (10.*self.log_step) == 0:
                self.save(step=step)

        self.save()



##Just for reference##
    #def load(self, checkpoint_dir):
    #    print(" [*] Reading checkpoints...")
    #    checkpoint_dir = os.path.join(checkpoint_dir, self.model_dir)
    #    ckpt = tf.train.get_checkpoint_state(checkpoint_dir)
    #    if ckpt and ckpt.model_checkpoint_path:
    #        ckpt_name = os.path.basename(ckpt.model_checkpoint_path)
    #        self.saver.restore(self.sess, os.path.join(checkpoint_dir, ckpt_name))
    #        print(" [*] Success to read {}".format(ckpt_name))
    #        return True
    #    else:
    #        print(" [*] Failed to find a checkpoint")
    #        return False
#def norm_img(image, data_format=None):
#    image = image/127.5 - 1.
#    if data_format:
#        image = to_nhwc(image, data_format)
#    return image
#def transform:
#    stuff
#  return np.array(cropped_image)/127.5 - 1.
#def denorm_img(norm, data_format):
#    return tf.clip_by_value(to_nhwc((norm + 1)*127.5, data_format), 0, 255)
#def inverse_transform(images):
#  return (images+1.)/2.



#if model.model_name=='began':
#    fake_labels=model.fake_labels
#    D_fake_labels=model.D_fake_labels
#    #result_dir=os.path.join('began',model.model_dir)
#    result_dir=model.model_dir
#    if str_step=='':
#        str_step=str( model.sess.run(model.step) )+'_'
#    attr=model.attr[list(model.cc.node_names)]
#elif model.model_name=='dcgan':
#    fake_labels=model.fake_labels
#    D_fake_labels=model.D_labels_for_fake
#    result_dir=model.checkpoint_dir
#    attr=0.5*(model.attributes+1)
#    attr=attr[list(model.cc.names)]



================================================
FILE: figure_scripts/high_level.py
================================================
import tensorflow as tf
import numpy as np
import os
import scipy.misc
import numpy as np
import pandas as pd
from tqdm import trange,tqdm
import pandas as pd
from itertools import combinations, product
import sys
from utils import save_figure_images,make_sample_dir,guess_model_step
from sample import get_joint,sample,find_logit_percentile



'''
This is a file where each function creates a particular figure. No real need
for this to be configurable. Just make a new function for each figure

This uses functions in sample.py and distribution.py, which are intended to
be lower level functions that can be used more generally.

'''




def fig1(model, output_folder):
    '''
    This function makes two 2x10 images
    showing the difference between conditioning
    and intervening
    '''

    str_step=guess_model_step(model)
    fname=os.path.join(output_folder,str_step+model.model_type)

    for key in ['Young','Smiling','Wearing_Lipstick','Male','Mouth_Slightly_Open','Narrow_Eyes']:
    #for key in ['Mustache','Bald']:
    #for key in ['Mustache']:
        print 'Starting ',key,
        #for key in ['Bald']:

        p50,n50=find_logit_percentile(model,key,50)
        do_dict={key:np.repeat([p50],10)}
        eps=3
        cond_dict={key:np.repeat([+eps],10)}

        out,_=sample(model,do_dict=do_dict)
        intv_images=out['G']

        out,_=sample(model,cond_dict=cond_dict)
        cond_images=out['G']

        images=np.vstack([intv_images,cond_images])
        dc_file=fname+'_'+key+'_topdo1_botcond1.pdf'
        save_figure_images(model.model_type,images,dc_file,size=[2,10])

        do_dict={key:np.repeat([p50,n50],10)}
        cond_dict={key:np.repeat([+eps,-eps],10)}

        dout,_=sample(model,do_dict=do_dict)
        cout,_=sample(model,cond_dict=cond_dict)

        itv_file  = fname+'_'+key+'_topdo1_botdo0.pdf'
        cond_file  = fname+'_'+key+'_topcond1_botcond0.pdf'
        eps=3

        save_figure_images(model.model_type,dout['G'],itv_file,size=[2,10])
        save_figure_images(model.model_type,cout['G'],cond_file,size=[2,10])
        print '..finished ',key

    #return images,cout['G'],dout['G']
    return key





================================================
FILE: figure_scripts/pairwise.py
================================================
from __future__ import print_function
import time
import tensorflow as tf
import os
import scipy.misc
import numpy as np
from tqdm import trange

import pandas as pd
from itertools import combinations
import sys
from sample import sample




def calc_tvd(label_dict,attr):
    '''
    attr should be a 0,1 pandas dataframe with
    columns corresponding to label names

    for example:
    names=zip(*self.graph)[0]
    calc_tvd(label_dict,attr[names])

    label_dict should be a dictionary key:1d-array of samples
    '''
    ####Calculate Total Variation####
    if np.min(attr.values)<0:
        raise ValueError('calc_tvd received \
                 attr that may not have been in {0,1}')

    label_names=label_dict.keys()
    attr=attr[label_names]

    df2=attr.drop_duplicates()
    df2 = df2.reset_index(drop = True).reset_index()
    df2=df2.rename(columns = {'index':'ID'})
    real_data_id=pd.merge(attr,df2)
    real_counts = pd.value_counts(real_data_id['ID'])
    real_pdf=real_counts/len(attr)

    label_list_dict={k:np.round(v.ravel()) for k,v in label_dict.items()}
    df_dat=pd.DataFrame.from_dict(label_list_dict)
    dat_id=pd.merge(df_dat,df2,on=label_names,how='left')
    dat_counts=pd.value_counts(dat_id['ID'])
    dat_pdf = dat_counts / dat_counts.sum()
    diff=real_pdf.subtract(dat_pdf, fill_value=0)
    tvd=0.5*diff.abs().sum()
    return tvd


def crosstab(model,result_dir=None,report_tvd=True,no_save=False,N=500000):
    '''
    This is a script for outputing [0,1/2], [1/2,1] binned pdfs
    including the marginals and the pairwise comparisons

    report_tvd is given as optional because it is somewhat time consuming

    result_dir is where to save the distribution text files. defaults to
    model.cc.model_dir

    '''
    result_dir=result_dir or model.cc.model_dir
    result={}

    n_labels=len(model.cc.nodes)

    #Not really sure how this should scale
    #N=1000*n_labels
    #N=500*n_labels**2#open to ideas that avoid a while loop
    #N=12000

    #tvd will not be reported as low unless N is large
    #N=500000 #default

    print('Calculating joint distribution with',)

    t0=time.time()
    label_dict=sample(model,fetch_dict=model.cc.label_dict,N=N)
    print('sampling model N=',N,' times took ',time.time()-t0,'sec')


    #fake_labels=model.cc.fake_labels

    str_step=str( model.sess.run(model.cc.step) )+'_'

    attr=model.data.attr
    attr=attr[model.cc.node_names]

    lab_xtab_fn = os.path.join(result_dir,str_step+'glabel_crosstab.txt')
    print('Writing to files:',lab_xtab_fn)

    if report_tvd:
        t0=time.time()
        tvd=calc_tvd(label_dict,attr)
        result['tvd']=tvd
        print('calculating tvd from samples took ',time.time()-t0,'sec')

        if no_save:
            return result

    t0=time.time()

    joint={}
    label_joint={}
    #for name, lab in zip(model.cc.node_names,list_labels):
    for name, lab in label_dict.items():
        joint[name]={ 'g_fake_label':lab }


    #with open(dfl_xtab_fn,'w') as dlf_f, open(lab_xtab_fn,'w') as lab_f, open(gvsd_xtab_fn,'w') as gldf_f:
    with open(lab_xtab_fn,'w') as lab_f:
        if report_tvd:
            lab_f.write('TVD:'+str(tvd)+'\n\n')
        lab_f.write('Marginals:\n')

        #Marginals
        for name in joint.keys():
            lab_f.write('Node: '+name+'\n')

            true_marg=np.mean((attr[name]>0.5).values)
            lab_marg=(joint[name]['g_fake_label'] > 0.5).astype('int')

            lab_f.write('  mean='+str(np.mean(lab_marg))+'\t'+\
                        'true mean='+str(true_marg)+'\n')

            lab_f.write('\n')


        #Pairs of labels
        lab_f.write('\nPairwise:\n')

        for node1,node2 in combinations(joint.keys(),r=2):

            lab_node1=(joint[node1]['g_fake_label']>0.5).astype('int')
            lab_node2=(joint[node2]['g_fake_label']>0.5).astype('int')
            lab_df=pd.DataFrame(data=np.hstack([lab_node1,lab_node2]),columns=[node1,node2])
            lab_ct=pd.crosstab(index=lab_df[node1],columns=lab_df[node2],margins=True,normalize=True)

            true_ct=pd.crosstab(index=attr[node1],columns=attr[node2],margins=True,normalize=True)


            lab_f.write('\n\tFake:\n')
            lab_ct.to_csv(lab_xtab_fn,mode='a')
            lab_f.write( lab_ct.__repr__() )
            lab_f.write('\n\tReal:\n')
            lab_f.write( true_ct.__repr__() )

            lab_f.write('\n\n')

    print('calculating pairwise crosstabs and saving results took ',time.time()-t0,'sec')
    return result












================================================
FILE: figure_scripts/probability_table.txt
================================================



model: celebA_0627_200239
    graph:MLS

    [img,cc,d_fake_labels,true]

    P(M=1|S=1) = [0.28, 

    
     


================================================
FILE: figure_scripts/sample.py
================================================
from __future__ import print_function
import tensorflow as tf
import numpy as np
import os
import scipy.misc
import numpy as np
from tqdm import trange,tqdm

import pandas as pd
from itertools import combinations, product
import sys

from utils import save_figure_images#makes grid image plots

#convenience functions
from utils import make_sample_dir,guess_model_step,infer_grid_image_shape


from IPython.core import debugger
debug = debugger.Pdb().set_trace


def find_logit_percentile(model, key, per):
    data=[]
    for _ in range(30):
        data.append(model.sess.run(model.cc.node_dict[key].label_logit))
    D=np.vstack(data)
    pos_logits,neg_logits=D[D>0], D[D<0]
    pos_tile = np.percentile(pos_logits,per)
    neg_tile = np.percentile(neg_logits,100-per)
    return pos_tile,neg_tile

def fixed_label_diversity(model, config,step=''):
    sample_dir=make_sample_dir(model)
    str_step=str(step) or guess_model_step(model)

    N=64#per image
    n_combo=5#n label combinations

    #0,1 label combinations
    fixed_labels=model.attr.sample(n_combo)[model.cc.node_names]
    size=infer_grid_image_shape(N)

    for j, fx_label in enumerate(fixed_labels.values):
        fx_label=np.reshape(fx_label,[1,-1])
        fx_label=np.tile(fx_label,[N,1])
        do_dict={model.cc.labels: fx_label}

        images, feed_dict= sample(model, do_dict=do_dict)
        fx_file=os.path.join(sample_dir, str_step+'fxlab'+str(j)+'.pdf')
        save_figure_images(model.model_type,images['G'],fx_file,size=size)

    #which image is what label
    fixed_labels=fixed_labels.reset_index(drop=True)
    fixed_labels.to_csv(os.path.join(sample_dir,str_step+'fxlab'+'.csv'))


def get_joint(model, int_do_dict=None,int_cond_dict=None, N=6400,return_discrete=True):
    '''
    Returns a dictionary of dataframes of samples.
    Each dataframe correponds to a different tensor i.e. cc labels, d_labeler
    labels etc.

    int_do_dict and int_cond_dict indicate that just a simple +1 or 0 should be
    passed in
    ex: int_do_dict={'Wearing_Lipstick':+1}


    Ex: if intervention=+1 corresponds to logits uniform in [0,0.6], pass
    np.linspace(0,0.6,n)

    N is number of batches to sample at each location in logitspace (num_labels
    dimensional)
    '''

    #values are either +1 or -1 in cond and do dict

    do_dict,cond_dict={},{}
    if int_do_dict is not None:
        for key,value in int_do_dict.items():
            #Intervene in the middle of where the model is used to operating
            print('calculating percentile...')
            data=[]
            for _ in range(30):
                data.append(model.sess.run(model.cc.node_dict[key].label_logit))
            D=np.vstack(data)
            pos_logits,neg_logits=D[D>0], D[D<0]
            if value == 1:
                intv = np.percentile(pos_logits,50)
            elif value == 0:
                intv = np.percentile(neg_logits,50)
            else:
                raise ValueError('pass either +1 or 0')
            do_dict[key]=np.repeat([intv],N)


    if int_cond_dict is not None:
        for key,value in int_cond_dict.items():
            eps=3.
            if value == 1:
                cond_dict[key]=np.repeat([+eps],N)
            elif value == 0:
                cond_dict[key]=np.repeat([-eps],N)
            else:
                raise ValueError('pass either +1 or 0')

    #print 'getjoint: cond_dict:',cond_dict
    #print 'getjoint: do_dict:',do_dict

    #Terminology
    if model.model_type=='began':
        fake_labels=model.fake_labels
        D_fake_labels=model.D_fake_labels
        D_real_labels=model.D_real_labels
    elif model.model_type=='dcgan':
        fake_labels=model.fake_labels
        D_fake_labels=model.D_labels_for_fake
        D_real_labels=model.D_labels_for_real

    #fetch_dict={'cc_labels':model.cc.labels}
    fetch_dict={'d_fake_labels':D_fake_labels,
                'cc_labels':model.cc.labels}

    if model.model_type=='began':#dcgan not fully connected
        if not cond_dict and not do_dict:
            #Havent coded conditioning on real data
            fetch_dict.update({'d_real_labels':D_real_labels})


    print('Calculating joint distribution')
    result,_=sample(model, cond_dict=cond_dict, do_dict=do_dict,N=N,
                    fetch=fetch_dict,return_failures=False)
    print('fetd keys:',fetch_dict.keys())
    result={k:result[k] for k in fetch_dict.keys()}

    n_labels=len(model.cc.node_names)
    #list_labels=np.split( result['cfl'],n_labels, axis=1)
    #list_d_fake_labels=np.split(result['dfl'],n_labels, axis=1)
    #list_d_real_labels=np.split(result['drl'],n_labels, axis=1)

    for k in result.keys():
        print('valshape',result[k].shape)
        print('result',result[k])
    list_result={k:np.split(val,n_labels, axis=1) for k,val in result.items()}

    pd_joint={}
    for key,r in list_result.items():
        joint={}
        for name,val in zip(model.cc.node_names,r):
            int_val=(val>0.5).astype('int')
            joint[name]=int_val.ravel()
        pd_joint[key]=pd.DataFrame.from_dict(joint)

    return pd_joint


    for name, lab, dfl in zip(model.cc.node_names,list_labels,list_d_fake_labels):
        if return_discrete:
            cfl_val=(lab>0.5).astype('int')
            dfl_val=(dfl>0.5).astype('int')

        joint['dfl'][name]=dfl_val
        joint['cfl'][name]=cfl_val


    cfl=pd.DataFrame.from_dict( {k:val.ravel() for k,val in joint['cfl'].items()} )
    dfl=pd.DataFrame.from_dict( {k:val.ravel() for k,val in joint['cfl'].items()} )

    print('get_joint successful')
    return cfl,dfl



#__________

def take_product(do_dict):
    '''
    this function takes some dictionary like:
        {key1:1, key2:[a,b], key3:[c,d]}
    and returns the dictionary:
        {key1:[1,1,1], key2[a,a,b,b,],key3[c,d,c,d]}
    computing the product of values
    '''
    values=[]
    for v in do_dict.values():
        if hasattr(v,'__iter__'):
            values.append(v)
        else:
            values.append([v])#allows scalar to be passed

    prod_values=np.vstack(product(*values))
    return {k:np.array(v) for k,v in zip(do_dict.keys(),zip(*prod_values))}


def chunks(input_dict, chunk_size):
    """
    Yield successive n-sized chunks.
    Takes a dictionary of iterables and makes an
    iterable of dictionaries
    """
    if len(input_dict)==0:
        return [{}]

    n=chunk_size
    batches=[]

    L=len(input_dict.values()[0])
    for i in xrange(0, L, n):
        fd={}
        n=n- max(0, (i+n) - L )#incase doesn't evenly divide
        for key,value in input_dict.items():
            fd[key]=value[i:i+n]

        batches.append(fd)
    return batches


def do2feed( do_dict, model, on_logits=True):
    '''
    this contains logit for parsing "do_dict"
    into a feed dict that can actually be worked with
    '''
    feed_dict={}
    for key,value in do_dict.items():
        if isinstance(key,tf.Tensor):
            feed_dict[key]=value
        elif isinstance(key,str):
            if key in model.cc.node_names:
                node=model.cc.node_dict[key]
                if on_logits:# intervene on logits by default
                    feed_dict[node.label_logit]=value
                else:
                    feed_dict[node.label]=value
            elif hasattr(model,key):
                feed_dict[getattr(model,key)]=value
            else:
                raise ValueError('string keys must be attributes of either\
                                 model.cc or model. Got string:',key)
        else:
            raise ValueError('keys must be tensors or strings but got',type(key))

    #Make sure [64,] isn't passed to [64,1] for example
    for tensor,value in feed_dict.items():
        #Make last dims line up:
        tf_shape=tensor.get_shape().as_list()
        shape=[len(value)]+tf_shape[1:]
        try:
            feed_dict[tensor]=np.reshape(value,shape)
        except Exception,e:
            print('Unexpected difficulty reshaping inputs:',tensor.name, tf_shape, len(value), np.size(value))
            raise e
    return feed_dict

def cond2fetch( cond_dict=None, model=None, on_logits=True):
    '''
    this contains logit for parsing "cond_dict"
    into a fetch dict that can actually be worked with.
    A fetch dict can be passed into the first argument
    of session.run and therefore has values that are all tensors
    '''
    cond_dict=cond_dict or {}

    fetch_dict={}
    for key,value in cond_dict.items():
        if isinstance(value,tf.Tensor):
            fetch_dict[key]=value#Nothing to be done
        elif isinstance(key,tf.Tensor):
            fetch_dict[key]=key#strange scenario, but possible
        elif isinstance(key,str):
            if key in model.cc.node_names:
                node=model.cc.node_dict[key]
                if on_logits:# intervene on logits by default
                    fetch_dict[key]=node.label_logit
                else:
                    fetch_dict[key]=node.label
            elif hasattr(model,key):
                fetch_dict[key]=getattr(model,key)
            else:
                raise ValueError('string keys must be attributes of either\
                                 model.cc or model. Got string:',key)
        else:
            raise ValueError('keys must be tensors or strings but got',type(key))

    return fetch_dict




def interpret_dict( a_dict, model,n_times=1, on_logits=True):
    '''
    pass either a do_dict or a cond_dict.
    The rules for converting arguments to numpy arrays to pass
    to tensorflow are identical
    '''
    if a_dict is None:
        return {}
    elif len(a_dict)==0:
        return {}

    if n_times>1:
        token=tf.placeholder_with_default(2.22)
        a_dict[token]=-2.22

    p_a_dict=take_product(a_dict)

    ##Need divisible batch_size for most models
    if len(p_a_dict)>0:
        L=len(p_a_dict.values()[0])
    else:
        L=0
    print("L is " + str(L))
    print(p_a_dict)

    ##Check compatability batch_size and L
    if L>=model.batch_size:
        if not L % model.batch_size == 0:
            raise ValueError('a_dict must be dividable by batch_size\
                             but instead product of inputs was of length',L)
    elif model.batch_size % L == 0:
        p_a_dict = {key:np.repeat(value,model.batch_size/L,axis=0) for key,value in p_a_dict.items()}
    else:
        raise ValueError('No. of intervened values must divide batch_size.')
    return p_a_dict


def slice_dict(feed_dict, rows):
    '''
    conditional sampling requires doing only certain indicies depending
    on the result of the previous iteration.
    This function takes a feed_dict and "slices" it,
    returning a dictionary with the same keys, but with values[rows,:]
    '''
    fd_out={}
    for key,value in feed_dict.iteritems():
        fd_out[key]=value[rows]
    return fd_out


def did_succeed( output_dict, cond_dict ):
    '''
    Used in rejection sampling:
    for each row, determine if cond is satisfied
    for every cond in cond_dict

    success is hardcoded as being more extreme
    than the condition specified
    '''
    test_key=cond_dict.keys()[0]
    #print('output_dict:',np.squeeze(output_dict[test_key]))
    #print('cond_dict:',cond_dict[test_key])


    #definition success:
    def is_win(key):
        cond=np.squeeze(cond_dict[key])
        val=np.squeeze(output_dict[key])
        cond1=np.sign(val)==np.sign(cond)
        cond2=np.abs(val)>np.abs(cond)
        return cond1*cond2


    scoreboard=[is_win(key) for key in cond_dict]
    #print('scoreboard', scoreboard)
    all_victories_bool=np.logical_and.reduce(scoreboard)
    return all_victories_bool.flatten()


def sample(model, cond_dict=None, do_dict=None, fetch_dict=None,N=None,
           on_logits=True,return_failures=True):
    '''
    fetch_dict should be a dict of tensors to do sess.run on
    do_dict is a list of strings or tensors of the form:
    {'Male':1, model.z_gen:[0,1], model.cc.Smiling:[0.1,0.9]}

    N is used only if cond_dict and do_dict are None
    '''

    do_dict= do_dict or {}
    cond_dict= cond_dict or {}
    fetch_dict=fetch_dict or {'G':model.G}

    ##Handle the case where len querry doesn't divide batch_size
    #a_dict=cond_dict or do_dict
    #if a_dict:
    #    nsamples=len(a_dict.values()[0])
    #elif N:
    #    nsamples=N
    #else:
    #    raise ValueError('either pass a dictionary or N')


    ##Pad to be batch_size divisible
    #npad=(64-nsamples)%64
    #if npad>0:
    #    print("Warn. nsamples doesnt divide batch_size, pad=",npad)
    ##N+=npad

    #if npad>0:
    #    if do_dict:
    #        for k in do_dict.keys():
    #            keypad=np.tile(do_dict[k][0],[npad])
    #            do_dict[k]=np.concatenate([do_dict[k],keypad])

    #    if cond_dict:
    #        for k in cond_dict.keys():
    #            keypad=np.tile(cond_dict[k][0],[npad])
    #            cond_dict[k]=np.concatenate([cond_dict[k],keypad])

    verbose=False
    #verbose=True



    feed_dict = do2feed(do_dict, model, on_logits=on_logits)#{tensor:array}
    cond_fetch_dict= cond2fetch(cond_dict,model,on_logits=on_logits) #{string:tensor}
    fetch_dict.update(cond_fetch_dict)


    #print('actual cond_dict', cond_dict )#{}
    #print('actual do_dict', do_dict )#{}

    if verbose:
        print('feed_dict',feed_dict)
        print('fetch_dict',fetch_dict)

    if not cond_dict and do_dict:
        #Simply do intervention w/o loop
        if verbose:
            print('sampler mode:Interventional')

        #fds=chunks(feed_dict,model.batch_size)
        fds=chunks(feed_dict,model.default_batch_size)

        outputs={k:[] for k in fetch_dict.keys()}
        for fd in fds:
            out=model.sess.run(fetch_dict, fd)
            #outputs.append(out['G'])
            for k,val in out.items():
                outputs[k].append(val)

        for k in outputs.keys():
            outputs[k]=np.vstack(outputs[k])[:nsamples]
        return outputs,feed_dict
        #return np.vstack(outputs), feed_dict

    elif not cond_dict and not do_dict:
        #neither passed, but get N samples
        assert(N>0)
        if verbose:
            print('sampling model N=',N,' times')

        ##Should be variable batch_size allowed
        outputs=model.sess.run(fetch_dict,{model.batch_size:N})

        ##fds=chunks({'idx':range(npad+N)},model.batch_size)
        #fds=chunks({'idx':range(npad+N)},model.default_batch_size)

        #outputs={k:[] for k in fetch_dict.keys()}
        #for fd in fds:
        #    out=model.sess.run(fetch_dict)
        #    for k,val in out.items():
        #        outputs[k].append(val)
        #for k in outputs.keys():
        #    outputs[k]=np.vstack(outputs[k])[:nsamples]
        #return outputs, feed_dict

        return outputs


    #elif cond_dict and not do_dict:
    elif cond_dict:
    #Could also pass do_dict here to be interesting
        ##Implements r
Download .txt
gitextract_fvn7v_0h/

├── .gitignore
├── LICENSE
├── README.md
├── assets/
│   ├── 0808_112404_cbcg.csv
│   ├── 0810_191625_bcg.csv
│   ├── 0821_213901_rcbcg.csv
│   ├── guide_to_gifs.txt
│   └── tvdplot.ipynb
├── causal_began/
│   ├── CausalBEGAN.py
│   ├── __init__.py
│   ├── config.py
│   ├── models.py
│   └── utils.py
├── causal_controller/
│   ├── ArrayDict.py
│   ├── CausalController.py
│   ├── __init__.py
│   ├── config.py
│   ├── models.py
│   └── utils.py
├── causal_dcgan/
│   ├── CausalGAN.py
│   ├── __init__.py
│   ├── config.py
│   ├── models.py
│   ├── ops.py
│   └── utils.py
├── causal_graph.py
├── config.py
├── data_loader.py
├── download.py
├── figure_scripts/
│   ├── __init__.py
│   ├── distributions.py
│   ├── encode.py
│   ├── high_level.py
│   ├── pairwise.py
│   ├── probability_table.txt
│   ├── sample.py
│   └── utils.py
├── main.py
├── synthetic/
│   ├── README.md
│   ├── collect_stats.py
│   ├── config.py
│   ├── figure_generation.ipynb
│   ├── main.py
│   ├── models.py
│   ├── run_datasets.sh
│   ├── tboard.py
│   ├── trainer.py
│   └── utils.py
├── tboard.py
├── trainer.py
└── utils.py
Download .txt
SYMBOL INDEX (284 symbols across 35 files)

FILE: causal_began/CausalBEGAN.py
  class CausalBEGAN (line 18) | class CausalBEGAN(object):
    method __init__ (line 31) | def __init__(self,batch_size,config):
    method __call__ (line 89) | def __call__(self, real_inputs, fake_inputs):
    method build_train_op (line 248) | def build_train_op(self):
    method train_step (line 286) | def train_step(self,sess,counter):
    method build_summary_op (line 292) | def build_summary_op(self):

FILE: causal_began/config.py
  function str2bool (line 4) | def str2bool(v):
  function add_argument_group (line 11) | def add_argument_group(name):
  function gpu_logic (line 100) | def gpu_logic(config):
  function get_config (line 111) | def get_config():

FILE: causal_began/models.py
  function lrelu (line 6) | def lrelu(x,leak=0.2,name='lrelu'):
  function GeneratorCNN (line 12) | def GeneratorCNN( z, config, reuse=None):
  function DiscriminatorCNN (line 35) | def DiscriminatorCNN(image, config, reuse=None):
  function Discriminator_labeler (line 80) | def Discriminator_labeler(image, output_size, config, reuse=None):
  function next (line 107) | def next(loader):
  function to_nhwc (line 110) | def to_nhwc(image, data_format):
  function to_nchw_numpy (line 118) | def to_nchw_numpy(image):
  function norm_img (line 125) | def norm_img(image, data_format=None):
  function denorm_img (line 131) | def denorm_img(norm, data_format):
  function slerp (line 134) | def slerp(val, low, high):
  function int_shape (line 142) | def int_shape(tensor):
  function get_conv_shape (line 146) | def get_conv_shape(tensor, data_format):
  function nchw_to_nhwc (line 154) | def nchw_to_nhwc(x):
  function nhwc_to_nchw (line 157) | def nhwc_to_nchw(x):
  function reshape (line 160) | def reshape(x, h, w, c, data_format):
  function resize_nearest_neighbor (line 167) | def resize_nearest_neighbor(x, new_size, data_format):
  function upscale (line 176) | def upscale(x, scale, data_format):
  function average_gradients (line 183) | def average_gradients(tower_grads):

FILE: causal_began/utils.py
  function make_summary (line 16) | def make_summary(name, val):
  function summary_stats (line 19) | def summary_stats(name,tensor,collections=None,hist=False):
  function prepare_dirs_and_logger (line 29) | def prepare_dirs_and_logger(config):
  function get_time (line 70) | def get_time():
  function save_config (line 73) | def save_config(config):
  function get_available_gpus (line 82) | def get_available_gpus():
  function distribute_input_data (line 87) | def distribute_input_data(data_loader,num_gpu):
  function rank (line 114) | def rank(array):
  function make_grid (line 117) | def make_grid(tensor, nrow=8, padding=2,
  function save_image (line 137) | def save_image(tensor, filename, nrow=8, padding=2,

FILE: causal_controller/ArrayDict.py
  class ArrayDict (line 2) | class ArrayDict(object):
    method __init__ (line 12) | def __init__(self):
    method __len__ (line 14) | def __len__(self):
    method __repr__ (line 19) | def __repr__(self):
    method keys (line 21) | def keys(self):
    method items (line 23) | def items(self):
    method validate_dict (line 26) | def validate_dict(self,a_dict):
    method arr_dict (line 49) | def arr_dict(self,a_dict):
    method concat (line 56) | def concat(self,a_dict):
    method __getitem__ (line 63) | def __getitem__(self,at):

FILE: causal_controller/CausalController.py
  class CausalController (line 14) | class CausalController(object):
    method summary_scalar (line 17) | def summary_scalar(self,name,ten):
    method summary_stats (line 19) | def summary_stats(self,name,ten,hist=False):
    method load (line 22) | def load(self,sess,path):
    method __init__ (line 35) | def __init__(self,batch_size,config):
    method build_pretrain (line 140) | def build_pretrain(self,label_loader):
    method dcc_var (line 212) | def dcc_var(self):
    method critic_update (line 222) | def critic_update(self,sess):
    method __len__ (line 228) | def __len__(self):
    method list_placeholders (line 232) | def list_placeholders(self):
    method list_labels (line 234) | def list_labels(self):
    method list_label_logits (line 236) | def list_label_logits(self):
    method do2feed (line 239) | def do2feed(self,do_dict):
    method sample_label (line 248) | def sample_label(self, sess, cond_dict=None,do_dict=None,N=None,verbos...
  class CausalNode (line 328) | class CausalNode(object):
    method summary_scalar (line 352) | def summary_scalar(self,name,ten):
    method summary_stats (line 354) | def summary_stats(self,name,ten,hist=False):
    method __init__ (line 357) | def __init__(self,name,config):
    method setup_tensor (line 370) | def setup_tensor(self):
    method var (line 396) | def var(self):
    method train_var (line 401) | def train_var(self):
    method label_logit (line 406) | def label_logit(self):
    method label (line 415) | def label(self):
    method setup_pretrain (line 423) | def setup_pretrain(self,config,label_loader,DCC):

FILE: causal_controller/config.py
  function str2bool (line 11) | def str2bool(v):
  function add_argument_group (line 18) | def add_argument_group(name):
  function get_config (line 111) | def get_config():

FILE: causal_controller/models.py
  function lrelu (line 6) | def lrelu(x,leak=0.2,name='lrelu'):
  function DiscriminatorW (line 14) | def DiscriminatorW(labels,batch_size, n_hidden, config, reuse=None):
  function Grad_Penalty (line 35) | def Grad_Penalty(real_data,fake_data,Discriminator,config):

FILE: causal_controller/utils.py
  function summary_stats (line 5) | def summary_stats(name,tensor,collections=None,hist=False):
  function did_succeed (line 14) | def did_succeed( output_dict, cond_dict ):

FILE: causal_dcgan/CausalGAN.py
  function norm_img (line 25) | def norm_img(image):
  function denorm_img (line 28) | def denorm_img(norm):
  function tf_truncexpon (line 31) | def tf_truncexpon(batch_size,rate,right):
  function add_texp_noise (line 50) | def add_texp_noise(batch_size,labels01):
  class CausalGAN (line 64) | class CausalGAN(object):
    method __init__ (line 67) | def __init__(self,batch_size,config):
    method __call__ (line 101) | def __call__(self, real_inputs, fake_inputs):
    method build_train_op (line 281) | def build_train_op(self):
    method build_summary_op (line 303) | def build_summary_op(self):
    method train_step (line 306) | def train_step(self,sess,counter):

FILE: causal_dcgan/config.py
  function str2bool (line 4) | def str2bool(v):
  function add_argument_group (line 10) | def add_argument_group(name):
  function get_config (line 154) | def get_config():

FILE: causal_dcgan/models.py
  function conv_out_size_same (line 11) | def conv_out_size_same(size, stride):
  function GeneratorCNN (line 14) | def GeneratorCNN( z, config, reuse=None):
  function DiscriminatorCNN (line 65) | def DiscriminatorCNN(image, config, reuse=None):
  function discriminator_labeler (line 125) | def discriminator_labeler(image, output_dim, config, reuse=None):
  function discriminator_gen_labeler (line 143) | def discriminator_gen_labeler(image, output_dim, config, reuse=None):
  function discriminator_on_z (line 161) | def discriminator_on_z(image, config, reuse=None):

FILE: causal_dcgan/ops.py
  class batch_norm (line 11) | class batch_norm(object):
    method __init__ (line 12) | def __init__(self, epsilon=1e-5, momentum = 0.9, name="batch_norm"):
    method __call__ (line 18) | def __call__(self, x, train=True):
  function conv_cond_concat (line 27) | def conv_cond_concat(x, y):
  function conv2d (line 49) | def conv2d(input_, output_dim,
  function deconv2d (line 63) | def deconv2d(input_, output_shape,
  function lrelu (line 84) | def lrelu(x,leak=0.2,name='lrelu'):
  function linear (line 94) | def linear(input_, output_size, scope=None, stddev=0.02, bias_start=0.0,...
  function add_minibatch_features (line 115) | def add_minibatch_features(image,df_dim):

FILE: causal_dcgan/utils.py
  function get_image (line 20) | def get_image(image_path, input_height, input_width,
  function save_images (line 27) | def save_images(images, size, image_path):
  function imread (line 30) | def imread(path, is_grayscale = False):
  function merge_images (line 36) | def merge_images(images, size):
  function merge (line 39) | def merge(images, size):
  function imsave (line 48) | def imsave(images, size, path):
  function center_crop (line 51) | def center_crop(x, crop_h, crop_w,
  function transform (line 61) | def transform(image, input_height, input_width,
  function inverse_transform (line 71) | def inverse_transform(images):
  function to_json (line 74) | def to_json(output_path, *layers):
  function make_gif (line 137) | def make_gif(images, fname, duration=2, true_image=False):

FILE: causal_graph.py
  function get_causal_graph (line 325) | def get_causal_graph(causal_model=None,*args,**kwargs):

FILE: config.py
  function str2bool (line 4) | def str2bool(v):
  function add_argument_group (line 11) | def add_argument_group(name):
  function gpu_logic (line 113) | def gpu_logic(config):
  function get_config (line 125) | def get_config():

FILE: data_loader.py
  function logodds (line 12) | def logodds(p):
  class DataLoader (line 15) | class DataLoader(object):
    method __init__ (line 24) | def __init__(self,label_names,config):
    method get_label_queue (line 56) | def get_label_queue(self,batch_size):
    method get_data_queue (line 80) | def get_data_queue(self,batch_size):

FILE: download.py
  function download_file_from_google_drive (line 13) | def download_file_from_google_drive(id, destination):
  function get_confirm_token (line 26) | def get_confirm_token(response):
  function save_response_content (line 32) | def save_response_content(response, destination, chunk_size=32*1024):
  function unzip (line 40) | def unzip(filepath):
  function download_celeb_a (line 47) | def download_celeb_a(base_path):
  function download_attr_file (line 74) | def download_attr_file(data_path):
  function prepare_data_dir (line 81) | def prepare_data_dir(path = './data'):
  function check_link (line 86) | def check_link(in_dir, basename, out_dir):
  function add_splits (line 93) | def add_splits(base_path):
  function delete_top_line (line 121) | def delete_top_line(txt_fname):

FILE: figure_scripts/distributions.py
  function get_pdf (line 16) | def get_pdf(model, do_dict=None,cond_dict=None,name='',N=6400,return_dis...
  function get_interv_table (line 45) | def get_interv_table(model,intrv=True):
  function record_interventional (line 71) | def record_interventional(model,step=''):

FILE: figure_scripts/encode.py
  function var_like_z (line 18) | def var_like_z(z_ten,name):
  function noise_like_z (line 21) | def noise_like_z(z_ten,name):
  class Encoder (line 27) | class Encoder:
    method __init__ (line 37) | def __init__(self,model,image,image_name=None,max_tr_steps=50000,load_...
    method init (line 187) | def init(self):
    method save (line 201) | def save(self, step=None):
    method train (line 212) | def train(self, n_step=None):

FILE: figure_scripts/high_level.py
  function fig1 (line 28) | def fig1(model, output_folder):

FILE: figure_scripts/pairwise.py
  function calc_tvd (line 17) | def calc_tvd(label_dict,attr):
  function crosstab (line 53) | def crosstab(model,result_dir=None,report_tvd=True,no_save=False,N=500000):

FILE: figure_scripts/sample.py
  function find_logit_percentile (line 23) | def find_logit_percentile(model, key, per):
  function fixed_label_diversity (line 33) | def fixed_label_diversity(model, config,step=''):
  function get_joint (line 58) | def get_joint(model, int_do_dict=None,int_cond_dict=None, N=6400,return_...
  function take_product (line 176) | def take_product(do_dict):
  function chunks (line 195) | def chunks(input_dict, chunk_size):
  function do2feed (line 218) | def do2feed( do_dict, model, on_logits=True):
  function cond2fetch (line 254) | def cond2fetch( cond_dict=None, model=None, on_logits=True):
  function interpret_dict (line 289) | def interpret_dict( a_dict, model,n_times=1, on_logits=True):
  function slice_dict (line 326) | def slice_dict(feed_dict, rows):
  function did_succeed (line 339) | def did_succeed( output_dict, cond_dict ):
  function sample (line 368) | def sample(model, cond_dict=None, do_dict=None, fetch_dict=None,N=None,
  function condition2d (line 614) | def condition2d( model, cond_dict,cond_dict_name,step='', on_logits=True):
  function intervention2d (line 753) | def intervention2d(model, fetch=None, do_dict=None, do_dict_name=None, o...

FILE: figure_scripts/utils.py
  function nhwc_to_nchw (line 28) | def nhwc_to_nchw(x):
  function to_nchw_numpy (line 30) | def to_nchw_numpy(image):
  function norm_img (line 37) | def norm_img(image, data_format=None):
  function nchw_to_nhwc (line 50) | def nchw_to_nhwc(x):
  function to_nhwc (line 52) | def to_nhwc(image, data_format):
  function denorm_img (line 58) | def denorm_img(norm, data_format):
  function read_prepared_uint8_image (line 62) | def read_prepared_uint8_image(img_path):
  function make_encode_dir (line 73) | def make_encode_dir(model,image_name):
  function make_sample_dir (line 85) | def make_sample_dir(model):
  function guess_model_step (line 98) | def guess_model_step(model):
  function infer_grid_image_shape (line 108) | def infer_grid_image_shape(N):
  function save_figure_images (line 116) | def save_figure_images(model_type, tensor, filename, size, padding=2, no...
  function make_grid (line 132) | def make_grid(tensor, nrow=8, padding=2,
  function began_save_image (line 152) | def began_save_image(tensor, filename, nrow=8, padding=2,
  function get_image (line 164) | def get_image(image_path, input_height, input_width,
  function dcgan_save_images (line 171) | def dcgan_save_images(images, size, image_path):
  function imread (line 174) | def imread(path, is_grayscale = False):
  function merge_images (line 180) | def merge_images(images, size):
  function merge (line 183) | def merge(images, size):
  function imsave (line 192) | def imsave(images, size, path):
  function center_crop (line 195) | def center_crop(x, crop_h, crop_w,
  function transform (line 205) | def transform(image, input_height, input_width,
  function inverse_transform (line 215) | def inverse_transform(images):

FILE: main.py
  function get_trainer (line 23) | def get_trainer():
  function main (line 77) | def main(trainer):

FILE: synthetic/collect_stats.py
  function makeplots (line 11) | def makeplots(x_iter,tvd_datastore,show=False,save=False,save_name=None):
  function make_individual_plots (line 69) | def make_individual_plots(x_iter,tvd_datastore,smooth=True,show=False,sa...

FILE: synthetic/config.py
  function str2bool (line 3) | def str2bool(v):
  function add_argument_group (line 12) | def add_argument_group(name):
  function get_config (line 59) | def get_config():

FILE: synthetic/main.py
  function get_trainer (line 18) | def get_trainer(config):
  function main (line 34) | def main(trainer,config):
  function get_model (line 41) | def get_model(config=None):

FILE: synthetic/models.py
  function sxe (line 7) | def sxe(logits,labels):
  function linear (line 15) | def linear(input_, output_dim, scope=None, stddev=.7):
  class Arrows (line 26) | class Arrows:
    method __init__ (line 30) | def __init__(self,N):
    method build (line 40) | def build(self):
    method normalize_output (line 43) | def normalize_output(self,X):
  class Generator (line 54) | class Generator:
    method __init__ (line 56) | def __init__(self, N, hidden_size=10,z_dim=10):
    method build (line 65) | def build(self):
    method smallNN (line 67) | def smallNN(self,inputs,name='smallNN'):
  function poly (line 82) | def poly(cause,cause2=None,cause3=None,name='poly1d',reuse=None):
  class CompleteArrows (line 147) | class CompleteArrows(Arrows): # Data generated from the causal graph X1-...
    method build (line 149) | def build(self):
  class CompleteGenerator (line 159) | class CompleteGenerator(Generator):
    method build (line 161) | def build(self):
  class ColliderArrows (line 171) | class ColliderArrows(Arrows):
    method build (line 173) | def build(self):
  class ColliderGenerator (line 181) | class ColliderGenerator(Generator):
    method build (line 183) | def build(self):
  class LinearArrows (line 192) | class LinearArrows(Arrows):
    method build (line 194) | def build(self):
  class LinearGenerator (line 203) | class LinearGenerator(Generator):
    method build (line 205) | def build(self):
  class NetworkArrows (line 214) | class NetworkArrows(Arrows):
    method build (line 216) | def build(self):
  class FC3_Generator (line 226) | class FC3_Generator(Generator):
    method build (line 228) | def build(self):
  class FC5_Generator (line 236) | class FC5_Generator(Generator):
    method build (line 238) | def build(self):
  class FC10_Generator (line 248) | class FC10_Generator(Generator):
    method build (line 250) | def build(self):
  function minibatch (line 266) | def minibatch(input_, num_kernels=5, kernel_dim=3):
  function Discriminator (line 275) | def Discriminator(input_, hidden_size,minibatch_layer=True,alpha=0.5,reu...

FILE: synthetic/tboard.py
  function file2number (line 6) | def file2number(fname):

FILE: synthetic/trainer.py
  class GAN (line 20) | class GAN(object):
    method __init__ (line 21) | def __init__(self,config,gan_type,data,parent_dir):
    method build_model (line 36) | def build_model(self):
    method build_summaries (line 62) | def build_summaries(self):
    method record_losses (line 72) | def record_losses(self,sess):
    method record_tvd (line 78) | def record_tvd(self,sess):
    method record_scatter (line 86) | def record_scatter(self,sess):
    method prepare_model_dir (line 129) | def prepare_model_dir(self):
    method prepare_logger (line 134) | def prepare_logger(self):
    method log_tvd (line 141) | def log_tvd(self,step,tvd,mvd):
  class Trainer (line 146) | class Trainer(object):
    method __init__ (line 147) | def __init__(self,config,data_type):
    method data_scatterplot (line 201) | def data_scatterplot(self):
    method build_model (line 210) | def build_model(self):
    method train (line 224) | def train(self):
    method prepare_model_dir (line 270) | def prepare_model_dir(self):

FILE: synthetic/utils.py
  function make_summary (line 19) | def make_summary(name, val):
  function summary_losses (line 22) | def summary_losses(sess,model,N=1000):
  function calc_tvd (line 28) | def calc_tvd(sess,Generator,Data,N=50000,nbins=10):
  function summary_stats (line 48) | def summary_stats(name,tensor,hist=False):
  function summary_scatterplots (line 56) | def summary_scatterplots(X1,X2,X3):
  function summary_scatter2d (line 66) | def summary_scatter2d(x,y,title='2dscatterplot',xlabel=None,ylabel=None):
  function scatter2d (line 80) | def scatter2d(x,y,title='2dscatterplot',xlabel=None,ylabel=None):
  function prepare_dirs_and_logger (line 101) | def prepare_dirs_and_logger(config):
  function get_time (line 142) | def get_time():
  function save_config (line 145) | def save_config(config):
  class Timer (line 156) | class Timer(object):
    method __init__ (line 157) | def __init__(self):
    method on (line 160) | def on(self):
    method off (line 162) | def off(self):
    method __str__ (line 165) | def __str__(self):

FILE: tboard.py
  function file2number (line 6) | def file2number(fname):

FILE: trainer.py
  class Trainer (line 15) | class Trainer(object):
    method __init__ (line 17) | def __init__(self, config, cc_config, model_config=None):
    method pretrain_loop (line 157) | def pretrain_loop(self,num_iter=None):
    method train_loop (line 238) | def train_loop(self,num_iter=None):
    method sample_label (line 270) | def sample_label(self, cond_dict=None, do_dict=None,N=None):
    method label_interpolation (line 275) | def label_interpolation(self,inputs=None,save_dir=None,ext='.pdf'):
    method causal_sampling (line 311) | def causal_sampling(self, img_shape ,ext='.pdf'):
    method sample_diversity (line 399) | def sample_diversity(self,save_dir=None,ext='.pdf'):

FILE: utils.py
  function make_summary (line 20) | def make_summary(name, val):
  function summary_stats (line 23) | def summary_stats(name,tensor,collections=None,hist=False):
  function prepare_dirs_and_logger (line 33) | def prepare_dirs_and_logger(config):
  function ignore_except (line 81) | def ignore_except(src,contents,allowed_dirs):
  function get_time (line 88) | def get_time():
  function save_configs (line 91) | def save_configs(config,cc_config,dcgan_config,began_config):
  function save_config (line 100) | def save_config(config,name="params.json",where=None):
  function get_available_gpus (line 109) | def get_available_gpus():
  function distribute_input_data (line 114) | def distribute_input_data(data_loader,num_gpu):
  function rank (line 140) | def rank(array):
  function make_grid (line 143) | def make_grid(tensor, nrow=8, padding=2,
  function save_image (line 164) | def save_image(tensor, filename, nrow=8, padding=2,
Condensed preview — 51 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (307K chars).
[
  {
    "path": ".gitignore",
    "chars": 181,
    "preview": "data/\ndata\n.*.swp\n\nlogs\nold\n\nfinal_checkpoints\ncheckpoint/\nfigures/\n*.pyc\n.DS_Store\n.ipynb_checkpoints\n[._]*.s[a-v][a-z]"
  },
  {
    "path": "LICENSE",
    "chars": 1091,
    "preview": "MIT License\n\nCopyright (c) 2017 Murat Kocaoglu, Christopher Snyder\n\nPermission is hereby granted, free of charge, to any"
  },
  {
    "path": "README.md",
    "chars": 8498,
    "preview": "# CausalGAN/CausalBEGAN in Tensorflow\n\nTensorflow implementation of [CausalGAN: Learning Causal Implicit Generative Mode"
  },
  {
    "path": "assets/0808_112404_cbcg.csv",
    "chars": 1399,
    "preview": "Wall time,Step,Value\r\n1502209477.065396,1,0.9871935844421387\r\n1502210175.629644,1001,0.5611526370048523\r\n1502210858.0279"
  },
  {
    "path": "assets/0810_191625_bcg.csv",
    "chars": 902,
    "preview": "Wall time,Step,Value\r\n1502410626.387592,1,0.9544087648391724\r\n1502411081.292726,1001,0.5290326476097107\r\n1502411533.6229"
  },
  {
    "path": "assets/0821_213901_rcbcg.csv",
    "chars": 1357,
    "preview": "Wall time,Step,Value\r\n1503369574.677247,1,0.8920440077781677\r\n1503370041.447478,1001,0.512530505657196\r\n1503370517.21502"
  },
  {
    "path": "assets/guide_to_gifs.txt",
    "chars": 283,
    "preview": "#Approach uses imagemagick\n#Take the first 20 images in a folder and convert to gif\nls -v | head -20 | xargs cp -t newfo"
  },
  {
    "path": "assets/tvdplot.ipynb",
    "chars": 3080,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {\n    \"collapsed\": false\n   },\n   \"out"
  },
  {
    "path": "causal_began/CausalBEGAN.py",
    "chars": 15813,
    "preview": "from __future__ import print_function\nfrom utils import save_image,distribute_input_data,summary_stats,make_summary\nimpo"
  },
  {
    "path": "causal_began/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "causal_began/config.py",
    "chars": 5793,
    "preview": "#-*- coding: utf-8 -*-\nimport argparse\n\ndef str2bool(v):\n    #return (v is True) or (v.lower() in ('true', '1'))\n    ret"
  },
  {
    "path": "causal_began/models.py",
    "chars": 8573,
    "preview": "import numpy as np\nimport tensorflow as tf\nslim = tf.contrib.slim\n\n\ndef lrelu(x,leak=0.2,name='lrelu'):\n    with tf.vari"
  },
  {
    "path": "causal_began/utils.py",
    "chars": 4952,
    "preview": "from __future__ import print_function\nimport tensorflow as tf\nimport os\nfrom os import listdir\nfrom os.path import isfil"
  },
  {
    "path": "causal_controller/ArrayDict.py",
    "chars": 2673,
    "preview": "import numpy as np\nclass ArrayDict(object):\n\n    '''\n    This is a class for manipulating dictionaries of arrays\n    or "
  },
  {
    "path": "causal_controller/CausalController.py",
    "chars": 17937,
    "preview": "from __future__ import print_function\nfrom itertools import chain\nimport numpy as np\nimport tensorflow as tf\nimport pand"
  },
  {
    "path": "causal_controller/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "causal_controller/config.py",
    "chars": 5214,
    "preview": "'''\n\nThese are the command line parameters that pertain exlusively to the\nCausalController.\n\n'''\n\nfrom __future__ import"
  },
  {
    "path": "causal_controller/models.py",
    "chars": 1974,
    "preview": "import numpy as np\nimport tensorflow as tf\nslim = tf.contrib.slim\n\n\ndef lrelu(x,leak=0.2,name='lrelu'):\n    with tf.vari"
  },
  {
    "path": "causal_controller/utils.py",
    "chars": 1147,
    "preview": "from __future__ import print_function\nimport numpy as np\nimport tensorflow as tf\n\ndef summary_stats(name,tensor,collecti"
  },
  {
    "path": "causal_dcgan/CausalGAN.py",
    "chars": 14862,
    "preview": "from __future__ import division,print_function\nfrom figure_scripts.pairwise import crosstab\nfrom figure_scripts.sample i"
  },
  {
    "path": "causal_dcgan/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "causal_dcgan/config.py",
    "chars": 6541,
    "preview": "from __future__ import print_function\nimport argparse\n\ndef str2bool(v):\n    return v is True or v.lower() in ('true', '1"
  },
  {
    "path": "causal_dcgan/models.py",
    "chars": 7393,
    "preview": "import tensorflow as tf\nimport numpy as np\nslim = tf.contrib.slim\nimport math\n\nfrom ops import lrelu,linear,conv_cond_co"
  },
  {
    "path": "causal_dcgan/ops.py",
    "chars": 6379,
    "preview": "import math\nimport numpy as np\nimport tensorflow as tf\n\nfrom tensorflow.python.framework import ops\n\nfrom utils import *"
  },
  {
    "path": "causal_dcgan/utils.py",
    "chars": 4660,
    "preview": "\"\"\"\nSome codes from https://github.com/Newmu/dcgan_code\n\"\"\"\nfrom __future__ import division\nimport math\nimport json\nimpo"
  },
  {
    "path": "causal_graph.py",
    "chars": 22111,
    "preview": "'''\nTo use a particular causal graph, just specify it here\n\n\nStrings specified have to match *exactly* to keys in attrib"
  },
  {
    "path": "config.py",
    "chars": 7052,
    "preview": "from __future__ import print_function\nimport argparse\n\ndef str2bool(v):\n    #return (v is True) or (v.lower() in ('true'"
  },
  {
    "path": "data_loader.py",
    "chars": 5183,
    "preview": "import os\nimport numpy as np\nimport pandas as pd\nfrom PIL import Image\nfrom glob import glob\nimport tensorflow as tf\n\nfr"
  },
  {
    "path": "download.py",
    "chars": 4346,
    "preview": "\"\"\"\nModification of\nhttps://github.com/carpedm20/BEGAN-tensorflow/blob/master/download.py\n\"\"\"\nfrom __future__ import pri"
  },
  {
    "path": "figure_scripts/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "figure_scripts/distributions.py",
    "chars": 2997,
    "preview": "import tensorflow as tf\nimport numpy as np\nimport os\nimport scipy.misc\nimport numpy as np\nimport pandas as pd\nfrom tqdm "
  },
  {
    "path": "figure_scripts/encode.py",
    "chars": 11128,
    "preview": "#from __future__ import print_function\nimport tensorflow as tf\n#import scipy\nimport scipy.misc\nimport numpy as np\nfrom t"
  },
  {
    "path": "figure_scripts/high_level.py",
    "chars": 2175,
    "preview": "import tensorflow as tf\nimport numpy as np\nimport os\nimport scipy.misc\nimport numpy as np\nimport pandas as pd\nfrom tqdm "
  },
  {
    "path": "figure_scripts/pairwise.py",
    "chars": 4553,
    "preview": "from __future__ import print_function\nimport time\nimport tensorflow as tf\nimport os\nimport scipy.misc\nimport numpy as np"
  },
  {
    "path": "figure_scripts/probability_table.txt",
    "chars": 114,
    "preview": "\n\n\nmodel: celebA_0627_200239\n    graph:MLS\n\n    [img,cc,d_fake_labels,true]\n\n    P(M=1|S=1) = [0.28, \n\n    \n     \n"
  },
  {
    "path": "figure_scripts/sample.py",
    "chars": 28953,
    "preview": "from __future__ import print_function\nimport tensorflow as tf\nimport numpy as np\nimport os\nimport scipy.misc\nimport nump"
  },
  {
    "path": "figure_scripts/utils.py",
    "chars": 6440,
    "preview": "from __future__ import print_function,division\nimport tensorflow as tf\nimport os\nfrom os import listdir\nfrom os.path imp"
  },
  {
    "path": "main.py",
    "chars": 3084,
    "preview": "from __future__ import print_function\nimport numpy as np\nimport os\nimport tensorflow as tf\n\nfrom trainer import Trainer\n"
  },
  {
    "path": "synthetic/README.md",
    "chars": 1491,
    "preview": "# Causal(BE)GAN in Tensorflow\n\n# (test comment)\n\nSynthetic Data Figures\n<> (Tensorflow implementation of [BEGAN: Boundar"
  },
  {
    "path": "synthetic/collect_stats.py",
    "chars": 7816,
    "preview": "import pandas as pd\nimport numpy as np\nimport time\nfrom scipy import stats\nimport os\nimport matplotlib.pyplot as plt\nfro"
  },
  {
    "path": "synthetic/config.py",
    "chars": 2490,
    "preview": "import argparse\nfrom models import DataTypes\ndef str2bool(v):\n    return v is True or v.lower() in ('true', '1')\n\ndtypes"
  },
  {
    "path": "synthetic/figure_generation.ipynb",
    "chars": 7247,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 2,\n   \"metadata\": {\n    \"collapsed\": false,\n    \"scrolle"
  },
  {
    "path": "synthetic/main.py",
    "chars": 1031,
    "preview": "from __future__ import print_function\nimport numpy as np\nimport tensorflow as tf\n\nfrom trainer import Trainer\nfrom confi"
  },
  {
    "path": "synthetic/models.py",
    "chars": 14587,
    "preview": "import tensorflow as tf\nimport matplotlib.pyplot as plt\nfrom utils import *\n\n#class Data3d\n\ndef sxe(logits,labels):\n    "
  },
  {
    "path": "synthetic/run_datasets.sh",
    "chars": 887,
    "preview": "#!/bin/bash\n\n#This script should be called with CUDA_VISIBLE_DEVICES\n#already set. This script runs 1 of each gan model "
  },
  {
    "path": "synthetic/tboard.py",
    "chars": 469,
    "preview": "import os\nimport sys\n\nfrom subprocess import call\n\ndef file2number(fname):\n    nums=[s for s in fname.split('_') if s.is"
  },
  {
    "path": "synthetic/trainer.py",
    "chars": 11431,
    "preview": "from __future__ import print_function\nimport tensorflow as tf\nimport logging\nimport numpy as np\nimport pandas as pd\nimpo"
  },
  {
    "path": "synthetic/utils.py",
    "chars": 5337,
    "preview": "from __future__ import print_function\nimport tensorflow as tf\nimport os\nfrom os import listdir\nfrom os.path import isfil"
  },
  {
    "path": "tboard.py",
    "chars": 469,
    "preview": "import os\nimport sys\n\nfrom subprocess import call\n\ndef file2number(fname):\n    nums=[s for s in fname.split('_') if s.is"
  },
  {
    "path": "trainer.py",
    "chars": 17168,
    "preview": "from __future__ import print_function\nimport numpy as np\nimport tensorflow as tf\nfrom causal_controller.CausalController"
  },
  {
    "path": "utils.py",
    "chars": 6108,
    "preview": "from __future__ import print_function\nimport tensorflow as tf\nfrom functools import partial\nimport os\nfrom os import lis"
  }
]

About this extraction

This page contains the full source code of the mkocaoglu/CausalGAN GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 51 files (288.4 KB), approximately 77.7k tokens, and a symbol index with 284 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Copied to clipboard!