Repository: ashafahi/free_adv_train
Branch: master
Commit: c42cac2c4711
Files: 12
Total size: 45.4 KB

Directory structure:
gitextract_de67v4c3/

├── .gitignore
├── README.md
├── cifar100_input.py
├── cifar10_input.py
├── config.py
├── config.yaml
├── free_model.py
├── free_train.py
├── multi_restart_pgd_attack.py
└── requirements/
    ├── base.txt
    ├── cpu.txt
    └── gpu.txt

================================================
FILE CONTENTS
================================================

================================================
FILE: .gitignore
================================================

# Created by https://www.gitignore.io/api/linux,macos,python,pycharm
# Edit at https://www.gitignore.io/?templates=linux,macos,python,pycharm

### Linux ###
*~

# temporary files which can be created if a process still has a handle open of a deleted file
.fuse_hidden*

# KDE directory preferences
.directory

# Linux trash folder which might appear on any partition or disk
.Trash-*

# .nfs files are created when an open file is removed but is still being accessed
.nfs*

### macOS ###
# General
.DS_Store
.AppleDouble
.LSOverride

# Icon must end with two \r
Icon

# Thumbnails
._*

# Files that might appear in the root of a volume
.DocumentRevisions-V100
.fseventsd
.Spotlight-V100
.TemporaryItems
.Trashes
.VolumeIcon.icns
.com.apple.timemachine.donotpresent

# Directories potentially created on remote AFP share
.AppleDB
.AppleDesktop
Network Trash Folder
Temporary Items
.apdisk

### PyCharm ###
# Covers JetBrains IDEs: IntelliJ, RubyMine, PhpStorm, AppCode, PyCharm, CLion, Android Studio and WebStorm
# Reference: https://intellij-support.jetbrains.com/hc/en-us/articles/206544839

# User-specific stuff
.idea/**/workspace.xml
.idea/**/tasks.xml
.idea/**/usage.statistics.xml
.idea/**/dictionaries
.idea/**/shelf

# Generated files
.idea/**/contentModel.xml

# Sensitive or high-churn files
.idea/**/dataSources/
.idea/**/dataSources.ids
.idea/**/dataSources.local.xml
.idea/**/sqlDataSources.xml
.idea/**/dynamic.xml
.idea/**/uiDesigner.xml
.idea/**/dbnavigator.xml

# Gradle
.idea/**/gradle.xml
.idea/**/libraries

# Gradle and Maven with auto-import
# When using Gradle or Maven with auto-import, you should exclude module files,
# since they will be recreated, and may cause churn.  Uncomment if using
# auto-import.
# .idea/modules.xml
# .idea/*.iml
# .idea/modules

# CMake
cmake-build-*/

# Mongo Explorer plugin
.idea/**/mongoSettings.xml

# File-based project format
*.iws

# IntelliJ
out/

# mpeltonen/sbt-idea plugin
.idea_modules/

# JIRA plugin
atlassian-ide-plugin.xml

# Cursive Clojure plugin
.idea/replstate.xml

# Crashlytics plugin (for Android Studio and IntelliJ)
com_crashlytics_export_strings.xml
crashlytics.properties
crashlytics-build.properties
fabric.properties

# Editor-based Rest Client
.idea/httpRequests

# Android studio 3.1+ serialized cache file
.idea/caches/build_file_checksums.ser

# JetBrains templates
**___jb_tmp___

### PyCharm Patch ###
# Comment Reason: https://github.com/joeblau/gitignore.io/issues/186#issuecomment-215987721

# *.iml
# modules.xml
# .idea/misc.xml
# *.ipr

# Sonarlint plugin
.idea/sonarlint

### Python ###
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
#  Usually these files are written by a python script from a template
#  before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
.pytest_cache/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
.python-version

# pipenv
#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
#   However, in case of collaboration, if having platform-specific dependencies or dependencies
#   having no cross-platform support, pipenv may install dependencies that don’t work, or not
#   install all needed dependencies.
#Pipfile.lock

# celery beat schedule file
celerybeat-schedule

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# End of https://www.gitignore.io/api/linux,macos,python,pycharm


# Just to make sure pycharm does not ruin anything: 
.idea

# To avoid uploading datasets to the github
datasets


# To avoid uploading models
models


# SWP vim 
.*.swp


================================================
FILE: README.md
================================================
# Free Adversarial Training 
This repository belongs to the [Free Adversarial Training](https://arxiv.org/abs/1904.12843 "Free Adversarial Training") paper.
The implementation is inspired by [CIFAR10 Adversarial Example Challenge](https://github.com/MadryLab/cifar10_challenge "Madry's CIFAR10 Challenge") so to them we give the credit.
This repo is for the CIFAR-10 and CIFAR-100 datasets and is in Tensorflow. Our Free-m models can acheive comparable performance with conventional PGD adversarial training at a fraction of the time. 

**__News!__**: We have released our [ImageNet implementation of Free adversarial training in Pytorch](https://github.com/mahyarnajibi/FreeAdversarialTraining) !


###### CIFAR-10 WRN 32-10 (L-inf epsilon=8):

| Model | Natural | PGD-100 | CW-100 | 10 restart PGD-20 | train-time (min) |
| --- | --- | --- | --- | --- | --- |
| Natrual | 95.01 | 0.00 | 0.00| 0.00 | 780 |
| Free-2 | 91.45 | 33.20 | 34.57 | 33.41 | 816 |
| Free-4 | 87.83 | 40.35 | 41.96 | 40.73 | 800 |
| **Free-8** | **85.96** | **46.19** | **46.60** | **46.33** | **785** |
| Free-10 |83.94 | 45.79 | 45.86 | 45.94 | 785 |
|Madry 7-PGD (public model) | 87.25 | 45.29 | 46.52 | 45.53 | 5418 |

###### CIFAR-100 WRN 32-10 (L-inf epsilon=8):
| Model | Natural | PGD-20 | PGD-100  | train-time (min) |
| --- | --- | --- | --- | --- |
| Natrual | 78.84 | 0.00 | 0.00 | 811 |
| Free-2 | 69.20 | 15.37 | 14.86 | 816 |
| Free-4 | 65.28 | 20.64 | 20.15 | 767 |
| **Free-8** | **62.13** | **25.88** | **25.58** | **780** |
| Free-10 | 59.27 | 25.15 | 24.88 | 776 |
| Madry 2-PGD trained | 67.94 | 17.08 | 16.50 | 2053 |
| Madry 7-PGD trained | 59.87 | 22.76 | 22.52 | 5157 |


## Demo
To train a new robust model for free! run the following command specifying the replay parameter `m`:

```bash
python free_train.py -m 8
```

To evaluate a robust model using PGD-20 with 2 random restarts run:

```bash
python multi_restart_pgd_attack.py --model_dir $MODEL_DIR --num_restarts 2
```
Note that if you have trained a CIFAR-100 model, even for evaluation, you should pass the dataset argument. For example:
```bash
python multi_restart_pgd_attack.py --model_dir $MODEL_DIR_TO_CIFAR100 --num_restarts 2 -d cifar100
```

## Requirements 
To install all the requirements plus tensorflow for multi-gpus run: (Inspired By [Illarion ikhlestov](https://github.com/ikhlestov/vision_networks "Densenet Implementation") ) 

```bash
pip install -r requirements/gpu.txt
```

Alternatively, to install the requirements plus tensorflow for cpu run: 
```bash
pip install -r requirements/cpu.txt
```

To prepare the data, please see [Datasets section](https://github.com/ashafahi/free_adv_train/tree/master/datasets "Dataset readme").

If you find the paper or the code useful for your study, please consider citing the free training paper:
```bash
@article{shafahi2019adversarial,
  title={Adversarial Training for Free!},
  author={Shafahi, Ali and Najibi, Mahyar and Ghiasi, Amin and Xu, Zheng and Dickerson, John and Studer, Christoph and Davis, Larry S and Taylor, Gavin and Goldstein, Tom},
  journal={arXiv preprint arXiv:1904.12843},
  year={2019}
}
```


================================================
FILE: cifar100_input.py
================================================
"""
Utilities for importing the CIFAR100 dataset.

Each image in the dataset is a numpy array of shape (32, 32, 3), with the values
being unsigned integers (i.e., in the range 0,1,...,255).
"""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import os
import pickle
import sys
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
version = sys.version_info

import numpy as np

class CIFAR100Data(object):
    """
    Unpickles the CIFAR100 dataset from a specified folder containing a pickled
    version following the format of Krizhevsky which can be found
    [here](https://www.cs.toronto.edu/~kriz/cifar.html).

    Inputs to constructor
    =====================

        - path: path to the pickled dataset. The training data must be pickled
        into  one file name train containing 50,000, and 10,000
        examples each, the test data
        must be pickled into a single file called test containing 10,000
        examples, and the 100 fine or 20 coarse class names must be
        pickled into a file called meta. The pickled examples should
        be stored as a tuple of two objects: an array of 50,000 32x32x3-shaped
        arrays, and an array of their 50,000 true labels.

    """
    def __init__(self, path):
        #train_filenames = ['data_batch_{}'.format(ii + 1) for ii in range(5)]
        train_filename = 'train'
        eval_filename = 'test'#'test_batch'
        metadata_filename = 'meta'#'batches.meta'

        train_images = np.zeros((50000, 32, 32, 3), dtype='uint8')
        train_labels = np.zeros(50000, dtype='int32')
        #for ii, fname in enumerate(train_filenames):
        #    cur_images, cur_labels = self._load_datafile(os.path.join(path, fname))
        #    train_images[ii * 10000 : (ii+1) * 10000, ...] = cur_images
        #    train_labels[ii * 10000 : (ii+1) * 10000, ...] = cur_labels
        train_images, train_labels = self._load_datafile(
            os.path.join(path, train_filename))
        eval_images, eval_labels = self._load_datafile(
            os.path.join(path, eval_filename))

        with open(os.path.join(path, metadata_filename), 'rb') as fo:
              if version.major == 3:
                  data_dict = pickle.load(fo, encoding='bytes')
              else:
                  data_dict = pickle.load(fo)

              #self.label_names = data_dict[b'label_names']
              self.label_names = data_dict[b'fine_label_names']
        for ii in range(len(self.label_names)):
            self.label_names[ii] = self.label_names[ii].decode('utf-8')

        self.train_data = DataSubset(train_images, train_labels)
        self.eval_data = DataSubset(eval_images, eval_labels)

    @staticmethod
    def _load_datafile(filename):
      with open(filename, 'rb') as fo:
          if version.major == 3:
              data_dict = pickle.load(fo, encoding='bytes')
          else:
              data_dict = pickle.load(fo)

          assert data_dict[b'data'].dtype == np.uint8
          image_data = data_dict[b'data']
          #image_data = image_data.reshape((10000, 3, 32, 32)).transpose(0, 2, 3, 1)
          image_data = image_data.reshape((-1, 3, 32, 32)).transpose(0, 2, 3, 1)
          #return image_data, np.array(data_dict[b'labels'])
          return image_data, np.array(data_dict[b'fine_labels'])

class AugmentedCIFAR100Data(object):
    """
    Data augmentation wrapper over a loaded dataset.

    Inputs to constructor
    =====================
        - raw_cifar10data: the loaded CIFAR100 dataset, via the CIFAR100Data class
        - sess: current tensorflow session
        - model: current model (needed for input tensor)
    """
    def __init__(self, raw_cifar100data, sess, model):
        assert isinstance(raw_cifar100data, CIFAR100Data)
        self.image_size = 32

        # create augmentation computational graph
        self.x_input_placeholder = tf.placeholder(tf.float32, shape=[None, 32, 32, 3])
        padded = tf.map_fn(lambda img: tf.image.resize_image_with_crop_or_pad(
            img, self.image_size + 4, self.image_size + 4),
            self.x_input_placeholder)
        cropped = tf.map_fn(lambda img: tf.random_crop(img, [self.image_size,
                                                             self.image_size,
                                                             3]), padded)
        flipped = tf.map_fn(lambda img: tf.image.random_flip_left_right(img), cropped)
        self.augmented = flipped

        self.train_data = AugmentedDataSubset(raw_cifar100data.train_data, sess,
                                             self.x_input_placeholder,
                                              self.augmented)
        self.eval_data = AugmentedDataSubset(raw_cifar100data.eval_data, sess,
                                             self.x_input_placeholder,
                                             self.augmented)
        self.label_names = raw_cifar100data.label_names


class DataSubset(object):
    def __init__(self, xs, ys):
        self.xs = xs
        self.n = xs.shape[0]
        self.ys = ys
        self.batch_start = 0
        self.cur_order = np.random.permutation(self.n)

    def get_next_batch(self, batch_size, multiple_passes=False, reshuffle_after_pass=True):
        if self.n < batch_size:
            raise ValueError('Batch size can be at most the dataset size')
        if not multiple_passes:
            actual_batch_size = min(batch_size, self.n - self.batch_start)
            if actual_batch_size <= 0:
                raise ValueError('Pass through the dataset is complete.')
            batch_end = self.batch_start + actual_batch_size
            batch_xs = self.xs[self.cur_order[self.batch_start : batch_end], ...]
            batch_ys = self.ys[self.cur_order[self.batch_start : batch_end], ...]
            self.batch_start += actual_batch_size
            return batch_xs, batch_ys
        actual_batch_size = min(batch_size, self.n - self.batch_start)
        if actual_batch_size < batch_size:
            if reshuffle_after_pass:
                self.cur_order = np.random.permutation(self.n)
            self.batch_start = 0
        batch_end = self.batch_start + batch_size
        batch_xs = self.xs[self.cur_order[self.batch_start : batch_end], ...]
        batch_ys = self.ys[self.cur_order[self.batch_start : batch_end], ...]
        self.batch_start += batch_size
        return batch_xs, batch_ys


class AugmentedDataSubset(object):
    def __init__(self, raw_datasubset, sess, x_input_placeholder,
                 augmented):
        self.sess = sess
        self.raw_datasubset = raw_datasubset
        self.x_input_placeholder = x_input_placeholder
        self.augmented = augmented

    def get_next_batch(self, batch_size, multiple_passes=False, reshuffle_after_pass=True):
        raw_batch = self.raw_datasubset.get_next_batch(batch_size, multiple_passes,
                                                       reshuffle_after_pass)
        images = raw_batch[0].astype(np.float32)
        return self.sess.run(self.augmented, feed_dict={self.x_input_placeholder:
                                                    raw_batch[0]}), raw_batch[1]


================================================
FILE: cifar10_input.py
================================================
"""
Utilities for importing the CIFAR10 dataset.
Each image in the dataset is a numpy array of shape (32, 32, 3), with the values
being unsigned integers (i.e., in the range 0,1,...,255).
"""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import os
import pickle
import sys
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import numpy as np
import re

version = sys.version_info


class CIFAR10Data(object):
    """
    Unpickles the CIFAR10 dataset from a specified folder containing a pickled
    version following the format of Krizhevsky which can be found
    [here](https://www.cs.toronto.edu/~kriz/cifar.html).
    Inputs to constructor
    =====================
        - path: path to the pickled dataset. The training data must be pickled
        into five files named data_batch_i for i = 1, ..., 5, containing 10,000
        examples each, the test data
        must be pickled into a single file called test_batch containing 10,000
        examples, and the 10 class names must be
        pickled into a file called batches.meta. The pickled examples should
        be stored as a tuple of two objects: an array of 10,000 32x32x3-shaped
        arrays, and an array of their 10,000 true labels.
    """

    def __init__(self, path):
        path = CIFAR10Data.rec_search(path)
        train_filenames = ['data_batch_{}'.format(ii + 1) for ii in range(5)]
        eval_filename = 'test_batch'
        metadata_filename = 'batches.meta'

        train_images = np.zeros((50000, 32, 32, 3), dtype='uint8')
        train_labels = np.zeros(50000, dtype='int32')
        for ii, fname in enumerate(train_filenames):
            cur_images, cur_labels = self._load_datafile(os.path.join(path, fname))
            train_images[ii * 10000: (ii + 1) * 10000, ...] = cur_images
            train_labels[ii * 10000: (ii + 1) * 10000, ...] = cur_labels
        eval_images, eval_labels = self._load_datafile(
            os.path.join(path, eval_filename))

        with open(os.path.join(path, metadata_filename), 'rb') as fo:
            if version.major == 3:
                data_dict = pickle.load(fo, encoding='bytes')
            else:
                data_dict = pickle.load(fo)

            self.label_names = data_dict[b'label_names']
        for ii in range(len(self.label_names)):
            self.label_names[ii] = self.label_names[ii].decode('utf-8')

        self.train_data = DataSubset(train_images, train_labels)
        self.eval_data = DataSubset(eval_images, eval_labels)

    @staticmethod
    def rec_search(original_path):
        rx = re.compile(r'data_batch_[0-9]+')
        r = []
        for path, _, file_names in os.walk(original_path):
            r.extend([os.path.join(path, x) for x in file_names if rx.search(x)])
        if len(r) is 0:  # TODO: Is this the best way?
            return original_path
        return os.path.dirname(r[0])

    @staticmethod
    def _load_datafile(filename):
        with open(filename, 'rb') as fo:
            if version.major == 3:
                data_dict = pickle.load(fo, encoding='bytes')
            else:
                data_dict = pickle.load(fo)

            assert data_dict[b'data'].dtype == np.uint8
            image_data = data_dict[b'data']
            image_data = image_data.reshape((10000, 3, 32, 32)).transpose(0, 2, 3, 1)
            return image_data, np.array(data_dict[b'labels'])


class AugmentedCIFAR10Data(object):
    """
    Data augmentation wrapper over a loaded dataset.
    Inputs to constructor
    =====================
        - raw_cifar10data: the loaded CIFAR10 dataset, via the CIFAR10Data class
        - sess: current tensorflow session
        - model: current model (needed for input tensor)
    """

    def __init__(self, raw_cifar10data, sess, model):
        assert isinstance(raw_cifar10data, CIFAR10Data)
        self.image_size = 32

        # create augmentation computational graph
        self.x_input_placeholder = tf.placeholder(tf.float32, shape=[None, 32, 32, 3])
        padded = tf.map_fn(lambda img: tf.image.resize_image_with_crop_or_pad(
            img, self.image_size + 4, self.image_size + 4),
                           self.x_input_placeholder)
        cropped = tf.map_fn(lambda img: tf.random_crop(img, [self.image_size,
                                                             self.image_size,
                                                             3]), padded)
        flipped = tf.map_fn(lambda img: tf.image.random_flip_left_right(img), cropped)
        self.augmented = flipped

        self.train_data = AugmentedDataSubset(raw_cifar10data.train_data, sess,
                                              self.x_input_placeholder,
                                              self.augmented)
        self.eval_data = AugmentedDataSubset(raw_cifar10data.eval_data, sess,
                                             self.x_input_placeholder,
                                             self.augmented)
        self.label_names = raw_cifar10data.label_names


class DataSubset(object):
    def __init__(self, xs, ys):
        self.xs = xs
        self.n = xs.shape[0]
        self.ys = ys
        self.batch_start = 0
        self.cur_order = np.random.permutation(self.n)

    def get_next_batch(self, batch_size, multiple_passes=False, reshuffle_after_pass=True):
        if self.n < batch_size:
            raise ValueError('Batch size can be at most the dataset size')
        if not multiple_passes:
            actual_batch_size = min(batch_size, self.n - self.batch_start)
            if actual_batch_size <= 0:
                raise ValueError('Pass through the dataset is complete.')
            batch_end = self.batch_start + actual_batch_size
            batch_xs = self.xs[self.cur_order[self.batch_start: batch_end], ...]
            batch_ys = self.ys[self.cur_order[self.batch_start: batch_end], ...]
            self.batch_start += actual_batch_size
            return batch_xs, batch_ys
        actual_batch_size = min(batch_size, self.n - self.batch_start)
        if actual_batch_size < batch_size:
            if reshuffle_after_pass:
                self.cur_order = np.random.permutation(self.n)
            self.batch_start = 0
        batch_end = self.batch_start + batch_size
        batch_xs = self.xs[self.cur_order[self.batch_start: batch_end], ...]
        batch_ys = self.ys[self.cur_order[self.batch_start: batch_end], ...]
        self.batch_start += actual_batch_size
        return batch_xs, batch_ys


class AugmentedDataSubset(object):
    def __init__(self, raw_datasubset, sess, x_input_placeholder,
                 augmented):
        self.sess = sess
        self.raw_datasubset = raw_datasubset
        self.x_input_placeholder = x_input_placeholder
        self.augmented = augmented

    def get_next_batch(self, batch_size, multiple_passes=False, reshuffle_after_pass=True):
        raw_batch = self.raw_datasubset.get_next_batch(batch_size, multiple_passes,
                                                       reshuffle_after_pass)
        images = raw_batch[0].astype(np.float32)
        return self.sess.run(self.augmented, feed_dict={self.x_input_placeholder:
                                                            raw_batch[0]}), raw_batch[1]


================================================
FILE: config.py
================================================
import configargparse
import pdb

def pair(arg):
    return [float(x) for x in arg.split(',')]

def get_args():
    parser = configargparse.ArgParser(default_config_files=[])
    parser.add("--config", type=str, is_config_file=True, help="You can store all the config args in a config file and pass the path here")
    parser.add("--model_dir", type=str, default="models/model", help="Path to save/load the checkpoints, default=models/model")
    parser.add("--data_dir", type=str, default="datasets/", help="Path to load datasets from, default=datasets")
    parser.add("--dataset", "-d", type=str, default="cifar10", choices=["cifar10", "cifar100"], help="Path to load dataset, default=cifar10")
    parser.add("--tf_seed", type=int, default=451760341, help="Random seed for initializing tensor-flow variables to rule out the effect of randomness in experiments, default=45160341") 
    parser.add("--np_seed", type=int, default=216105420, help="Random seed for initializing numpy variables to rule out the effect of randomness in experiments, default=216105420") 
    parser.add("--train_steps", type=int, default=80000, help="Maximum number of training steps, default=80000")
    parser.add("--out_steps", "-o", type=int, default=100, help="Number of output steps, default=100")
    parser.add("--summary_steps", type=int, default=500, help="Number of summary steps, default=500") 
    parser.add("--checkpoint_steps", "-c", type=int, default=1000, help="Number of checkpoint steps, default=1000")
    parser.add("--train_batch_size", "-b", type=int, default=128, help="The training batch size, default=128")
    parser.add("--step_size_schedule", nargs='+', type=pair, default=[[0, 0.1], [40000, 0.01], [60000, 0.001]], help="The step size scheduling, default=[[0, 0.1], [40000, 0.01], [60000, 0.001]], use like: --stepsize 0,0.1 40000,0.01 60000,0.001") 
    parser.add("--weight_decay", "-w", type=float, default=0.0002, help="The weight decay parameter, default=0.0002")
    parser.add("--momentum", type=float, default=0.9, help="The momentum parameter, default=0.9")
    parser.add("--replay_m", "-m", type=int, default=8, help="Number of steps to repeat trainig on the same batch, default=8")
    parser.add("--eval_examples", type=int, default=10000, help="Number of evaluation examples, default=10000")
    parser.add("--eval_size", type=int, default=128, help="Evaluation batch size, default=128")
    parser.add("--eval_cpu", type=bool, default=False, help="Set True to do evaluation on CPU instead of GPU, default=False")
    # params regarding attack
    parser.add("--epsilon", "-e", type=float, default=8.0, help="Epsilon (Lp Norm distance from the original image) for generating adversarial examples, default=8.0")
    parser.add("--pgd_steps", "-k", type=int, default=20, help="Number of steps to PGD attack, default=20")
    parser.add("--step_size", "-s", type=float, default=2.0, help="Step size in PGD attack for generating adversarial examples in each step, default=2.0")
    parser.add("--loss_func", "-f", type=str, default="xent", choices=["xent", "cw"], help="Loss function for the model, choices are [xent, cw], default=xent")
    parser.add("--num_restarts", type=int, default=1, help="Number of resets for the PGD attack, default=1")
    args = parser.parse_args()
    return args


if __name__ == "__main__": 
    print(get_args())
    pdb.set_trace()

# TODO Default for model_dir
# TODO Need to update the helps


================================================
FILE: config.yaml
================================================


================================================
FILE: free_model.py
================================================
# based on https://github.com/tensorflow/models/tree/master/resnet
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np
import tensorflow as tf
import json


class Model(object):
    """ResNet model."""

    def __init__(self, mode, dataset, train_batch_size=None):
        """ResNet constructor.

        Args:
          mode: One of 'train' and 'eval'.
        """
        self.neck = None
        self.y_pred = None
        self.mode = mode
        self.pert = True if mode == 'train' else False
        self.num_classes = 100 if dataset == 'cifar100' else 10
        self.train_batch_size = train_batch_size
        self._build_model()

    def add_internal_summaries(self):
        pass

    def _stride_arr(self, stride):
        """Map a stride scalar to the stride array for tf.nn.conv2d."""
        return [1, stride, stride, 1]

    def _build_model(self):
        assert self.mode == 'train' or self.mode == 'eval'
        """Build the core model within the graph."""
        with tf.variable_scope('input'):

            self.x_input = tf.placeholder(
                tf.float32,
                shape=[None, 32, 32, 3])

            self.y_input = tf.placeholder(tf.int64, shape=None)

            if self.pert:
                self.pert = tf.get_variable(name='instance_perturbation', initializer=tf.zeros_initializer,
                                            shape=[self.train_batch_size, 32, 32, 3], dtype=tf.float32,
                                            trainable=True)
                self.final_input = self.x_input + self.pert
                self.final_input = tf.clip_by_value(self.final_input, 0., 255.)
            else:
                self.final_input = self.x_input

            input_standardized = tf.map_fn(lambda img: tf.image.per_image_standardization(img), self.final_input)
            x = self._conv('init_conv', input_standardized, 3, 3, 16, self._stride_arr(1))

        strides = [1, 2, 2]
        activate_before_residual = [True, False, False]
        res_func = self._residual

        # Uncomment the following codes to use w28-10 wide residual network.
        # It is more memory efficient than very deep residual network and has
        # comparably good performance.
        # https://arxiv.org/pdf/1605.07146v1.pdf
        filters = [16, 160, 320, 640]

        # Update hps.num_residual_units to 9

        with tf.variable_scope('unit_1_0'):
            x = res_func(x, filters[0], filters[1], self._stride_arr(strides[0]),
                         activate_before_residual[0])
        for i in range(1, 5):
            with tf.variable_scope('unit_1_%d' % i):
                x = res_func(x, filters[1], filters[1], self._stride_arr(1), False)

        with tf.variable_scope('unit_2_0'):
            x = res_func(x, filters[1], filters[2], self._stride_arr(strides[1]),
                         activate_before_residual[1])
        for i in range(1, 5):
            with tf.variable_scope('unit_2_%d' % i):
                x = res_func(x, filters[2], filters[2], self._stride_arr(1), False)

        with tf.variable_scope('unit_3_0'):
            x = res_func(x, filters[2], filters[3], self._stride_arr(strides[2]),
                         activate_before_residual[2])
        for i in range(1, 5):
            with tf.variable_scope('unit_3_%d' % i):
                x = res_func(x, filters[3], filters[3], self._stride_arr(1), False)

        with tf.variable_scope('unit_last'):
            x = self._batch_norm('final_bn', x)
            x = self._relu(x, 0.1)
            x = self._global_avg_pool(x)
            self.neck = x

        with tf.variable_scope('logit'):
            self.pre_softmax = self._fully_connected(x, self.num_classes)

        self.predictions = tf.argmax(self.pre_softmax, 1)
        self.y_pred = self.predictions
        self.correct_prediction = tf.equal(self.predictions, self.y_input)
        self.num_correct = tf.reduce_sum(tf.cast(self.correct_prediction, tf.int64))
        self.accuracy = tf.reduce_mean(tf.cast(self.correct_prediction, tf.float32))

        with tf.variable_scope('costs'):
            self.y_xent = tf.nn.sparse_softmax_cross_entropy_with_logits(
                logits=self.pre_softmax, labels=self.y_input)
            self.xent = tf.reduce_sum(self.y_xent, name='y_xent')
            self.mean_xent = tf.reduce_mean(self.y_xent)
            self.weight_decay_loss = self._decay()

    def _batch_norm(self, name, x):
        """Batch normalization."""
        with tf.name_scope(name):
            return tf.contrib.layers.batch_norm(inputs=x, decay=.9, center=True, scale=True, activation_fn=None,
                                                updates_collections=None, is_training=(self.mode == 'train'))

    def _residual(self, x, in_filter, out_filter, stride, activate_before_residual=False):
        """Residual unit with 2 sub layers."""
        if activate_before_residual:
            with tf.variable_scope('shared_activation'):
                x = self._batch_norm('init_bn', x)
                x = self._relu(x, 0.1)
                orig_x = x
        else:
            with tf.variable_scope('residual_only_activation'):
                orig_x = x
                x = self._batch_norm('init_bn', x)
                x = self._relu(x, 0.1)

        with tf.variable_scope('sub1'):
            x = self._conv('conv1', x, 3, in_filter, out_filter, stride)

        with tf.variable_scope('sub2'):
            x = self._batch_norm('bn2', x)
            x = self._relu(x, 0.1)
            x = self._conv('conv2', x, 3, out_filter, out_filter, [1, 1, 1, 1])

        with tf.variable_scope('sub_add'):
            if in_filter != out_filter:
                orig_x = tf.nn.avg_pool(orig_x, stride, stride, 'VALID')
                orig_x = tf.pad(
                    orig_x, [[0, 0], [0, 0], [0, 0],
                             [(out_filter - in_filter) // 2, (out_filter - in_filter) // 2]])
            x += orig_x

        tf.logging.debug('image after unit %s', x.get_shape())
        return x

    def _decay(self):
        """L2 weight decay loss."""
        costs = []
        for var in tf.trainable_variables():
            if var.op.name.find('DW') > 0:
                costs.append(tf.nn.l2_loss(var))
        return tf.add_n(costs)

    def _conv(self, name, x, filter_size, in_filters, out_filters, strides):
        """Convolution."""
        with tf.variable_scope(name):
            n = filter_size * filter_size * out_filters
            kernel = tf.get_variable(
                'DW', [filter_size, filter_size, in_filters, out_filters],
                tf.float32, initializer=tf.random_normal_initializer(
                    stddev=np.sqrt(2.0 / n)))
            return tf.nn.conv2d(x, kernel, strides, padding='SAME')

    def _relu(self, x, leakiness=0.0):
        """Relu, with optional leaky support."""
        return tf.where(tf.less(x, 0.0), leakiness * x, x, name='leaky_relu')

    def _fully_connected(self, x, out_dim):
        """FullyConnected layer for final output."""
        num_non_batch_dimensions = len(x.shape)
        prod_non_batch_dimensions = 1
        for ii in range(num_non_batch_dimensions - 1):
            prod_non_batch_dimensions *= int(x.shape[ii + 1])
        x = tf.reshape(x, [tf.shape(x)[0], -1])
        w = tf.get_variable(
            'DW', [prod_non_batch_dimensions, out_dim],
            initializer=tf.uniform_unit_scaling_initializer(factor=1.0))
        b = tf.get_variable('biases', [out_dim],
                            initializer=tf.constant_initializer())
        return tf.nn.xw_plus_b(x, w, b)

    def _global_avg_pool(self, x):
        assert x.get_shape().ndims == 4
        return tf.reduce_mean(x, [1, 2])


================================================
FILE: free_train.py
================================================
"""Trains a model, saving checkpoints and tensorboard summaries along
   the way."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from datetime import datetime
import os
import shutil
from timeit import default_timer as timer
import tensorflow as tf
import numpy as np
import sys
from free_model import Model
import cifar10_input
import cifar100_input
import pdb

import config

def get_path_dir(data_dir, dataset, **_):
    path = os.path.join(data_dir, dataset)
    if os.path.islink(path):
        path = os.readlink(path)
    return path


def train(tf_seed, np_seed, train_steps, out_steps, summary_steps, checkpoint_steps, step_size_schedule,
          weight_decay, momentum, train_batch_size, epsilon, replay_m, model_dir, dataset, **kwargs):
    tf.set_random_seed(tf_seed)
    np.random.seed(np_seed)

    model_dir = model_dir + '%s_m%d_eps%.1f_b%d' % (dataset, replay_m, epsilon, train_batch_size)  # TODO Replace with not defaults

    # Setting up the data and the model
    data_path = get_path_dir(dataset=dataset, **kwargs)
    if dataset == 'cifar10':
      raw_data = cifar10_input.CIFAR10Data(data_path)
    else:
      raw_data = cifar100_input.CIFAR100Data(data_path)
    global_step = tf.contrib.framework.get_or_create_global_step()
    model = Model(mode='train', dataset=dataset, train_batch_size=train_batch_size)

    # Setting up the optimizer
    boundaries = [int(sss[0]) for sss in step_size_schedule][1:]
    values = [sss[1] for sss in step_size_schedule]
    learning_rate = tf.train.piecewise_constant(tf.cast(global_step, tf.int32), boundaries, values)
    optimizer = tf.train.MomentumOptimizer(learning_rate, momentum)

    # Optimizing computation
    total_loss = model.mean_xent + weight_decay * model.weight_decay_loss
    grads = optimizer.compute_gradients(total_loss)

    # Compute new image
    pert_grad = [g for g, v in grads if 'perturbation' in v.name]
    sign_pert_grad = tf.sign(pert_grad[0])
    new_pert = model.pert + epsilon * sign_pert_grad
    clip_new_pert = tf.clip_by_value(new_pert, -epsilon, epsilon)
    assigned = tf.assign(model.pert, clip_new_pert)

    # Train
    no_pert_grad = [(tf.zeros_like(v), v) if 'perturbation' in v.name else (g, v) for g, v in grads]
    with tf.control_dependencies([assigned]):
        min_step = optimizer.apply_gradients(no_pert_grad, global_step=global_step)
    tf.initialize_variables([model.pert])  # TODO: Removed from TF

    # Setting up the Tensorboard and checkpoint outputs
    if not os.path.exists(model_dir):
        os.makedirs(model_dir)

    saver = tf.train.Saver(max_to_keep=1)
    tf.summary.scalar('accuracy', model.accuracy)
    tf.summary.scalar('xent', model.xent / train_batch_size)
    tf.summary.scalar('total loss', total_loss / train_batch_size)
    merged_summaries = tf.summary.merge_all()

    gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=1.0)
    with tf.Session(config=tf.ConfigProto(gpu_options=gpu_options)) as sess:
        print('\n\n********** free training for epsilon=%.1f using m_replay=%d **********\n\n' % (epsilon, replay_m))
        print('important params >>> \n model dir: %s \n dataset: %s \n training batch size: %d \n' % (model_dir, dataset, train_batch_size))
        if dataset == 'cifar100':
          print('the ride for CIFAR100 is bumpy -- fasten your seatbelts! \n \
          you will probably see the training and validation accuracy fluctuating a lot early in trainnig \n \
                this is natural especially for large replay_m values because we see that mini-batch so many times.')
        # initialize data augmentation
        if dataset == 'cifar10':
          data = cifar10_input.AugmentedCIFAR10Data(raw_data, sess, model)
        else:
          data = cifar100_input.AugmentedCIFAR100Data(raw_data, sess, model)

        # Initialize the summary writer, global variables, and our time counter.
        summary_writer = tf.summary.FileWriter(model_dir + '/train', sess.graph)
        eval_summary_writer = tf.summary.FileWriter(model_dir + '/eval')
        sess.run(tf.global_variables_initializer())

        # Main training loop
        for ii in range(train_steps):
            if ii % replay_m == 0:
                x_batch, y_batch = data.train_data.get_next_batch(train_batch_size, multiple_passes=True)
                nat_dict = {model.x_input: x_batch, model.y_input: y_batch}

            x_eval_batch, y_eval_batch = data.eval_data.get_next_batch(train_batch_size, multiple_passes=True)
            eval_dict = {model.x_input: x_eval_batch, model.y_input: y_eval_batch}

            # Output to stdout
            if ii % summary_steps == 0:
                train_acc, summary = sess.run([model.accuracy, merged_summaries], feed_dict=nat_dict)
                summary_writer.add_summary(summary, global_step.eval(sess))
                val_acc, summary = sess.run([model.accuracy, merged_summaries], feed_dict=eval_dict)
                eval_summary_writer.add_summary(summary, global_step.eval(sess))
                print('Step {}:    ({})'.format(ii, datetime.now()))
                print('    training nat accuracy {:.4}% -- validation nat accuracy {:.4}%'.format(train_acc * 100,
                                                                                                  val_acc * 100))
                sys.stdout.flush()
            # Tensorboard summaries
            elif ii % out_steps == 0:
                nat_acc = sess.run(model.accuracy, feed_dict=nat_dict)
                print('Step {}:    ({})'.format(ii, datetime.now()))
                print('    training nat accuracy {:.4}%'.format(nat_acc * 100))

            # Write a checkpoint
            if (ii+1) % checkpoint_steps == 0:
                saver.save(sess, os.path.join(model_dir, 'checkpoint'), global_step=global_step)

            # Actual training step
            sess.run(min_step, feed_dict=nat_dict)


if __name__ == '__main__':
    args = config.get_args()
    train(**vars(args))


================================================
FILE: multi_restart_pgd_attack.py
================================================
"""
Implementation of attack methods. Running this file as a program will
evaluate the model and get the validation accuracy and then
apply the attack to the model specified by the config file and store
the examples in an .npy file.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tensorflow as tf
import numpy as np
import sys
import cifar10_input
import cifar100_input
import config
from tqdm import tqdm
import os

config = config.get_args()
_NUM_RESTARTS = config.num_restarts


class LinfPGDAttack:
    def __init__(self, model, epsilon, num_steps, step_size, loss_func):
        """Attack parameter initialization. The attack performs k steps of
           size a, while always staying within epsilon from the initial
           point."""
        self.model = model
        self.epsilon = epsilon
        self.num_steps = num_steps
        self.step_size = step_size

        if loss_func == 'xent':
            loss = model.xent
        elif loss_func == 'cw':
            label_mask = tf.one_hot(model.y_input,
                                    10,
                                    on_value=1.0,
                                    off_value=0.0,
                                    dtype=tf.float32)
            correct_logit = tf.reduce_sum(label_mask * model.pre_softmax, axis=1)
            wrong_logit = tf.reduce_max((1 - label_mask) * model.pre_softmax - 1e4 * label_mask, axis=1)
            loss = -tf.nn.relu(correct_logit - wrong_logit + 0)
        else:
            print('Unknown loss function. Defaulting to cross-entropy')
            loss = model.xent

        self.grad = tf.gradients(loss, model.x_input)[0]

    def perturb(self, x_nat, y, sess):
        """Given a set of examples (x_nat, y), returns a set of adversarial
           examples within epsilon of x_nat in l_infinity norm."""
        x = x_nat + np.random.uniform(-self.epsilon, self.epsilon, x_nat.shape)
        x = np.clip(x, 0, 255)

        for i in range(self.num_steps):
            grad = sess.run(self.grad, feed_dict={self.model.x_input: x,
                                                  self.model.y_input: y})

            x = np.add(x, self.step_size * np.sign(grad), out=x, casting='unsafe')

            x = np.clip(x, x_nat - self.epsilon, x_nat + self.epsilon)
            x = np.clip(x, 0, 255)  # ensure valid pixel range

        return x

def get_path_dir(data_dir, dataset, **_):
    path = os.path.join(data_dir, dataset)
    if os.path.islink(path):
        path = os.readlink(path)
    return path


if __name__ == '__main__':
    import sys
    import math
    from free_model import Model

    model_file = tf.train.latest_checkpoint(config.model_dir)
    if model_file is None:
        print('No model found')
        sys.exit()

    dataset = config.dataset
    data_dir = config.data_dir
    data_path = get_path_dir(data_dir, dataset)

    model = Model(mode='eval', dataset=dataset)
    attack = LinfPGDAttack(model,
                           config.epsilon,
                           config.pgd_steps,
                           config.step_size,
                           config.loss_func)
    saver = tf.train.Saver()


    if dataset == 'cifar10': 
      cifar = cifar10_input.CIFAR10Data(data_path)
    else:
      cifar = cifar100_input.CIFAR100Data(data_path)

    with tf.Session() as sess:
        # Restore the checkpoint
        saver.restore(sess, model_file)

        # Iterate over the samples batch-by-batch
        num_eval_examples = config.eval_examples
        eval_batch_size = config.eval_size
        num_batches = int(math.ceil(num_eval_examples / eval_batch_size))

        x_adv = []  # adv accumulator

        print('getting clean validation accuracy')
        total_corr = 0
        for ibatch in tqdm(range(num_batches)):
            bstart = ibatch * eval_batch_size
            bend = min(bstart + eval_batch_size, num_eval_examples)

            x_batch = cifar.eval_data.xs[bstart:bend, :].astype(np.float32)
            y_batch = cifar.eval_data.ys[bstart:bend]

            dict_val = {model.x_input: x_batch, model.y_input: y_batch}
            cur_corr = sess.run(model.num_correct, feed_dict=dict_val)
            total_corr += cur_corr
        print('** validation accuracy: %.3f **\n\n' % (total_corr / float(num_eval_examples) * 100))

        print('Iterating over {} batches'.format(num_batches))

        total_corr, total_num = 0, 0
        for ibatch in range(num_batches):
            bstart = ibatch * eval_batch_size
            bend = min(bstart + eval_batch_size, num_eval_examples)
            curr_num = bend - bstart
            total_num += curr_num
            print('mini batch: {}/{} -- batch size: {}'.format(ibatch + 1, num_batches, curr_num))
            sys.stdout.flush()

            x_batch = cifar.eval_data.xs[bstart:bend, :].astype(np.float32)
            y_batch = cifar.eval_data.ys[bstart:bend]

            best_batch_adv = np.copy(x_batch)
            dict_adv = {model.x_input: best_batch_adv, model.y_input: y_batch}
            cur_corr, y_pred_batch, best_loss = sess.run([model.num_correct, model.predictions, model.y_xent],
                                                         feed_dict=dict_adv)
            for ri in range(_NUM_RESTARTS):
                x_batch_adv = attack.perturb(x_batch, y_batch, sess)
                dict_adv = {model.x_input: x_batch_adv, model.y_input: y_batch}
                cur_corr, y_pred_batch, this_loss = sess.run([model.num_correct, model.predictions, model.y_xent],
                                                             feed_dict=dict_adv)
                bb = best_loss >= this_loss
                bw = best_loss < this_loss
                best_batch_adv[bw, :, :, :] = x_batch_adv[bw, :, :, :]

                best_corr, y_pred_batch, best_loss = sess.run([model.num_correct, model.predictions, model.y_xent],
                                                              feed_dict={model.x_input: best_batch_adv,
                                                                         model.y_input: y_batch})
                print('restart %d: num correct: %d -- loss:%.4f' % (ri, best_corr, np.mean(best_loss)))
            total_corr += best_corr
            print('accuracy till now {:4}% \n\n'.format(float(total_corr) / total_num * 100))

            x_adv.append(best_batch_adv)

        x_adv = np.concatenate(x_adv, axis=0)


================================================
FILE: requirements/base.txt
================================================
ConfigArgParse==0.14.0
tqdm==4.31.1


================================================
FILE: requirements/cpu.txt
================================================
-r base.txt
tensorflow>=0.10.0


================================================
FILE: requirements/gpu.txt
================================================
-r base.txt
tensorflow-gpu>=0.10.0