Showing preview only (409K chars total). Download the full file or copy to clipboard to get everything.
Repository: deepmind/interval-bound-propagation
Branch: master
Commit: 217a14d12686
Files: 42
Total size: 391.2 KB
Directory structure:
gitextract_zmfehsxp/
├── CONTRIBUTING.md
├── LICENSE
├── README.md
├── examples/
│ ├── eval.py
│ ├── language/
│ │ ├── README.md
│ │ ├── config.py
│ │ ├── data/
│ │ │ ├── character_substitution_enkey_sub1.json
│ │ │ ├── sst_binary_character_vocabulary_sorted.txt
│ │ │ └── sst_binary_character_vocabulary_sorted_pad.txt
│ │ ├── exhaustive_verification.py
│ │ ├── interactive_example.py
│ │ ├── models.py
│ │ ├── robust_model.py
│ │ ├── robust_train.py
│ │ └── utils.py
│ └── train.py
├── interval_bound_propagation/
│ ├── __init__.py
│ ├── src/
│ │ ├── __init__.py
│ │ ├── attacks.py
│ │ ├── bounds.py
│ │ ├── crown.py
│ │ ├── fastlin.py
│ │ ├── layer_utils.py
│ │ ├── layers.py
│ │ ├── loss.py
│ │ ├── model.py
│ │ ├── relative_bounds.py
│ │ ├── simplex_bounds.py
│ │ ├── specification.py
│ │ ├── utils.py
│ │ └── verifiable_wrapper.py
│ └── tests/
│ ├── attacks_test.py
│ ├── bounds_test.py
│ ├── crown_test.py
│ ├── fastlin_test.py
│ ├── layers_test.py
│ ├── loss_test.py
│ ├── model_test.py
│ ├── relative_bounds_test.py
│ ├── simplex_bounds_test.py
│ └── specification_test.py
└── setup.py
================================================
FILE CONTENTS
================================================
================================================
FILE: CONTRIBUTING.md
================================================
# How to Contribute
We'd love to accept your patches and contributions to this project. There are
just a few small guidelines you need to follow.
## Contributor License Agreement
Contributions to this project must be accompanied by a Contributor License
Agreement. You (or your employer) retain the copyright to your contribution;
this simply gives us permission to use and redistribute your contributions as
part of the project. Head over to <https://cla.developers.google.com/> to see
your current agreements on file or to sign a new one.
You generally only need to submit a CLA once, so if you've already submitted one
(even if it was for a different project), you probably don't need to do it
again.
## Code reviews
All submissions, including submissions by project members, require review. We
use GitHub pull requests for this purpose. Consult
[GitHub Help](https://help.github.com/articles/about-pull-requests/) for more
information on using pull requests.
## Community Guidelines
This project follows
[Google's Open Source Community Guidelines](https://opensource.google.com/conduct/).
================================================
FILE: LICENSE
================================================
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
================================================
FILE: README.md
================================================
# Interval Bound Propagation for Training Verifiably Robust Models
This repository contains a simple implementation of Interval Bound Propagation
(IBP) using TensorFlow:
[https://arxiv.org/abs/1810.12715](https://arxiv.org/abs/1810.12715).
It also contains an implementation of CROWN-IBP:
[https://arxiv.org/abs/1906.06316](https://arxiv.org/abs/1906.06316).
It also contains a sentiment analysis example under [`examples/language`](https://github.com/deepmind/interval-bound-propagation/tree/master/examples/language)
for [https://arxiv.org/abs/1909.01492](https://arxiv.org/abs/1909.01492).
This is not an official Google product
## Installation
IBP can be installed with the following command:
```bash
pip install git+https://github.com/deepmind/interval-bound-propagation
```
IBP will work with both the CPU and GPU version of tensorflow and dm-sonnet, but
to allow for that it does not list Tensorflow as a requirement, so you need to
install Tensorflow and Sonnet separately if you haven't already done so.
## Usage
The following command trains a small model on MNIST with epsilon set to 0.3:
```bash
cd examples
python train.py --model=small --output_dir=/tmp/small_model
```
## Pretrained Models
Models trained using IBP and CROWN-IBP can be downloaded
[here](https://drive.google.com/open?id=1lovI-fUabgs3swMgIe7MLRvHB9KtjzNT).
### IBP models:
| Dataset | Test epsilon | Model path | Clean accuracy | Verified accuracy | Accuracy under attack |
|----------|--------------|----------------------------|----------------|-------------------|-----------------------|
| MNIST | 0.1 | ibp/mnist_0.2_medium | 98.94% | 97.08% | 97.99% |
| MNIST | 0.2 | ibp/mnist_0.4_large_200 | 98.34% | 95.47% | 97.06% |
| MNIST | 0.3 | ibp/mnist_0.4_large_200 | 98.34% | 91.79% | 96.03% |
| MNIST | 0.4 | ibp/mnist_0.4_large_200 | 98.34% | 84.99% | 94.56% |
| CIFAR-10 | 2/255 | ibp/cifar_2-255_large_200 | 70.21% | 44.12% | 56.53% |
| CIFAR-10 | 8/255 | ibp/cifar_8-255_large | 49.49% | 31.56% | 39.53% |
### CROWN-IBP models:
| Dataset | Test epsilon | Model path | Clean accuracy | Verified accuracy | Accuracy under attack |
|----------|--------------|------------------------------|----------------|-------------------|-----------------------|
| MNIST | 0.1 | crown-ibp/mnist_0.2_large | 99.03% | 97.75% | 98.34% |
| MNIST | 0.2 | crown-ibp/mnist_0.4_large | 98.38% | 96.13% | 97.28% |
| MNIST | 0.3 | crown-ibp/mnist_0.4_large | 98.38% | 93.32% | 96.38% |
| MNIST | 0.4 | crown-ibp/mnist_0.4_large | 98.38% | 87.51% | 94.95% |
| CIFAR-10 | 2/255 | crown-ibp/cifar_2-255_large | 71.52% | 53.97% | 59.72% |
| CIFAR-10 | 8/255 | crown-ibp/cifar_8-255_large | 47.14% | 33.30% | 36.81% |
| CIFAR-10 | 16/255 | crown-ibp/cifar_16-255_large | 34.19% | 23.08% | 26.55% |
In these tables, we evaluated the verified accuracy using IBP only.
We evaluted the accuracy under attack using a 20-step untargeted PGD attack.
You can evaluate these models yourself using `eval.py`, for example:
```bash
cd examples
python eval.py --model_dir pretrained_models/ibp/mnist_0.4_large_200/ \
--epsilon 0.3
```
Note that we evaluated the CIFAR-10 2/255 CROWN-IBP model using CROWN-IBP
(instead of pure IBP). You can do so yourself by setting the flag
`--bound_method=crown-ibp`:
```bash
python eval.py --model_dir pretrained_models/crown-ibp/cifar_2-255_large/ \
--epsilon 0.00784313725490196 --bound_method=crown-ibp
```
## Giving credit
If you use this code in your work, we ask that you cite this paper:
Sven Gowal, Krishnamurthy Dvijotham, Robert Stanforth, Rudy Bunel, Chongli Qin,
Jonathan Uesato, Relja Arandjelovic, Timothy Mann, and Pushmeet Kohli.
"On the Effectiveness of Interval Bound Propagation for Training Verifiably
Robust Models." _arXiv preprint arXiv:1810.12715 (2018)_.
If you use CROWN-IBP, we also ask that you cite:
Huan Zhang, Hongge Chen, Chaowei Xiao, Sven Gowal, Robert Stanforth, Bo Li,
Duane Boning, Cho-Jui Hsieh.
"Towards Stable and Efficient Training of Verifiably Robust Neural Networks."
_arXiv preprint arXiv:1906.06316 (2019)_.
If you use the sentiment analysis example, please cite:
Po-Sen Huang, Robert Stanforth, Johannes Welbl, Chris Dyer, Dani Yogatama, Sven Gowal, Krishnamurthy Dvijotham, Pushmeet Kohli.
"Achieving Verified Robustness to Symbol Substitutions via Interval Bound Propagation."
_EMNLP 2019_.
## Acknowledgements
In addition to the people involved in the original IBP publication, we would
like to thank Huan Zhang, Sumanth Dathathri and Johannes Welbl for their
contributions.
================================================
FILE: examples/eval.py
================================================
# coding=utf-8
# Copyright 2019 The Interval Bound Propagation Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Evaluates a verifiable model on Mnist or CIFAR-10."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from absl import app
from absl import flags
from absl import logging
import interval_bound_propagation as ibp
import tensorflow.compat.v1 as tf
FLAGS = flags.FLAGS
flags.DEFINE_enum('dataset', 'auto', ['auto', 'mnist', 'cifar10'], 'Dataset '
'("auto", "mnist" or "cifar10"). When set to "auto", '
'the dataset is inferred from the model directory path.')
flags.DEFINE_enum('model', 'auto', ['auto', 'tiny', 'small', 'medium',
'large_200', 'large'], 'Model size. '
'When set to "auto", the model name is inferred from the '
'model directory path.')
flags.DEFINE_string('model_dir', None, 'Model checkpoint directory.')
flags.DEFINE_enum('bound_method', 'ibp', ['ibp', 'crown-ibp'],
'Bound progataion method. For models trained with CROWN-IBP '
'and beta_final=1 (e.g., CIFAR 2/255), use "crown-ibp". '
'Otherwise use "ibp".')
flags.DEFINE_integer('batch_size', 200, 'Batch size.')
flags.DEFINE_float('epsilon', .3, 'Target epsilon.')
def layers(model_size):
"""Returns the layer specification for a given model name."""
if model_size == 'tiny':
return (
('linear', 100),
('activation', 'relu'))
elif model_size == 'small':
return (
('conv2d', (4, 4), 16, 'VALID', 2),
('activation', 'relu'),
('conv2d', (4, 4), 32, 'VALID', 1),
('activation', 'relu'),
('linear', 100),
('activation', 'relu'))
elif model_size == 'medium':
return (
('conv2d', (3, 3), 32, 'VALID', 1),
('activation', 'relu'),
('conv2d', (4, 4), 32, 'VALID', 2),
('activation', 'relu'),
('conv2d', (3, 3), 64, 'VALID', 1),
('activation', 'relu'),
('conv2d', (4, 4), 64, 'VALID', 2),
('activation', 'relu'),
('linear', 512),
('activation', 'relu'),
('linear', 512),
('activation', 'relu'))
elif model_size == 'large_200':
# Some old large checkpoints have 200 hidden neurons in the last linear
# layer.
return (
('conv2d', (3, 3), 64, 'SAME', 1),
('activation', 'relu'),
('conv2d', (3, 3), 64, 'SAME', 1),
('activation', 'relu'),
('conv2d', (3, 3), 128, 'SAME', 2),
('activation', 'relu'),
('conv2d', (3, 3), 128, 'SAME', 1),
('activation', 'relu'),
('conv2d', (3, 3), 128, 'SAME', 1),
('activation', 'relu'),
('linear', 200),
('activation', 'relu'))
elif model_size == 'large':
return (
('conv2d', (3, 3), 64, 'SAME', 1),
('activation', 'relu'),
('conv2d', (3, 3), 64, 'SAME', 1),
('activation', 'relu'),
('conv2d', (3, 3), 128, 'SAME', 2),
('activation', 'relu'),
('conv2d', (3, 3), 128, 'SAME', 1),
('activation', 'relu'),
('conv2d', (3, 3), 128, 'SAME', 1),
('activation', 'relu'),
('linear', 512),
('activation', 'relu'))
else:
raise ValueError('Unknown model: "{}"'.format(model_size))
def show_metrics(metric_values, bound_method='ibp'):
if bound_method == 'crown-ibp':
verified_accuracy = metric_values.crown_ibp_verified_accuracy
else:
verified_accuracy = metric_values.verified_accuracy
print('nominal accuracy = {:.2f}%, '
'verified accuracy = {:.2f}%, '
'accuracy under PGD attack = {:.2f}%'.format(
metric_values.nominal_accuracy * 100.,
verified_accuracy* 100.,
metric_values.attack_accuracy * 100.))
def main(unused_args):
dataset = FLAGS.dataset
if FLAGS.dataset == 'auto':
if 'mnist' in FLAGS.model_dir:
dataset = 'mnist'
elif 'cifar' in FLAGS.model_dir:
dataset = 'cifar10'
else:
raise ValueError('Cannot guess the dataset name. Please specify '
'--dataset manually.')
model_name = FLAGS.model
if FLAGS.model == 'auto':
model_names = ['large_200', 'large', 'medium', 'small', 'tiny']
for name in model_names:
if name in FLAGS.model_dir:
model_name = name
logging.info('Using guessed model name "%s".', model_name)
break
if model_name == 'auto':
raise ValueError('Cannot guess the model name. Please specify --model '
'manually.')
checkpoint_path = tf.train.latest_checkpoint(FLAGS.model_dir)
if checkpoint_path is None:
raise OSError('Cannot find a valid checkpoint in {}.'.format(
FLAGS.model_dir))
# Dataset.
input_bounds = (0., 1.)
num_classes = 10
if dataset == 'mnist':
data_train, data_test = tf.keras.datasets.mnist.load_data()
else:
assert dataset == 'cifar10', (
'Unknown dataset "{}"'.format(dataset))
data_train, data_test = tf.keras.datasets.cifar10.load_data()
data_train = (data_train[0], data_train[1].flatten())
data_test = (data_test[0], data_test[1].flatten())
# Base predictor network.
original_predictor = ibp.DNN(num_classes, layers(model_name))
predictor = original_predictor
if dataset == 'cifar10':
mean = (0.4914, 0.4822, 0.4465)
std = (0.2023, 0.1994, 0.2010)
predictor = ibp.add_image_normalization(original_predictor, mean, std)
if FLAGS.bound_method == 'crown-ibp':
predictor = ibp.crown.VerifiableModelWrapper(predictor)
else:
predictor = ibp.VerifiableModelWrapper(predictor)
# Test using while loop.
def get_test_metrics(batch_size, attack_builder=ibp.UntargetedPGDAttack):
"""Returns the test metrics."""
num_test_batches = len(data_test[0]) // batch_size
assert len(data_test[0]) % batch_size == 0, (
'Test data is not a multiple of batch size.')
def cond(i, *unused_args):
return i < num_test_batches
def body(i, metrics):
"""Compute the sum of all metrics."""
test_data = ibp.build_dataset(data_test, batch_size=batch_size,
sequential=True)
predictor(test_data.image, override=True, is_training=False)
input_interval_bounds = ibp.IntervalBounds(
tf.maximum(test_data.image - FLAGS.epsilon, input_bounds[0]),
tf.minimum(test_data.image + FLAGS.epsilon, input_bounds[1]))
predictor.propagate_bounds(input_interval_bounds)
test_specification = ibp.ClassificationSpecification(
test_data.label, num_classes)
test_attack = attack_builder(predictor, test_specification, FLAGS.epsilon,
input_bounds=input_bounds,
optimizer_builder=ibp.UnrolledAdam)
# Use CROWN-IBP bound or IBP bound.
if FLAGS.bound_method == 'crown-ibp':
test_losses = ibp.crown.Losses(predictor, test_specification,
test_attack, use_crown_ibp=True,
crown_bound_schedule=tf.constant(1.))
else:
test_losses = ibp.Losses(predictor, test_specification, test_attack)
test_losses(test_data.label)
new_metrics = []
for m, n in zip(metrics, test_losses.scalar_metrics):
new_metrics.append(m + n)
return i + 1, new_metrics
if FLAGS.bound_method == 'crown-ibp':
metrics = ibp.crown.ScalarMetrics
else:
metrics = ibp.ScalarMetrics
total_count = tf.constant(0, dtype=tf.int32)
total_metrics = [tf.constant(0, dtype=tf.float32)
for _ in range(len(metrics._fields))]
total_count, total_metrics = tf.while_loop(
cond,
body,
loop_vars=[total_count, total_metrics],
back_prop=False,
parallel_iterations=1)
total_count = tf.cast(total_count, tf.float32)
test_metrics = []
for m in total_metrics:
test_metrics.append(m / total_count)
return metrics(*test_metrics)
test_metrics = get_test_metrics(
FLAGS.batch_size, ibp.UntargetedPGDAttack)
# Prepare to load the pretrained-model.
saver = tf.compat.v1.train.Saver(original_predictor.get_variables())
# Run everything.
tf_config = tf.ConfigProto()
tf_config.gpu_options.allow_growth = True
with tf.train.SingularMonitoredSession(config=tf_config) as sess:
logging.info('Restoring from checkpoint "%s".', checkpoint_path)
saver.restore(sess, checkpoint_path)
logging.info('Evaluating at epsilon = %f.', FLAGS.epsilon)
metric_values = sess.run(test_metrics)
show_metrics(metric_values, FLAGS.bound_method)
if __name__ == '__main__':
flags.mark_flag_as_required('model_dir')
app.run(main)
================================================
FILE: examples/language/README.md
================================================
# Achieving Verified Robustness to Symbol Substitutions via Interval Bound Propagation
Here contains an implementation of
[Achieving Verified Robustness to Symbol Substitutions via Interval Bound
Propagation](https://arxiv.org/abs/1909.01492).
## Installation
The installation can be done with the following commands:
```bash
pip3 install "tensorflow-gpu<2" "dm-sonnet<2" "tensorflow-probability==0.7.0" "tensorflow-datasets" "absl-py"
pip3 install git+https://github.com/deepmind/interval-bound-propagation
```
## Usage
The following command reproduces the [SST](https://nlp.stanford.edu/sentiment/)
character level experiments using perturbation radius of 3:
```bash
cd examples/language
python3 robust_train.py
```
You should expect to see the following at the end of training
(note we only use SST dev set only for evaluation here).
```bash
step: 149900, train loss: 0.392112, verifiable train loss: 0.826042,
train accuracy: 0.850000, dev accuracy: 0.747619, test accuracy: 0.747619,
Train Bound = -0.42432, train verified: 0.800,
dev verified: 0.695, test verified: 0.695
best dev acc 0.780952 best test acc 0.780952
best verified dev acc 0.716667 best verified test acc 0.716667
```
We can verify the model in
`config['model_location']='/tmp/robust_model/checkpoint/final'` using IBP.
For example, after changing `config['delta']=1.`, we can evaluate the IBP
verified accuracy with perturbation radius of 1:
```bash
python3 robust_train.py --analysis --batch_size=1
```
We expect to see results like the following:
```bash
test final correct: 0.748, verified: 0.722
{'datasplit': 'test', 'nominal': 0.7477064220183486,
'verify': 0.7224770642201835, 'delta': 1.0,
'num_perturbations': 268,
'model_location': '/tmp/robust_model/checkpoint/final', 'final': True}
```
We can also exhaustively search all valid perturbations to exhaustively verify
the models.
```bash
python3 exhaustive_verification.py --num_examples=0
```
We should expect the following results
```bash
verified_proportion: 0.7350917431192661
{'delta': 1, 'character_level': True, 'mode': 'validation', 'checkpoint_path': '/tmp/robust_model/checkpoint/final', 'verified_proportion': 0.7350917431192661}
```
The IBP verified accuracy ` 0.7224770642201835` is a lower bound of the
exhaustive verification results, `0.7350917431192661`.
Furthermore, we can also align the predictions between the IBP verification
and exhaustive verification. There should not be cases where IBP can verify
(no attack can change the predictions) and exhaustive verification cannot
verify (there exist an attack that can change the predictions), since IBP
provides a lower bound on the true robustness accuracy (via exhaustive search).
## Reference
If you use this code in your work, please cite the accompanying paper:
```
@inproceedings{huang-2019-achieving,
title = "Achieving Verified Robustness to Symbol Substitutions via Interval Bound Propagation",
author = "Po-Sen Huang and
Robert Stanforth and
Johannes Welbl and
Chris Dyer and
Dani Yogatama and
Sven Gowal and
Krishnamurthy Dvijotham and
Pushmeet Kohli",
booktitle = "Empirical Methods in Natural Language Processing (EMNLP)",
year = "2019",
pages = "4081--4091",
}
```
## Disclaimer
This is not an official Google product.
================================================
FILE: examples/language/config.py
================================================
# coding=utf-8
# Copyright 2019 The Interval Bound Propagation Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Configuration parameters for sentence representation models."""
def get_config():
"""Returns the default configuration as a dict."""
config = {}
config['dataset'] = 'sst'
# Convolutional architecture.
# Format: Tuple/List for a Conv layer (filters, kernel_size, pooling_size)
# Otherwise, nonlinearity.
config['conv_architecture'] = ((100, 5, 1), 'relu')
# Fully connected layer 1 hidden sizes (0 means no layer).
config['conv_fc1'] = 0
# Fully connected layer 2 hidden sizes (0 means no layer).
config['conv_fc2'] = 0
# Number of allowable perturbations.
# (delta specifies the budget, i.e., how many may be used at once.)
config['delta'] = 3.0
# Allow each character to be changed to another character.
config['synonym_filepath'] = 'data/character_substitution_enkey_sub1.json'
config['max_padded_length'] = 268
# (~1*268) Max num_perturbations.
# seqlen * max_number_synonyms (total number of elementary perturbations)
config['num_perturbations'] = 268
config['vocab_filename'] = 'data/sst_binary_character_vocabulary_sorted.txt'
# Need to add pad for analysis (which is what is used after
# utils.get_merged_vocabulary_file).
config['vocab_filename_pad'] = (
'data/sst_binary_character_vocabulary_sorted_pad.txt')
config['embedding_dim'] = 150
config['delta_schedule'] = True
config['verifiable_loss_schedule'] = True
# Ratio between the task loss and verifiable loss.
config['verifiable_loss_ratio'] = 0.75
# Aggregrated loss of the verifiable training objective
# (among softmax, mean, max).
config['verifiable_training_aggregation'] = 'softmax'
config['data_id'] = 1
config['model_location'] = '/tmp/robust_model/checkpoint/final'
return config
================================================
FILE: examples/language/data/character_substitution_enkey_sub1.json
================================================
{"z": ["x"], "y": ["t"], "x": ["s"], "w": ["d"], "v": ["c"], "u": ["8"], "t": ["f"], "s": ["e"], "r": ["g"], "q": ["s"], "p": [";"], "o": ["k"], "n": ["m"], "m": ["j"], "l": ["p"], "k": ["."], "j": ["i"], "i": ["u"], "h": ["n"], "g": ["v"], "f": ["c"], "e": ["r"], "d": ["f"], "c": ["d"], "b": ["g"], "a": ["x"]}
================================================
FILE: examples/language/data/sst_binary_character_vocabulary_sorted.txt
================================================
!
#
$
%
&
'
(
)
*
+
,
-
.
/
0
1
2
3
4
5
6
7
8
9
:
;
=
?
`
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
q
r
s
t
u
v
w
x
y
z
================================================
FILE: examples/language/data/sst_binary_character_vocabulary_sorted_pad.txt
================================================
<PAD>
!
#
$
%
&
'
(
)
*
+
,
-
.
/
0
1
2
3
4
5
6
7
8
9
:
;
=
?
`
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
q
r
s
t
u
v
w
x
y
z
================================================
FILE: examples/language/exhaustive_verification.py
================================================
# coding=utf-8
# Copyright 2019 The Interval Bound Propagation Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Functionality for exhaustive adversarial attacks on synonym perturbations.
Models restored from checkpoint can be tested w.r.t their robustness to
exhaustive-search adversaries, which have a fixed perturbation budget with which
they can flip words to synonyms.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import collections
import copy
import imp
import json
import pprint
from absl import app
from absl import flags
from absl import logging
import numpy as np
import tensorflow.compat.v1 as tf
import tensorflow_datasets as tfds
import tqdm
import interactive_example
flags.DEFINE_boolean('character_level', True, 'Character level model.')
flags.DEFINE_boolean('debug_mode', False, 'Debug mode.')
flags.DEFINE_string('checkpoint_path', '/tmp/robust_model/checkpoint/final',
'Checkpoint path.')
flags.DEFINE_string('dataset', 'sst', 'Dataset name. train, dev, or test.')
flags.DEFINE_string('mode', 'validation', 'Dataset part. train, dev, or test.')
flags.DEFINE_string('config_path', './config.py',
'Path to training configuration file.')
flags.DEFINE_string('task', 'sst', 'One of snli, mnli, sick, sst.')
flags.DEFINE_integer('batch_size', 30, 'Batch size.')
flags.DEFINE_string('pooling', 'average', 'One of averge, sum, max, last.')
flags.DEFINE_boolean('fine_tune_embeddings', True, 'Finetune embeddings.')
flags.DEFINE_integer('num_oov_buckets', 1, 'Number of out-of-vocab buckets.')
flags.DEFINE_integer('delta', 1, 'Maximum perturbation radius')
flags.DEFINE_integer('skip_batches', 0, 'Skip this number of batches'
' for analysis.')
flags.DEFINE_integer('num_examples', 100, 'Analyze this number of examples. '
' 0 suggest the whole dataset.')
flags.DEFINE_integer('truncated_len', 0, 'truncated sentence length. '
' 0 suggest the whole sentence.')
flags.DEFINE_integer('max_padded_length', 0, 'max_padded_length. '
' 0 suggest no change.')
flags.DEFINE_integer('num_perturbations', 0, 'num_perturbations. '
' 0 suggest no change.')
FLAGS = flags.FLAGS
def load_synonyms(synonym_filepath=None):
"""Loads synonym dictionary. Returns as defaultdict(list)."""
with tf.gfile.Open(synonym_filepath) as f:
synonyms = json.load(f)
synonyms_ = collections.defaultdict(list)
synonyms_.update(synonyms)
return synonyms_
def load_dataset(mode='validation', character_level=False):
"""Loads SST dataset.
Takes data from disk/cns if it exists, otherwise out of tensorflow graph.
Args:
mode: string. Either train, dev, or test.
character_level: bool. Whether to return character-level, or token level
inputs.
Returns:
List of (input, output) pairs, where input is a list of strings (tokens),
and output is an integer (categorical label in [0,1]).
"""
message = 'Loading SST {}, character_level {}'.format(mode,
str(character_level))
logging.info(message)
dataset = tfds.load(name='glue/sst2', split=mode)
minibatch = dataset.batch(1).make_one_shot_iterator().get_next()
label_list, input_list = [], []
with tf.train.SingularMonitoredSession() as session:
while True:
output_nodes = (minibatch['label'], minibatch['sentence'])
label, sentence = session.run(output_nodes)
label_list.append(label[0])
input_list.append([chr(i) for i in sentence[0]])
# zip together.
dataset = [(in_, out_) for (in_, out_) in zip(input_list, label_list)]
return dataset
def expand_by_one_perturbation(original_tokenized_sentence,
tokenized_sentence, synonym_dict):
"""Expands given sentence by all possible synonyms.
Note that only a single synonym replacement is applied, and it is applied
everywhere, i.e. for every mention of the word with the synonym.
Args:
original_tokenized_sentence: List[str]. List of tokens.
tokenized_sentence: List[str]. List of tokens.
synonym_dict: dict, mapping words (str) to lists of synonyms (list of str)
Returns:
new_sentences_list: List[List[str]]. Outer list is across different synonym
replacements. Inner list is over (str) tokens.
"""
new_sentences_list = []
for i_outer, (original_token, _) in enumerate(zip(
original_tokenized_sentence, tokenized_sentence)):
synonyms = synonym_dict[original_token]
for synonym in synonyms: # replace only one particular mention
new_sentence = copy.copy(tokenized_sentence)
new_sentence[i_outer] = synonym
new_sentences_list.append(new_sentence)
return new_sentences_list
def find_up_to_depth_k_perturbations(
original_tokenized_sentence, tokenized_sentence, synonym_dict, k):
"""Takes sentence, finds all sentences reachable using k token perturbations.
Args:
original_tokenized_sentence: List[str]. List of tokens.
tokenized_sentence: List[str]. List of tokens.
synonym_dict: dict, mapping words (str) to lists of synonyms (list of str)
k: int. perturbation depth parameter.
Returns:
output_sentences: List[List[str]]. List of tokenised sentences.
"""
# Case: recursion ends - no further perturbations.
if k == 0:
return [tokenized_sentence]
else:
# Expand by one level.
expanded_sentences = expand_by_one_perturbation(original_tokenized_sentence,
tokenized_sentence,
synonym_dict)
# Call recursive function one level deeper for each expanded sentence.
expanded_sentences_deeper = []
for sentence in expanded_sentences:
new_sentences = find_up_to_depth_k_perturbations(
original_tokenized_sentence, sentence, synonym_dict, k-1)
expanded_sentences_deeper.extend(new_sentences)
output_sentences = expanded_sentences + expanded_sentences_deeper
output_sentences = remove_duplicates(output_sentences)
return output_sentences
def remove_duplicates(list_of_list_of_tokens):
# Convert list of str to str.
sentences = ['|'.join(s) for s in list_of_list_of_tokens]
sentences = set(sentences) # Now hashable -> remove duplicates.
sentences = [s.split('|') for s in sentences] # Convert to original format.
return sentences
def verify_exhaustively(sample, synonym_dict, sst_model, delta,
truncated_len=0):
"""Returns True if a sample can be verified, False otherwise.
Args:
sample: a 2-tuple (x,y), where x is a tokenised sentence (List[str]), and y
is a label (int).
synonym_dict: str -> List[str]. Keys are words, values are word lists with
synonyms for the key word.
sst_model: InteractiveSentimentPredictor instance. Used to make predictions.
delta: int. How many synonym perturbations to maximally allow.
truncated_len: int. Truncate sentence to truncated_len. 0 for unchanged.
Returns:
verified: bool. Whether all possible perturbed version of input sentence x
up to perturbation radius delta have the correct prediction.
"""
(x, y) = sample
counter_example = None
counter_prediction = None
# Create (potentially long) list of perturbed sentences from x.
if truncated_len > 0:
x = x[: truncated_len]
# Add original sentence.
altered_sentences = find_up_to_depth_k_perturbations(x, x, synonym_dict,
delta)
altered_sentences = altered_sentences + [x]
# Form batches of these altered sentences.
batch = []
num_forward_passes = len(altered_sentences)
for sentence in altered_sentences:
any_prediction_wrong = False
batch.append(sentence)
# When batch_size is reached, make predictions, break if any label flip
if len(batch) == sst_model.batch_size:
# np array of size [batch_size]
predictions, _ = sst_model.batch_predict_sentiment(
batch, is_tokenised=True)
# Check any prediction that is different from the true label.
any_prediction_wrong = np.any(predictions != y)
if any_prediction_wrong:
wrong_index = np.where(predictions != y)[0].tolist()[0]
counter_example = ' '.join([str(c) for c in batch[wrong_index]])
if FLAGS.debug_mode:
logging.info('\nOriginal example: %s, prediction: %d',
' '.join([str(c) for c in sentence]), y)
logging.info('\ncounter example: %s, prediction: %s',
counter_example, predictions[wrong_index].tolist())
counter_prediction = predictions[wrong_index]
# Break. No need to evaluate further.
return False, counter_example, counter_prediction, num_forward_passes
# Start filling up the next batch.
batch = []
if not batch:
# No remainder, not previously broken the loop.
return True, None, None, num_forward_passes
else:
# Remainder -- what didn't fit into a full batch of size batch_size.
# We use the first altered_sentence to pad.
batch += [altered_sentences[0]]*(sst_model.batch_size-len(batch))
assert len(batch) == sst_model.batch_size
predictions, _ = sst_model.batch_predict_sentiment(batch, is_tokenised=True)
any_prediction_wrong = np.any(predictions != y)
if any_prediction_wrong:
wrong_index = np.where(predictions != y)[0].tolist()[0]
counter_example = ' '.join([str(c) for c in batch[wrong_index]])
if FLAGS.debug_mode:
logging.info('\nOriginal example: %s, prediction: %d',
' '.join([str(c) for c in sentence]), y) # pylint: disable=undefined-loop-variable
logging.info('\ncounter example: %s, prediction: %s', counter_example,
predictions[wrong_index].tolist())
counter_prediction = predictions[wrong_index]
return (not any_prediction_wrong, counter_example,
counter_prediction, num_forward_passes)
def verify_dataset(dataset, config_dict, model_location, synonym_dict, delta):
"""Tries to verify against perturbation attacks up to delta."""
sst_model = interactive_example.InteractiveSentimentPredictor(
config_dict, model_location,
max_padded_length=FLAGS.max_padded_length,
num_perturbations=FLAGS.num_perturbations)
verified_list = [] # Holds boolean entries, across dataset.
samples = []
labels = []
counter_examples = []
counter_predictions = []
total_num_forward_passes = []
logging.info('dataset size: %d', len(dataset))
num_examples = FLAGS.num_examples if FLAGS.num_examples else len(dataset)
logging.info('skip_batches: %d', FLAGS.skip_batches)
logging.info('num_examples: %d', num_examples)
logging.info('new dataset size: %d',
len(dataset[FLAGS.skip_batches:FLAGS.skip_batches+num_examples]))
for i, sample in tqdm.tqdm(enumerate(
dataset[FLAGS.skip_batches:FLAGS.skip_batches+num_examples])):
if FLAGS.debug_mode:
logging.info('index: %d', i)
(verified_bool, counter_example, counter_prediction, num_forward_passes
) = verify_exhaustively(
sample, synonym_dict, sst_model, delta, FLAGS.truncated_len)
samples.append(''.join(sample[0]))
labels.append(sample[1])
counter_examples.append(counter_example)
counter_predictions.append(counter_prediction)
total_num_forward_passes.append(num_forward_passes)
else:
verified_bool, _, _, num_forward_passes = verify_exhaustively(
sample, synonym_dict, sst_model, delta, FLAGS.truncated_len)
verified_list.append(verified_bool)
verified_proportion = np.mean(verified_list)
assert len(verified_list) == len(
dataset[FLAGS.skip_batches:FLAGS.skip_batches+num_examples])
return (verified_proportion, verified_list, samples, counter_examples,
counter_predictions, total_num_forward_passes)
def example(synonym_dict, dataset, k=2):
"""Example usage of functions above."""
# The below example x has these synonyms.
# 'decree' --> [edict, order],
# 'tubes' --> 'pipes';
# 'refrigerated' --> ['cooled', 'chilled']
x = ['the', 'refrigerated', 'decree', 'tubes']
# Example: 1 perturbation.
new_x = expand_by_one_perturbation(x, x, synonym_dict)
pprint.pprint(sorted(new_x))
# Example: up to k perturbations.
new_x = find_up_to_depth_k_perturbations(x, x, synonym_dict, k)
pprint.pprint(sorted(new_x))
# Statistics: how large is the combinatorial space of perturbations?
total_x = []
size_counter = collections.Counter()
for (x, _) in tqdm.tqdm(dataset):
new_x = find_up_to_depth_k_perturbations(x, x, synonym_dict, k)
size_counter[len(new_x)] += 1
total_x.extend(new_x)
# Histogram for perturbation space size, computed across dataset.
pprint.pprint([x for x in sorted(size_counter.items(), key=lambda xx: xx[0])])
# Total number of inputs for forward pass if comprehensively evaluated.
pprint.pprint(len(total_x))
def main(args):
del args
# Read the config file into a new ad-hoc module.
with open(FLAGS.config_path, 'r') as config_file:
config_code = config_file.read()
config_module = imp.new_module('config')
exec(config_code, config_module.__dict__) # pylint: disable=exec-used
config = config_module.get_config()
config_dict = {'task': FLAGS.task,
'batch_size': FLAGS.batch_size,
'pooling': FLAGS.pooling,
'learning_rate': 0.,
'config': config,
'embedding_dim': config['embedding_dim'],
'fine_tune_embeddings': FLAGS.fine_tune_embeddings,
'num_oov_buckets': FLAGS.num_oov_buckets,
'max_grad_norm': 0.}
# Maximum verification range.
delta = FLAGS.delta
character_level = FLAGS.character_level
mode = FLAGS.mode
model_location = FLAGS.checkpoint_path
# Load synonyms.
synonym_filepath = config['synonym_filepath']
synonym_dict = load_synonyms(synonym_filepath)
# Load data.
dataset = load_dataset(mode, character_level)
# Compute verifiable accuracy on dataset.
(verified_proportion, _, _, _, _, _) = verify_dataset(dataset, config_dict,
model_location,
synonym_dict, delta)
logging.info('verified_proportion:')
logging.info(str(verified_proportion))
logging.info({
'delta': FLAGS.delta,
'character_level': FLAGS.character_level,
'mode': FLAGS.mode,
'checkpoint_path': FLAGS.checkpoint_path,
'verified_proportion': verified_proportion
})
if __name__ == '__main__':
logging.set_stderrthreshold('info')
app.run(main)
================================================
FILE: examples/language/interactive_example.py
================================================
# coding=utf-8
# Copyright 2019 The Interval Bound Propagation Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Minimum code to interact with a pretrained Stanford Sentiment Treebank model.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import collections
import numpy as np
from six.moves import range
import tensorflow.compat.v1 as tf
import robust_model
SparseTensorValue = collections.namedtuple(
'SparseTensorValue', ['indices', 'values', 'dense_shape'])
class InteractiveSentimentPredictor(object):
"""Can be used to interact with a trained sentiment analysis model."""
def __init__(self, config_dict, model_location, max_padded_length=0,
num_perturbations=0):
self.graph_tensor_producer = robust_model.RobustModel(**config_dict)
self.batch_size = self.graph_tensor_producer.batch_size
if max_padded_length:
self.graph_tensor_producer.config.max_padded_length = max_padded_length
if num_perturbations:
self.graph_tensor_producer.config.num_perturbations = num_perturbations
self.graph_tensors = self.graph_tensor_producer()
network_saver = tf.train.Saver(self.graph_tensor_producer.variables)
self.open_session = tf.Session()
self.open_session.run(tf.tables_initializer())
network_saver.restore(self.open_session, model_location)
def batch_predict_sentiment(self, list_of_sentences, is_tokenised=True):
"""Computes sentiment predictions for a batch of sentences.
Note: the model batch size is usually hard-coded in the model (e.g. at 64).
We require that len(list_of_sentences)==self.batch_size.
If padding is necessary to reach as many sentences, this should happen
outside of this function.
Important: we assume that each sentence has the same number of tokens.
Args:
list_of_sentences: List[str] in case is_tokenised is False, or
List[List[str]] in case is_tokenised is True. Holds inputs whose
sentiment is to be classified.
is_tokenised: bool. Whether sentences are already tokenised. If not,
naive whitespace splitting tokenisation is applied.
Returns:
batch_label_predictions: np.array of shape [self.batch_size] holding
integers, representing model predictions for each input.
"""
# Prepare inputs.
tokenised_sentence_list = []
for sentence in list_of_sentences:
if not is_tokenised:
tokenised_sentence = sentence.lower().split(' ')
else:
tokenised_sentence = sentence
tokenised_sentence_list.append(tokenised_sentence)
length = len(tokenised_sentence_list[0])
assert all([len(x) == length for x in tokenised_sentence_list])
assert len(tokenised_sentence_list) == self.batch_size
# Construct sparse tensor holding token information.
indices = np.zeros([self.batch_size*length, 2])
dense_shape = [self.batch_size, length]
# Loop over words. All sentences have the same length.
for j, _ in enumerate(tokenised_sentence_list[0]):
for i in range(self.batch_size): # Loop over samples.
offset = i*length + j
indices[offset, 0] = i
indices[offset, 1] = j
# Define sparse tensor values.
tokenised_sentence_list = [word for sentence in tokenised_sentence_list # pylint:disable=g-complex-comprehension
for word in sentence]
values = np.array(tokenised_sentence_list)
mb_tokens = SparseTensorValue(indices=indices, values=values,
dense_shape=dense_shape)
mb_num_tokens = np.array([length]*self.batch_size)
# Fill feed_dict with input token information.
feed_dict = {}
feed_dict[self.graph_tensors['dev']['tokens']] = mb_tokens
feed_dict[self.graph_tensors['dev']['num_tokens']] = mb_num_tokens
# Generate model predictions [batch_size x n_labels].
logits = self.open_session.run(self.graph_tensors['dev']['predictions'],
feed_dict)
batch_label_predictions = np.argmax(logits, axis=1)
return batch_label_predictions, logits
def predict_sentiment(self, sentence, tokenised=False):
"""Computes sentiment of a sentence."""
# Create inputs to tensorflow graph.
if tokenised:
inputstring_tokenised = sentence
else:
assert isinstance(sentence, str)
# Simple tokenisation.
inputstring_tokenised = sentence.lower().split(' ')
length = len(inputstring_tokenised)
# Construct inputs to sparse tensor holding token information.
indices = np.zeros([self.batch_size*length, 2])
dense_shape = [self.batch_size, length]
for j, _ in enumerate(inputstring_tokenised):
for i in range(self.batch_size):
offset = i*length + j
indices[offset, 0] = i
indices[offset, 1] = j
values = inputstring_tokenised*self.batch_size
mb_tokens = SparseTensorValue(indices=indices, values=np.array(values),
dense_shape=dense_shape)
mb_num_tokens = np.array([length]*self.batch_size)
# Fill feeddict with input token information.
feed_dict = {}
feed_dict[self.graph_tensors['dev']['tokens']] = mb_tokens
feed_dict[self.graph_tensors['dev']['num_tokens']] = mb_num_tokens
# Generate predictions.
logits = self.open_session.run(self.graph_tensors['dev']['predictions'],
feed_dict)
predicted_label = np.argmax(logits, axis=1)
final_prediction = predicted_label[0]
# Check that prediction same everywhere (had batch of identical inputs).
assert np.all(predicted_label == final_prediction)
return final_prediction, logits
================================================
FILE: examples/language/models.py
================================================
# coding=utf-8
# Copyright 2019 The Interval Bound Propagation Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Models for sentence representation."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import sonnet as snt
import tensorflow.compat.v1 as tf
def _max_pool_1d(x, pool_size=2, name='max_pool_1d'):
with tf.name_scope(name, 'MaxPool1D', [x, pool_size]):
return tf.squeeze(
tf.nn.max_pool(tf.expand_dims(x, 1),
[1, 1, pool_size, 1],
[1, 1, pool_size, 1],
'VALID'),
axis=1)
class SentenceRepresenterConv(snt.AbstractModule):
"""Use stacks of 1D Convolutions to build a sentence representation."""
def __init__(self,
config,
keep_prob=1.,
pooling='max',
name='sentence_rep_conv'):
super(SentenceRepresenterConv, self).__init__(name=name)
self._config = config
self._pooling = pooling
self._keep_prob = keep_prob
def _build(self, padded_word_embeddings, length):
x = padded_word_embeddings
for layer in self._config['conv_architecture']:
if isinstance(layer, tuple) or isinstance(layer, list):
filters, kernel_size, pooling_size = layer
conv = snt.Conv1D(
output_channels=filters,
kernel_shape=kernel_size)
x = conv(x)
if pooling_size and pooling_size > 1:
x = _max_pool_1d(x, pooling_size)
elif layer == 'relu':
x = tf.nn.relu(x)
if self._keep_prob < 1:
x = tf.nn.dropout(x, keep_prob=self._keep_prob)
else:
raise RuntimeError('Bad layer type {} in conv'.format(layer))
# Final layer pools over the remaining sequence length to get a
# fixed sized vector.
if self._pooling == 'max':
x = tf.reduce_max(x, axis=1)
elif self._pooling == 'average':
x = tf.reduce_sum(x, axis=1)
lengths = tf.expand_dims(tf.cast(length, tf.float32), axis=1)
x = x / lengths
if self._config['conv_fc1']:
fc1_layer = snt.Linear(output_size=self._config['conv_fc1'])
x = tf.nn.relu(fc1_layer(x))
if self._keep_prob < 1:
x = tf.nn.dropout(x, keep_prob=self._keep_prob)
if self._config['conv_fc2']:
fc2_layer = snt.Linear(output_size=self._config['conv_fc2'])
x = tf.nn.relu(fc2_layer(x))
if self._keep_prob < 1:
x = tf.nn.dropout(x, keep_prob=self._keep_prob)
return x
================================================
FILE: examples/language/robust_model.py
================================================
# coding=utf-8
# Copyright 2019 The Interval Bound Propagation Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Train verifiable robust models."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import collections
from absl import logging
import interval_bound_propagation as ibp
import numpy as np
import six
import sonnet as snt
import tensorflow.compat.v1 as tf
import tensorflow_datasets as tfds
import tensorflow_probability as tfp
from tensorflow.contrib import lookup as contrib_lookup
import models
import utils
EmbeddedDataset = collections.namedtuple(
'EmbeddedDataset',
['embedded_inputs', 'length', 'input_tokens', 'sentiment'])
Dataset = collections.namedtuple(
'Dataset',
['tokens', 'num_tokens', 'sentiment'])
Perturbation = collections.namedtuple(
'Perturbation',
['positions', 'tokens'])
def _pad_fixed(x, axis, padded_length):
"""Pads a tensor to a fixed size (rather than batch-specific)."""
pad_shape = x.shape.as_list()
pad_shape[axis] = tf.maximum(padded_length - tf.shape(x)[axis], 0)
# Pad zero as in utils.get_padded_indexes.
padded = tf.concat([x, tf.zeros(dtype=x.dtype, shape=pad_shape)], axis=axis)
assert axis == 1
padded = padded[:, :padded_length]
padded_shape = padded.shape.as_list()
padded_shape[axis] = padded_length
padded.set_shape(padded_shape)
return padded
class GeneratedDataset(snt.AbstractModule):
"""A dataset wrapper for data_gen such that it behaves like sst_binary."""
def __init__(self, data_gen, batch_size, mode='train',
num_examples=0,
dataset_name='glue/sst2',
name='generated_dataset'):
super(GeneratedDataset, self).__init__(name=name)
self._data_gen = data_gen
self._batch_size = batch_size
self._mode = mode
self._shuffle = True if mode == 'train' else False
self._num_examples = num_examples
self._dataset_name = dataset_name
def get_row_lengths(self, sparse_tensor_input):
# sparse_tensor_input is a tf.SparseTensor
# In RaggedTensor, row_lengths is a vector with shape `[nrows]`,
# which specifies the length of each row.
rt = tf.RaggedTensor.from_sparse(sparse_tensor_input)
return rt.row_lengths()
def _build(self):
dataset = tfds.load(name=self._dataset_name, split=self._mode)
minibatch = dataset.map(parse).repeat()
if self._shuffle:
minibatch = minibatch.shuffle(self._batch_size*100)
minibatch = minibatch.batch(
self._batch_size).make_one_shot_iterator().get_next()
minibatch['sentiment'].set_shape([self._batch_size])
minibatch['sentence'] = tf.SparseTensor(
indices=minibatch['sentence'].indices,
values=minibatch['sentence'].values,
dense_shape=[self._batch_size, minibatch['sentence'].dense_shape[1]])
# minibatch.sentence sparse tensor with dense shape
# [batch_size x seq_length], length: [batch_size]
return Dataset(
tokens=minibatch['sentence'],
num_tokens=self.get_row_lengths(minibatch['sentence']),
sentiment=minibatch['sentiment'],
)
@property
def num_examples(self):
return self._num_examples
def parse(data_dict):
"""Parse dataset from _data_gen into the same format as sst_binary."""
sentiment = data_dict['label']
sentence = data_dict['sentence']
dense_chars = tf.decode_raw(sentence, tf.uint8)
dense_chars.set_shape((None,))
chars = tfp.math.dense_to_sparse(dense_chars)
if six.PY3:
safe_chr = lambda c: '?' if c >= 128 else chr(c)
else:
safe_chr = chr
to_char = np.vectorize(safe_chr)
chars = tf.SparseTensor(indices=chars.indices,
values=tf.py_func(to_char, [chars.values], tf.string),
dense_shape=chars.dense_shape)
return {'sentiment': sentiment,
'sentence': chars}
class RobustModel(snt.AbstractModule):
"""Model for applying sentence representations for different tasks."""
def __init__(self,
task,
batch_size,
pooling,
learning_rate,
config,
embedding_dim,
fine_tune_embeddings=False,
num_oov_buckets=1000,
max_grad_norm=5.0,
name='robust_model'):
super(RobustModel, self).__init__(name=name)
self.config = config
self.task = task
self.batch_size = batch_size
self.pooling = pooling
self.learning_rate = learning_rate
self.embedding_dim = embedding_dim
self.fine_tune_embeddings = fine_tune_embeddings
self.num_oov_buckets = num_oov_buckets
self.max_grad_norm = max_grad_norm
self.linear_classifier = None
def add_representer(self, vocab_filename, padded_token=None):
"""Add sentence representer to the computation graph.
Args:
vocab_filename: the name of vocabulary files.
padded_token: padded_token to the vocabulary.
"""
self.embed_pad = utils.EmbedAndPad(
self.batch_size,
[self._lines_from_file(vocab_filename)],
embedding_dim=self.embedding_dim,
num_oov_buckets=self.num_oov_buckets,
fine_tune_embeddings=self.fine_tune_embeddings,
padded_token=padded_token)
self.keep_prob = tf.placeholder(tf.float32, shape=None, name='keep_prob')
# Model to get a sentence representation from embeddings.
self.sentence_representer = models.SentenceRepresenterConv(
self.config, keep_prob=self.keep_prob, pooling=self.pooling)
def add_dataset(self):
"""Add datasets.
Returns:
train_data, dev_data, test_data, num_classes
"""
if self.config.get('dataset', '') == 'sst':
train_data = GeneratedDataset(None, self.batch_size, mode='train',
num_examples=67349)
dev_data = GeneratedDataset(None, self.batch_size, mode='validation',
num_examples=872)
test_data = GeneratedDataset(None, self.batch_size, mode='validation',
num_examples=872)
num_classes = 2
return train_data, dev_data, test_data, num_classes
else:
raise ValueError('Not supported dataset')
def get_representation(self, tokens, num_tokens):
if tokens.dtype == tf.float32:
return self.sentence_representer(tokens, num_tokens)
else: # dtype == tf.string
return self.sentence_representer(self.embed_pad(tokens), num_tokens)
def add_representation(self, minibatch):
"""Compute sentence representations.
Args:
minibatch: a minibatch of sequences of embeddings.
Returns:
joint_rep: representation of sentences or concatenation of
sentence vectors.
"""
joint_rep = self.get_representation(minibatch.tokens, minibatch.num_tokens)
result = {'representation1': joint_rep}
return joint_rep, result
def add_train_ops(self,
num_classes,
joint_rep,
minibatch):
"""Add ops for training in the computation graph.
Args:
num_classes: number of classes to predict in the task.
joint_rep: the joint sentence representation if the input is sentence
pairs or the representation for the sentence if the input is a single
sentence.
minibatch: a minibatch of sequences of embeddings.
Returns:
train_accuracy: the accuracy on the training dataset
loss: training loss.
opt_step: training op.
"""
if self.linear_classifier is None:
classifier_layers = []
classifier_layers.append(snt.Linear(num_classes))
self.linear_classifier = snt.Sequential(classifier_layers)
logits = self.linear_classifier(joint_rep)
# Losses and optimizer.
def get_loss(logits, labels):
return tf.reduce_mean(
tf.nn.sparse_softmax_cross_entropy_with_logits(
labels=labels, logits=logits))
loss = get_loss(logits, minibatch.sentiment)
train_accuracy = utils.get_accuracy(logits, minibatch.sentiment)
opt_step = self._add_optimize_op(loss)
return train_accuracy, loss, opt_step
def create_perturbation_ops(self, minibatch, synonym_values, vocab_table):
"""Perturb data_batch using synonym_values."""
data_batch = _pad_fixed(
utils.get_padded_indexes(vocab_table, minibatch.tokens,
self.batch_size), axis=1,
padded_length=self.config['max_padded_length'])
# synonym_values: [vocab_size x max_num_synonyms]
# data_batch: [batch_size x seq_length]
# [batch_size x seq_length x max_num_synonyms] - synonyms for each token.
# Defaults to same word in case of no other synonyms.
synonym_ids = tf.gather(synonym_values, data_batch, axis=0)
# Split along batchsize. Elements shape: [seq_length x max_num_synonyms].
synonym_ids_per_example = tf.unstack(synonym_ids, axis=0)
# Loop across batch.
# synonym_ids_this_example shape: [seq_length x max_num_synonyms]
sequence_positions_across_batch, values_across_batch = [], []
for i_sample, synonym_ids_this_example in enumerate(
synonym_ids_per_example):
# [num_nonzero, 2]. The rows are pairs of (t,s), where t is an index for
# a time step, and s is an index into the max_num_synonyms dimension.
nonzero_indices = tf.where(synonym_ids_this_example)
# shape [num_nonzero]. Corresponding to the entries at nonzero_indices
synonym_tokens = tf.gather_nd(params=synonym_ids_this_example,
indices=nonzero_indices)
# [num_nonzero] - Of the (t,s) pairs in nonzero_indices, pick only the
# time dimension (t), corresponding to perturbation positions in the
# sequence.
perturbation_positions_this_example = nonzero_indices[:, 0]
# The main logic is done. Now follows padding to a fixed length of
# num_perturbations. However, this cannot be done with 0-padding, as it
# would introduce a new (zero) vertex. Instead, we duplicate existing
# tokens as perturbations (which have no effect), until we have reached a
# total of num_perturbations perturbations. In this case, the padded
# tokens are the original tokens from the data_batch. The padded positions
# are all the positions (using range) corresponding to the padded tokens.
# How often seq-length fits into maximum num perturbations
padding_multiplier = tf.floordiv(self.config['num_perturbations'],
tf.cast(minibatch.num_tokens[i_sample],
tf.int32)) + 1
# original tokens # [seq_length]
original_tokens = data_batch[i_sample, :minibatch.num_tokens[i_sample]]
# [padding_multiplier * seq_length]. Repeat several times, use as padding.
padding_tokens = tf.tile(original_tokens, multiples=[padding_multiplier])
synonym_tokens_padded = tf.concat([synonym_tokens, tf.cast(padding_tokens,
dtype=tf.int64)
], axis=0)
# Crop at exact num_perturbations size.
synonym_tokens_padded = synonym_tokens_padded[
:self.config['num_perturbations']]
# [seq_length] padding sequence positions with tiles of range()
pad_positions = tf.range(minibatch.num_tokens[i_sample], delta=1)
# [padding_multiplier*seq_length]
padding_positions = tf.tile(pad_positions, multiples=[padding_multiplier])
perturbation_positions_this_example_padded = tf.concat(
[perturbation_positions_this_example, tf.cast(padding_positions,
dtype=tf.int64)],
axis=0)
# Crop at exact size num_perturbations.
sequence_positions_padded = perturbation_positions_this_example_padded[
:self.config['num_perturbations']]
# Collect across the batch for tf.stack later.
sequence_positions_across_batch.append(sequence_positions_padded)
values_across_batch.append(synonym_tokens_padded)
# Both [batch_size x max_n_perturbations]
perturbation_positions = tf.stack(sequence_positions_across_batch, axis=0)
perturbation_tokens = tf.stack(values_across_batch, axis=0)
# Explicitly setting the shape to self.config['num_perturbations']
perturbation_positions_shape = perturbation_positions.shape.as_list()
perturbation_positions_shape[1] = self.config['num_perturbations']
perturbation_positions.set_shape(perturbation_positions_shape)
perturbation_tokens_shape = perturbation_tokens.shape.as_list()
perturbation_tokens_shape[1] = self.config['num_perturbations']
perturbation_tokens.set_shape(perturbation_tokens_shape)
return Perturbation(
positions=perturbation_positions,
tokens=perturbation_tokens)
def _add_optimize_op(self, loss):
"""Add ops for training."""
global_step = tf.Variable(0, trainable=False)
learning_rate = tf.Variable(self.learning_rate, trainable=False)
tvars = tf.trainable_variables()
grads, _ = tf.clip_by_global_norm(tf.gradients(loss, tvars),
self.max_grad_norm)
opt = tf.train.AdamOptimizer(learning_rate)
opt_step = opt.apply_gradients(zip(grads, tvars),
global_step=global_step)
return opt_step
def embed_dataset(self, minibatch, vocab_table):
return EmbeddedDataset(
embedded_inputs=_pad_fixed(
self.embed_pad(minibatch.tokens),
axis=1,
padded_length=self.config['max_padded_length']),
input_tokens=_pad_fixed(
utils.get_padded_indexes(vocab_table, minibatch.tokens,
self.batch_size),
axis=1,
padded_length=self.config['max_padded_length']),
length=tf.minimum(self.config['max_padded_length'],
tf.cast(minibatch.num_tokens, tf.int32)),
sentiment=minibatch.sentiment)
def compute_mask_vertices(self, data_batch, perturbation):
"""Compute perturbation masks and perbuted vertices.
Args:
data_batch: EmbeddedDataset object.
perturbation: Perturbation object.
Returns:
masks: Positions where there are perturbations.
vertices: The resulting embeddings of the perturbed inputs.
"""
# The following are all shaped (after broadcasting) as:
# (batch_size, num_perturbations, seq_length, embedding_size).
embedding = self.embed_pad._embeddings # pylint: disable=protected-access
# (batch_size, 1, seq_length, emb_dim)
original_vertices = tf.expand_dims(data_batch.embedded_inputs, axis=1)
# (batch_size, num_perturbation, 1, emb_dim])
perturbation_vertices = tf.gather(
embedding, tf.expand_dims(perturbation.tokens, axis=2))
# (batch_size, num_perturbations, seq_length, 1)
mask = tf.expand_dims(
tf.one_hot(perturbation.positions,
depth=self.config['max_padded_length']), axis=3)
# (batch_size, num_perturbations, seq_length, embedding_size)
vertices = (1 - mask) * original_vertices + mask * perturbation_vertices
return mask, vertices
def preprocess_databatch(self, minibatch, vocab_table, perturbation):
data_batch = self.embed_dataset(minibatch, vocab_table)
mask, vertices = self.compute_mask_vertices(data_batch, perturbation)
return data_batch, mask, vertices
def add_verifiable_objective(self,
minibatch,
vocab_table,
perturbation,
stop_gradient=False):
# pylint: disable=g-missing-docstring
data_batch = self.embed_dataset(minibatch, vocab_table)
_, vertices = self.compute_mask_vertices(data_batch, perturbation)
def classifier(embedded_inputs):
representation = self.sentence_representer(embedded_inputs,
data_batch.length)
return self.linear_classifier(representation)
# Verification graph.
network = ibp.VerifiableModelWrapper(classifier)
network(data_batch.embedded_inputs)
input_bounds = ibp.SimplexBounds(
vertices=vertices,
nominal=data_batch.embedded_inputs,
r=(self.delta if not stop_gradient else self.config['delta']))
network.propagate_bounds(input_bounds)
# Calculate the verifiable objective.
verifiable_obj = verifiable_objective(
network, data_batch.sentiment, margin=1.)
return verifiable_obj
def run_classification(self, inputs, labels, length):
prediction = self.run_prediction(inputs, length)
correct = tf.cast(tf.equal(labels, tf.argmax(prediction, 1)),
dtype=tf.float32)
return correct
def compute_verifiable_loss(self, verifiable_obj, labels):
"""Compute verifiable training objective.
Args:
verifiable_obj: Verifiable training objective.
labels: Ground truth labels.
Returns:
verifiable_loss: Aggregrated loss of the verifiable training objective.
"""
# Three options: reduce max, reduce mean, and softmax.
if self.config['verifiable_training_aggregation'] == 'mean':
verifiable_loss = tf.reduce_mean(
verifiable_obj) # average across all target labels
elif self.config['verifiable_training_aggregation'] == 'max':
# Worst target label only.
verifiable_loss = tf.reduce_mean(tf.reduce_max(verifiable_obj, axis=0))
elif self.config['verifiable_training_aggregation'] == 'softmax':
# This assumes that entries in verifiable_obj belonging to the true class
# are set to a (large) negative value, so to not affect the softmax much.
# [batch_size]. Compute x-entropy against one-hot distrib. for true label.
verifiable_loss = tf.nn.sparse_softmax_cross_entropy_with_logits(
logits=tf.transpose(verifiable_obj), labels=labels)
verifiable_loss = tf.reduce_mean(
verifiable_loss) # aggregation across batch
else:
logging.info(self.config['verifiable_training_aggregation'])
raise ValueError(
'Bad input argument for verifiable_training_aggregation used.')
return verifiable_loss
def compute_verifiable_verified(self, verifiable_obj):
# Overall upper bound is maximum over all incorrect target classes.
bound = tf.reduce_max(verifiable_obj, axis=0)
verified = tf.cast(bound <= 0, dtype=tf.float32)
return bound, verified
def run_prediction(self, inputs, length):
representation = self.sentence_representer(inputs, length)
prediction = self.linear_classifier(representation)
return prediction
def sentiment_accuracy_op(self, minibatch):
"""Compute accuracy of dev/test set on the task of sentiment analysis.
Args:
minibatch: a batch of sequences of embeddings.
Returns:
num_correct: the number of examples that are predicted correctly on the
given dataset.
"""
rep = self.get_representation(minibatch.tokens, minibatch.num_tokens)
logits = self.linear_classifier(rep)
num_correct = utils.get_num_correct_predictions(logits,
minibatch.sentiment)
return num_correct
def add_dev_eval_ops(self, minibatch):
"""Add ops for evaluating on the dev/test set.
Args:
minibatch: a batch of sequence of embeddings.
Returns:
num_correct: the number of examples that are predicted correctly.
"""
num_correct = self.sentiment_accuracy_op(minibatch)
return num_correct
def _build(self):
"""Build the computation graph.
Returns:
graph_tensors: list of ops that are to be executed during
training/evaluation.
"""
train_data, dev_data, test_data, num_classes = self.add_dataset()
train_minibatch = train_data()
dev_minibatch = dev_data()
test_minibatch = test_data()
# Load the vocab without padded_token and add it to the add_representer
# later. Otherwise, it will be sorted.
vocab_filename = self.config['vocab_filename']
self.add_representer(vocab_filename, padded_token=b'<PAD>')
graph_tensors = self._build_graph_with_datasets(
train_minibatch, dev_minibatch, test_minibatch, num_classes)
graph_tensors['dev_num_examples'] = dev_data.num_examples
graph_tensors['test_num_examples'] = test_data.num_examples
return graph_tensors
def _build_graph_with_datasets(self,
train_minibatch,
dev_minibatch,
test_minibatch,
num_classes):
"""Returns the training/evaluation ops."""
self.keep_prob = 1. # Using literal 1 (not placeholder) skips dropout op.
self.sentence_representer._keep_prob = 1. # pylint:disable=protected-access
# Build the graph as per the base class.
(train_joint_rep, _) = self.add_representation(train_minibatch)
(train_accuracy,
loss,
opt_step) = self.add_train_ops(num_classes, train_joint_rep,
train_minibatch)
dev_num_correct = self.add_dev_eval_ops(dev_minibatch)
test_num_correct = self.add_dev_eval_ops(test_minibatch)
graph_tensors = {
'loss': loss,
'train_op': opt_step,
'train_accuracy': train_accuracy,
'dev_num_correct': dev_num_correct,
'test_num_correct': test_num_correct,
'keep_prob': self.keep_prob
}
vocab_table = self.embed_pad.vocab_table
vocab_size = self.embed_pad.vocab_size
verifiable_loss_ratio = tf.constant(
self.config['verifiable_loss_ratio'],
dtype=tf.float32,
name='verifiable_loss_ratio')
self.delta = tf.constant(self.config['delta'],
dtype=tf.float32, name='delta')
lookup_token = tf.placeholder(tf.string, shape=None, name='lookup_token')
indices = vocab_table.lookup(lookup_token)
self.vocab_list = contrib_lookup.index_to_string_table_from_file(
self.config['vocab_filename_pad'])
lookup_token_index = tf.placeholder(tf.int64, shape=None,
name='lookup_token_index')
lookup_token_string = self.vocab_list.lookup(lookup_token_index)
synonym_values = tf.placeholder(tf.int64, shape=[None, None],
name='synonym_values')
synonym_counts = tf.placeholder(tf.int64, shape=[None],
name='synonym_counts')
train_perturbation = self.create_perturbation_ops(
train_minibatch, synonym_values, vocab_table)
train_data_batch, _, _ = self.preprocess_databatch(
train_minibatch, vocab_table, train_perturbation)
train_words = self.vocab_list.lookup(train_data_batch.input_tokens)
# [num_targets x batchsize]
verifiable_obj = self.add_verifiable_objective(
train_minibatch, vocab_table, train_perturbation, stop_gradient=False)
train_nominal = self.run_classification(train_data_batch.embedded_inputs,
train_data_batch.sentiment,
train_data_batch.length)
train_bound, train_verified = self.compute_verifiable_verified(
verifiable_obj)
verifiable_loss = self.compute_verifiable_loss(verifiable_obj,
train_minibatch.sentiment)
if (self.config['verifiable_loss_ratio']) > 1.0:
raise ValueError('Loss ratios sum up to more than 1.0')
total_loss = (1 - verifiable_loss_ratio) * graph_tensors['loss']
if self.config['verifiable_loss_ratio'] != 0:
total_loss += verifiable_loss_ratio * verifiable_loss
# Attack on dev/test set.
dev_perturbation = self.create_perturbation_ops(
dev_minibatch, synonym_values, vocab_table)
# [num_targets x batchsize]
dev_verifiable_obj = self.add_verifiable_objective(
dev_minibatch, vocab_table, dev_perturbation, stop_gradient=True)
dev_bound, dev_verified = self.compute_verifiable_verified(
dev_verifiable_obj)
dev_data_batch, _, _ = self.preprocess_databatch(
dev_minibatch, vocab_table, dev_perturbation)
test_perturbation = self.create_perturbation_ops(
test_minibatch, synonym_values, vocab_table)
# [num_targets x batchsize]
test_verifiable_obj = self.add_verifiable_objective(
test_minibatch, vocab_table, test_perturbation, stop_gradient=True)
test_bound, test_verified = self.compute_verifiable_verified(
test_verifiable_obj)
test_data_batch, _, _ = self.preprocess_databatch(
test_minibatch, vocab_table, test_perturbation)
dev_words = self.vocab_list.lookup(dev_data_batch.input_tokens)
test_words = self.vocab_list.lookup(test_data_batch.input_tokens)
dev_nominal = self.run_classification(dev_data_batch.embedded_inputs,
dev_data_batch.sentiment,
dev_data_batch.length)
test_nominal = self.run_classification(test_data_batch.embedded_inputs,
test_data_batch.sentiment,
test_data_batch.length)
dev_predictions = self.run_prediction(dev_data_batch.embedded_inputs,
dev_data_batch.length)
test_predictions = self.run_prediction(test_data_batch.embedded_inputs,
test_data_batch.length)
with tf.control_dependencies([train_verified, test_verified, dev_verified]):
opt_step = self._add_optimize_op(total_loss)
graph_tensors['total_loss'] = total_loss
graph_tensors['verifiable_loss'] = verifiable_loss
graph_tensors['train_op'] = opt_step
graph_tensors['indices'] = indices
graph_tensors['lookup_token_index'] = lookup_token_index
graph_tensors['lookup_token_string'] = lookup_token_string
graph_tensors['lookup_token'] = lookup_token
graph_tensors['vocab_size'] = vocab_size
graph_tensors['synonym_values'] = synonym_values
graph_tensors['synonym_counts'] = synonym_counts
graph_tensors['verifiable_loss_ratio'] = verifiable_loss_ratio
graph_tensors['delta'] = self.delta
graph_tensors['train'] = {
'bound': train_bound,
'verified': train_verified,
'words': train_words,
'sentiment': train_minibatch.sentiment,
'correct': train_nominal,
}
graph_tensors['dev'] = {
'predictions': dev_predictions,
'data_batch': dev_data_batch,
'tokens': dev_minibatch.tokens,
'num_tokens': dev_minibatch.num_tokens,
'minibatch': dev_minibatch,
'bound': dev_bound,
'verified': dev_verified,
'words': dev_words,
'sentiment': dev_minibatch.sentiment,
'correct': dev_nominal,
}
graph_tensors['test'] = {
'predictions': test_predictions,
'data_batch': test_data_batch,
'tokens': test_minibatch.tokens,
'num_tokens': test_minibatch.num_tokens,
'minibatch': test_minibatch,
'bound': test_bound,
'verified': test_verified,
'words': test_words,
'sentiment': test_minibatch.sentiment,
'correct': test_nominal,
}
return graph_tensors
def _lines_from_file(self, filename):
with open(filename, 'rb') as f:
return f.read().splitlines()
def verifiable_objective(network, labels, margin=0.):
"""Computes the verifiable objective.
Args:
network: `ibp.VerifiableModelWrapper` for the network to verify.
labels: 1D integer tensor of shape (batch_size) of labels for each
input example.
margin: Verifiable objective values for correct class will be forced to
`-margin`, thus disregarding large negative bounds when maximising. By
default this is set to 0.
Returns:
2D tensor of shape (num_classes, batch_size) containing verifiable objective
for each target class, for each example.
"""
last_layer = network.output_module
# Objective, elided with final linear layer.
obj_w, obj_b = targeted_objective(
last_layer.module.w, last_layer.module.b, labels)
# Relative bounds on the objective.
per_neuron_objective = tf.maximum(
obj_w * last_layer.input_bounds.lower_offset,
obj_w * last_layer.input_bounds.upper_offset)
verifiable_obj = tf.reduce_sum(
per_neuron_objective,
axis=list(range(2, per_neuron_objective.shape.ndims)))
# Constant term (objective layer bias).
verifiable_obj += tf.reduce_sum(
obj_w * last_layer.input_bounds.nominal,
axis=list(range(2, obj_w.shape.ndims)))
verifiable_obj += obj_b
# Filter out cases in which the target class is the correct class.
# Using `margin` makes the irrelevant cases of target=correct return
# a large negative value, which will be ignored by the reduce_max.
num_classes = last_layer.output_bounds.shape[-1]
verifiable_obj = filter_correct_class(
verifiable_obj, num_classes, labels, margin=margin)
return verifiable_obj
def targeted_objective(final_w, final_b, labels):
"""Determines final layer weights for attacks targeting each class.
Args:
final_w: 2D tensor of shape (last_hidden_layer_size, num_classes)
containing the weights for the final linear layer.
final_b: 1D tensor of shape (num_classes) containing the biases for the
final hidden layer.
labels: 1D integer tensor of shape (batch_size) of labels for each
input example.
Returns:
obj_w: Tensor of shape (num_classes, batch_size, last_hidden_layer_size)
containing weights (to use in place of final linear layer weights)
for targeted attacks.
obj_b: Tensor of shape (num_classes, batch_size) containing bias
(to use in place of final linear layer biases) for targeted attacks.
"""
# Elide objective with final linear layer.
final_wt = tf.transpose(final_w)
obj_w = tf.expand_dims(final_wt, axis=1) - tf.gather(final_wt, labels, axis=0)
obj_b = tf.expand_dims(final_b, axis=1) - tf.gather(final_b, labels, axis=0)
return obj_w, obj_b
def filter_correct_class(verifiable_obj, num_classes, labels, margin):
"""Filters out the objective when the target class contains the true label.
Args:
verifiable_obj: 2D tensor of shape (num_classes, batch_size) containing
verifiable objectives.
num_classes: number of target classes.
labels: 1D tensor of shape (batch_size) containing the labels for each
example in the batch.
margin: Verifiable objective values for correct class will be forced to
`-margin`, thus disregarding large negative bounds when maximising.
Returns:
2D tensor of shape (num_classes, batch_size) containing the corrected
verifiable objective values for each (class, example).
"""
targets_to_filter = tf.expand_dims(
tf.range(num_classes, dtype=labels.dtype), axis=1)
neq = tf.not_equal(targets_to_filter, labels)
verifiable_obj = tf.where(neq, verifiable_obj, -margin *
tf.ones_like(verifiable_obj))
return verifiable_obj
================================================
FILE: examples/language/robust_train.py
================================================
# coding=utf-8
# Copyright 2019 The Interval Bound Propagation Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Train verifiably robust models."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import imp
import json
import os
from absl import app
from absl import flags
from absl import logging
import numpy as np
from six.moves import range
import tensorflow.compat.v1 as tf
import robust_model
flags.DEFINE_string('config_path', 'config.py',
'Path to training configuration file.')
flags.DEFINE_integer('batch_size', 40, 'Batch size.')
flags.DEFINE_integer('num_train_steps', 150000, 'Number of training steps.')
flags.DEFINE_integer('num_oov_buckets', 1,
'Number of out of vocabulary buckets.')
flags.DEFINE_integer('report_every', 100,
'Report test loss every N batches.')
flags.DEFINE_float('schedule_ratio', 0.8,
'The final delta and verifiable_loss_ratio are reached when '
'the number of steps equals schedule_ratio * '
'num_train_steps.')
flags.DEFINE_float('learning_rate', 0.001, 'Learning rate.')
flags.DEFINE_float('max_grad_norm', 5.0, 'Maximum norm of gradients.')
flags.DEFINE_boolean('fine_tune_embeddings', True, 'Finetune embeddings.')
flags.DEFINE_string('task', 'sst', 'One of snli, mnli, sick, sst.')
flags.DEFINE_string('pooling', 'average', 'One of averge, sum, max, last.')
flags.DEFINE_boolean('analysis', False, 'Analysis mode.')
flags.DEFINE_string('analysis_split', 'test', 'Analysis dataset split.')
flags.DEFINE_string('experiment_root',
'/tmp/robust_model/',
'Path to save trained models.')
flags.DEFINE_string(
'tensorboard_dir', None,
'Tensorboard folder. If not specified, set under experiment_root')
FLAGS = flags.FLAGS
def load_synonyms(synonym_filepath=None):
synonyms = None
with open(synonym_filepath) as f:
synonyms = json.load(f)
return synonyms
def construct_synonyms(synonym_filepath):
synonyms = load_synonyms(synonym_filepath)
synonym_keys = list(synonyms.keys())
synonym_values = [synonyms[k] for k in synonym_keys]
max_synoynm_counts = max([len(s) for s in synonym_values])
synonym_value_lens = [len(x) for x in synonym_values]
# Add 0 for the first starting point.
synonym_value_lens_cum = np.cumsum([0] + synonym_value_lens)
synonym_values_list = [word for val in synonym_values for word in val] # pylint: disable=g-complex-comprehension
return synonym_keys, max_synoynm_counts, synonym_value_lens_cum, synonym_values_list
def linear_schedule(step, init_step, final_step, init_value, final_value):
"""Linear schedule."""
assert final_step >= init_step
if init_step == final_step:
return final_value
rate = np.float32(step - init_step) / float(final_step - init_step)
linear_value = rate * (final_value - init_value) + init_value
return np.clip(linear_value, min(init_value, final_value),
max(init_value, final_value))
def config_train_summary(task, train_accuracy, loss):
"""Add ops for summary in the computation graph.
Args:
task: string name of task being trained for.
train_accuracy: training accuracy.
loss: training loss.
Returns:
train_summary: summary for training.
saver: tf.saver, used to save the checkpoint with the best dev accuracy.
"""
train_acc_summ = tf.summary.scalar(('%s_train_accuracy' % task),
train_accuracy)
loss_summ = tf.summary.scalar('loss', loss)
train_summary = tf.summary.merge([train_acc_summ, loss_summ])
return train_summary
def write_tf_summary(writer, step, tag, value):
summary = tf.Summary()
summary.value.add(tag=tag, simple_value=value)
writer.add_summary(summary, step)
def train(config_dict, synonym_filepath,
batch_size, num_train_steps, schedule_ratio, report_every,
checkpoint_path, tensorboard_dir):
"""Model training."""
graph_tensor_producer = robust_model.RobustModel(**config_dict)
graph_tensors = graph_tensor_producer()
synonym_keys, max_synoynm_counts, synonym_value_lens_cum, \
synonym_values_list = construct_synonyms(synonym_filepath)
train_summary = config_train_summary(config_dict['task'],
graph_tensors['train_accuracy'],
graph_tensors['loss'])
tf.gfile.MakeDirs(checkpoint_path)
best_dev_accuracy = 0.0
best_test_accuracy = 0.0
best_verified_dev_accuracy = 0.0
best_verified_test_accuracy = 0.0
network_saver = tf.train.Saver(graph_tensor_producer.variables)
with tf.train.SingularMonitoredSession() as session:
logging.info('Initialize parameters...')
writer = tf.summary.FileWriter(tensorboard_dir, session.graph)
input_feed = {}
# Tokenize synonyms.
tokenize_synonyms = [[] for _ in range(graph_tensors['vocab_size'])]
lookup_indices_keys = session.run(graph_tensors['indices'],
feed_dict={graph_tensors['lookup_token']:
synonym_keys})
lookup_indices_values = session.run(graph_tensors['indices'],
feed_dict={
graph_tensors['lookup_token']:
synonym_values_list})
for i, key_index in enumerate(lookup_indices_keys):
tokenize_synonyms[key_index] = lookup_indices_values[
synonym_value_lens_cum[i]:synonym_value_lens_cum[i+1]].tolist()
synonym_values_np = np.zeros([graph_tensors['vocab_size'],
max_synoynm_counts])
for i in range(graph_tensors['vocab_size']):
# False-safe case. No perturbations. Set it as itself.
synonym_values_np[i][0] = i
for j in range(len(tokenize_synonyms[i])):
synonym_values_np[i][j] = tokenize_synonyms[i][j]
synonym_counts_np = [len(s) for s in tokenize_synonyms]
input_feed[graph_tensors['synonym_values']] = synonym_values_np
input_feed[graph_tensors['synonym_counts']] = synonym_counts_np
warmup_steps = 0
for step in range(num_train_steps):
config = config_dict['config']
if config['delta'] > 0.0 and config['delta_schedule']:
delta = linear_schedule(
step, 0., schedule_ratio * num_train_steps,
0., config['delta'])
input_feed[graph_tensors['delta']] = delta
if (config['verifiable_loss_ratio'] > 0.0 and
config['verifiable_loss_schedule']):
if delta > 0.0 and warmup_steps == 0:
warmup_steps = step
if delta > 0.0:
verifiable_loss_ratio = linear_schedule(
step, warmup_steps, schedule_ratio * num_train_steps,
0., config['verifiable_loss_ratio'])
else:
verifiable_loss_ratio = 0.0
input_feed[
graph_tensors['verifiable_loss_ratio']] = verifiable_loss_ratio
total_loss_np, loss_np, verifiable_loss_np, train_accuracy_np, \
train_bound, train_verified, \
verifiable_loss_ratio_val, delta_val, \
train_summary_py, _ = session.run(
[graph_tensors['total_loss'],
graph_tensors['loss'],
graph_tensors['verifiable_loss'],
graph_tensors['train_accuracy'],
graph_tensors['train']['bound'],
graph_tensors['train']['verified'],
graph_tensors['verifiable_loss_ratio'],
graph_tensors['delta'],
train_summary,
graph_tensors['train_op']], input_feed)
writer.add_summary(train_summary_py, step)
if step % report_every == 0 or step == num_train_steps - 0:
dev_total_num_correct = 0.0
test_total_num_correct = 0.0
dev_verified_count = 0.0
test_verified_count = 0.0
dev_num_batches = graph_tensors['dev_num_examples'] // batch_size
test_num_batches = graph_tensors['test_num_examples'] // batch_size
dev_total_num_examples = dev_num_batches * batch_size
test_total_num_examples = test_num_batches * batch_size
for _ in range(dev_num_batches):
correct, verified = session.run(
[graph_tensors['dev_num_correct'],
graph_tensors['dev']['verified']], input_feed)
dev_total_num_correct += correct
dev_verified_count += np.sum(verified)
for _ in range(test_num_batches):
correct, verified = session.run(
[graph_tensors['test_num_correct'],
graph_tensors['test']['verified']], input_feed)
test_total_num_correct += correct
test_verified_count += np.sum(verified)
dev_accuracy = dev_total_num_correct / dev_total_num_examples
test_accuracy = test_total_num_correct / test_total_num_examples
dev_verified_accuracy = dev_verified_count / dev_total_num_examples
test_verified_accuracy = test_verified_count / test_total_num_examples
write_tf_summary(writer, step, tag='dev_accuracy', value=dev_accuracy)
write_tf_summary(writer, step, tag='test_accuracy', value=test_accuracy)
write_tf_summary(writer, step, tag='train_bound_summary',
value=np.mean(train_bound))
write_tf_summary(writer, step, tag='train_verified_summary',
value=np.mean(train_verified))
write_tf_summary(writer, step, tag='dev_verified_summary',
value=np.mean(dev_verified_accuracy))
write_tf_summary(writer, step, tag='test_verified_summary',
value=np.mean(test_verified_accuracy))
write_tf_summary(writer, step, tag='total_loss_summary',
value=total_loss_np)
write_tf_summary(writer, step, tag='verifiable_train_loss_summary',
value=verifiable_loss_np)
logging.info('verifiable_loss_ratio: %f, delta: %f',
verifiable_loss_ratio_val, delta_val)
logging.info('step: %d, '
'train loss: %f, '
'verifiable train loss: %f, '
'train accuracy: %f, '
'dev accuracy: %f, '
'test accuracy: %f, ', step, loss_np,
verifiable_loss_np, train_accuracy_np,
dev_accuracy, test_accuracy)
dev_verified_accuracy_mean = np.mean(dev_verified_accuracy)
test_verified_accuracy_mean = np.mean(test_verified_accuracy)
logging.info('Train Bound = %.05f, train verified: %.03f, '
'dev verified: %.03f, test verified: %.03f',
np.mean(train_bound),
np.mean(train_verified), dev_verified_accuracy_mean,
test_verified_accuracy_mean)
if dev_accuracy > best_dev_accuracy:
# Store most accurate model so far.
network_saver.save(session.raw_session(),
os.path.join(checkpoint_path, 'best'))
best_dev_accuracy = dev_accuracy
best_test_accuracy = test_accuracy
logging.info('best dev acc\t%f\tbest test acc\t%f',
best_dev_accuracy, best_test_accuracy)
if dev_verified_accuracy_mean > best_verified_dev_accuracy:
# Store model with best verified accuracy so far.
network_saver.save(session.raw_session(),
os.path.join(checkpoint_path, 'best_verified'))
best_verified_dev_accuracy = dev_verified_accuracy_mean
best_verified_test_accuracy = test_verified_accuracy_mean
logging.info('best verified dev acc\t%f\tbest verified test acc\t%f',
best_verified_dev_accuracy, best_verified_test_accuracy)
network_saver.save(session.raw_session(),
os.path.join(checkpoint_path, 'model'))
writer.flush()
# Store model at end of training.
network_saver.save(session.raw_session(),
os.path.join(checkpoint_path, 'final'))
def analysis(config_dict, synonym_filepath,
model_location, batch_size, batch_offset=0,
total_num_batches=0, datasplit='test', delta=3.0,
num_perturbations=5, max_padded_length=0):
"""Run analysis."""
tf.reset_default_graph()
if datasplit not in ['train', 'dev', 'test']:
raise ValueError('Invalid datasplit: %s' % datasplit)
logging.info('model_location: %s', model_location)
logging.info('num_perturbations: %d', num_perturbations)
logging.info('delta: %f', delta)
logging.info('Run analysis, datasplit: %s, batch %d', datasplit, batch_offset)
synonym_keys, max_synoynm_counts, synonym_value_lens_cum, \
synonym_values_list = construct_synonyms(synonym_filepath)
graph_tensor_producer = robust_model.RobustModel(**config_dict)
# Use new batch size.
graph_tensor_producer.batch_size = batch_size
# Overwrite the config originally in the saved checkpoint.
logging.info('old delta %f, old num_perturbations: %d',
graph_tensor_producer.config['delta'],
graph_tensor_producer.config['num_perturbations'])
graph_tensor_producer.config['delta'] = delta
graph_tensor_producer.config['num_perturbations'] = num_perturbations
if max_padded_length > 0:
graph_tensor_producer.config['max_padded_length'] = max_padded_length
logging.info('new delta %f, num_perturbations: %d, max_padded_length: %d',
graph_tensor_producer.config['delta'],
graph_tensor_producer.config['num_perturbations'],
graph_tensor_producer.config['max_padded_length'])
logging.info('graph_tensors.config: %s', graph_tensor_producer.config)
graph_tensors = graph_tensor_producer()
network_saver = tf.train.Saver(graph_tensor_producer.variables)
with tf.train.SingularMonitoredSession() as session:
network_saver.restore(session.raw_session(), model_location)
for _ in range(batch_offset):
# Seek to the correct batch.
session.run(graph_tensors[datasplit]['sentiment'])
input_feed = {}
# Tokenize synonyms.
tokenize_synonyms = [[] for _ in range(graph_tensors['vocab_size'])]
lookup_indices_keys = session.run(graph_tensors['indices'],
feed_dict={graph_tensors['lookup_token']:
synonym_keys})
lookup_indices_values = session.run(graph_tensors['indices'],
feed_dict={
graph_tensors['lookup_token']:
synonym_values_list})
for i, key_index in enumerate(lookup_indices_keys):
tokenize_synonyms[key_index] = lookup_indices_values[
synonym_value_lens_cum[i]:synonym_value_lens_cum[i+1]].tolist()
synonym_values_np = np.zeros([graph_tensors['vocab_size'],
max_synoynm_counts])
for i in range(graph_tensors['vocab_size']):
# False-safe case. No perturbations. Set it as itself.
synonym_values_np[i][0] = i
for j in range(len(tokenize_synonyms[i])):
synonym_values_np[i][j] = tokenize_synonyms[i][j]
synonym_counts_np = [len(s) for s in tokenize_synonyms]
input_feed[graph_tensors['synonym_values']] = synonym_values_np
input_feed[graph_tensors['synonym_counts']] = synonym_counts_np
total_num_batches = (
graph_tensors['%s_num_examples' % datasplit] //
batch_size) if total_num_batches == 0 else total_num_batches
total_num_examples = total_num_batches * batch_size
logging.info('total number of examples %d', total_num_examples)
logging.info('total number of batches %d', total_num_batches)
total_correct, total_verified = 0.0, 0.0
for ibatch in range(total_num_batches):
results = session.run(graph_tensors[datasplit], input_feed)
logging.info('batch: %d, %s bound = %.05f, verified: %.03f,'
' nominally correct: %.03f',
ibatch, datasplit, np.mean(results['bound']),
np.mean(results['verified']),
np.mean(results['correct']))
total_correct += sum(results['correct'])
total_verified += sum(results['verified'])
total_correct /= total_num_examples
total_verified /= total_num_examples
logging.info('%s final correct: %.03f, verified: %.03f',
datasplit, total_correct, total_verified)
logging.info({
'datasplit': datasplit,
'nominal': total_correct,
'verify': total_verified,
'delta': delta,
'num_perturbations': num_perturbations,
'model_location': model_location,
'final': True
})
def main(_):
# Read the config file into a new ad-hoc module.
with open(FLAGS.config_path, 'r') as config_file:
config_code = config_file.read()
config_module = imp.new_module('config')
exec(config_code, config_module.__dict__) # pylint: disable=exec-used
config = config_module.get_config()
config_dict = {'task': FLAGS.task,
'batch_size': FLAGS.batch_size,
'pooling': FLAGS.pooling,
'learning_rate': FLAGS.learning_rate,
'config': config,
'embedding_dim': config['embedding_dim'],
'fine_tune_embeddings': FLAGS.fine_tune_embeddings,
'num_oov_buckets': FLAGS.num_oov_buckets,
'max_grad_norm': FLAGS.max_grad_norm}
if FLAGS.analysis:
logging.info('Analyze model location: %s', config['model_location'])
base_batch_offset = 0
analysis(config_dict, config['synonym_filepath'], config['model_location'],
FLAGS.batch_size, base_batch_offset,
0, datasplit=FLAGS.analysis_split,
delta=config['delta'],
num_perturbations=config['num_perturbations'],
max_padded_length=config['max_padded_length'])
else:
checkpoint_path = os.path.join(FLAGS.experiment_root, 'checkpoint')
if FLAGS.tensorboard_dir is None:
tensorboard_dir = os.path.join(FLAGS.experiment_root, 'tensorboard')
else:
tensorboard_dir = FLAGS.tensorboard_dir
train(config_dict, config['synonym_filepath'],
FLAGS.batch_size,
num_train_steps=FLAGS.num_train_steps,
schedule_ratio=FLAGS.schedule_ratio,
report_every=FLAGS.report_every,
checkpoint_path=checkpoint_path,
tensorboard_dir=tensorboard_dir)
if __name__ == '__main__':
app.run(main)
================================================
FILE: examples/language/utils.py
================================================
# coding=utf-8
# Copyright 2019 The Interval Bound Propagation Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Utilities for sentence representation."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import tempfile
from absl import logging
import sonnet as snt
import tensorflow as tf
from tensorflow.contrib import lookup as contrib_lookup
def get_padded_embeddings(embeddings,
vocabulary_table,
tokens, batch_size,
token_indexes=None):
"""Reshapes and pads 'raw' word embeddings.
Say we have batch of B tokenized sentences, of variable length, with a total
of W tokens. For example, B = 2 and W = 3 + 4 = 7:
[['The', 'cat', 'eats'],
[ 'A', 'black', 'cat', 'jumps']]
Since rows have variable length, this cannot be represented as a tf.Tensor.
It is represented as a tf.SparseTensor, with 7 values & indexes:
indices: [[0,0], [0,1], [0,2], [1,0], [1,1], [1,2], [1,3]]
values: ['The', 'cat', 'eats', 'A', 'black', 'cat', 'jumps']
We have also built a vocabulary table:
vocabulary table: ['cat', 'The', 'A', 'black', 'eats', 'jumps']
We also have the embeddings, a WxD matrix of floats
representing each word in the vocabulary table as a normal tf.Tensor.
For example, with D=3, embeddings could be:
[[0.4, 0.5, -0.6], # This is the embedding for word 0 = 'cat'
[0.1, -0.3, 0.6], # This is the embedding for word 1 = 'The''
[0.7, 0.8, -0.9], # This is the embedding for word 2 = 'A'
[-0.1, 0.9, 0.7], # This is the embedding for word 3 = 'black'
[-0.2, 0.4, 0.7], # This is the embedding for word 4 = 'eats
[0.3, -0.5, 0.2]] # This is the embedding for word 5 = 'jumps'
This function builds a normal tf.Tensor containing the embeddings for the
tokens provided, in the correct order, with appropriate 0 padding.
In our example, the returned tensor would be:
[[[0.1, -0.3, 0.6], [0.4, 0.5, -0.6], [-0.2, 0.4, 0.7], [0.0, 0.0, 0.0]],
[[0.7, 0.8, -0.9], [-0.1, 0.9, 0.7], [0.4, 0.5, -0.6], [0.3, -0.5, 0.2]]]
Note that since the first sentence has only 3 words, the 4th embedding gets
replaced by a D-dimensional vector of 0.
Args:
embeddings: [W, D] Tensor of floats, containing the embeddings, initialized
with the same vocabulary file as vocabulary_table.
vocabulary_table: a tf.contrib.lookup.LookupInterface,
containing the vocabulary, initialized with the same vocabulary file as
embeddings.
tokens: [B, ?] SparseTensor of strings, the tokens.
batch_size: Python integer.
token_indexes: A Boolean, indicating whether the input tokens are
token ids or string.
Returns:
[B, L, D] Tensor of floats: the embeddings in the correct order,
appropriately padded with 0.0, where L = max(num_tokens) and B = batch_size
"""
embedding_dim = embeddings.get_shape()[1].value # D in docstring above.
num_tokens_in_batch = tf.shape(tokens.indices)[0] # W in the docstring above.
max_length = tokens.dense_shape[1] # This is L in the docstring above.
# Get indices of tokens in vocabulary_table.
if token_indexes is not None:
indexes = token_indexes
else:
indexes = vocabulary_table.lookup(tokens.values)
# Get word embeddings.
tokens_embeddings = tf.gather(embeddings, indexes)
# Shape of the return tensor.
new_shape = tf.cast(
tf.stack([batch_size, max_length, embedding_dim], axis=0), tf.int32)
# Build the vector of indices for the return Tensor.
# In the example above, indices_final would be:
# [[[0,0,0], [0,0,1], [0,0,2]],
# [[0,1,0], [0,1,1], [0,1,2]],
# [[0,2,0], [0,2,1], [0,2,2]],
# [[1,0,0], [1,0,1], [1,0,2]],
# [[1,1,0], [1,1,1], [1,1,2]],
# [[1,2,0], [1,2,1], [1,2,2]],
# [[1,3,0], [1,3,1], [1,3,2]]]
tiled = tf.tile(tokens.indices, [1, embedding_dim])
indices_tiled = tf.cast(
tf.reshape(tiled, [num_tokens_in_batch * embedding_dim, 2]), tf.int32)
indices_linear = tf.expand_dims(
tf.tile(tf.range(0, embedding_dim), [num_tokens_in_batch]), axis=1)
indices_final = tf.concat([indices_tiled, indices_linear], axis=1)
# Build the dense Tensor.
embeddings_padded = tf.sparse_to_dense(
sparse_indices=indices_final,
output_shape=new_shape,
sparse_values=tf.reshape(tokens_embeddings,
[num_tokens_in_batch * embedding_dim]))
embeddings_padded.set_shape((batch_size, None, embedding_dim))
return embeddings_padded
def get_padded_indexes(vocabulary_table,
tokens, batch_size,
token_indexes=None):
"""Get the indices of tokens from vocabulary table.
Args:
vocabulary_table: a tf.contrib.lookup.LookupInterface,
containing the vocabulary, initialized with the same vocabulary file as
embeddings.
tokens: [B, ?] SparseTensor of strings, the tokens.
batch_size: Python integer.
token_indexes: A Boolean, indicating whether the input tokens are
token ids or string.
Returns:
[B, L] Tensor of integers: indices of tokens in the correct order,
appropriately padded with 0, where L = max(num_tokens) and B = batch_size
"""
num_tokens_in_batch = tf.shape(tokens.indices)[0]
max_length = tokens.dense_shape[1]
# Get indices of tokens in vocabulary_table.
if token_indexes is not None:
indexes = token_indexes
else:
indexes = vocabulary_table.lookup(tokens.values)
# Build the dense Tensor.
indexes_padded = tf.sparse_to_dense(
sparse_indices=tokens.indices,
output_shape=[batch_size, max_length],
sparse_values=tf.reshape(indexes,
[num_tokens_in_batch]))
indexes_padded.set_shape((batch_size, None))
return indexes_padded
class EmbedAndPad(snt.AbstractModule):
"""Embed and pad tokenized words.
This class primary functionality is similar to get_padded_embeddings.
It stores references to the embeddings and vocabulary table for convenience,
so that the user does not have to keep and pass them around.
"""
def __init__(self,
batch_size,
vocabularies,
embedding_dim,
num_oov_buckets=1000,
fine_tune_embeddings=False,
padded_token=None,
name='embed_and_pad'):
super(EmbedAndPad, self).__init__(name=name)
self._batch_size = batch_size
vocab_file, vocab_size = get_merged_vocabulary_file(vocabularies,
padded_token)
self._vocab_size = vocab_size
self._num_oov_buckets = num_oov_buckets
# Load vocabulary table for index lookup.
self._vocabulary_table = contrib_lookup.index_table_from_file(
vocabulary_file=vocab_file,
num_oov_buckets=num_oov_buckets,
vocab_size=self._vocab_size)
def create_initializer(initializer_range=0.02):
"""Creates a `truncated_normal_initializer` with the given range."""
# The default value is chosen from language/bert/modeling.py.
return tf.truncated_normal_initializer(stddev=initializer_range)
self._embeddings = tf.get_variable('embeddings_matrix',
[self._vocab_size + num_oov_buckets,
embedding_dim],
trainable=fine_tune_embeddings,
initializer=create_initializer())
def _build(self, tokens):
padded_embeddings = get_padded_embeddings(
self._embeddings, self._vocabulary_table, tokens, self._batch_size)
return padded_embeddings
@property
def vocab_table(self):
return self._vocabulary_table
@property
def vocab_size(self):
return self._vocab_size + self._num_oov_buckets
def get_accuracy(logits, labels):
"""Top 1 accuracy from logits and labels."""
return tf.reduce_mean(tf.cast(tf.nn.in_top_k(logits, labels, 1), tf.float32))
def get_num_correct_predictions(logits, labels):
"""Get the number of correct predictions over a batch."""
predictions = tf.cast(tf.argmax(logits, axis=1), tf.int64)
evals = tf.equal(predictions, labels)
num_correct = tf.reduce_sum(tf.cast(evals, tf.float64))
return num_correct
def get_merged_vocabulary_file(vocabularies, padded_token=None):
"""Merges several vocabulary files into one temporary file.
The TF object that loads the embedding expects a vocabulary file, to know
which embeddings it should load.
See tf.contrib.embedding.load_embedding_initializer.
When we want to train/test on several datasets simultaneously we need to merge
their vocabulary files into a single file.
Args:
vocabularies: Iterable of vocabularies. Each vocabulary should be
a list of tokens.
padded_token: If not None, add the padded_token to the first index.
Returns:
outfilename: Name of the merged file. Contains the union of all tokens in
filenames, without duplicates, one token per line.
vocabulary_size: Count of tokens in the merged file.
"""
uniques = [set(vocabulary) for vocabulary in vocabularies]
unique_merged = frozenset().union(*uniques)
unique_merged_sorted = sorted(unique_merged)
if padded_token is not None:
# Add padded token as 0 index.
unique_merged_sorted = [padded_token] + unique_merged_sorted
vocabulary_size = len(unique_merged_sorted)
outfile = tempfile.NamedTemporaryFile(delete=False)
outfile.write(b'\n'.join(unique_merged_sorted))
outfilename = outfile.name
logging.info('Merged vocabulary file with %d tokens: %s', vocabulary_size,
outfilename)
outfile.close()
return outfilename, vocabulary_size
================================================
FILE: examples/train.py
================================================
# coding=utf-8
# Copyright 2019 The Interval Bound Propagation Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Trains a verifiable model on Mnist or CIFAR-10."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
from absl import app
from absl import flags
from absl import logging
import interval_bound_propagation as ibp
import tensorflow.compat.v1 as tf
FLAGS = flags.FLAGS
flags.DEFINE_enum('dataset', 'mnist', ['mnist', 'cifar10'],
'Dataset (either "mnist" or "cifar10").')
flags.DEFINE_enum('model', 'tiny', ['tiny', 'small', 'medium', 'large'],
'Model size.')
flags.DEFINE_string('output_dir', '/tmp/ibp_model', 'Output directory.')
# Options.
flags.DEFINE_integer('steps', 60001, 'Number of steps in total.')
flags.DEFINE_integer('test_every_n', 2000,
'Number of steps between testing iterations.')
flags.DEFINE_integer('warmup_steps', 2000, 'Number of warm-up steps.')
flags.DEFINE_integer('rampup_steps', 10000, 'Number of ramp-up steps.')
flags.DEFINE_integer('batch_size', 200, 'Batch size.')
flags.DEFINE_float('epsilon', .3, 'Target epsilon.')
flags.DEFINE_float('epsilon_train', .33, 'Train epsilon.')
flags.DEFINE_string('learning_rate', '1e-3,1e-4@15000,1e-5@25000',
'Learning rate schedule of the form: '
'initial_learning_rate[,learning:steps]*. E.g., "1e-3" or '
'"1e-3,1e-4@15000,1e-5@25000".')
flags.DEFINE_float('nominal_xent_init', 1.,
'Initial weight for the nominal cross-entropy.')
flags.DEFINE_float('nominal_xent_final', .5,
'Final weight for the nominal cross-entropy.')
flags.DEFINE_float('verified_xent_init', 0.,
'Initial weight for the verified cross-entropy.')
flags.DEFINE_float('verified_xent_final', .5,
'Final weight for the verified cross-entropy.')
flags.DEFINE_float('crown_bound_init', 0.,
'Initial weight for mixing the CROWN bound with the IBP '
'bound in the verified cross-entropy.')
flags.DEFINE_float('crown_bound_final', 0.,
'Final weight for mixing the CROWN bound with the IBP '
'bound in the verified cross-entropy.')
flags.DEFINE_float('attack_xent_init', 0.,
'Initial weight for the attack cross-entropy.')
flags.DEFINE_float('attack_xent_final', 0.,
'Initial weight for the attack cross-entropy.')
def show_metrics(step_value, metric_values, loss_value=None):
print('{}: {}nominal accuracy = {:.2f}%, '
'verified = {:.2f}%, attack = {:.2f}%'.format(
step_value,
'loss = {}, '.format(loss_value) if loss_value is not None else '',
metric_values.nominal_accuracy * 100.,
metric_values.verified_accuracy * 100.,
metric_values.attack_accuracy * 100.))
def layers(model_size):
"""Returns the layer specification for a given model name."""
if model_size == 'tiny':
return (
('linear', 100),
('activation', 'relu'))
elif model_size == 'small':
return (
('conv2d', (4, 4), 16, 'VALID', 2),
('activation', 'relu'),
('conv2d', (4, 4), 32, 'VALID', 1),
('activation', 'relu'),
('linear', 100),
('activation', 'relu'))
elif model_size == 'medium':
return (
('conv2d', (3, 3), 32, 'VALID', 1),
('activation', 'relu'),
('conv2d', (4, 4), 32, 'VALID', 2),
('activation', 'relu'),
('conv2d', (3, 3), 64, 'VALID', 1),
('activation', 'relu'),
('conv2d', (4, 4), 64, 'VALID', 2),
('activation', 'relu'),
('linear', 512),
('activation', 'relu'),
('linear', 512),
('activation', 'relu'))
elif model_size == 'large':
return (
('conv2d', (3, 3), 64, 'SAME', 1),
('activation', 'relu'),
('conv2d', (3, 3), 64, 'SAME', 1),
('activation', 'relu'),
('conv2d', (3, 3), 128, 'SAME', 2),
('activation', 'relu'),
('conv2d', (3, 3), 128, 'SAME', 1),
('activation', 'relu'),
('conv2d', (3, 3), 128, 'SAME', 1),
('activation', 'relu'),
('linear', 512),
('activation', 'relu'))
else:
raise ValueError('Unknown model: "{}"'.format(model_size))
def main(unused_args):
logging.info('Training IBP on %s...', FLAGS.dataset.upper())
step = tf.train.get_or_create_global_step()
# Learning rate.
learning_rate = ibp.parse_learning_rate(step, FLAGS.learning_rate)
# Dataset.
input_bounds = (0., 1.)
num_classes = 10
if FLAGS.dataset == 'mnist':
data_train, data_test = tf.keras.datasets.mnist.load_data()
else:
assert FLAGS.dataset == 'cifar10', (
'Unknown dataset "{}"'.format(FLAGS.dataset))
data_train, data_test = tf.keras.datasets.cifar10.load_data()
data_train = (data_train[0], data_train[1].flatten())
data_test = (data_test[0], data_test[1].flatten())
data = ibp.build_dataset(data_train, batch_size=FLAGS.batch_size,
sequential=False)
if FLAGS.dataset == 'cifar10':
data = data._replace(image=ibp.randomize(
data.image, (32, 32, 3), expand_shape=(40, 40, 3),
crop_shape=(32, 32, 3), vertical_flip=True))
# Base predictor network.
original_predictor = ibp.DNN(num_classes, layers(FLAGS.model))
predictor = original_predictor
if FLAGS.dataset == 'cifar10':
mean = (0.4914, 0.4822, 0.4465)
std = (0.2023, 0.1994, 0.2010)
predictor = ibp.add_image_normalization(original_predictor, mean, std)
if FLAGS.crown_bound_init > 0 or FLAGS.crown_bound_final > 0:
logging.info('Using CROWN-IBP loss.')
model_wrapper = ibp.crown.VerifiableModelWrapper
loss_helper = ibp.crown.create_classification_losses
else:
model_wrapper = ibp.VerifiableModelWrapper
loss_helper = ibp.create_classification_losses
predictor = model_wrapper(predictor)
# Training.
train_losses, train_loss, _ = loss_helper(
step,
data.image,
data.label,
predictor,
FLAGS.epsilon_train,
loss_weights={
'nominal': {
'init': FLAGS.nominal_xent_init,
'final': FLAGS.nominal_xent_final,
'warmup': FLAGS.verified_xent_init + FLAGS.nominal_xent_init
},
'attack': {
'init': FLAGS.attack_xent_init,
'final': FLAGS.attack_xent_final
},
'verified': {
'init': FLAGS.verified_xent_init,
'final': FLAGS.verified_xent_final,
'warmup': 0.
},
'crown_bound': {
'init': FLAGS.crown_bound_init,
'final': FLAGS.crown_bound_final,
'warmup': 0.
},
},
warmup_steps=FLAGS.warmup_steps,
rampup_steps=FLAGS.rampup_steps,
input_bounds=input_bounds)
saver = tf.train.Saver(original_predictor.get_variables())
optimizer = tf.train.AdamOptimizer(learning_rate)
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(update_ops):
train_op = optimizer.minimize(train_loss, step)
# Test using while loop.
def get_test_metrics(batch_size, attack_builder=ibp.UntargetedPGDAttack):
"""Returns the test metrics."""
num_test_batches = len(data_test[0]) // batch_size
assert len(data_test[0]) % batch_size == 0, (
'Test data is not a multiple of batch size.')
def cond(i, *unused_args):
return i < num_test_batches
def body(i, metrics):
"""Compute the sum of all metrics."""
test_data = ibp.build_dataset(data_test, batch_size=batch_size,
sequential=True)
predictor(test_data.image, override=True, is_training=False)
input_interval_bounds = ibp.IntervalBounds(
tf.maximum(test_data.image - FLAGS.epsilon, input_bounds[0]),
tf.minimum(test_data.image + FLAGS.epsilon, input_bounds[1]))
predictor.propagate_bounds(input_interval_bounds)
test_specification = ibp.ClassificationSpecification(
test_data.label, num_classes)
test_attack = attack_builder(predictor, test_specification, FLAGS.epsilon,
input_bounds=input_bounds,
optimizer_builder=ibp.UnrolledAdam)
test_losses = ibp.Losses(predictor, test_specification, test_attack)
test_losses(test_data.label)
new_metrics = []
for m, n in zip(metrics, test_losses.scalar_metrics):
new_metrics.append(m + n)
return i + 1, new_metrics
total_count = tf.constant(0, dtype=tf.int32)
total_metrics = [tf.constant(0, dtype=tf.float32)
for _ in range(len(ibp.ScalarMetrics._fields))]
total_count, total_metrics = tf.while_loop(
cond,
body,
loop_vars=[total_count, total_metrics],
back_prop=False,
parallel_iterations=1)
total_count = tf.cast(total_count, tf.float32)
test_metrics = []
for m in total_metrics:
test_metrics.append(m / total_count)
return ibp.ScalarMetrics(*test_metrics)
test_metrics = get_test_metrics(
FLAGS.batch_size, ibp.UntargetedPGDAttack)
summaries = []
for f in test_metrics._fields:
summaries.append(
tf.summary.scalar(f, getattr(test_metrics, f)))
test_summaries = tf.summary.merge(summaries)
test_writer = tf.summary.FileWriter(os.path.join(FLAGS.output_dir, 'test'))
# Run everything.
tf_config = tf.ConfigProto()
tf_config.gpu_options.allow_growth = True
with tf.train.SingularMonitoredSession(config=tf_config) as sess:
for _ in range(FLAGS.steps):
iteration, loss_value, _ = sess.run(
[step, train_losses.scalar_losses.nominal_cross_entropy, train_op])
if iteration % FLAGS.test_every_n == 0:
metric_values, summary = sess.run([test_metrics, test_summaries])
test_writer.add_summary(summary, iteration)
show_metrics(iteration, metric_values, loss_value=loss_value)
saver.save(sess._tf_sess(), # pylint: disable=protected-access
os.path.join(FLAGS.output_dir, 'model'),
global_step=FLAGS.steps - 1)
if __name__ == '__main__':
app.run(main)
================================================
FILE: interval_bound_propagation/__init__.py
================================================
# coding=utf-8
# Copyright 2019 The Interval Bound Propagation Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Library to train verifiably robust neural networks.
For more details see paper: On the Effectiveness of Interval Bound Propagation
for Training Verifiably Robust Models.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from interval_bound_propagation.src.attacks import MemoryEfficientMultiTargetedPGDAttack
from interval_bound_propagation.src.attacks import MultiTargetedPGDAttack
from interval_bound_propagation.src.attacks import pgd_attack
from interval_bound_propagation.src.attacks import RestartedAttack
from interval_bound_propagation.src.attacks import UnrolledAdam
from interval_bound_propagation.src.attacks import UnrolledFGSMDescent
from interval_bound_propagation.src.attacks import UnrolledGradientDescent
from interval_bound_propagation.src.attacks import UnrolledSPSAAdam
from interval_bound_propagation.src.attacks import UnrolledSPSAFGSMDescent
from interval_bound_propagation.src.attacks import UnrolledSPSAGradientDescent
from interval_bound_propagation.src.attacks import UntargetedAdaptivePGDAttack
from interval_bound_propagation.src.attacks import UntargetedPGDAttack
from interval_bound_propagation.src.attacks import UntargetedTop5PGDAttack
from interval_bound_propagation.src.bounds import AbstractBounds
from interval_bound_propagation.src.bounds import IntervalBounds
import interval_bound_propagation.src.crown as crown
from interval_bound_propagation.src.fastlin import RelativeSymbolicBounds
from interval_bound_propagation.src.fastlin import SymbolicBounds
import interval_bound_propagation.src.layer_utils as layer_utils
from interval_bound_propagation.src.layers import BatchNorm
from interval_bound_propagation.src.layers import ImageNorm
from interval_bound_propagation.src.loss import Losses
from interval_bound_propagation.src.loss import ScalarLosses
from interval_bound_propagation.src.loss import ScalarMetrics
from interval_bound_propagation.src.model import DNN
from interval_bound_propagation.src.model import StandardModelWrapper
from interval_bound_propagation.src.model import VerifiableModelWrapper
from interval_bound_propagation.src.relative_bounds import RelativeIntervalBounds
from interval_bound_propagation.src.simplex_bounds import SimplexBounds
from interval_bound_propagation.src.specification import ClassificationSpecification
from interval_bound_propagation.src.specification import LeastLikelyClassificationSpecification
from interval_bound_propagation.src.specification import LinearSpecification
from interval_bound_propagation.src.specification import RandomClassificationSpecification
from interval_bound_propagation.src.specification import Specification
from interval_bound_propagation.src.specification import TargetedClassificationSpecification
from interval_bound_propagation.src.utils import add_image_normalization
from interval_bound_propagation.src.utils import build_dataset
from interval_bound_propagation.src.utils import create_attack
from interval_bound_propagation.src.utils import create_classification_losses
from interval_bound_propagation.src.utils import create_specification
from interval_bound_propagation.src.utils import get_attack_builder
from interval_bound_propagation.src.utils import linear_schedule
from interval_bound_propagation.src.utils import parse_learning_rate
from interval_bound_propagation.src.utils import randomize
from interval_bound_propagation.src.utils import smooth_schedule
from interval_bound_propagation.src.verifiable_wrapper import BatchFlattenWrapper
from interval_bound_propagation.src.verifiable_wrapper import BatchNormWrapper
from interval_bound_propagation.src.verifiable_wrapper import BatchReshapeWrapper
from interval_bound_propagation.src.verifiable_wrapper import ConstWrapper
from interval_bound_propagation.src.verifiable_wrapper import ImageNormWrapper
from interval_bound_propagation.src.verifiable_wrapper import IncreasingMonotonicWrapper
from interval_bound_propagation.src.verifiable_wrapper import LinearConv1dWrapper
from interval_bound_propagation.src.verifiable_wrapper import LinearConv2dWrapper
from interval_bound_propagation.src.verifiable_wrapper import LinearConvWrapper
from interval_bound_propagation.src.verifiable_wrapper import LinearFCWrapper
from interval_bound_propagation.src.verifiable_wrapper import ModelInputWrapper
from interval_bound_propagation.src.verifiable_wrapper import PiecewiseMonotonicWrapper
from interval_bound_propagation.src.verifiable_wrapper import VerifiableWrapper
__version__ = '1.10'
================================================
FILE: interval_bound_propagation/src/__init__.py
================================================
# coding=utf-8
# Copyright 2019 The Interval Bound Propagation Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Library to train verifiably robust neural networks."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
================================================
FILE: interval_bound_propagation/src/attacks.py
================================================
# coding=utf-8
# Copyright 2019 The Interval Bound Propagation Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Utilities to define attacks."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import abc
import collections
import six
import sonnet as snt
import tensorflow.compat.v1 as tf
nest = tf.nest
@six.add_metaclass(abc.ABCMeta)
class UnrolledOptimizer(object):
"""In graph optimizer to be used in tf.while_loop."""
def __init__(self, colocate_gradients_with_ops=False):
self._colocate_gradients_with_ops = colocate_gradients_with_ops
@abc.abstractmethod
def minimize(self, loss, x, optim_state):
"""Compute a new value of `x` to minimize `loss`.
Args:
loss: A scalar Tensor, the value to be minimized. `loss` should be a
continuous function of `x` which supports gradients, `loss = f(x)`.
x: A list of Tensors, the values to be updated. This is analogous to the
`var_list` argument in standard TF Optimizer.
optim_state: A (possibly nested) dict, containing any state info needed
for the optimizer.
Returns:
new_x: A list of Tensors, the same length as `x`, which are updated
new_optim_state: A new state, with the same structure as `optim_state`,
which have been updated.
"""
@abc.abstractmethod
def init_state(self, x):
"""Returns the initial state of the optimizer.
Args:
x: A list of Tensors, which will be optimized.
Returns:
Any structured output.
"""
class UnrolledGradientDescent(UnrolledOptimizer):
"""Vanilla gradient descent optimizer."""
_State = collections.namedtuple('State', ['iteration']) # pylint: disable=invalid-name
def __init__(self, lr=.1, lr_fn=None, fgsm=False,
colocate_gradients_with_ops=False):
super(UnrolledGradientDescent, self).__init__(
colocate_gradients_with_ops=colocate_gradients_with_ops)
self._lr_fn = (lambda i: lr) if lr_fn is None else lr_fn
self._fgsm = fgsm
def init_state(self, unused_x):
return self._State(tf.constant(0, dtype=tf.int64))
def minimize(self, loss, x, optim_state):
"""Refer to parent class documentation."""
lr = self._lr_fn(optim_state.iteration)
grads = self.gradients(loss, x)
if self._fgsm:
grads = [tf.sign(g) for g in grads]
new_x = [None] * len(x)
for i in range(len(x)):
new_x[i] = x[i] - lr * grads[i]
new_optim_state = self._State(optim_state.iteration + 1)
return new_x, new_optim_state
def gradients(self, loss, x):
return tf.gradients(
loss, x, colocate_gradients_with_ops=self._colocate_gradients_with_ops)
# Syntactic sugar.
class UnrolledFGSMDescent(UnrolledGradientDescent):
"""Identical to UnrolledGradientDescent but forces FGM steps."""
def __init__(self, lr=.1, lr_fn=None,
colocate_gradients_with_ops=False):
super(UnrolledFGSMDescent, self).__init__(
lr, lr_fn, True, colocate_gradients_with_ops)
class UnrolledAdam(UnrolledOptimizer):
"""The Adam optimizer defined in https://arxiv.org/abs/1412.6980."""
_State = collections.namedtuple('State', ['t', 'm', 'u']) # pylint: disable=invalid-name
def __init__(self, lr=0.1, lr_fn=None, beta1=0.9, beta2=0.999, epsilon=1e-9,
colocate_gradients_with_ops=False):
super(UnrolledAdam, self).__init__(
colocate_gradients_with_ops=colocate_gradients_with_ops)
self._lr_fn = (lambda i: lr) if lr_fn is None else lr_fn
self._beta1 = beta1
self._beta2 = beta2
self._epsilon = epsilon
def init_state(self, x):
return self._State(
t=tf.constant(0, dtype=tf.int64),
m=[tf.zeros_like(v) for v in x],
u=[tf.zeros_like(v) for v in x])
def _apply_gradients(self, grads, x, optim_state):
"""Applies gradients."""
lr = self._lr_fn(optim_state.t)
new_optim_state = self._State(
t=optim_state.t + 1,
m=[None] * len(x),
u=[None] * len(x))
t = tf.cast(new_optim_state.t, tf.float32)
new_x = [None] * len(x)
for i in range(len(x)):
g = grads[i]
m_old = optim_state.m[i]
u_old = optim_state.u[i]
new_optim_state.m[i] = self._beta1 * m_old + (1. - self._beta1) * g
new_optim_state.u[i] = self._beta2 * u_old + (1. - self._beta2) * g * g
m_hat = new_optim_state.m[i] / (1. - tf.pow(self._beta1, t))
u_hat = new_optim_state.u[i] / (1. - tf.pow(self._beta2, t))
new_x[i] = x[i] - lr * m_hat / (tf.sqrt(u_hat) + self._epsilon)
return new_x, new_optim_state
def minimize(self, loss, x, optim_state):
grads = self.gradients(loss, x)
return self._apply_gradients(grads, x, optim_state)
def gradients(self, loss, x):
return tf.gradients(
loss, x, colocate_gradients_with_ops=self._colocate_gradients_with_ops)
def _spsa_gradients(loss_fn, x, delta=0.01, num_samples=16, num_iterations=4):
"""Compute gradient estimates using SPSA.
Args:
loss_fn: Callable that takes a single argument of shape [batch_size, ...]
and returns the loss contribution of each element of the batch as a
tensor of shape [batch_size].
x: List of tensors with a single element. We only support computation of
the gradient of the loss with respect to x[0]. We take a list as input to
keep the same API call as tf.gradients.
delta: The gradients are computed by computing the loss within x - delta and
x + delta.
num_samples: The total number of random samples used to compute the gradient
is `num_samples` times `num_iterations`. `num_samples` contributes to the
gradient by tiling `x` `num_samples` times.
num_iterations: The total number of random samples used to compute the
gradient is `num_samples` times `num_iterations`. `num_iterations`
contributes to the gradient by iterating using a `tf.while_loop`.
Returns:
List of tensors with a single element corresponding to the gradient of
loss_fn(x[0]) with respect to x[0].
"""
if len(x) != 1:
raise NotImplementedError('SPSA gradients with respect to multiple '
'variables is not supported.')
# loss_fn takes a single argument.
tensor = x[0]
def _get_delta(x):
return delta * tf.sign(
tf.random_uniform(tf.shape(x), minval=-1., maxval=1., dtype=x.dtype))
# Process batch_size samples at a time.
def cond(i, *_):
return tf.less(i, num_iterations)
def loop_body(i, total_grad):
"""Compute gradient estimate."""
batch_size = tf.shape(tensor)[0]
# The tiled tensor has shape [num_samples, batch_size, ...]
tiled_tensor = tf.expand_dims(tensor, axis=0)
tiled_tensor = tf.tile(tiled_tensor,
[num_samples] + [1] * len(tensor.shape))
# The tiled tensor has now shape [2, num_samples, batch_size, ...].
delta = _get_delta(tiled_tensor)
tiled_tensor = tf.stack(
[tiled_tensor + delta, tiled_tensor - delta], axis=0)
# Compute loss with shape [2, num_samples, batch_size].
losses = loss_fn(
tf.reshape(tiled_tensor,
[2 * num_samples, batch_size] + tensor.shape.as_list()[1:]))
losses = tf.reshape(losses, [2, num_samples, batch_size])
# Compute approximate gradient using broadcasting.
shape = losses.shape.as_list() + [1] * (len(tensor.shape) - 1)
shape = [(s or -1) for s in shape] # Remove None.
losses = tf.reshape(losses, shape)
g = tf.reduce_mean((losses[0] - losses[1]) / (2. * delta), axis=0)
return [i + 1, g / num_iterations + total_grad]
_, g = tf.while_loop(
cond,
loop_body,
loop_vars=[tf.constant(0.), tf.zeros_like(tensor)],
parallel_iterations=1,
back_prop=False)
return [g]
@six.add_metaclass(abc.ABCMeta)
class UnrolledSPSA(object):
"""Abstract class that represents an optimizer based on SPSA."""
class UnrolledSPSAGradientDescent(UnrolledGradientDescent, UnrolledSPSA):
"""Optimizer for gradient-free attacks in https://arxiv.org/abs/1802.05666.
Gradients estimates are computed using Simultaneous Perturbation Stochastic
Approximation (SPSA).
"""
def __init__(self, lr=0.1, lr_fn=None, fgsm=False,
colocate_gradients_with_ops=False, delta=0.01, num_samples=32,
num_iterations=4, loss_fn=None):
super(UnrolledSPSAGradientDescent, self).__init__(
lr, lr_fn, fgsm, colocate_gradients_with_ops)
assert num_samples % 2 == 0, 'Number of samples must be even'
self._delta = delta
self._num_samples = num_samples // 2 # Since we mirror +/- delta later.
self._num_iterations = num_iterations
assert loss_fn is not None, 'loss_fn must be specified.'
self._loss_fn = loss_fn
def gradients(self, loss, x):
return _spsa_gradients(self._loss_fn, x, self._delta, self._num_samples,
self._num_iterations)
# Syntactic sugar.
class UnrolledSPSAFGSMDescent(UnrolledSPSAGradientDescent):
"""Identical to UnrolledSPSAGradientDescent but forces FGSM steps."""
def __init__(self, lr=.1, lr_fn=None,
colocate_gradients_with_ops=False, delta=0.01, num_samples=32,
num_iterations=4, loss_fn=None):
super(UnrolledSPSAFGSMDescent, self).__init__(
lr, lr_fn, True, colocate_gradients_with_ops, delta, num_samples,
num_iterations, loss_fn)
class UnrolledSPSAAdam(UnrolledAdam, UnrolledSPSA):
"""Optimizer for gradient-free attacks in https://arxiv.org/abs/1802.05666.
Gradients estimates are computed using Simultaneous Perturbation Stochastic
Approximation (SPSA), combined with the ADAM update rule.
"""
def __init__(self, lr=0.1, lr_fn=None, beta1=0.9, beta2=0.999, epsilon=1e-9,
colocate_gradients_with_ops=False, delta=0.01, num_samples=32,
num_iterations=4, loss_fn=None):
super(UnrolledSPSAAdam, self).__init__(lr, lr_fn, beta1, beta2, epsilon,
colocate_gradients_with_ops)
assert num_samples % 2 == 0, 'Number of samples must be even'
self._delta = delta
self._num_samples = num_samples // 2 # Since we mirror +/- delta later.
self._num_iterations = num_iterations
assert loss_fn is not None, 'loss_fn must be specified.'
self._loss_fn = loss_fn
def gradients(self, loss, x):
return _spsa_gradients(self._loss_fn, x, self._delta, self._num_samples,
self._num_iterations)
def _is_spsa_optimizer(cls):
return issubclass(cls, UnrolledSPSA)
def wrap_optimizer(cls, **default_kwargs):
"""Wraps an optimizer such that __init__ uses the specified kwargs."""
class WrapperUnrolledOptimizer(cls):
def __init__(self, *args, **kwargs):
new_kwargs = default_kwargs.copy()
new_kwargs.update(kwargs)
super(WrapperUnrolledOptimizer, self).__init__(*args, **new_kwargs)
return WrapperUnrolledOptimizer
def _project_perturbation(perturbation, epsilon, input_image, image_bounds):
"""Project `perturbation` onto L-infinity ball of radius `epsilon`."""
clipped_perturbation = tf.clip_by_value(perturbation, -epsilon, epsilon)
new_image = tf.clip_by_value(input_image + clipped_perturbation,
image_bounds[0], image_bounds[1])
return new_image - input_image
def pgd_attack(loss_fn, input_image, epsilon, num_steps,
optimizer=UnrolledGradientDescent(),
project_perturbation=_project_perturbation,
image_bounds=None, random_init=1.):
"""Projected gradient descent for generating adversarial images.
Args:
loss_fn: A callable which takes `input_image` and `label` as arguments, and
returns the loss, a scalar Tensor, we will be minimized
input_image: Tensor, a batch of images
epsilon: float, the L-infinity norm of the maximum allowable perturbation
num_steps: int, the number of steps of gradient descent
optimizer: An `UnrolledOptimizer` object
project_perturbation: A function, which will be used to enforce some
constraint. It should have the same signature as `_project_perturbation`.
Note that if you use a custom projection function, you should double-check
your implementation, since an incorrect implementation will not error,
and will appear to work fine.
image_bounds: A pair of floats: minimum and maximum pixel value. If None
(default), the bounds are assumed to be 0 and 1.
random_init: Probability of starting from random location rather than
nominal input image.
Returns:
adversarial version of `input_image`, with L-infinity difference less than
epsilon, which tries to minimize loss_fn.
"""
image_bounds = image_bounds or (0., 1.)
random_shape = [tf.shape(input_image)[0]] + [1] * (len(input_image.shape) - 1)
use_random_init = tf.cast(
tf.random_uniform(random_shape) < float(random_init), tf.float32)
init_perturbation = use_random_init * tf.random_uniform(
tf.shape(input_image), minval=-epsilon, maxval=epsilon)
init_perturbation = project_perturbation(init_perturbation,
epsilon, input_image, image_bounds)
init_optim_state = optimizer.init_state([init_perturbation])
def loop_body(i, perturbation, flat_optim_state):
"""Update perturbation to input image."""
optim_state = nest.pack_sequence_as(structure=init_optim_state,
flat_sequence=flat_optim_state)
loss = loss_fn(input_image + perturbation)
new_perturbation_list, new_optim_state = optimizer.minimize(
loss, [perturbation], optim_state)
projected_perturbation = project_perturbation(
new_perturbation_list[0], epsilon, input_image, image_bounds)
return i + 1, projected_perturbation, nest.flatten(new_optim_state)
def cond(i, *_):
return tf.less(i, num_steps)
flat_init_optim_state = nest.flatten(init_optim_state)
_, final_perturbation, _ = tf.while_loop(
cond,
loop_body,
loop_vars=[tf.constant(0.), init_perturbation, flat_init_optim_state],
parallel_iterations=1,
back_prop=False)
adversarial_image = input_image + final_perturbation
return tf.stop_gradient(adversarial_image)
@six.add_metaclass(abc.ABCMeta)
class Attack(snt.AbstractModule):
"""Defines an attack as a Sonnet module."""
def __init__(self, predictor, specification, name, predictor_kwargs=None):
super(Attack, self).__init__(name=name)
self._predictor = predictor
self._specification = specification
if predictor_kwargs is None:
self._kwargs = {'intermediate': {}, 'final': {}}
else:
self._kwargs = predictor_kwargs
self._forced_mode = None
self._target_class = None
def _eval_fn(self, x, mode='intermediate'):
"""Runs the logits corresponding to `x`.
Args:
x: input to the predictor network.
mode: Either "intermediate" or "final". Selects the desired predictor
arguments.
Returns:
Tensor of logits.
"""
if self._forced_mode is not None:
mode = self._forced_mode
return self._predictor(x, **self._kwargs[mode])
@abc.abstractmethod
def _build(self, inputs, labels):
"""Returns the adversarial attack around inputs."""
@abc.abstractproperty
def logits(self):
"""Returns the logits corresponding to the best attack."""
@abc.abstractproperty
def attack(self):
"""Returns the best attack."""
@abc.abstractproperty
def success(self):
"""Returns whether the attack was successful."""
def force_mode(self, mode):
"""Only used by RestartedAttack to force the evaluation mode."""
self._forced_mode = mode
@property
def target_class(self):
"""Returns the target class if this attack is a targeted attacks."""
return self._target_class
@target_class.setter
def target_class(self, t):
self._target_class = t
@six.add_metaclass(abc.ABCMeta)
class PGDAttack(Attack):
"""Defines a PGD attack."""
def __init__(self, predictor, specification, epsilon, lr=.1, lr_fn=None,
num_steps=20, num_restarts=1, input_bounds=(0., 1.),
random_init=1., optimizer_builder=UnrolledGradientDescent,
project_perturbation=_project_perturbation,
predictor_kwargs=None):
super(PGDAttack, self).__init__(predictor, specification, name='pgd',
predictor_kwargs=predictor_kwargs)
self._num_steps = num_steps
self._num_restarts = num_restarts
self._epsilon = epsilon
self._lr = lr
self._lr_fn = lr_fn
self._input_bounds = input_bounds
self._random_init = random_init
self._optimizer_builder = optimizer_builder
self._project_perturbation = project_perturbation
# Helper functions.
def prepare_inputs(self, inputs):
"""Tiles inputs according to number of restarts."""
batch_size = tf.shape(inputs)[0]
input_shape = list(inputs.shape.as_list()[1:])
duplicated_inputs = tf.expand_dims(inputs, axis=0)
# Shape is [num_restarts, batch_size, ...]
duplicated_inputs = tf.tile(
duplicated_inputs,
[self._num_restarts, 1] + [1] * len(input_shape))
# Shape is [num_restarts * batch_size, ...]
duplicated_inputs = tf.reshape(
duplicated_inputs, [self._num_restarts * batch_size] + input_shape)
return batch_size, input_shape, duplicated_inputs
def prepare_labels(self, labels):
"""Tiles labels according to number of restarts."""
return tf.tile(labels, [self._num_restarts])
def find_worst_attack(self, objective_fn, adversarial_input, batch_size,
input_shape):
"""Returns the attack that maximizes objective_fn."""
adversarial_objective = objective_fn(adversarial_input)
adversarial_objective = tf.reshape(adversarial_objective, [-1, batch_size])
adversarial_input = tf.reshape(adversarial_input,
[-1, batch_size] + input_shape)
i = tf.argmax(adversarial_objective, axis=0)
j = tf.cast(tf.range(tf.shape(adversarial_objective)[1]), i.dtype)
ij = tf.stack([i, j], axis=1)
return tf.gather_nd(adversarial_input, ij)
def _maximize_margin(bounds):
# Bounds has shape [num_restarts, batch_size, num_specs].
return tf.reduce_max(bounds, axis=-1)
def _any_greater(bounds):
# Bounds has shape [batch_size, num_specs].
bounds = tf.reduce_max(bounds, axis=-1)
return bounds > 0.
def _maximize_topk_hinge_margin(bounds, k=5, margin=.1):
# Bounds has shape [num_restarts, batch_size, num_specs].
b = tf.nn.top_k(bounds, k=k, sorted=False).values
return tf.reduce_sum(tf.minimum(b, margin), axis=-1)
def _topk_greater(bounds, k=5):
# Bounds has shape [batch_size, num_specs].
b = tf.nn.top_k(bounds, k=k, sorted=False).values
return tf.reduce_min(b, axis=-1) > 0.
class UntargetedPGDAttack(PGDAttack):
"""Defines an untargeted PGD attack."""
def __init__(self, predictor, specification, epsilon, lr=.1, lr_fn=None,
num_steps=20, num_restarts=1, input_bounds=(0., 1.),
random_init=1., optimizer_builder=UnrolledGradientDescent,
project_perturbation=_project_perturbation,
objective_fn=_maximize_margin, success_fn=_any_greater,
predictor_kwargs=None):
super(UntargetedPGDAttack, self).__init__(
predictor, specification, epsilon, lr, lr_fn, num_steps, num_restarts,
input_bounds, random_init, optimizer_builder, project_perturbation,
predictor_kwargs)
self._objective_fn = objective_fn
self._success_fn = success_fn
def _build(self, inputs, labels):
batch_size, input_shape, duplicated_inputs = self.prepare_inputs(inputs)
duplicated_labels = self.prepare_labels(labels)
# Define objectives.
def objective_fn(x):
model_logits = self._eval_fn(x) # [restarts * batch_size, output].
model_logits = tf.reshape(
model_logits, [self._num_restarts, batch_size, -1])
bounds = self._specification.evaluate(model_logits)
# Output has dimension [num_restarts, batch_size].
return self._objective_fn(bounds)
# Only used for SPSA.
# The input to this loss is the perturbation (not the image).
# The first dimension corresponds to the number of SPSA samples.
# Shape of perturbations is [num_samples, restarts * batch_size, ...]
def spsa_loss_fn(perturbation):
"""Computes the loss per SPSA sample."""
x = tf.reshape(
perturbation + tf.expand_dims(duplicated_inputs, axis=0),
[-1] + duplicated_inputs.shape.as_list()[1:])
model_logits = self._eval_fn(x)
num_outputs = tf.shape(model_logits)[1]
model_logits = tf.reshape(
model_logits, [-1, batch_size, num_outputs])
bounds = self._specification.evaluate(model_logits)
losses = -self._objective_fn(bounds)
return tf.reshape(losses, [-1])
def reduced_loss_fn(x):
# Pick worse attack, output has shape [num_restarts, batch_size].
return -tf.reduce_sum(objective_fn(x))
# Use targeted attacks as specified by the specification.
if _is_spsa_optimizer(self._optimizer_builder):
optimizer = self._optimizer_builder(lr=self._lr, lr_fn=self._lr_fn,
loss_fn=spsa_loss_fn)
else:
optimizer = self._optimizer_builder(lr=self._lr, lr_fn=self._lr_fn)
adversarial_input = pgd_attack(
reduced_loss_fn, duplicated_inputs, epsilon=self._epsilon,
num_steps=self._num_steps, image_bounds=self._input_bounds,
random_init=self._random_init, optimizer=optimizer,
project_perturbation=self._project_perturbation)
adversarial_input = self.adapt(duplicated_inputs, adversarial_input,
duplicated_labels)
self._attack = self.find_worst_attack(objective_fn, adversarial_input,
batch_size, input_shape)
self._logits = self._eval_fn(self._attack, mode='final')
self._success = self._success_fn(self._specification.evaluate(self._logits))
return self._attack
@property
def logits(self):
self._ensure_is_connected()
return self._logits
@property
def attack(self):
self._ensure_is_connected()
return self._attack
@property
def success(self):
self._ensure_is_connected()
return self._success
def adapt(self, original_inputs, adversarial_inputs, labels):
"""Function called after PGD to adapt adversarial examples."""
return adversarial_inputs
class UntargetedTop5PGDAttack(UntargetedPGDAttack):
"""Defines an untargeted PGD attack on top-5."""
def __init__(self, predictor, specification, epsilon, lr=.1, lr_fn=None,
num_steps=20, num_restarts=1, input_bounds=(0., 1.),
random_init=1., optimizer_builder=UnrolledGradientDescent,
project_perturbation=_project_perturbation,
objective_fn=_maximize_topk_hinge_margin, predictor_kwargs=None):
super(UntargetedTop5PGDAttack, self).__init__(
predictor, specification, epsilon, lr=lr, lr_fn=lr_fn,
num_steps=num_steps, num_restarts=num_restarts,
input_bounds=input_bounds, random_init=random_init,
optimizer_builder=optimizer_builder,
project_perturbation=project_perturbation, objective_fn=objective_fn,
success_fn=_topk_greater, predictor_kwargs=predictor_kwargs)
class UntargetedAdaptivePGDAttack(UntargetedPGDAttack):
"""Uses an adaptive scheme to pick attacks that are just strong enough."""
def adapt(self, original_inputs, adversarial_inputs, labels):
"""Runs binary search to find the first misclassified input."""
batch_size = tf.shape(original_inputs)[0]
binary_search_iterations = 10
def cond(i, *_):
return tf.less(i, binary_search_iterations)
def get(m):
m = tf.reshape(m, [batch_size] + [1] * (len(original_inputs.shape) - 1))
return (adversarial_inputs - original_inputs) * m + original_inputs
def is_attack_successful(m):
logits = self._eval_fn(get(m))
return self._success_fn(self._specification.evaluate(logits))
def loop_body(i, lower, upper):
m = (lower + upper) * .5
success = is_attack_successful(m)
new_lower = tf.where(success, lower, m)
new_upper = tf.where(success, m, upper)
return i + 1, new_lower, new_upper
lower = tf.zeros(shape=[batch_size])
upper = tf.ones(shape=[batch_size])
_, lower, upper = tf.while_loop(
cond,
loop_body,
loop_vars=[tf.constant(0.), lower, upper],
parallel_iterations=1,
back_prop=False)
# If lower is incorrectly classified, pick lower; otherwise pick upper.
success = is_attack_successful(lower)
return get(tf.where(success, lower, upper))
class MultiTargetedPGDAttack(PGDAttack):
"""Runs targeted attacks for each specification."""
def __init__(self, predictor, specification, epsilon, lr=.1, lr_fn=None,
num_steps=20, num_restarts=1, input_bounds=(0., 1.),
random_init=1., optimizer_builder=UnrolledGradientDescent,
project_perturbation=_project_perturbation,
max_specifications=0, random_specifications=False,
predictor_kwargs=None):
super(MultiTargetedPGDAttack, self).__init__(
predictor, specification, epsilon, lr=lr, lr_fn=lr_fn,
num_steps=num_steps, num_restarts=num_restarts,
input_bounds=input_bounds, random_init=random_init,
optimizer_builder=optimizer_builder,
project_perturbation=project_perturbation,
predictor_kwargs=predictor_kwargs)
self._max_specifications = max_specifications
self._random_specifications = random_specifications
def _build(self, inputs, labels):
batch_size = tf.shape(inputs)[0]
num_specs = self._specification.num_specifications
if self._max_specifications > 0 and self._max_specifications < num_specs:
model_logits = self._eval_fn(inputs)
bounds = self._specification.evaluate(model_logits)
_, idx = tf.math.top_k(bounds, k=self._max_specifications, sorted=False)
if self._random_specifications:
idx = tf.random.uniform(shape=tf.shape(idx),
maxval=self._specification.num_specifications,
dtype=idx.dtype)
idx = tf.tile(tf.expand_dims(idx, 0), [self._num_restarts, 1, 1])
select_fn = lambda x: tf.gather(x, idx, batch_dims=len(idx.shape) - 1)
else:
select_fn = lambda x: x
input_shape = list(inputs.shape.as_list()[1:])
duplicated_inputs = tf.expand_dims(inputs, axis=0)
# Shape is [num_restarts * num_specifications, batch_size, ...]
duplicated_inputs = tf.tile(
duplicated_inputs,
[self._num_restarts * num_specs, 1] + [1] * len(input_shape))
# Shape is [num_restarts * num_specifications * batch_size, ...]
duplicated_inputs = tf.reshape(duplicated_inputs, [-1] + input_shape)
def objective_fn(x):
# Output has shape [restarts * num_specs * batch_size, output].
model_logits = self._eval_fn(x)
model_logits = tf.reshape(
model_logits, [self._num_restarts, num_specs, batch_size, -1])
# Output has shape [num_restarts, batch_size, num_specs].
return self._specification.evaluate(model_logits)
def reduced_loss_fn(x):
# Negate as we minimize.
return -tf.reduce_sum(select_fn(objective_fn(x)))
# Use targeted attacks as specified by the specification.
if _is_spsa_optimizer(self._optimizer_builder):
raise ValueError('"UnrolledSPSA*" unsupported in '
'MultiTargetedPGDAttack')
optimizer = self._optimizer_builder(lr=self._lr, lr_fn=self._lr_fn)
adversarial_input = pgd_attack(
reduced_loss_fn, duplicated_inputs,
epsilon=self._epsilon, num_steps=self._num_steps,
image_bounds=self._input_bounds, random_init=self._random_init,
optimizer=optimizer, project_perturbation=self._project_perturbation)
# Get best attack.
adversarial_objective = objective_fn(adversarial_input)
adversarial_objective = tf.transpose(adversarial_objective, [0, 2, 1])
adversarial_objective = tf.reshape(adversarial_objective, [-1, batch_size])
adversarial_input = tf.reshape(adversarial_input,
[-1, batch_size] + input_shape)
i = tf.argmax(adversarial_objective, axis=0)
j = tf.cast(tf.range(tf.shape(adversarial_objective)[1]), i.dtype)
ij = tf.stack([i, j], axis=1)
self._attack = tf.gather_nd(adversarial_input, ij)
self._logits = self._eval_fn(self._attack, mode='final')
# Count the number of sample that violate any specification.
bounds = tf.reduce_max(self._specification.evaluate(self._logits), axis=1)
self._success = (bounds > 0.)
return self._attack
@property
def logits(self):
self._ensure_is_connected()
return self._logits
@property
def attack(self):
self._ensure_is_connected()
return self._attack
@property
def success(self):
self._ensure_is_connected()
return self._success
class MemoryEfficientMultiTargetedPGDAttack(PGDAttack):
"""Defines a targeted PGD attack for each specification using while_loop."""
def __init__(self, predictor, specification, epsilon, lr=.1, lr_fn=None,
num_steps=20, num_restarts=1, input_bounds=(0., 1.),
random_init=1., optimizer_builder=UnrolledGradientDescent,
project_perturbation=_project_perturbation,
max_specifications=0, random_specifications=False,
predictor_kwargs=None):
super(MemoryEfficientMultiTargetedPGDAttack, self).__init__(
predictor, specification, epsilon, lr=lr, lr_fn=lr_fn,
num_steps=num_steps, num_restarts=num_restarts,
input_bounds=input_bounds, random_init=random_init,
optimizer_builder=optimizer_builder,
project_perturbation=project_perturbation,
predictor_kwargs=predictor_kwargs)
self._max_specifications = max_specifications
self._random_specifications = random_specifications
def _build(self, inputs, labels):
batch_size, input_shape, duplicated_inputs = self.prepare_inputs(inputs)
if (self._max_specifications > 0 and
self._max_specifications < self._specification.num_specifications):
num_specs = self._max_specifications
model_logits = self._eval_fn(inputs)
bounds = self._specification.evaluate(model_logits)
_, idx = tf.math.top_k(bounds, k=num_specs, sorted=False)
if self._random_specifications:
idx = tf.random.uniform(shape=tf.shape(idx),
maxval=self._specification.num_specifications,
dtype=idx.dtype)
idx = tf.tile(tf.expand_dims(idx, 0), [self._num_restarts, 1, 1])
def select_fn(x, i):
return tf.squeeze(
tf.gather(x, tf.expand_dims(idx[:, :, i], -1),
batch_dims=len(idx.shape) - 1),
axis=-1)
else:
num_specs = self._specification.num_specifications
select_fn = lambda x, i: x[:, :, i]
def objective_fn(x):
model_logits = self._eval_fn(x) # [restarts * batch_size, output].
model_logits = tf.reshape(
model_logits, [self._num_restarts, batch_size, -1])
# Output has dimension [num_restarts, batch_size, num_specifications].
return self._specification.evaluate(model_logits)
def flat_objective_fn(x):
return _maximize_margin(objective_fn(x))
def build_loss_fn(idx):
def _reduced_loss_fn(x):
# Pick worse attack, output has shape [num_restarts, batch_size].
return -tf.reduce_sum(select_fn(objective_fn(x), idx))
return _reduced_loss_fn
if _is_spsa_optimizer(self._optimizer_builder):
raise ValueError('"UnrolledSPSA*" unsupported in '
'MultiTargetedPGDAttack')
optimizer = self._optimizer_builder(lr=self._lr, lr_fn=self._lr_fn)
# Run a separate PGD attack for each specification.
def cond(spec_idx, unused_attack, success):
# If we are already successful, we break.
return tf.logical_and(spec_idx < num_specs,
tf.logical_not(tf.reduce_all(success)))
def body(spec_idx, attack, success):
"""Runs a separate PGD attack for each specification."""
adversarial_input = pgd_attack(
build_loss_fn(spec_idx), duplicated_inputs,
epsilon=self._epsilon, num_steps=self._num_steps,
image_bounds=self._input_bounds, random_init=self._random_init,
optimizer=optimizer, project_perturbation=self._project_perturbation)
new_attack = self.find_worst_attack(flat_objective_fn, adversarial_input,
batch_size, input_shape)
new_logits = self._eval_fn(new_attack)
# Count the number of sample that violate any specification.
new_success = _any_greater(self._specification.evaluate(new_logits))
# The first iteration always sets the attack and logits.
use_new_values = tf.logical_or(tf.equal(spec_idx, 0), new_success)
print_op = tf.print('Processed specification #', spec_idx)
with tf.control_dependencies([print_op]):
new_spec_idx = spec_idx + 1
return (new_spec_idx,
tf.where(use_new_values, new_attack, attack),
tf.logical_or(success, new_success))
_, self._attack, self._success = tf.while_loop(
cond, body, back_prop=False, parallel_iterations=1,
loop_vars=[
tf.constant(0, dtype=tf.int32),
inputs,
tf.zeros([tf.shape(inputs)[0]], dtype=tf.bool),
])
self._logits = self._eval_fn(self._attack, mode='final')
return self._attack
@property
def logits(self):
self._ensure_is_connected()
return self._logits
@property
def attack(self):
self._ensure_is_connected()
return self._attack
@property
def success(self):
self._ensure_is_connected()
return self._success
class RestartedAttack(Attack):
"""Wraps an attack to run it multiple times using a tf.while_loop."""
def __init__(self, inner_attack, num_restarts=1):
super(RestartedAttack, self).__init__(
inner_attack._predictor, # pylint: disable=protected-access
inner_attack._specification, # pylint: disable=protected-access
name='restarted_' + inner_attack.module_name,
predictor_kwargs=inner_attack._kwargs) # pylint: disable=protected-access
self._inner_attack = inner_attack
self._num_restarts = num_restarts
# Prevent the inner attack from updating batch normalization statistics.
self._inner_attack.force_mode('intermediate')
def _build(self, inputs, labels):
def cond(i, unused_attack, success):
# If we are already successful, we break.
return tf.logical_and(i < self._num_restarts,
tf.logical_not(tf.reduce_all(success)))
def body(i, attack, success):
new_attack = self._inner_attack(inputs, labels)
new_success = self._inner_attack.success
# The first iteration always sets the attack.
use_new_values = tf.logical_or(tf.equal(i, 0), new_success)
return (i + 1,
tf.where(use_new_values, new_attack, attack),
tf.logical_or(success, new_success))
_, self._attack, self._success = tf.while_loop(
cond, body, back_prop=False, parallel_iterations=1,
loop_vars=[
tf.constant(0, dtype=tf.int32),
inputs,
tf.zeros([tf.shape(inputs)[0]], dtype=tf.bool),
])
self._logits = self._eval_fn(self._attack, mode='final')
return self._attack
@property
def logits(self):
self._ensure_is_connected()
return self._logits
@property
def attack(self):
self._ensure_is_connected()
return self._attack
@property
def success(self):
self._ensure_is_connected()
return self._success
================================================
FILE: interval_bound_propagation/src/bounds.py
================================================
# coding=utf-8
# Copyright 2019 The Interval Bound Propagation Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Definition of input bounds to each layer."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import abc
import itertools
import six
import sonnet as snt
import tensorflow.compat.v1 as tf
@six.add_metaclass(abc.ABCMeta)
class AbstractBounds(object):
"""Abstract bounds class."""
def __init__(self):
self._update_cache_op = None
@classmethod
@abc.abstractmethod
def convert(cls, bounds):
"""Converts another bound type to this type."""
@abc.abstractproperty
def shape(self):
"""Returns shape (as list) of the tensor, including batch dimension."""
def concretize(self):
return self
def _raise_not_implemented(self, name):
raise NotImplementedError(
'{} modules are not supported by "{}".'.format(
name, self.__class__.__name__))
def apply_linear(self, wrapper, w, b): # pylint: disable=unused-argument
self._raise_not_implemented('snt.Linear')
def apply_conv1d(self, wrapper, w, b, padding, stride): # pylint: disable=unused-argument
self._raise_not_implemented('snt.Conv1D')
def apply_conv2d(self, wrapper, w, b, padding, strides): # pylint: disable=unused-argument
self._raise_not_implemented('snt.Conv2D')
def apply_increasing_monotonic_fn(self, wrapper, fn, *args, **parameters): # pylint: disable=unused-argument
self._raise_not_implemented(fn.__name__)
def apply_piecewise_monotonic_fn(self, wrapper, fn, boundaries, *args): # pylint: disable=unused-argument
self._raise_not_implemented(fn.__name__)
def apply_batch_norm(self, wrapper, mean, variance, scale, bias, epsilon): # pylint: disable=unused-argument
self._raise_not_implemented('ibp.BatchNorm')
def apply_batch_reshape(self, wrapper, shape): # pylint: disable=unused-argument
self._raise_not_implemented('snt.BatchReshape')
def apply_softmax(self, wrapper): # pylint: disable=unused-argument
self._raise_not_implemented('tf.nn.softmax')
@property
def update_cache_op(self):
"""TF op to update cached bounds for re-use across session.run calls."""
if self._update_cache_op is None:
raise ValueError('Bounds not cached: enable_caching() not called.')
return self._update_cache_op
def enable_caching(self):
"""Enables caching the bounds for re-use across session.run calls."""
if self._update_cache_op is not None:
raise ValueError('Bounds already cached: enable_caching() called twice.')
self._update_cache_op = self._set_up_cache()
def _set_up_cache(self):
"""Replace fields with cached versions.
Returns:
TensorFlow op to update the cache.
"""
return tf.no_op() # By default, don't cache.
def _cache_with_update_op(self, tensor):
"""Creates non-trainable variable to cache the tensor across sess.run calls.
Args:
tensor: Tensor to cache.
Returns:
cached_tensor: Non-trainable variable to contain the cached value
of `tensor`.
update_op: TensorFlow op to re-evaluate `tensor` and assign the result
to `cached_tensor`.
"""
cached_tensor = tf.get_variable(
tensor.name.replace(':', '__') + '_ibp_cache',
shape=tensor.shape, dtype=tensor.dtype, trainable=False)
update_op = tf.assign(cached_tensor, tensor)
return cached_tensor, update_op
class IntervalBounds(AbstractBounds):
"""Axis-aligned bounding box."""
def __init__(self, lower, upper):
super(IntervalBounds, self).__init__()
self._lower = lower
self._upper = upper
@property
def lower(self):
return self._lower
@property
def upper(self):
return self._upper
@property
def shape(self):
return self.lower.shape.as_list()
def __iter__(self):
yield self.lower
yield self.upper
@classmethod
def convert(cls, bounds):
if isinstance(bounds, tf.Tensor):
return cls(bounds, bounds)
bounds = bounds.concretize()
if not isinstance(bounds, cls):
raise ValueError('Cannot convert "{}" to "{}"'.format(bounds,
cls.__name__))
return bounds
def apply_linear(self, wrapper, w, b):
return self._affine(w, b, tf.matmul)
def apply_conv1d(self, wrapper, w, b, padding, stride):
return self._affine(w, b, tf.nn.conv1d, padding=padding, stride=stride)
def apply_conv2d(self, wrapper, w, b, padding, strides):
return self._affine(w, b, tf.nn.convolution,
padding=padding, strides=strides)
def _affine(self, w, b, fn, **kwargs):
c = (self.lower + self.upper) / 2.
r = (self.upper - self.lower) / 2.
c = fn(c, w, **kwargs)
if b is not None:
c = c + b
r = fn(r, tf.abs(w), **kwargs)
return IntervalBounds(c - r, c + r)
def apply_increasing_monotonic_fn(self, wrapper, fn, *args, **parameters):
args_lower = [self.lower] + [a.lower for a in args]
args_upper = [self.upper] + [a.upper for a in args]
return IntervalBounds(fn(*args_lower), fn(*args_upper))
def apply_piecewise_monotonic_fn(self, wrapper, fn, boundaries, *args):
valid_values = []
for a in [self] + list(args):
vs = []
vs.append(a.lower)
vs.append(a.upper)
for b in boundaries:
vs.append(
tf.maximum(a.lower, tf.minimum(a.upper, b * tf.ones_like(a.lower))))
valid_values.append(vs)
outputs = []
for inputs in itertools.product(*valid_values):
outputs.append(fn(*inputs))
outputs = tf.stack(outputs, axis=-1)
return IntervalBounds(tf.reduce_min(outputs, axis=-1),
tf.reduce_max(outputs, axis=-1))
def apply_batch_norm(self, wrapper, mean, variance, scale, bias, epsilon):
# Element-wise multiplier.
multiplier = tf.rsqrt(variance + epsilon)
if scale is not None:
multiplier *= scale
w = multiplier
# Element-wise bias.
b = -multiplier * mean
if bias is not None:
b += bias
b = tf.squeeze(b, axis=0)
# Because the scale might be negative, we need to apply a strategy similar
# to linear.
c = (self.lower + self.upper) / 2.
r = (self.upper - self.lower) / 2.
c = tf.multiply(c, w) + b
r = tf.multiply(r, tf.abs(w))
return IntervalBounds(c - r, c + r)
def apply_batch_reshape(self, wrapper, shape):
return IntervalBounds(snt.BatchReshape(shape)(self.lower),
snt.BatchReshape(shape)(self.upper))
def apply_softmax(self, wrapper):
ub = self.upper
lb = self.lower
# Keep diagonal and take opposite bound for non-diagonals.
lbs = tf.matrix_diag(lb) + tf.expand_dims(ub, axis=-2) - tf.matrix_diag(ub)
ubs = tf.matrix_diag(ub) + tf.expand_dims(lb, axis=-2) - tf.matrix_diag(lb)
# Get diagonal entries after softmax operation.
ubs = tf.matrix_diag_part(tf.nn.softmax(ubs))
lbs = tf.matrix_diag_part(tf.nn.softmax(lbs))
return IntervalBounds(lbs, ubs)
def _set_up_cache(self):
self._lower, update_lower_op = self._cache_with_update_op(self._lower)
self._upper, update_upper_op = self._cache_with_update_op(self._upper)
return tf.group([update_lower_op, update_upper_op])
================================================
FILE: interval_bound_propagation/src/crown.py
================================================
# coding=utf-8
# Copyright 2019 The Interval Bound Propagation Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""CROWN-IBP implementation."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import collections
from absl import logging
from interval_bound_propagation.src import bounds
from interval_bound_propagation.src import fastlin
from interval_bound_propagation.src import loss
from interval_bound_propagation.src import model
from interval_bound_propagation.src import specification as specification_lib
from interval_bound_propagation.src import utils
from interval_bound_propagation.src import verifiable_wrapper
import tensorflow.compat.v1 as tf
class BackwardBounds(bounds.AbstractBounds):
"""Implementation of backward bound propagation used by CROWN."""
def __init__(self, lower, upper):
super(BackwardBounds, self).__init__()
# Setting "lower" or "upper" to None will avoid creating the computation
# graph for CROWN lower or upper bounds. For verifiable training, only the
# upper bound is necessary.
self._lower = lower
self._upper = upper
@property
def lower(self):
return self._lower
@property
def upper(self):
return self._upper
@property
def shape(self):
return self.lower.shape.as_list()
def concretize(self):
"""Returns lower and upper interval bounds."""
lb = ub = None
if self.lower is not None:
lb = (
tf.einsum('nsi,ni->ns',
self._reshape_to_rank(tf.maximum(self.lower.w, 0), 3),
self._reshape_to_rank(self.lower.lower, 2)) +
tf.einsum('nsi,ni->ns',
self._reshape_to_rank(tf.minimum(self.lower.w, 0), 3),
self._reshape_to_rank(self.lower.upper, 2)))
lb += self.lower.b
if self.upper is not None:
ub = (
tf.einsum('nsi,ni->ns',
self._reshape_to_rank(tf.maximum(self.upper.w, 0), 3),
self._reshape_to_rank(self.upper.upper, 2)) +
tf.einsum('nsi,ni->ns',
self._reshape_to_rank(tf.minimum(self.upper.w, 0), 3),
self._reshape_to_rank(self.upper.lower, 2)))
ub += self.upper.b
return bounds.IntervalBounds(lb, ub)
@classmethod
def convert(cls, other_bounds):
if isinstance(other_bounds, cls):
return other_bounds
raise RuntimeError('BackwardBounds does not support conversion from any '
'other bound type.')
def apply_linear(self, wrapper, w, b):
"""Propagate CROWN bounds backward through a linear layer."""
def _linear_propagate(bound):
"""Propagate one side of the bound."""
new_bound_w = tf.einsum('nsk,lk->nsl', bound.w, w)
if b is not None:
bias = tf.tensordot(bound.w, b, axes=1)
return fastlin.LinearExpression(w=new_bound_w, b=bias + bound.b,
lower=wrapper.input_bounds.lower,
upper=wrapper.input_bounds.upper)
ub_expr = _linear_propagate(self.upper) if self.upper else None
lb_expr = _linear_propagate(self.lower) if self.lower else None
return BackwardBounds(lb_expr, ub_expr)
def apply_conv2d(self, wrapper, w, b, padding, strides):
"""Propagate CROWN bounds backward through a convolution layer."""
def _conv2d_propagate(bound):
"""Propagate one side of the bound."""
s = tf.shape(bound.w)
# Variable bound.w has shape (batch_size, num_specs, H, W, C),
# resize it to (batch_size * num_specs, H, W, C) for batch processing.
effective_batch_size = tf.reshape(s[0] * s[1], [1])
batched_shape = tf.concat([effective_batch_size, s[2:]], 0)
# The output of a deconvolution is the input shape of the corresponding
# convolution.
output_shape = wrapper.input_bounds.lower.shape
batched_output_shape = tf.concat([effective_batch_size, output_shape[1:]],
0)
# Batched transpose convolution for efficiency.
bound_batch = tf.nn.conv2d_transpose(tf.reshape(bound.w, batched_shape),
filter=w,
output_shape=batched_output_shape,
strides=[1] + list(strides) + [1],
padding=padding)
# Reshape results to (batch_size, num_specs, new_H, new_W, new_C).
new_shape = tf.concat(
[tf.reshape(s[0], [1]), tf.reshape(s[1], [1]), output_shape[1:]], 0)
new_bound_w = tf.reshape(bound_batch, new_shape)
# If this convolution has bias, multiplies it with current w.
bias = 0
if b is not None:
# Variable bound.w has dimension (batch_size, num_specs, H, W, C),
# accumulate H and W, and do a dot product for each channel C.
bias = tf.tensordot(tf.reduce_sum(bound.w, [2, 3]), b, axes=1)
return fastlin.LinearExpression(w=new_bound_w, b=bias + bound.b,
lower=wrapper.input_bounds.lower,
upper=wrapper.input_bounds.upper)
ub_expr = _conv2d_propagate(self.upper) if self.upper else None
lb_expr = _conv2d_propagate(self.lower) if self.lower else None
return BackwardBounds(lb_expr, ub_expr)
def _get_monotonic_fn_bound(self, wrapper, fn):
"""Compute CROWN upper and lower linear bounds for a given function fn."""
# Get lower and upper bounds from forward IBP pass.
lb, ub = wrapper.input_bounds.lower, wrapper.input_bounds.upper
if fn.__name__ == 'relu':
# CROWN upper and lower linear bounds for ReLU.
f_lb = tf.minimum(lb, 0)
f_ub = tf.maximum(ub, 0)
# When both ub and lb are very close to 0 we might have NaN issue,
# so we have to avoid this happening.
f_ub = tf.maximum(f_ub, f_lb + 1e-8)
# CROWN upper/lower scaling matrices and biases.
ub_scaling_matrix = f_ub / (f_ub - f_lb)
ub_bias = -f_lb * ub_scaling_matrix
# Expand dimension for using broadcast later.
ub_scaling_matrix = tf.expand_dims(ub_scaling_matrix, 1)
lb_scaling_matrix = tf.cast(tf.greater(ub_scaling_matrix, .5),
dtype=tf.float32)
lb_bias = 0.
# For 'apply' fn we need to differentiate them through the wrapper.
elif isinstance(wrapper, verifiable_wrapper.ImageNormWrapper):
inner_module = wrapper.inner_module
ub_scaling_matrix = lb_scaling_matrix = inner_module.scale
ub_bias = - inner_module.offset * inner_module.scale
lb_bias = ub_bias
else:
raise NotImplementedError('monotonic fn {} is not supported '
'by BackwardBounds'.format(fn.__name__))
return ub_scaling_matrix, lb_scaling_matrix, ub_bias, lb_bias
def apply_increasing_monotonic_fn(self, wrapper, fn, *args):
"""Propagate CROWN bounds backward through a increasing monotonic fn."""
# Function _get_monotonic_fn_bound returns matrix and bias term for linear
# relaxation.
(ub_scaling_matrix, lb_scaling_matrix,
ub_bias, lb_bias) = self._get_monotonic_fn_bound(wrapper, fn)
def _propagate_monotonic_fn(bound, ub_mult, lb_mult):
# Matrix multiplication by a diagonal matrix.
new_bound_w = ub_mult * ub_scaling_matrix + lb_mult * lb_scaling_matrix
# Matrix vector product for the bias term. ub_bias or lb_bias might be 0
# or a constant, or need broadcast. They will be handled optimally.
b = self._matvec(ub_mult, ub_bias) + self._matvec(lb_mult, lb_bias)
return fastlin.LinearExpression(w=new_bound_w, b=bound.b + b,
lower=wrapper.input_bounds.lower,
upper=wrapper.input_bounds.upper)
# Multiplies w to upper or lower scaling terms according to its sign.
ub_expr = _propagate_monotonic_fn(
self.upper, tf.maximum(self.upper.w, 0),
tf.minimum(self.upper.w, 0)) if self.upper else None
lb_expr = _propagate_monotonic_fn(
self.lower, tf.minimum(self.lower.w, 0),
tf.maximum(self.lower.w, 0)) if self.lower else None
return BackwardBounds(lb_expr, ub_expr)
def apply_batch_reshape(self, wrapper, shape):
"""Propagate CROWN bounds backward through a reshape layer."""
input_shape = wrapper.input_bounds.lower.shape[1:]
def _propagate_batch_flatten(bound):
new_bound_w = tf.reshape(
bound.w, tf.concat([tf.shape(bound.w)[:2], input_shape], 0))
return fastlin.LinearExpression(w=new_bound_w, b=bound.b,
lower=wrapper.input_bounds.lower,
upper=wrapper.input_bounds.upper)
ub_expr = _propagate_batch_flatten(self.upper) if self.upper else None
lb_expr = _propagate_batch_flatten(self.lower) if self.lower else None
return BackwardBounds(lb_expr, ub_expr)
@staticmethod
def _reshape_to_rank(a, rank):
"""Reshapes to the given rank while keeping the first (rank-1) dims."""
shape = tf.concat([tf.shape(a)[0:(rank - 1)], [-1]], axis=-1)
return tf.reshape(a, shape)
@staticmethod
def _matvec(a, b):
"""Specialized matvec detecting the case where b is 0 or constant."""
if isinstance(b, int) or isinstance(b, float):
if b == 0:
# For efficiency we directly return constant 0, no graph generated.
return 0
else:
# Broadcasting a constant.
return a * b
elif len(b.shape) == 1:
# Need to broadcast against all examples in the batch. This can be done
# using an einsum "tf.einsum('ns...c,c->ns', a, b)" but it currently
# triggers a compiler bug on TPUs, thus we use the following instead.
return tf.einsum('nsc,c->ns', tf.reduce_sum(a, [2, 3]), b)
else:
# Normal 1D or 3D mat-vec product.
return tf.einsum('nsi,ni->ns',
BackwardBounds._reshape_to_rank(a, 3),
BackwardBounds._reshape_to_rank(b, 2))
ScalarMetrics = collections.namedtuple('ScalarMetrics', [
'nominal_accuracy',
# Verified accuracy using pure IBP bounds.
'verified_accuracy',
# Verified accuracy using CROWN and IBP mixture.
'crown_ibp_verified_accuracy',
'attack_accuracy',
'attack_success'])
ScalarLosses = collections.namedtuple('ScalarLosses', [
'nominal_cross_entropy',
'attack_cross_entropy',
'verified_loss'])
class Losses(loss.Losses):
"""Helper to compute CROWN-IBP losses."""
def __init__(self, predictor, specification=None, pgd_attack=None,
interval_bounds_loss_type='xent',
interval_bounds_hinge_margin=10.,
label_smoothing=0.,
use_crown_ibp=False,
crown_bound_schedule=None):
super(Losses, self).__init__(predictor, specification, pgd_attack,
interval_bounds_loss_type,
interval_bounds_hinge_margin,
label_smoothing)
self._use_crown_ibp = use_crown_ibp
self._crown_bound_schedule = crown_bound_schedule
def _get_specification_bounds(self):
"""Get upper bounds on specification. Used for building verified loss."""
ibp_bounds = self._specification(self._predictor.modules)
# Compute verified accuracy using IBP bounds.
v = tf.reduce_max(ibp_bounds, axis=1)
self._interval_bounds_accuracy = tf.reduce_mean(
tf.cast(v <= 0., tf.float32))
# CROWN-IBP bounds.
if self._use_crown_ibp:
logging.info('CROWN-IBP active')
def _build_crown_ibp_bounds():
"""Create the computationally expensive CROWN bounds for tf.cond."""
predictor = self._predictor
# CROWN is computed backwards so we need to start with a
# initial bound related to the specification.
init_crown_bounds = create_initial_backward_bounds(self._specification,
predictor.modules)
# Now propagate the specification matrix layer by layer;
# we only need the CROWN upper bound, do not need lower bound.
crown_bound = predictor.propagate_bound_backward(init_crown_bounds,
compute_upper=True,
compute_lower=False)
# A linear mixture of the two bounds with a schedule.
return self._crown_bound_schedule * crown_bound.upper + \
(1. - self._crown_bound_schedule) * ibp_bounds
# If the coefficient for CROWN bound is close to 0, compute IBP only.
mixture_bounds = tf.cond(self._crown_bound_schedule < 1e-6,
lambda: ibp_bounds, _build_crown_ibp_bounds)
v = tf.reduce_max(mixture_bounds, axis=1)
self._crown_ibp_accuracy = tf.reduce_mean(tf.cast(v <= 0., tf.float32))
else:
mixture_bounds = ibp_bounds
self._crown_ibp_accuracy = tf.constant(0.)
return mixture_bounds
@property
def scalar_metrics(self):
self._ensure_is_connected()
return ScalarMetrics(self._nominal_accuracy,
self._interval_bounds_accuracy,
self._crown_ibp_accuracy,
self._attack_accuracy,
self._attack_success)
@property
def scalar_losses(self):
self._ensure_is_connected()
return ScalarLosses(self._cross_entropy,
self._attack_cross_entropy,
self._verified_loss)
class VerifiableModelWrapper(model.VerifiableModelWrapper):
"""Model wrapper with CROWN-IBP backward bound propagation."""
def _propagate(self, current_module, current_bounds):
"""Propagate CROWN bounds in a backwards manner."""
# Construct bounds for this layer.
if isinstance(current_module, verifiable_wrapper.ModelInputWrapper):
if current_module.index != 0:
raise NotImplementedError('CROWN backpropagation does not support '
'multiple inputs.')
return current_bounds
# Propagate the bounds through the current layer.
new_bounds = current_module.propagate_bounds(current_bounds)
prev_modules = self._module_depends_on[current_module]
# We assume that each module only depends on one module.
if len(prev_modules) != 1:
raise NotImplementedError('CROWN for non-sequential networks is not '
'implemented.')
return self._propagate(prev_modules[0], new_bounds)
def propagate_bound_backward(self, initial_bound,
compute_upper=True, compute_lower=False):
"""Propagates CROWN bounds backward through the network.
This function assumes that we have obtained bounds for all intermediate
layers using IBP. Currently only sequential networks are implemented.
Args:
initial_bound: A BackwardBounds object containing the initial matrices
and biases to start bound propagation.
compute_upper: Set to True to construct the computation graph for the
CROWN upper bound. For verified training, only the upper bound is
needed. Default is True.
compute_lower: Set to True to construct the computation graph for the
CROWN lower bound. Default is False.
Returns:
IntervalBound instance corresponding to bounds on the specification.
"""
if (not compute_upper) and (not compute_lower):
raise ValueError('At least one of "compute_upper" or "compute_lower" '
'needs to be True')
self._ensure_is_connected()
# We start bound propagation from the logit layer.
logit_layer = self._produced_by[self._logits.name]
# If only one of ub or lb is needed, we set the unnecessary one to None.
ub = initial_bound.upper if compute_upper else N
gitextract_zmfehsxp/ ├── CONTRIBUTING.md ├── LICENSE ├── README.md ├── examples/ │ ├── eval.py │ ├── language/ │ │ ├── README.md │ │ ├── config.py │ │ ├── data/ │ │ │ ├── character_substitution_enkey_sub1.json │ │ │ ├── sst_binary_character_vocabulary_sorted.txt │ │ │ └── sst_binary_character_vocabulary_sorted_pad.txt │ │ ├── exhaustive_verification.py │ │ ├── interactive_example.py │ │ ├── models.py │ │ ├── robust_model.py │ │ ├── robust_train.py │ │ └── utils.py │ └── train.py ├── interval_bound_propagation/ │ ├── __init__.py │ ├── src/ │ │ ├── __init__.py │ │ ├── attacks.py │ │ ├── bounds.py │ │ ├── crown.py │ │ ├── fastlin.py │ │ ├── layer_utils.py │ │ ├── layers.py │ │ ├── loss.py │ │ ├── model.py │ │ ├── relative_bounds.py │ │ ├── simplex_bounds.py │ │ ├── specification.py │ │ ├── utils.py │ │ └── verifiable_wrapper.py │ └── tests/ │ ├── attacks_test.py │ ├── bounds_test.py │ ├── crown_test.py │ ├── fastlin_test.py │ ├── layers_test.py │ ├── loss_test.py │ ├── model_test.py │ ├── relative_bounds_test.py │ ├── simplex_bounds_test.py │ └── specification_test.py └── setup.py
SYMBOL INDEX (506 symbols across 33 files)
FILE: examples/eval.py
function layers (line 46) | def layers(model_size):
function show_metrics (line 108) | def show_metrics(metric_values, bound_method='ibp'):
function main (line 121) | def main(unused_args):
FILE: examples/language/config.py
function get_config (line 19) | def get_config():
FILE: examples/language/exhaustive_verification.py
function load_synonyms (line 73) | def load_synonyms(synonym_filepath=None):
function load_dataset (line 82) | def load_dataset(mode='validation', character_level=False):
function expand_by_one_perturbation (line 112) | def expand_by_one_perturbation(original_tokenized_sentence,
function find_up_to_depth_k_perturbations (line 138) | def find_up_to_depth_k_perturbations(
function remove_duplicates (line 171) | def remove_duplicates(list_of_list_of_tokens):
function verify_exhaustively (line 179) | def verify_exhaustively(sample, synonym_dict, sst_model, delta,
function verify_dataset (line 258) | def verify_dataset(dataset, config_dict, model_location, synonym_dict, d...
function example (line 300) | def example(synonym_dict, dataset, k=2):
function main (line 332) | def main(args):
FILE: examples/language/interactive_example.py
class InteractiveSentimentPredictor (line 36) | class InteractiveSentimentPredictor(object):
method __init__ (line 39) | def __init__(self, config_dict, model_location, max_padded_length=0,
method batch_predict_sentiment (line 55) | def batch_predict_sentiment(self, list_of_sentences, is_tokenised=True):
method predict_sentiment (line 117) | def predict_sentiment(self, sentence, tokenised=False):
FILE: examples/language/models.py
function _max_pool_1d (line 25) | def _max_pool_1d(x, pool_size=2, name='max_pool_1d'):
class SentenceRepresenterConv (line 35) | class SentenceRepresenterConv(snt.AbstractModule):
method __init__ (line 38) | def __init__(self,
method _build (line 48) | def _build(self, padded_word_embeddings, length):
FILE: examples/language/robust_model.py
function _pad_fixed (line 50) | def _pad_fixed(x, axis, padded_length):
class GeneratedDataset (line 65) | class GeneratedDataset(snt.AbstractModule):
method __init__ (line 68) | def __init__(self, data_gen, batch_size, mode='train',
method get_row_lengths (line 80) | def get_row_lengths(self, sparse_tensor_input):
method _build (line 87) | def _build(self):
method num_examples (line 109) | def num_examples(self):
function parse (line 113) | def parse(data_dict):
class RobustModel (line 132) | class RobustModel(snt.AbstractModule):
method __init__ (line 135) | def __init__(self,
method add_representer (line 158) | def add_representer(self, vocab_filename, padded_token=None):
method add_dataset (line 180) | def add_dataset(self):
method get_representation (line 198) | def get_representation(self, tokens, num_tokens):
method add_representation (line 204) | def add_representation(self, minibatch):
method add_train_ops (line 218) | def add_train_ops(self,
method create_perturbation_ops (line 251) | def create_perturbation_ops(self, minibatch, synonym_values, vocab_tab...
method _add_optimize_op (line 341) | def _add_optimize_op(self, loss):
method embed_dataset (line 353) | def embed_dataset(self, minibatch, vocab_table):
method compute_mask_vertices (line 368) | def compute_mask_vertices(self, data_batch, perturbation):
method preprocess_databatch (line 395) | def preprocess_databatch(self, minibatch, vocab_table, perturbation):
method add_verifiable_objective (line 400) | def add_verifiable_objective(self,
method run_classification (line 430) | def run_classification(self, inputs, labels, length):
method compute_verifiable_loss (line 436) | def compute_verifiable_loss(self, verifiable_obj, labels):
method compute_verifiable_verified (line 469) | def compute_verifiable_verified(self, verifiable_obj):
method run_prediction (line 475) | def run_prediction(self, inputs, length):
method sentiment_accuracy_op (line 480) | def sentiment_accuracy_op(self, minibatch):
method add_dev_eval_ops (line 496) | def add_dev_eval_ops(self, minibatch):
method _build (line 507) | def _build(self):
method _build_graph_with_datasets (line 530) | def _build_graph_with_datasets(self,
method _lines_from_file (line 696) | def _lines_from_file(self, filename):
function verifiable_objective (line 701) | def verifiable_objective(network, labels, margin=0.):
function targeted_objective (line 746) | def targeted_objective(final_w, final_b, labels):
function filter_correct_class (line 771) | def filter_correct_class(verifiable_obj, num_classes, labels, margin):
FILE: examples/language/robust_train.py
function load_synonyms (line 64) | def load_synonyms(synonym_filepath=None):
function construct_synonyms (line 71) | def construct_synonyms(synonym_filepath):
function linear_schedule (line 83) | def linear_schedule(step, init_step, final_step, init_value, final_value):
function config_train_summary (line 94) | def config_train_summary(task, train_accuracy, loss):
function write_tf_summary (line 113) | def write_tf_summary(writer, step, tag, value):
function train (line 119) | def train(config_dict, synonym_filepath,
function analysis (line 291) | def analysis(config_dict, synonym_filepath,
function main (line 392) | def main(_):
FILE: examples/language/utils.py
function get_padded_embeddings (line 29) | def get_padded_embeddings(embeddings,
function get_padded_indexes (line 128) | def get_padded_indexes(vocabulary_table,
class EmbedAndPad (line 166) | class EmbedAndPad(snt.AbstractModule):
method __init__ (line 174) | def __init__(self,
method _build (line 206) | def _build(self, tokens):
method vocab_table (line 212) | def vocab_table(self):
method vocab_size (line 216) | def vocab_size(self):
function get_accuracy (line 220) | def get_accuracy(logits, labels):
function get_num_correct_predictions (line 225) | def get_num_correct_predictions(logits, labels):
function get_merged_vocabulary_file (line 233) | def get_merged_vocabulary_file(vocabularies, padded_token=None):
FILE: examples/train.py
function show_metrics (line 71) | def show_metrics(step_value, metric_values, loss_value=None):
function layers (line 81) | def layers(model_size):
function main (line 127) | def main(unused_args):
FILE: interval_bound_propagation/src/attacks.py
class UnrolledOptimizer (line 33) | class UnrolledOptimizer(object):
method __init__ (line 36) | def __init__(self, colocate_gradients_with_ops=False):
method minimize (line 40) | def minimize(self, loss, x, optim_state):
method init_state (line 58) | def init_state(self, x):
class UnrolledGradientDescent (line 69) | class UnrolledGradientDescent(UnrolledOptimizer):
method __init__ (line 74) | def __init__(self, lr=.1, lr_fn=None, fgsm=False,
method init_state (line 81) | def init_state(self, unused_x):
method minimize (line 84) | def minimize(self, loss, x, optim_state):
method gradients (line 96) | def gradients(self, loss, x):
class UnrolledFGSMDescent (line 102) | class UnrolledFGSMDescent(UnrolledGradientDescent):
method __init__ (line 105) | def __init__(self, lr=.1, lr_fn=None,
class UnrolledAdam (line 111) | class UnrolledAdam(UnrolledOptimizer):
method __init__ (line 116) | def __init__(self, lr=0.1, lr_fn=None, beta1=0.9, beta2=0.999, epsilon...
method init_state (line 125) | def init_state(self, x):
method _apply_gradients (line 131) | def _apply_gradients(self, grads, x, optim_state):
method minimize (line 151) | def minimize(self, loss, x, optim_state):
method gradients (line 155) | def gradients(self, loss, x):
function _spsa_gradients (line 160) | def _spsa_gradients(loss_fn, x, delta=0.01, num_samples=16, num_iteratio...
class UnrolledSPSA (line 232) | class UnrolledSPSA(object):
class UnrolledSPSAGradientDescent (line 236) | class UnrolledSPSAGradientDescent(UnrolledGradientDescent, UnrolledSPSA):
method __init__ (line 243) | def __init__(self, lr=0.1, lr_fn=None, fgsm=False,
method gradients (line 255) | def gradients(self, loss, x):
class UnrolledSPSAFGSMDescent (line 261) | class UnrolledSPSAFGSMDescent(UnrolledSPSAGradientDescent):
method __init__ (line 264) | def __init__(self, lr=.1, lr_fn=None,
class UnrolledSPSAAdam (line 272) | class UnrolledSPSAAdam(UnrolledAdam, UnrolledSPSA):
method __init__ (line 279) | def __init__(self, lr=0.1, lr_fn=None, beta1=0.9, beta2=0.999, epsilon...
method gradients (line 291) | def gradients(self, loss, x):
function _is_spsa_optimizer (line 296) | def _is_spsa_optimizer(cls):
function wrap_optimizer (line 300) | def wrap_optimizer(cls, **default_kwargs):
function _project_perturbation (line 312) | def _project_perturbation(perturbation, epsilon, input_image, image_boun...
function pgd_attack (line 320) | def pgd_attack(loss_fn, input_image, epsilon, num_steps,
class Attack (line 384) | class Attack(snt.AbstractModule):
method __init__ (line 387) | def __init__(self, predictor, specification, name, predictor_kwargs=No...
method _eval_fn (line 398) | def _eval_fn(self, x, mode='intermediate'):
method _build (line 414) | def _build(self, inputs, labels):
method logits (line 418) | def logits(self):
method attack (line 422) | def attack(self):
method success (line 426) | def success(self):
method force_mode (line 429) | def force_mode(self, mode):
method target_class (line 434) | def target_class(self):
method target_class (line 439) | def target_class(self, t):
class PGDAttack (line 444) | class PGDAttack(Attack):
method __init__ (line 447) | def __init__(self, predictor, specification, epsilon, lr=.1, lr_fn=None,
method prepare_inputs (line 465) | def prepare_inputs(self, inputs):
method prepare_labels (line 479) | def prepare_labels(self, labels):
method find_worst_attack (line 483) | def find_worst_attack(self, objective_fn, adversarial_input, batch_size,
function _maximize_margin (line 496) | def _maximize_margin(bounds):
function _any_greater (line 501) | def _any_greater(bounds):
function _maximize_topk_hinge_margin (line 507) | def _maximize_topk_hinge_margin(bounds, k=5, margin=.1):
function _topk_greater (line 513) | def _topk_greater(bounds, k=5):
class UntargetedPGDAttack (line 519) | class UntargetedPGDAttack(PGDAttack):
method __init__ (line 522) | def __init__(self, predictor, specification, epsilon, lr=.1, lr_fn=None,
method _build (line 535) | def _build(self, inputs, labels):
method logits (line 591) | def logits(self):
method attack (line 596) | def attack(self):
method success (line 601) | def success(self):
method adapt (line 605) | def adapt(self, original_inputs, adversarial_inputs, labels):
class UntargetedTop5PGDAttack (line 610) | class UntargetedTop5PGDAttack(UntargetedPGDAttack):
method __init__ (line 613) | def __init__(self, predictor, specification, epsilon, lr=.1, lr_fn=None,
class UntargetedAdaptivePGDAttack (line 627) | class UntargetedAdaptivePGDAttack(UntargetedPGDAttack):
method adapt (line 630) | def adapt(self, original_inputs, adversarial_inputs, labels):
class MultiTargetedPGDAttack (line 666) | class MultiTargetedPGDAttack(PGDAttack):
method __init__ (line 669) | def __init__(self, predictor, specification, epsilon, lr=.1, lr_fn=None,
method _build (line 685) | def _build(self, inputs, labels):
method logits (line 749) | def logits(self):
method attack (line 754) | def attack(self):
method success (line 759) | def success(self):
class MemoryEfficientMultiTargetedPGDAttack (line 764) | class MemoryEfficientMultiTargetedPGDAttack(PGDAttack):
method __init__ (line 767) | def __init__(self, predictor, specification, epsilon, lr=.1, lr_fn=None,
method _build (line 783) | def _build(self, inputs, labels):
method logits (line 864) | def logits(self):
method attack (line 869) | def attack(self):
method success (line 874) | def success(self):
class RestartedAttack (line 879) | class RestartedAttack(Attack):
method __init__ (line 882) | def __init__(self, inner_attack, num_restarts=1):
method _build (line 893) | def _build(self, inputs, labels):
method logits (line 920) | def logits(self):
method attack (line 925) | def attack(self):
method success (line 930) | def success(self):
FILE: interval_bound_propagation/src/bounds.py
class AbstractBounds (line 31) | class AbstractBounds(object):
method __init__ (line 34) | def __init__(self):
method convert (line 39) | def convert(cls, bounds):
method shape (line 43) | def shape(self):
method concretize (line 46) | def concretize(self):
method _raise_not_implemented (line 49) | def _raise_not_implemented(self, name):
method apply_linear (line 54) | def apply_linear(self, wrapper, w, b): # pylint: disable=unused-argument
method apply_conv1d (line 57) | def apply_conv1d(self, wrapper, w, b, padding, stride): # pylint: dis...
method apply_conv2d (line 60) | def apply_conv2d(self, wrapper, w, b, padding, strides): # pylint: di...
method apply_increasing_monotonic_fn (line 63) | def apply_increasing_monotonic_fn(self, wrapper, fn, *args, **paramete...
method apply_piecewise_monotonic_fn (line 66) | def apply_piecewise_monotonic_fn(self, wrapper, fn, boundaries, *args)...
method apply_batch_norm (line 69) | def apply_batch_norm(self, wrapper, mean, variance, scale, bias, epsil...
method apply_batch_reshape (line 72) | def apply_batch_reshape(self, wrapper, shape): # pylint: disable=unus...
method apply_softmax (line 75) | def apply_softmax(self, wrapper): # pylint: disable=unused-argument
method update_cache_op (line 79) | def update_cache_op(self):
method enable_caching (line 85) | def enable_caching(self):
method _set_up_cache (line 91) | def _set_up_cache(self):
method _cache_with_update_op (line 99) | def _cache_with_update_op(self, tensor):
class IntervalBounds (line 118) | class IntervalBounds(AbstractBounds):
method __init__ (line 121) | def __init__(self, lower, upper):
method lower (line 127) | def lower(self):
method upper (line 131) | def upper(self):
method shape (line 135) | def shape(self):
method __iter__ (line 138) | def __iter__(self):
method convert (line 143) | def convert(cls, bounds):
method apply_linear (line 152) | def apply_linear(self, wrapper, w, b):
method apply_conv1d (line 155) | def apply_conv1d(self, wrapper, w, b, padding, stride):
method apply_conv2d (line 158) | def apply_conv2d(self, wrapper, w, b, padding, strides):
method _affine (line 162) | def _affine(self, w, b, fn, **kwargs):
method apply_increasing_monotonic_fn (line 171) | def apply_increasing_monotonic_fn(self, wrapper, fn, *args, **paramete...
method apply_piecewise_monotonic_fn (line 176) | def apply_piecewise_monotonic_fn(self, wrapper, fn, boundaries, *args):
method apply_batch_norm (line 193) | def apply_batch_norm(self, wrapper, mean, variance, scale, bias, epsil...
method apply_batch_reshape (line 212) | def apply_batch_reshape(self, wrapper, shape):
method apply_softmax (line 216) | def apply_softmax(self, wrapper):
method _set_up_cache (line 227) | def _set_up_cache(self):
FILE: interval_bound_propagation/src/crown.py
class BackwardBounds (line 35) | class BackwardBounds(bounds.AbstractBounds):
method __init__ (line 38) | def __init__(self, lower, upper):
method lower (line 47) | def lower(self):
method upper (line 51) | def upper(self):
method shape (line 55) | def shape(self):
method concretize (line 58) | def concretize(self):
method convert (line 82) | def convert(cls, other_bounds):
method apply_linear (line 88) | def apply_linear(self, wrapper, w, b):
method apply_conv2d (line 102) | def apply_conv2d(self, wrapper, w, b, padding, strides):
method _get_monotonic_fn_bound (line 139) | def _get_monotonic_fn_bound(self, wrapper, fn):
method apply_increasing_monotonic_fn (line 169) | def apply_increasing_monotonic_fn(self, wrapper, fn, *args):
method apply_batch_reshape (line 193) | def apply_batch_reshape(self, wrapper, shape):
method _reshape_to_rank (line 207) | def _reshape_to_rank(a, rank):
method _matvec (line 213) | def _matvec(a, b):
class Losses (line 249) | class Losses(loss.Losses):
method __init__ (line 252) | def __init__(self, predictor, specification=None, pgd_attack=None,
method _get_specification_bounds (line 265) | def _get_specification_bounds(self):
method scalar_metrics (line 301) | def scalar_metrics(self):
method scalar_losses (line 310) | def scalar_losses(self):
class VerifiableModelWrapper (line 317) | class VerifiableModelWrapper(model.VerifiableModelWrapper):
method _propagate (line 320) | def _propagate(self, current_module, current_bounds):
method propagate_bound_backward (line 337) | def propagate_bound_backward(self, initial_bound,
function create_initial_backward_bounds (line 371) | def create_initial_backward_bounds(spec, modules):
function create_classification_losses (line 391) | def create_classification_losses(
FILE: interval_bound_propagation/src/fastlin.py
class SymbolicBounds (line 50) | class SymbolicBounds(basic_bounds.AbstractBounds):
method __init__ (line 53) | def __init__(self, lower, upper):
method lower (line 61) | def lower(self):
method upper (line 65) | def upper(self):
method shape (line 69) | def shape(self):
method concretize (line 72) | def concretize(self):
method with_priors (line 86) | def with_priors(self, existing_bounds):
method convert (line 94) | def convert(cls, bounds):
method apply_linear (line 107) | def apply_linear(self, wrapper, w, b):
method apply_conv1d (line 122) | def apply_conv1d(self, wrapper, w, b, padding, stride):
method apply_conv2d (line 136) | def apply_conv2d(self, wrapper, w, b, padding, strides):
method apply_increasing_monotonic_fn (line 150) | def apply_increasing_monotonic_fn(self, wrapper, fn, *args, **paramete...
method apply_batch_reshape (line 181) | def apply_batch_reshape(self, wrapper, shape):
method _add_bias (line 188) | def _add_bias(expr, b):
method _add_expression (line 196) | def _add_expression(expr_a, expr_b):
method _scale_expression (line 202) | def _scale_expression(expr, w):
method _conv1d_expression (line 209) | def _conv1d_expression(expr, w, padding, stride):
method _conv2d_expression (line 221) | def _conv2d_expression(expr, w, padding, strides):
method _batch_reshape_expression (line 233) | def _batch_reshape_expression(expr, shape):
method _concretize_bounds (line 239) | def _concretize_bounds(lower, upper):
method _initial_symbolic_bounds (line 259) | def _initial_symbolic_bounds(lb, ub):
class RelativeSymbolicBounds (line 276) | class RelativeSymbolicBounds(SymbolicBounds):
method __init__ (line 279) | def __init__(self, lower_offset, upper_offset, nominal):
method concretize (line 283) | def concretize(self):
method convert (line 299) | def convert(cls, bounds):
method apply_linear (line 315) | def apply_linear(self, wrapper, w, b):
method apply_conv1d (line 327) | def apply_conv1d(self, wrapper, w, b, padding, stride):
method apply_conv2d (line 340) | def apply_conv2d(self, wrapper, w, b, padding, strides):
method apply_increasing_monotonic_fn (line 353) | def apply_increasing_monotonic_fn(self, wrapper, fn, *args, **paramete...
method apply_batch_reshape (line 397) | def apply_batch_reshape(self, wrapper, shape):
FILE: interval_bound_propagation/src/layer_utils.py
function conv_output_shape (line 27) | def conv_output_shape(input_shape, w, padding, strides):
function materialise_conv (line 59) | def materialise_conv(w, b, input_shape, padding, strides):
function _materialise_conv2d (line 90) | def _materialise_conv2d(w, b, input_height, input_width, padding, strides):
function _materialise_conv1d (line 145) | def _materialise_conv1d(w, b, input_length, padding, stride):
function decode_batchnorm (line 197) | def decode_batchnorm(batchnorm_module):
function combine_with_batchnorm (line 249) | def combine_with_batchnorm(w, b, batchnorm_module):
FILE: interval_bound_propagation/src/layers.py
class BatchNorm (line 34) | class BatchNorm(snt.BatchNorm):
method __init__ (line 37) | def __init__(self, axis=None, offset=True, scale=False,
method _build_statistics (line 48) | def _build_statistics(self, input_batch, axis, use_batch_stats, stat_d...
method _build (line 54) | def _build(self, input_batch, is_training=True, test_local_stats=False,
method scale (line 86) | def scale(self):
method bias (line 91) | def bias(self):
method mean (line 96) | def mean(self):
method variance (line 101) | def variance(self):
method epsilon (line 106) | def epsilon(self):
class ImageNorm (line 111) | class ImageNorm(snt.AbstractModule):
method __init__ (line 114) | def __init__(self, mean, std, name='image_norm'):
method _build (line 131) | def _build(self, inputs):
method scale (line 135) | def scale(self):
method offset (line 139) | def offset(self):
method apply (line 143) | def apply(self, inputs):
FILE: interval_bound_propagation/src/loss.py
class Losses (line 44) | class Losses(snt.AbstractModule):
method __init__ (line 47) | def __init__(self, predictor, specification=None, pgd_attack=None,
method _build (line 84) | def _build(self, labels):
method _build_nominal_loss (line 89) | def _build_nominal_loss(self, labels):
method _get_specification_bounds (line 111) | def _get_specification_bounds(self):
method _build_verified_loss (line 120) | def _build_verified_loss(self, labels):
method _build_attack_loss (line 169) | def _build_attack_loss(self, labels):
method scalar_metrics (line 194) | def scalar_metrics(self):
method scalar_losses (line 202) | def scalar_losses(self):
FILE: interval_bound_propagation/src/model.py
class VerifiableModelWrapper (line 59) | class VerifiableModelWrapper(snt.AbstractModule):
method __init__ (line 62) | def __init__(self, net_builder, name='verifiable_predictor'):
method wrapped_network (line 75) | def wrapped_network(self):
method output_size (line 79) | def output_size(self):
method logits (line 84) | def logits(self):
method inputs (line 89) | def inputs(self):
method input_wrappers (line 94) | def input_wrappers(self):
method modules (line 99) | def modules(self):
method dependencies (line 103) | def dependencies(self, module):
method output_module (line 108) | def output_module(self):
method fanout_of (line 112) | def fanout_of(self, node):
method _build (line 125) | def _build(self, *z0, **kwargs):
method _observer (line 178) | def _observer(self, subgraph):
method _inputs_for_observed_module (line 194) | def _inputs_for_observed_module(self, subgraph):
method _wrapper_for_observed_module (line 222) | def _wrapper_for_observed_module(self, subgraph):
method _backtrack (line 251) | def _backtrack(self, node, max_depth=100):
method _wrap_node (line 259) | def _wrap_node(self, node, **kwargs):
method _add_module (line 458) | def _add_module(self, wrapper, node, *input_nodes, **kwargs):
method propagate_bounds (line 475) | def propagate_bounds(self, *input_bounds):
class StandardModelWrapper (line 506) | class StandardModelWrapper(snt.AbstractModule):
method __init__ (line 509) | def __init__(self, net_builder, name='verifiable_predictor'):
method wrapped_network (line 525) | def wrapped_network(self):
method output_size (line 529) | def output_size(self):
method logits (line 534) | def logits(self):
method inputs (line 539) | def inputs(self):
method modules (line 544) | def modules(self):
method propagate_bounds (line 548) | def propagate_bounds(self, *input_bounds):
method _build (line 552) | def _build(self, *z0, **kwargs):
class DNN (line 583) | class DNN(snt.AbstractModule):
method __init__ (line 586) | def __init__(self, num_classes, layer_types, l2_regularization_scale=0.,
method _build (line 613) | def _build(self, z0, is_training=True, test_local_stats=False, reuse=F...
function _create_conv2d_initializer (line 675) | def _create_conv2d_initializer(
function _create_linear_initializer (line 684) | def _create_linear_initializer(input_size, output_size, dtype=tf.float32...
FILE: interval_bound_propagation/src/relative_bounds.py
class RelativeIntervalBounds (line 27) | class RelativeIntervalBounds(basic_bounds.AbstractBounds):
method __init__ (line 30) | def __init__(self, lower_offset, upper_offset, nominal):
method lower_offset (line 37) | def lower_offset(self):
method upper_offset (line 42) | def upper_offset(self):
method nominal (line 47) | def nominal(self):
method lower (line 51) | def lower(self):
method upper (line 56) | def upper(self):
method shape (line 61) | def shape(self):
method convert (line 65) | def convert(cls, bounds):
method apply_batch_reshape (line 74) | def apply_batch_reshape(self, wrapper, shape):
method apply_linear (line 90) | def apply_linear(self, wrapper, w, b):
method apply_conv1d (line 116) | def apply_conv1d(self, wrapper, w, b, padding, stride):
method apply_conv2d (line 149) | def apply_conv2d(self, wrapper, w, b, padding, strides):
method apply_increasing_monotonic_fn (line 182) | def apply_increasing_monotonic_fn(self, wrapper, fn, *args, **paramete...
method apply_batch_norm (line 223) | def apply_batch_norm(self, wrapper, mean, variance, scale, bias, epsil...
method _set_up_cache (line 251) | def _set_up_cache(self):
function _maxpool_bounds (line 259) | def _maxpool_bounds(module, kernel_shape, strides, lb_in, ub_in,
function _activation_bounds (line 301) | def _activation_bounds(nl_fun, lb_in, ub_in, nominal_in, parameters=None):
FILE: interval_bound_propagation/src/simplex_bounds.py
class SimplexBounds (line 28) | class SimplexBounds(basic_bounds.AbstractBounds):
method __init__ (line 31) | def __init__(self, vertices, nominal, r):
method vertices (line 51) | def vertices(self):
method nominal (line 55) | def nominal(self):
method r (line 59) | def r(self):
method shape (line 63) | def shape(self):
method convert (line 67) | def convert(cls, bounds):
method apply_batch_reshape (line 73) | def apply_batch_reshape(self, wrapper, shape):
method apply_linear (line 83) | def apply_linear(self, wrapper, w, b):
method apply_conv1d (line 95) | def apply_conv1d(self, wrapper, w, b, padding, stride):
method apply_conv2d (line 123) | def apply_conv2d(self, wrapper, w, b, padding, strides):
method apply_increasing_monotonic_fn (line 150) | def apply_increasing_monotonic_fn(self, wrapper, fn, *args, **paramete...
function _simplex_bounds (line 172) | def _simplex_bounds(mapped_vertices, mapped_centres, r, axis):
FILE: interval_bound_propagation/src/specification.py
class Specification (line 34) | class Specification(snt.AbstractModule):
method __init__ (line 37) | def __init__(self, name, collapse=True):
method _build (line 42) | def _build(self, modules):
method evaluate (line 46) | def evaluate(self, logits):
method num_specifications (line 64) | def num_specifications(self):
method collapse (line 68) | def collapse(self):
class LinearSpecification (line 72) | class LinearSpecification(Specification):
method __init__ (line 75) | def __init__(self, c, d=None, prune_irrelevant=True, collapse=True):
method _build (line 97) | def _build(self, modules):
method evaluate (line 129) | def evaluate(self, logits):
method num_specifications (line 142) | def num_specifications(self):
method c (line 146) | def c(self):
method d (line 150) | def d(self):
class ClassificationSpecification (line 154) | class ClassificationSpecification(Specification):
method __init__ (line 161) | def __init__(self, label, num_classes, collapse=True):
method _build (line 174) | def _build(self, modules):
method evaluate (line 207) | def evaluate(self, logits):
method num_specifications (line 239) | def num_specifications(self):
method correct_idx (line 243) | def correct_idx(self):
method wrong_idx (line 247) | def wrong_idx(self):
method _build_indices (line 250) | def _build_indices(self, label, indices):
class TargetedClassificationSpecification (line 261) | class TargetedClassificationSpecification(ClassificationSpecification):
method __init__ (line 264) | def __init__(self, label, num_classes, target_class, collapse=True):
method target_class (line 281) | def target_class(self):
method num_specifications (line 286) | def num_specifications(self):
class RandomClassificationSpecification (line 290) | class RandomClassificationSpecification(TargetedClassificationSpecificat...
method __init__ (line 293) | def __init__(self, label, num_classes, num_targets=1, seed=None,
class LeastLikelyClassificationSpecification (line 306) | class LeastLikelyClassificationSpecification(
method __init__ (line 310) | def __init__(self, label, num_classes, logits, num_targets=1, collapse...
FILE: interval_bound_propagation/src/utils.py
function build_dataset (line 39) | def build_dataset(raw_data, batch_size=50, sequential=True):
function randomize (line 52) | def randomize(images, init_shape, expand_shape=None, crop_shape=None,
function linear_schedule (line 73) | def linear_schedule(step, init_step, final_step, init_value, final_value):
function smooth_schedule (line 84) | def smooth_schedule(step, init_step, final_step, init_value, final_value,
function build_loss_schedule (line 111) | def build_loss_schedule(step, warmup_steps, rampup_steps, init, final,
function add_image_normalization (line 143) | def add_image_normalization(model, mean, std):
function create_specification (line 149) | def create_specification(label, num_classes, logits,
function create_classification_losses (line 170) | def create_classification_losses(
function get_attack_builder (line 298) | def get_attack_builder(logits, label, name='UntargetedPGDAttack',
function create_attack (line 461) | def create_attack(attack_config, predictor, label, epsilon,
function parse_learning_rate (line 531) | def parse_learning_rate(step, learning_rate):
function _change_parameters (line 593) | def _change_parameters(attack_cls, **updated_kwargs):
function _get_random_class (line 600) | def _get_random_class(label, num_classes, seed=None):
function _get_least_likely_class (line 608) | def _get_least_likely_class(label, num_classes, logits):
function _maximize_cross_entropy (line 616) | def _maximize_cross_entropy(specification_bounds):
function _minimize_cross_entropy (line 632) | def _minimize_cross_entropy(specification_bounds):
function _maximize_margin (line 636) | def _maximize_margin(specification_bounds):
function _minimize_margin (line 641) | def _minimize_margin(specification_bounds):
function _all_smaller (line 645) | def _all_smaller(specification_bounds):
function _get_projection (line 650) | def _get_projection(p):
FILE: interval_bound_propagation/src/verifiable_wrapper.py
class VerifiableWrapper (line 33) | class VerifiableWrapper(object):
method __init__ (line 36) | def __init__(self, module):
method input_bounds (line 42) | def input_bounds(self):
method output_bounds (line 47) | def output_bounds(self):
method module (line 51) | def module(self):
method __str__ (line 54) | def __str__(self):
method propagate_bounds (line 65) | def propagate_bounds(self, *input_bounds):
method _propagate_through (line 78) | def _propagate_through(self, module, *input_bounds):
class ModelInputWrapper (line 90) | class ModelInputWrapper(object):
method __init__ (line 93) | def __init__(self, index):
method index (line 99) | def index(self):
method output_bounds (line 103) | def output_bounds(self):
method output_bounds (line 107) | def output_bounds(self, bounds):
method __str__ (line 110) | def __str__(self):
class ConstWrapper (line 114) | class ConstWrapper(VerifiableWrapper):
method _propagate_through (line 117) | def _propagate_through(self, module):
class LinearFCWrapper (line 122) | class LinearFCWrapper(VerifiableWrapper):
method __init__ (line 125) | def __init__(self, module):
method _propagate_through (line 130) | def _propagate_through(self, module, input_bounds):
class LinearConvWrapper (line 136) | class LinearConvWrapper(VerifiableWrapper):
class LinearConv1dWrapper (line 140) | class LinearConv1dWrapper(LinearConvWrapper):
method __init__ (line 143) | def __init__(self, module):
method _propagate_through (line 149) | def _propagate_through(self, module, input_bounds):
class LinearConv2dWrapper (line 157) | class LinearConv2dWrapper(LinearConvWrapper):
method __init__ (line 160) | def __init__(self, module):
method _propagate_through (line 166) | def _propagate_through(self, module, input_bounds):
class IncreasingMonotonicWrapper (line 174) | class IncreasingMonotonicWrapper(VerifiableWrapper):
method __init__ (line 177) | def __init__(self, module, **parameters):
method parameters (line 182) | def parameters(self):
method _propagate_through (line 185) | def _propagate_through(self, module, main_bounds, *other_input_bounds):
class SoftmaxWrapper (line 191) | class SoftmaxWrapper(VerifiableWrapper):
method __init__ (line 194) | def __init__(self):
method _propagate_through (line 197) | def _propagate_through(self, module, input_bounds):
class PiecewiseMonotonicWrapper (line 201) | class PiecewiseMonotonicWrapper(VerifiableWrapper):
method __init__ (line 204) | def __init__(self, module, boundaries=()):
method boundaries (line 209) | def boundaries(self):
method _propagate_through (line 212) | def _propagate_through(self, module, main_bounds, *other_input_bounds):
class ImageNormWrapper (line 218) | class ImageNormWrapper(IncreasingMonotonicWrapper):
method __init__ (line 221) | def __init__(self, module):
method inner_module (line 228) | def inner_module(self):
class BatchNormWrapper (line 232) | class BatchNormWrapper(VerifiableWrapper):
method __init__ (line 235) | def __init__(self, module):
method _propagate_through (line 241) | def _propagate_through(self, module, input_bounds):
class BatchReshapeWrapper (line 275) | class BatchReshapeWrapper(VerifiableWrapper):
method __init__ (line 278) | def __init__(self, module, shape):
method shape (line 286) | def shape(self):
method _propagate_through (line 289) | def _propagate_through(self, module, input_bounds):
class BatchFlattenWrapper (line 293) | class BatchFlattenWrapper(BatchReshapeWrapper):
method __init__ (line 296) | def __init__(self, module):
FILE: interval_bound_propagation/tests/attacks_test.py
class MockWithIsTraining (line 29) | class MockWithIsTraining(object):
method __init__ (line 32) | def __init__(self, module, test):
method __call__ (line 36) | def __call__(self, z0, is_training=False):
class MockWithoutIsTraining (line 42) | class MockWithoutIsTraining(object):
method __init__ (line 45) | def __init__(self, module, test):
method __call__ (line 49) | def __call__(self, z0):
class AttacksTest (line 53) | class AttacksTest(parameterized.TestCase, tf.test.TestCase):
method testEndToEnd (line 72) | def testEndToEnd(self, predictor_cls, attack_cls, optimizer_cls, epsilon,
FILE: interval_bound_propagation/tests/bounds_test.py
class IntervalBoundsTest (line 29) | class IntervalBoundsTest(parameterized.TestCase, tf.test.TestCase):
method testFCIntervalBounds (line 31) | def testFCIntervalBounds(self):
method testConv1dIntervalBounds (line 49) | def testConv1dIntervalBounds(self):
method testConv2dIntervalBounds (line 74) | def testConv2dIntervalBounds(self):
method testReluIntervalBounds (line 99) | def testReluIntervalBounds(self):
method testMulIntervalBounds (line 110) | def testMulIntervalBounds(self):
method testSubIntervalBounds (line 121) | def testSubIntervalBounds(self):
method testSoftmaxIntervalBounds (line 137) | def testSoftmaxIntervalBounds(self, axis, expected_outputs):
method testBatchNormIntervalBounds (line 152) | def testBatchNormIntervalBounds(self):
method testCaching (line 173) | def testCaching(self):
FILE: interval_bound_propagation/tests/crown_test.py
function _generate_identity_spec (line 28) | def _generate_identity_spec(modules, shape, dimension=1):
class CROWNBoundsTest (line 35) | class CROWNBoundsTest(tf.test.TestCase):
method testFCBackwardBounds (line 37) | def testFCBackwardBounds(self):
method testConv2dBackwardBounds (line 67) | def testConv2dBackwardBounds(self):
method testReluBackwardBounds (line 95) | def testReluBackwardBounds(self):
FILE: interval_bound_propagation/tests/fastlin_test.py
class SymbolicBoundsTest (line 29) | class SymbolicBoundsTest(parameterized.TestCase, tf.test.TestCase):
method testConvertSymbolicBounds (line 31) | def testConvertSymbolicBounds(self):
method testFCSymbolicBounds (line 41) | def testFCSymbolicBounds(self):
method testConv2dSymbolicBounds (line 70) | def testConv2dSymbolicBounds(self):
method testConv1dSymbolicBounds (line 97) | def testConv1dSymbolicBounds(self):
method testReluSymbolicBounds (line 124) | def testReluSymbolicBounds(self):
FILE: interval_bound_propagation/tests/layers_test.py
function _get_inputs (line 27) | def _get_inputs(dtype=tf.float32):
class LayersTest (line 34) | class LayersTest(tf.test.TestCase):
method assertBetween (line 36) | def assertBetween(self, value, minv, maxv):
method testBatchNormUpdateImproveStatistics (line 42) | def testBatchNormUpdateImproveStatistics(self):
method testImageNorm (line 62) | def testImageNorm(self):
FILE: interval_bound_propagation/tests/loss_test.py
class FixedNN (line 27) | class FixedNN(snt.AbstractModule):
method _build (line 29) | def _build(self, z0, is_training=False):
class LossTest (line 37) | class LossTest(tf.test.TestCase):
method testEndToEnd (line 39) | def testEndToEnd(self):
FILE: interval_bound_propagation/tests/model_test.py
function _build_model (line 30) | def _build_model():
class ModelTest (line 40) | class ModelTest(parameterized.TestCase, tf.test.TestCase):
method testDNN (line 42) | def testDNN(self):
method _propagation_test (line 60) | def _propagation_test(self, wrapper, inputs, outputs):
method testVerifiableModelWrapperDNN (line 69) | def testVerifiableModelWrapperDNN(self):
method testVerifiableModelWrapperResnet (line 103) | def testVerifiableModelWrapperResnet(self):
method testVerifiableModelWrapperPool (line 125) | def testVerifiableModelWrapperPool(self):
method testVerifiableModelWrapperConcat (line 139) | def testVerifiableModelWrapperConcat(self):
method testVerifiableModelWrapperExpandAndSqueeze (line 152) | def testVerifiableModelWrapperExpandAndSqueeze(self):
method testVerifiableModelWrapperSimple (line 175) | def testVerifiableModelWrapperSimple(self, fn, expected_modules):
method testPointlessReshape (line 188) | def testPointlessReshape(self):
method testLeakyRelu (line 204) | def testLeakyRelu(self):
method testMultipleInputs (line 219) | def testMultipleInputs(self):
FILE: interval_bound_propagation/tests/relative_bounds_test.py
class RelativeIntervalBoundsTest (line 30) | class RelativeIntervalBoundsTest(tf.test.TestCase, parameterized.TestCase):
method test_linear_bounds_shape (line 34) | def test_linear_bounds_shape(self, dtype):
method test_linear_bounds (line 56) | def test_linear_bounds(self, dtype, tol):
method test_conv2d_bounds_shape (line 78) | def test_conv2d_bounds_shape(self, dtype):
method test_conv2d_bounds (line 116) | def test_conv2d_bounds(self, dtype, tol):
method test_conv1d_bounds_shape (line 157) | def test_conv1d_bounds_shape(self, dtype):
method test_conv1d_bounds (line 192) | def test_conv1d_bounds(self, dtype, tol):
method test_batchnorm_bounds (line 236) | def test_batchnorm_bounds(self, batchnorm_class, dtype, tol, is_traini...
function _materialised_conv_bounds (line 294) | def _materialised_conv_bounds(w, b, padding, strides, bounds_in):
FILE: interval_bound_propagation/tests/simplex_bounds_test.py
class SimplexBoundsTest (line 29) | class SimplexBoundsTest(tf.test.TestCase, parameterized.TestCase):
method test_linear_simplex_bounds_shape (line 33) | def test_linear_simplex_bounds_shape(self, dtype):
method test_linear_bounds_on_embedding_layer (line 56) | def test_linear_bounds_on_embedding_layer(self, dtype, tol):
method test_conv1d_simplex_bounds_shape (line 81) | def test_conv1d_simplex_bounds_shape(self, dtype):
method test_conv1d_simplex_bounds (line 116) | def test_conv1d_simplex_bounds(self, dtype, tol):
function _materialised_conv_simplex_bounds (line 152) | def _materialised_conv_simplex_bounds(w, b, padding, strides, bounds_in):
FILE: interval_bound_propagation/tests/specification_test.py
function _build_spec_input (line 34) | def _build_spec_input():
function _build_classification_specification (line 49) | def _build_classification_specification(label, num_classes, collapse):
class SpecificationTest (line 67) | class SpecificationTest(tf.test.TestCase):
method testLinearSpecification (line 69) | def testLinearSpecification(self):
method testEquivalenceLinearClassification (line 84) | def testEquivalenceLinearClassification(self):
FILE: setup.py
function ibp_test_suite (line 35) | def ibp_test_suite():
Condensed preview — 42 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (416K chars).
[
{
"path": "CONTRIBUTING.md",
"chars": 1101,
"preview": "# How to Contribute\n\nWe'd love to accept your patches and contributions to this project. There are\njust a few small guid"
},
{
"path": "LICENSE",
"chars": 11358,
"preview": "\n Apache License\n Version 2.0, January 2004\n "
},
{
"path": "README.md",
"chars": 5184,
"preview": "# Interval Bound Propagation for Training Verifiably Robust Models\n\nThis repository contains a simple implementation of "
},
{
"path": "examples/eval.py",
"chars": 9366,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "examples/language/README.md",
"chars": 3357,
"preview": "# Achieving Verified Robustness to Symbol Substitutions via Interval Bound Propagation\n\nHere contains an implementation "
},
{
"path": "examples/language/config.py",
"chars": 2366,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "examples/language/data/character_substitution_enkey_sub1.json",
"chars": 313,
"preview": "{\"z\": [\"x\"], \"y\": [\"t\"], \"x\": [\"s\"], \"w\": [\"d\"], \"v\": [\"c\"], \"u\": [\"8\"], \"t\": [\"f\"], \"s\": [\"e\"], \"r\": [\"g\"], \"q\": [\"s\"],"
},
{
"path": "examples/language/data/sst_binary_character_vocabulary_sorted.txt",
"chars": 127,
"preview": " \n!\n#\n$\n%\n&\n'\n(\n)\n*\n+\n,\n-\n.\n/\n0\n1\n2\n3\n4\n5\n6\n7\n8\n9\n:\n;\n=\n?\n`\na\nb\nc\nd\ne\nf\ng\nh\ni\nj\nk\nl\nm\nn\no\np\nq\nr\ns\nt\nu\nv\nw\nx\ny\nz\n\n\n\n\n\n\n\n\n"
},
{
"path": "examples/language/data/sst_binary_character_vocabulary_sorted_pad.txt",
"chars": 133,
"preview": "<PAD>\n \n!\n#\n$\n%\n&\n'\n(\n)\n*\n+\n,\n-\n.\n/\n0\n1\n2\n3\n4\n5\n6\n7\n8\n9\n:\n;\n=\n?\n`\na\nb\nc\nd\ne\nf\ng\nh\ni\nj\nk\nl\nm\nn\no\np\nq\nr\ns\nt\nu\nv\nw\nx\ny\nz\n\n\n"
},
{
"path": "examples/language/exhaustive_verification.py",
"chars": 15347,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "examples/language/interactive_example.py",
"chars": 6220,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "examples/language/models.py",
"chars": 3035,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "examples/language/robust_model.py",
"chars": 31845,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "examples/language/robust_train.py",
"chars": 19275,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "examples/language/utils.py",
"chars": 10260,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "examples/train.py",
"chars": 10867,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "interval_bound_propagation/__init__.py",
"chars": 5157,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "interval_bound_propagation/src/__init__.py",
"chars": 786,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "interval_bound_propagation/src/attacks.py",
"chars": 36662,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "interval_bound_propagation/src/bounds.py",
"chars": 7795,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "interval_bound_propagation/src/crown.py",
"chars": 18913,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "interval_bound_propagation/src/fastlin.py",
"chars": 16201,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "interval_bound_propagation/src/layer_utils.py",
"chars": 10655,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "interval_bound_propagation/src/layers.py",
"chars": 5266,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "interval_bound_propagation/src/loss.py",
"chars": 8325,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "interval_bound_propagation/src/model.py",
"chars": 24484,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "interval_bound_propagation/src/relative_bounds.py",
"chars": 14034,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "interval_bound_propagation/src/simplex_bounds.py",
"chars": 7609,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "interval_bound_propagation/src/specification.py",
"chars": 12165,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "interval_bound_propagation/src/utils.py",
"chars": 26644,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "interval_bound_propagation/src/verifiable_wrapper.py",
"chars": 9206,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "interval_bound_propagation/tests/attacks_test.py",
"chars": 4312,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "interval_bound_propagation/tests/bounds_test.py",
"chars": 8285,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "interval_bound_propagation/tests/crown_test.py",
"chars": 4319,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "interval_bound_propagation/tests/fastlin_test.py",
"chars": 5231,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "interval_bound_propagation/tests/layers_test.py",
"chars": 2758,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "interval_bound_propagation/tests/loss_test.py",
"chars": 3197,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "interval_bound_propagation/tests/model_test.py",
"chars": 8855,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "interval_bound_propagation/tests/relative_bounds_test.py",
"chars": 14261,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "interval_bound_propagation/tests/simplex_bounds_test.py",
"chars": 8114,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "interval_bound_propagation/tests/specification_test.py",
"chars": 5279,
"preview": "# coding=utf-8\n# Copyright 2019 The Interval Bound Propagation Authors.\n#\n# Licensed under the Apache License, Version 2"
},
{
"path": "setup.py",
"chars": 1885,
"preview": "# Copyright 2018 The Interval Bound Propagation Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Ver"
}
]
About this extraction
This page contains the full source code of the deepmind/interval-bound-propagation GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 42 files (391.2 KB), approximately 97.6k tokens, and a symbol index with 506 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.