Repository: PacktPublishing/Deep-Learning-for-Computer-Vision
Branch: master
Commit: b8dfcf3f4860
Files: 52
Total size: 146.6 KB
Directory structure:
gitextract_77peer6f/
├── .gitignore
├── Chapter01/
│ ├── 1_hello_tensorflow.py
│ ├── 2_add.py
│ └── 3_add_tensorboard.py
├── Chapter02/
│ ├── 1_mnist_tf_perceptron.py
│ ├── 2_mnist_cnn.py
│ ├── 3_mnist_keras.py
│ ├── 4_cat_vs_dog_data_prep.py
│ ├── 5_cat_vs_dog_cnn.py
│ ├── 6_cat_vs_dog_augmentation.py
│ ├── 7_cat_vs_dog_bottleneck.py
│ └── 8_cat_vs_dog_fine_tune.py
├── Chapter03/
│ ├── 1_embedding_vis.py
│ ├── 2_guided_back_prop.py
│ ├── 3_deep_dream.py
│ ├── 4_export_model.py
│ ├── 5_serving_client.py
│ ├── 6_bottleneck_features.py
│ ├── 7_annoy.py
│ ├── 8_auto_encoder.py
│ └── 9_denoising.py
├── Chapter04/
│ ├── 1_iou.py
│ ├── 2_overfeat.py
│ ├── 3_object_detection_api.py
│ ├── 4_yolo.py
│ └── pascal_voc.py
├── Chapter05/
│ ├── 1_segnet.py
│ ├── 2_nerve_segmentation.py
│ ├── 3_satellite.py
│ └── data.py
├── Chapter06/
│ ├── 1_contrastive_loss.py
│ ├── 2_siamese_network.py
│ ├── 3_triplet_loss.py
│ ├── 4_triplet_mining.py
│ ├── 5_fiducial_points.py
│ └── 6_extract_features.py
├── Chapter07/
│ └── 1_caption_attention.py
├── Chapter08/
│ ├── 1_style_transfer.py
│ ├── 2_vanilla_gan.py
│ ├── 3_conditional_gan.py
│ ├── 4_adverserial_loss.py
│ ├── 5_image_translation.py
│ ├── 6_infogan.py
│ ├── utils.py
│ └── vgg16_avg.py
├── Chapter09/
│ ├── 1_video_to_frames_1.py
│ ├── 2_parallel_stream.py
│ ├── 3_lstm_after_cnn.py
│ └── 4_3d_convolution.py
├── Chapter10/
│ └── 1_ios.py
├── LICENSE
└── README.md
================================================
FILE CONTENTS
================================================
================================================
FILE: .gitignore
================================================
*MNIST_data*
Chapter02/test/*
Chapter02/train/*
Chapter03/inception5h.zip
Chapter03/classify_image_graph_def.pb
Chapter03/imagenet_2012_challenge_label_map_proto.pbtxt
Chapter03/imagenet_comp_graph_label_strings.txt
Chapter03/imagenet_synset_to_human_label_map.txt
Chapter03/inception-2015-12-05.tgz
Chapter03/LICENSE
Chapter03/tensorflow_inception_graph.pb
Chapter03/stitched_filters_3x3.png
Chapter03/cropped_panda.jpg
Chapter08/gen_*
.idea/*
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
target/
# Jupyter Notebook
.ipynb_checkpoints
# pyenv
.python-version
# celery beat schedule file
celerybeat-schedule
# SageMath parsed files
*.sage.py
# dotenv
.env
# virtualenv
.venv
venv/
ENV/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
================================================
FILE: Chapter01/1_hello_tensorflow.py
================================================
import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
session = tf.Session()
print(session.run(hello))
================================================
FILE: Chapter01/2_add.py
================================================
import tensorflow as tf
x = tf.placeholder(tf.float32)
y = tf.placeholder(tf.float32)
z = x + y
session = tf.Session()
values = {x: 5.0, y: 4.0}
result = session.run([z], values)
print(result)
================================================
FILE: Chapter01/3_add_tensorboard.py
================================================
import tensorflow as tf
x = tf.placeholder(tf.float32, name='x')
y = tf.placeholder(tf.float32, name='y')
z = tf.add(x, y, name='sum')
session = tf.Session()
summary_writer = tf.summary.FileWriter('/tmp/1', session.graph)
summary_writer.flush()
================================================
FILE: Chapter02/1_mnist_tf_perceptron.py
================================================
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
mnist_data = input_data.read_data_sets('MNIST_data', one_hot=True)
input_size = 784
no_classes = 10
batch_size = 100
total_batches = 200
x_input = tf.placeholder(tf.float32, shape=[None, input_size])
y_input = tf.placeholder(tf.float32, shape=[None, no_classes])
weights = tf.Variable(tf.random_normal([input_size, no_classes]))
bias = tf.Variable(tf.random_normal([no_classes]))
logits = tf.matmul(x_input, weights) + bias
softmax_cross_entropy = tf.nn.softmax_cross_entropy_with_logits_v2(labels=y_input, logits=logits)
loss_operation = tf.reduce_mean(softmax_cross_entropy)
optimiser = tf.train.GradientDescentOptimizer(learning_rate=0.5).minimize(loss_operation)
session = tf.Session()
session.run(tf.global_variables_initializer())
for batch_no in range(total_batches):
mnist_batch = mnist_data.train.next_batch(batch_size)
train_images, train_labels = mnist_batch[0], mnist_batch[1]
_, loss_value = session.run([optimiser, loss_operation], feed_dict={x_input: train_images,
y_input: train_labels})
print(loss_value)
predictions = tf.argmax(logits, 1)
correct_predictions = tf.equal(predictions, tf.argmax(y_input, 1))
accuracy_operation = tf.reduce_mean(tf.cast(correct_predictions, tf.float32))
test_images, test_labels = mnist_data.test.images, mnist_data.test.labels
accuracy_value = session.run(accuracy_operation, feed_dict={x_input: test_images,
y_input: test_labels})
print('Accuracy : ', accuracy_value)
session.close()
================================================
FILE: Chapter02/2_mnist_cnn.py
================================================
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
mnist_data = input_data.read_data_sets('MNIST_data', one_hot=True)
input_size = 784
no_classes = 10
batch_size = 100
total_batches = 200
x_input = tf.placeholder(tf.float32, shape=[None, input_size])
y_input = tf.placeholder(tf.float32, shape=[None, no_classes])
def add_variable_summary(tf_variable, summary_name):
with tf.name_scope(summary_name + '_summary'):
mean = tf.reduce_mean(tf_variable)
tf.summary.scalar('Mean', mean)
with tf.name_scope('standard_deviation'):
standard_deviation = tf.sqrt(tf.reduce_mean(
tf.square(tf_variable - mean)))
tf.summary.scalar('StandardDeviation', standard_deviation)
tf.summary.scalar('Maximum', tf.reduce_max(tf_variable))
tf.summary.scalar('Minimum', tf.reduce_min(tf_variable))
tf.summary.histogram('Histogram', tf_variable)
x_input_reshape = tf.reshape(x_input, [-1, 28, 28, 1],
name='input_reshape')
def convolution_layer(input_layer, filters, kernel_size=[3, 3],
activation=tf.nn.relu):
layer = tf.layers.conv2d(
inputs=input_layer,
filters=filters,
kernel_size=kernel_size,
activation=activation
)
add_variable_summary(layer, 'convolution')
return layer
def pooling_layer(input_layer, pool_size=[2, 2], strides=2):
layer = tf.layers.max_pooling2d(
inputs=input_layer,
pool_size=pool_size,
strides=strides
)
add_variable_summary(layer, 'pooling')
return layer
def dense_layer(input_layer, units, activation=tf.nn.relu):
layer = tf.layers.dense(
inputs=input_layer,
units=units,
activation=activation
)
add_variable_summary(layer, 'dense')
return layer
convolution_layer_1 = convolution_layer(x_input_reshape, 64)
pooling_layer_1 = pooling_layer(convolution_layer_1)
convolution_layer_2 = convolution_layer(pooling_layer_1, 128)
pooling_layer_2 = pooling_layer(convolution_layer_2)
flattened_pool = tf.reshape(pooling_layer_2, [-1, 5 * 5 * 128],
name='flattened_pool')
dense_layer_bottleneck = dense_layer(flattened_pool, 1024)
dropout_bool = tf.placeholder(tf.bool)
dropout_layer = tf.layers.dropout(
inputs=dense_layer_bottleneck,
rate=0.4,
training=dropout_bool
)
logits = dense_layer(dropout_layer, no_classes)
with tf.name_scope('loss'):
softmax_cross_entropy = tf.nn.softmax_cross_entropy_with_logits(
labels=y_input, logits=logits)
loss_operation = tf.reduce_mean(softmax_cross_entropy, name='loss')
tf.summary.scalar('loss', loss_operation)
with tf.name_scope('optimiser'):
optimiser = tf.train.AdamOptimizer().minimize(loss_operation)
with tf.name_scope('accuracy'):
with tf.name_scope('correct_prediction'):
predictions = tf.argmax(logits, 1)
correct_predictions = tf.equal(predictions, tf.argmax(y_input, 1))
with tf.name_scope('accuracy'):
accuracy_operation = tf.reduce_mean(
tf.cast(correct_predictions, tf.float32))
tf.summary.scalar('accuracy', accuracy_operation)
session = tf.Session()
session.run(tf.global_variables_initializer())
merged_summary_operation = tf.summary.merge_all()
train_summary_writer = tf.summary.FileWriter('/tmp/train', session.graph)
test_summary_writer = tf.summary.FileWriter('/tmp/test')
test_images, test_labels = mnist_data.test.images, mnist_data.test.labels
for batch_no in range(total_batches):
mnist_batch = mnist_data.train.next_batch(batch_size)
train_images, train_labels = mnist_batch[0], mnist_batch[1]
_, merged_summary = session.run([optimiser, merged_summary_operation],
feed_dict={
x_input: train_images,
y_input: train_labels,
dropout_bool: True
})
train_summary_writer.add_summary(merged_summary, batch_no)
if batch_no % 10 == 0:
merged_summary, _ = session.run([merged_summary_operation,
accuracy_operation], feed_dict={
x_input: test_images,
y_input: test_labels,
dropout_bool: False
})
test_summary_writer.add_summary(merged_summary, batch_no)
================================================
FILE: Chapter02/3_mnist_keras.py
================================================
import tensorflow as tf
batch_size = 128
no_classes = 10
epochs = 50
image_height, image_width = 28, 28
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train = x_train.reshape(x_train.shape[0], image_height, image_width, 1)
x_test = x_test.reshape(x_test.shape[0], image_height, image_width, 1)
input_shape = (image_height, image_width, 1)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
y_train = tf.keras.utils.to_categorical(y_train, no_classes)
y_test = tf.keras.utils.to_categorical(y_test, no_classes)
def simple_cnn(input_shape):
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Conv2D(
filters=64,
kernel_size=(3, 3),
activation='relu',
input_shape=input_shape
))
model.add(tf.keras.layers.Conv2D(
filters=128,
kernel_size=(3, 3),
activation='relu'
))
model.add(tf.keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add(tf.keras.layers.Dropout(rate=0.3))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(units=1024, activation='relu'))
model.add(tf.keras.layers.Dropout(rate=0.3))
model.add(tf.keras.layers.Dense(units=no_classes, activation='softmax'))
model.compile(loss=tf.keras.losses.categorical_crossentropy,
optimizer=tf.keras.optimizers.Adam(),
metrics=['accuracy'])
return model
simple_cnn_model = simple_cnn(input_shape)
simple_cnn_model.fit(x_train, y_train, batch_size, epochs, (x_test, y_test))
train_loss, train_accuracy = simple_cnn_model.evaluate(
x_train, y_train, verbose=0)
print('Train data loss:', train_loss)
print('Train data accuracy:', train_accuracy)
test_loss, test_accuracy = simple_cnn_model.evaluate(
x_test, y_test, verbose=0)
print('Test data loss:', test_loss)
print('Test data accuracy:', test_accuracy)
================================================
FILE: Chapter02/4_cat_vs_dog_data_prep.py
================================================
import os
import shutil
work_dir = ''
image_names = sorted(os.listdir(os.path.join(work_dir, 'train')))
def copy_files(prefix_str, range_start, range_end, target_dir):
image_paths = [os.path.join(work_dir, 'train', prefix_str + '.' + str(i) + '.jpg')
for i in range(range_start, range_end)]
dest_dir = os.path.join(work_dir, 'data', target_dir, prefix_str)
os.makedirs(dest_dir)
for image_path in image_paths:
shutil.copy(image_path, dest_dir)
copy_files('dog', 0, 1000, 'train')
copy_files('cat', 0, 1000, 'train')
copy_files('dog', 1000, 1400, 'test')
copy_files('cat', 1000, 1400, 'test')
================================================
FILE: Chapter02/5_cat_vs_dog_cnn.py
================================================
import numpy as np
import os
import tensorflow as tf
work_dir = ''
image_height, image_width = 150, 150
train_dir = os.path.join(work_dir, 'train')
test_dir = os.path.join(work_dir, 'test')
no_classes = 2
no_validation = 800
epochs = 2
batch_size = 200
no_train = 2000
no_test = 800
input_shape = (image_height, image_width, 3)
epoch_steps = no_train // batch_size
test_steps = no_test // batch_size
def simple_cnn(input_shape):
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Conv2D(
filters=64,
kernel_size=(3, 3),
activation='relu',
input_shape=input_shape
))
model.add(tf.keras.layers.Conv2D(
filters=128,
kernel_size=(3, 3),
activation='relu'
))
model.add(tf.keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add(tf.keras.layers.Dropout(rate=0.3))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(units=1024, activation='relu'))
model.add(tf.keras.layers.Dropout(rate=0.3))
model.add(tf.keras.layers.Dense(units=no_classes, activation='softmax'))
model.compile(loss=tf.keras.losses.categorical_crossentropy,
optimizer=tf.keras.optimizers.Adam(),
metrics=['accuracy'])
return model
simple_cnn_model = simple_cnn(input_shape)
generator_train = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1. / 255)
generator_test = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1. / 255)
train_images = generator_train.flow_from_directory(
train_dir,
batch_size=batch_size,
target_size=(image_width, image_height))
test_images = generator_test.flow_from_directory(
test_dir,
batch_size=batch_size,
target_size=(image_width, image_height))
simple_cnn_model.fit_generator(
train_images,
steps_per_epoch=epoch_steps,
epochs=epochs,
validation_data=test_images,
validation_steps=test_steps)
================================================
FILE: Chapter02/6_cat_vs_dog_augmentation.py
================================================
import tensorflow as tf
import os
work_dir = ''
image_height, image_width = 150, 150
train_dir = os.path.join(work_dir, 'train')
test_dir = os.path.join(work_dir, 'test')
no_classes = 2
no_validation = 800
epochs = 50
batch_size = 32
no_train = 2000
no_test = 800
input_shape = (image_height, image_width, 3)
epoch_steps = no_train // batch_size
test_steps = no_test // batch_size
def simple_cnn(input_shape):
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Conv2D(
filters=64,
kernel_size=(3, 3),
activation='relu',
input_shape=input_shape
))
model.add(tf.keras.layers.Conv2D(
filters=128,
kernel_size=(3, 3),
activation='relu'
))
model.add(tf.keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add(tf.keras.layers.Dropout(rate=0.3))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(units=1024, activation='relu'))
model.add(tf.keras.layers.Dropout(rate=0.3))
model.add(tf.keras.layers.Dense(units=no_classes, activation='softmax'))
model.compile(loss=tf.keras.losses.categorical_crossentropy,
optimizer=tf.keras.optimizers.Adam(),
metrics=['accuracy'])
return model
simple_cnn_model = simple_cnn(input_shape)
generator_train = tf.keras.preprocessing.image.ImageDataGenerator(
rescale=1. / 255,
horizontal_flip=True,
zoom_range=0.3,
shear_range=0.3,)
generator_test = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1. / 255)
train_images = generator_train.flow_from_directory(
train_dir,
batch_size=batch_size,
target_size=(image_width, image_height))
test_images = generator_test.flow_from_directory(
test_dir,
batch_size=batch_size,
target_size=(image_width, image_height))
simple_cnn_model.fit_generator(
train_images,
steps_per_epoch=epoch_steps,
epochs=epochs,
validation_data=test_images,
validation_steps=test_steps)
================================================
FILE: Chapter02/7_cat_vs_dog_bottleneck.py
================================================
import numpy as np
import os
import tensorflow as tf
work_dir = ''
image_height, image_width = 150, 150
train_dir = os.path.join(work_dir, 'train')
test_dir = os.path.join(work_dir, 'test')
no_classes = 2
no_validation = 800
epochs = 50
batch_size = 32
no_train = 2000
no_test = 800
input_shape = (image_height, image_width, 3)
epoch_steps = no_train // batch_size
test_steps = no_test // batch_size
generator = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1. / 255)
model = tf.keras.applications.VGG16(include_top=False)
train_images = generator.flow_from_directory(
train_dir,
batch_size=batch_size,
target_size=(image_width, image_height),
class_mode=None,
shuffle=False
)
train_bottleneck_features = model.predict_generator(train_images, epoch_steps)
test_images = generator.flow_from_directory(
test_dir,
batch_size=batch_size,
target_size=(image_width, image_height),
class_mode=None,
shuffle=False
)
test_bottleneck_features = model.predict_generator(test_images, test_steps)
train_labels = np.array([0] * int(no_train / 2) + [1] * int(no_train / 2))
test_labels = np.array([0] * int(no_test / 2) + [1] * int(no_test / 2))
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Flatten(input_shape=train_bottleneck_features.shape[1:]))
model.add(tf.keras.layers.Dense(1024, activation='relu'))
model.add(tf.keras.layers.Dropout(0.3))
model.add(tf.keras.layers.Dense(1, activation='softmax'))
model.compile(loss=tf.keras.losses.categorical_crossentropy,
optimizer=tf.keras.optimizers.Adam(),
metrics=['accuracy'])
model.fit(
train_bottleneck_features,
train_labels,
batch_size=batch_size,
epochs=epochs,
validation_data=(test_bottleneck_features, test_labels))
================================================
FILE: Chapter02/8_cat_vs_dog_fine_tune.py
================================================
import tensorflow as tf
import os
work_dir = ''
weights_path = '../keras/examples/vgg16_weights.h5'
top_model_weights_path = 'fc_model.h5'
image_height, image_width = 150, 150
train_dir = os.path.join(work_dir, 'train')
test_dir = os.path.join(work_dir, 'test')
no_classes = 2
no_validation = 800
epochs = 50
batch_size = 32
no_train = 2000
no_test = 800
input_shape = (image_height, image_width, 3)
epoch_steps = no_train // batch_size
test_steps = no_test // batch_size
model = tf.keras.applications.VGG16(include_top=False)
model_fine_tune = tf.keras.models.Sequential()
model_fine_tune.add(tf.keras.layers.Flatten(input_shape=model.output_shape))
model_fine_tune.add(tf.keras.layers.Dense(256, activation='relu'))
model_fine_tune.add(tf.keras.layers.Dropout(0.5))
model_fine_tune.add(tf.keras.layers.Dense(no_classes, activation='softmax'))
model_fine_tune.load_weights(top_model_weights_path)
model.add(model_fine_tune)
for vgg_layer in model.layers[:25]:
vgg_layer.trainable = False
model.compile(loss='binary_crossentropy',
optimizer=tf.keras.optimizers.SGD(lr=1e-4, momentum=0.9),
metrics=['accuracy'])
generator_train = tf.keras.preprocessing.image.ImageDataGenerator(
rescale=1. / 255,
horizontal_flip=True,
zoom_range=0.3,
shear_range=0.3
)
generator_test = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1. / 255)
generator_train = generator_train.flow_from_directory(
train_dir,
batch_size=batch_size,
target_size=(image_width, image_height)
)
generator_test = generator_test.flow_from_directory(
test_dir,
batch_size=batch_size,
target_size=(image_width, image_height)
)
model.fit_generator(
generator_train,
steps_per_epoch=epoch_steps,
epochs=epochs,
validation_data=generator_test,
validation_steps=test_steps
)
================================================
FILE: Chapter03/1_embedding_vis.py
================================================
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import os
import numpy as np
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
input_size = 784
no_classes = 10
batch_size = 100
total_batches = 100
x_input = tf.placeholder(tf.float32, shape=[None, input_size])
y_input = tf.placeholder(tf.float32, shape=[None, no_classes])
def add_variable_summary(tf_variable, summary_name):
with tf.name_scope(summary_name + '_summary'):
mean = tf.reduce_mean(tf_variable)
tf.summary.scalar('Mean', mean)
with tf.name_scope('standard_deviation'):
standard_deviation = tf.sqrt(tf.reduce_mean(tf.square(tf_variable - mean)))
tf.summary.scalar('StandardDeviation', standard_deviation)
tf.summary.scalar('Maximum', tf.reduce_max(tf_variable))
tf.summary.scalar('Minimum', tf.reduce_min(tf_variable))
tf.summary.histogram('Histogram', tf_variable)
x_input_reshape = tf.reshape(x_input, [-1, 28, 28, 1], name='input_reshape')
convolution_layer_1 = tf.layers.conv2d(
inputs=x_input_reshape,
filters=64,
kernel_size=[3, 3],
activation=tf.nn.relu,
)
add_variable_summary(convolution_layer_1, 'convolution1')
pooling_layer_1 = tf.layers.max_pooling2d(
inputs=convolution_layer_1,
pool_size=[2, 2],
strides=2
)
add_variable_summary(pooling_layer_1, 'pooling1')
convolution_layer_2 = tf.layers.conv2d(
inputs=pooling_layer_1,
filters=128,
kernel_size=[3, 3],
activation=tf.nn.relu,
)
add_variable_summary(convolution_layer_2, 'convolution2')
pooling_layer_2 = tf.layers.max_pooling2d(
inputs=convolution_layer_2,
pool_size=[2, 2],
strides=2
)
add_variable_summary(pooling_layer_2, 'pool2')
flattened_pool = tf.reshape(pooling_layer_2, [-1, 5 * 5 * 128], name='flattened_pool')
dense_layer = tf.layers.dense(
inputs=flattened_pool,
units=1024,
activation=tf.nn.relu,
name='dense'
)
add_variable_summary(dense_layer, 'dense')
dropout_bool = tf.placeholder(tf.bool)
dropout_layer = tf.layers.dropout(
inputs=dense_layer,
rate=0.4,
training=dropout_bool,
name='dropout'
)
logits = tf.layers.dense(inputs=dropout_layer, units=no_classes, name='logits')
add_variable_summary(logits, 'logits')
with tf.name_scope('loss'):
softmax_cross_entropy = tf.nn.softmax_cross_entropy_with_logits(labels=y_input,
logits=logits)
loss_operation = tf.reduce_mean(softmax_cross_entropy, name='loss')
tf.summary.scalar('loss', loss_operation)
with tf.name_scope('optimiser'):
optimiser = tf.train.AdamOptimizer().minimize(loss_operation)
with tf.name_scope('accuracy'):
with tf.name_scope('correct_prediction'):
predictions = tf.argmax(logits, 1)
correct_predictions = tf.equal(predictions, tf.argmax(y_input, 1))
with tf.name_scope('accuracy'):
accuracy_operation = tf.reduce_mean(tf.cast(correct_predictions, tf.float32))
tf.summary.scalar('accuracy', accuracy_operation)
session = tf.Session()
# Adding the variable in between creating the session and initialising the graph
no_embedding_data = 1000
embedding_variable = tf.Variable(tf.stack(
mnist.test.images[:no_embedding_data], axis=0), trainable=False)
session.run(tf.global_variables_initializer())
merged_summary_operation = tf.summary.merge_all()
train_summary_writer = tf.summary.FileWriter('/tmp/train', session.graph)
test_images, test_labels = mnist.test.images, mnist.test.labels
for batch_no in range(total_batches):
image_batch = mnist.train.next_batch(100)
_, merged_summary = session.run([optimiser, merged_summary_operation], feed_dict={
x_input: image_batch[0],
y_input: image_batch[1],
dropout_bool: True
})
train_summary_writer.add_summary(merged_summary, batch_no)
work_dir = '' # change path
metadata_path = '/tmp/train/metadata.tsv'
with open(metadata_path, 'w') as metadata_file:
for i in range(no_embedding_data):
metadata_file.write('{}\n'.format(
np.nonzero(mnist.test.labels[::1])[1:][0][i]))
from tensorflow.contrib.tensorboard.plugins import projector
projector_config = projector.ProjectorConfig()
embedding_projection = projector_config.embeddings.add()
embedding_projection.tensor_name = embedding_variable.name
embedding_projection.metadata_path = metadata_path
embedding_projection.sprite.image_path = os.path.join(work_dir + '/mnist_10k_sprite.png')
embedding_projection.sprite.single_image_dim.extend([28, 28])
projector.visualize_embeddings(train_summary_writer, projector_config)
tf.train.Saver().save(session, '/tmp/train/model.ckpt', global_step=1)
================================================
FILE: Chapter03/2_guided_back_prop.py
================================================
from scipy.misc import imsave
import numpy as np
import tensorflow as tf
image_width, image_height = 128, 128
vgg_model = tf.keras.applications.vgg16.VGG16(include_top=False)
input_image = vgg_model.input
vgg_layer_dict = dict([(vgg_layer.name, vgg_layer) for vgg_layer in vgg_model.layers[1:]])
vgg_layer_output = vgg_layer_dict['block5_conv1'].output
filters = []
for filter_idx in range(20):
loss = tf.keras.backend.mean(vgg_layer_output[:, :, :, filter_idx])
gradients = tf.keras.backend.gradients(loss, input_image)[0]
gradient_mean_square = tf.keras.backend.mean(tf.keras.backend.square(gradients))
gradients /= (tf.keras.backend.sqrt(gradient_mean_square) + 1e-5)
evaluator = tf.keras.backend.function([input_image], [loss, gradients])
gradient_ascent_step = 1.
input_image_data = np.random.random((1, image_width, image_height, 3))
input_image_data = (input_image_data - 0.5) * 20 + 128
for i in range(20):
loss_value, gradient_values = evaluator([input_image_data])
input_image_data += gradient_values * gradient_ascent_step
# print('Loss :', loss_value)
if loss_value <= 0.:
break
if loss_value > 0:
filter = input_image_data[0]
filter -= filter.mean()
filter /= (filter.std() + 1e-5)
filter *= 0.1
filter += 0.5
filter = np.clip(filter, 0, 1)
filter *= 255
filter = np.clip(filter, 0, 255).astype('uint8')
filters.append((filter, loss_value))
# For visualisation, not in book
n = 3
filters.sort(key=lambda x: x[1], reverse=True)
filters = filters[:n * n]
margin = 5
width = n * image_width + (n - 1) * margin
height = n * image_height + (n - 1) * margin
stitched_filters = np.zeros((width, height, 3))
for i in range(n):
for j in range(n):
img, loss = filters[i * n + j]
stitched_filters[(image_width + margin) * i: (image_width + margin) * i + image_width,
(image_height + margin) * j: (image_height + margin) * j + image_height, :] = img
imsave('stitched_filters_%dx%d.png' % (n, n), stitched_filters)
================================================
FILE: Chapter03/3_deep_dream.py
================================================
import os
import numpy as np
import PIL.Image
import urllib.request
from tensorflow.python.platform import gfile
import zipfile
import tensorflow as tf
work_dir = ''
model_url = 'https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip'
file_name = model_url.split('/')[-1]
file_path = os.path.join(work_dir, file_name)
if not os.path.exists(file_path):
file_path, _ = urllib.request.urlretrieve(model_url, file_path)
zip_handle = zipfile.ZipFile(file_path, 'r')
zip_handle.extractall(work_dir)
zip_handle.close()
graph = tf.Graph()
session = tf.InteractiveSession(graph=graph)
model_path = os.path.join(work_dir, 'tensorflow_inception_graph.pb')
with gfile.FastGFile(model_path, 'rb') as f:
graph_defnition = tf.GraphDef()
graph_defnition.ParseFromString(f.read())
input_placeholder = tf.placeholder(np.float32, name='input')
imagenet_mean_value = 117.0
preprocessed_input = tf.expand_dims(input_placeholder-imagenet_mean_value, 0)
tf.import_graph_def(graph_defnition, {'input': preprocessed_input})
def resize_image(image, size):
resize_placeholder = tf.placeholder(tf.float32)
resize_placeholder_expanded = tf.expand_dims(resize_placeholder, 0)
resized_image = tf.image.resize_bilinear(resize_placeholder_expanded, size)[0, :, :, :]
return session.run(resized_image, feed_dict={resize_placeholder: image})
image_name = 'mountain.jpg'
image = PIL.Image.open(image_name)
image = np.float32(image)
objective_fn = tf.square(graph.get_tensor_by_name("import/mixed4c:0"))
no_octave = 4
scale = 1.4
window_size = 51
score = tf.reduce_mean(objective_fn)
gradients = tf.gradients(score, input_placeholder)[0]
octave_images = []
for i in range(no_octave - 1):
image_height_width = image.shape[:2]
scaled_image = resize_image(image, np.int32(np.float32(image_height_width) / scale))
image_difference = image - resize_image(scaled_image, image_height_width)
image = scaled_image
octave_images.append(image_difference)
for octave_idx in range(no_octave):
if octave_idx > 0:
image_difference = octave_images[-octave_idx]
image = resize_image(image, image_difference.shape[:2]) + image_difference
for i in range(10):
image_heigth, image_width = image.shape[:2]
sx, sy = np.random.randint(window_size, size=2)
shifted_image = np.roll(np.roll(image, sx, 1), sy, 0)
gradient_values = np.zeros_like(image)
for y in range(0, max(image_heigth - window_size // 2, window_size), window_size):
for x in range(0, max(image_width - window_size // 2, window_size), window_size):
sub = shifted_image[y:y + window_size, x:x + window_size]
gradient_windows = session.run(gradients, {input_placeholder: sub})
gradient_values[y:y + window_size, x:x + window_size] = gradient_windows
gradient_windows = np.roll(np.roll(gradient_values, -sx, 1), -sy, 0)
image += gradient_windows * (1.5 / (np.abs(gradient_windows).mean() + 1e-7))
image /= 255.0
image = np.uint8(np.clip(image, 0, 1) * 255)
PIL.Image.fromarray(image).save('dream_' + image_name, 'jpeg')
================================================
FILE: Chapter03/4_export_model.py
================================================
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import os
work_dir = '/tmp'
model_version = 9
training_iteration = 1000
input_size = 784
no_classes = 10
batch_size = 100
total_batches = 200
tf_example = tf.parse_example(tf.placeholder(tf.string, name='tf_example'),
{'x': tf.FixedLenFeature(shape=[784], dtype=tf.float32), })
x_input = tf.identity(tf_example['x'], name='x')
y_input = tf.placeholder(tf.float32, shape=[None, no_classes])
weights = tf.Variable(tf.random_normal([input_size, no_classes]))
bias = tf.Variable(tf.random_normal([no_classes]))
logits = tf.matmul(x_input, weights) + bias
softmax_cross_entropy = tf.nn.softmax_cross_entropy_with_logits(labels=y_input, logits=logits)
loss_operation = tf.reduce_mean(softmax_cross_entropy)
optimiser = tf.train.GradientDescentOptimizer(0.5).minimize(loss_operation)
session = tf.Session()
session.run(tf.global_variables_initializer())
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
for batch_no in range(total_batches):
mnist_batch = mnist.train.next_batch(batch_size)
_, loss_value = session.run([optimiser, loss_operation], feed_dict={
x_input: mnist_batch[0],
y_input: mnist_batch[1]
})
print(loss_value)
signature_def = (
tf.saved_model.signature_def_utils.build_signature_def(
inputs={'x': tf.saved_model.utils.build_tensor_info(x_input)},
outputs={'y': tf.saved_model.utils.build_tensor_info(y_input)},
method_name="tensorflow/serving/predict"))
model_path = os.path.join(work_dir, str(model_version))
saved_model_builder = tf.saved_model.builder.SavedModelBuilder(model_path)
saved_model_builder.add_meta_graph_and_variables(
session, [tf.saved_model.tag_constants.SERVING],
signature_def_map={
'prediction': signature_def
},
legacy_init_op=tf.group(tf.tables_initializer(), name='legacy_init_op'))
saved_model_builder.save()
================================================
FILE: Chapter03/5_serving_client.py
================================================
from grpc.beta import implementations
import numpy
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_service_pb2
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
concurrency = 1
num_tests = 100
host = ''
port = 8000
work_dir = '/tmp'
def _create_rpc_callback():
def _callback(result):
response = numpy.array(
result.result().outputs['y'].float_val)
prediction = numpy.argmax(response)
print(prediction)
return _callback
test_data_set = mnist.test
test_image = mnist.test.images[0]
predict_request = predict_pb2.PredictRequest()
predict_request.model_spec.name = 'mnist'
predict_request.model_spec.signature_name = 'prediction'
predict_channel = implementations.insecure_channel(host, int(port))
predict_stub = prediction_service_pb2.beta_create_PredictionService_stub(predict_channel)
predict_request.inputs['x'].CopyFrom(
tf.contrib.util.make_tensor_proto(test_image, shape=[1, test_image.size]))
result = predict_stub.Predict.future(predict_request, 3.0)
result.add_done_callback(
_create_rpc_callback())
================================================
FILE: Chapter03/6_bottleneck_features.py
================================================
import tensorflow as tf
import os
import urllib.request
from tensorflow.python.platform import gfile
import tarfile
import numpy as np
work_dir = ''
model_url = 'http://download.tensorflow.org/models/image/imagenet/inception-2015-12-05.tgz'
file_name = model_url.split('/')[-1]
file_path = os.path.join(work_dir, file_name)
if not os.path.exists(file_path):
file_path, _ = urllib.request.urlretrieve(model_url, file_path)
tarfile.open(file_path, 'r:gz').extractall(work_dir)
model_path = os.path.join(work_dir, 'classify_image_graph_def.pb')
with gfile.FastGFile(model_path, 'rb') as f:
graph_defnition = tf.GraphDef()
graph_defnition.ParseFromString(f.read())
bottleneck, image, resized_input = (
tf.import_graph_def(
graph_defnition,
name='',
return_elements=['pool_3/_reshape:0',
'DecodeJpeg/contents:0',
'ResizeBilinear:0'])
)
query_image_path = os.path.join(work_dir, 'cat.1000.jpg')
query_image = gfile.FastGFile(query_image_path, 'rb').read()
target_image_path = os.path.join(work_dir, 'cat.1001.jpg')
target_image = gfile.FastGFile(target_image_path, 'rb').read()
def get_bottleneck_data(session, image_data):
bottleneck_data = session.run(bottleneck, {image: image_data})
bottleneck_data = np.squeeze(bottleneck_data)
return bottleneck_data
session = tf.Session()
query_feature = get_bottleneck_data(session, query_image)
print(query_feature)
target_feature = get_bottleneck_data(session, target_image)
print(target_feature)
dist = np.linalg.norm(np.asarray(query_feature) - np.asarray(target_feature))
print(dist)
================================================
FILE: Chapter03/7_annoy.py
================================================
import os
from annoy import AnnoyIndex
work_dir = ''
layer_dimension = 256
target_features = []
query_feature = []
def create_annoy(target_features):
t = AnnoyIndex(layer_dimension)
for idx, target_feature in enumerate(target_features):
t.add_item(idx, target_feature)
t.build(10)
t.save(os.path.join(work_dir, 'annoy.ann'))
create_annoy(target_features)
annoy_index = AnnoyIndex(10)
annoy_index.load(os.path.join(work_dir, 'annoy.ann'))
matches = annoy_index.get_nns_by_vector(query_feature, 20)
================================================
FILE: Chapter03/8_auto_encoder.py
================================================
import tensorflow as tf
def fully_connected_layer(input_layer, units):
return tf.layers.dense(
input_layer,
units=units,
activation=tf.nn.relu
)
def convolution_layer(input_layer, filter_size):
return tf.layers.conv2d(
input_layer,
filters=filter_size,
kernel_initializer=tf.contrib.layers.xavier_initializer_conv2d(),
kernel_size=3,
strides=2
)
def deconvolution_layer(input_layer, filter_size, activation=tf.nn.relu):
return tf.layers.conv2d_transpose(
input_layer,
filters=filter_size,
kernel_initializer=tf.contrib.layers.xavier_initializer_conv2d(),
kernel_size=3,
activation=activation,
strides=2
)
input_layer = tf.placeholder(tf.float32, [None, 128, 128, 3])
convolution_layer_1 = convolution_layer(input_layer, 1024)
convolution_layer_2 = convolution_layer(convolution_layer_1, 512)
convolution_layer_3 = convolution_layer(convolution_layer_2, 256)
convolution_layer_4 = convolution_layer(convolution_layer_3, 128)
convolution_layer_5 = convolution_layer(convolution_layer_4, 32)
convolution_layer_5_flattened = tf.layers.flatten(convolution_layer_5)
bottleneck_layer = fully_connected_layer(convolution_layer_5_flattened, 16)
c5_shape = convolution_layer_5.get_shape().as_list()
c5f_flat_shape = convolution_layer_5_flattened.get_shape().as_list()[1]
fully_connected = fully_connected_layer(bottleneck_layer, c5f_flat_shape)
fully_connected = tf.reshape(fully_connected,
[-1, c5_shape[1], c5_shape[2], c5_shape[3]])
deconvolution_layer_1 = deconvolution_layer(fully_connected, 128)
deconvolution_layer_2 = deconvolution_layer(deconvolution_layer_1, 256)
deconvolution_layer_3 = deconvolution_layer(deconvolution_layer_2, 512)
deconvolution_layer_4 = deconvolution_layer(deconvolution_layer_3, 1024)
deconvolution_layer_5 = deconvolution_layer(deconvolution_layer_4, 3,
activation=tf.nn.tanh)
================================================
FILE: Chapter03/9_denoising.py
================================================
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import numpy as np
mnist_data = input_data.read_data_sets('MNIST_data', one_hot=True)
input_size = 784
no_classes = 10
batch_size = 100
total_batches = 2000
x_input = tf.placeholder(tf.float32, shape=[None, input_size])
y_input = tf.placeholder(tf.float32, shape=[None, input_size])
def add_variable_summary(tf_variable, summary_name):
with tf.name_scope(summary_name + '_summary'):
mean = tf.reduce_mean(tf_variable)
tf.summary.scalar('Mean', mean)
with tf.name_scope('standard_deviation'):
standard_deviation = tf.sqrt(tf.reduce_mean(
tf.square(tf_variable - mean)))
tf.summary.scalar('StandardDeviation', standard_deviation)
tf.summary.scalar('Maximum', tf.reduce_max(tf_variable))
tf.summary.scalar('Minimum', tf.reduce_min(tf_variable))
tf.summary.histogram('Histogram', tf_variable)
def dense_layer(input_layer, units, activation=tf.nn.tanh):
layer = tf.layers.dense(
inputs=input_layer,
units=units,
activation=activation
)
add_variable_summary(layer, 'dense')
return layer
layer_1 = dense_layer(x_input, 500)
layer_2 = dense_layer(layer_1, 250)
layer_3 = dense_layer(layer_2, 50)
layer_4 = dense_layer(layer_3, 250)
layer_5 = dense_layer(layer_4, 500)
layer_6 = dense_layer(layer_5, 784)
with tf.name_scope('loss'):
softmax_cross_entropy = tf.nn.sigmoid_cross_entropy_with_logits(
labels=y_input, logits=layer_6)
loss_operation = tf.reduce_mean(softmax_cross_entropy, name='loss')
tf.summary.scalar('loss', loss_operation)
with tf.name_scope('optimiser'):
optimiser = tf.train.AdamOptimizer().minimize(loss_operation)
x_input_reshaped = tf.reshape(x_input, [-1, 28, 28, 1])
tf.summary.image("noisy_images", x_input_reshaped)
y_input_reshaped = tf.reshape(y_input, [-1, 28, 28, 1])
tf.summary.image("original_images", y_input_reshaped)
layer_6_reshaped = tf.reshape(layer_6, [-1, 28, 28, 1])
tf.summary.image("reconstructed_images", layer_6_reshaped)
session = tf.Session()
session.run(tf.global_variables_initializer())
merged_summary_operation = tf.summary.merge_all()
train_summary_writer = tf.summary.FileWriter('/tmp/train', session.graph)
for batch_no in range(total_batches):
mnist_batch = mnist_data.train.next_batch(batch_size)
train_images, _ = mnist_batch[0], mnist_batch[1]
train_images_noise = train_images + 0.2 * np.random.normal(size=train_images.shape)
train_images_noise = np.clip(train_images_noise, 0., 1.)
_, merged_summary = session.run([optimiser, merged_summary_operation],
feed_dict={
x_input: train_images_noise,
y_input: train_images,
})
train_summary_writer.add_summary(merged_summary, batch_no)
================================================
FILE: Chapter04/1_iou.py
================================================
import tensorflow as tf
def calculate_iou(gt_bb, pred_bb):
'''
:param gt_bb: ground truth bounding box
:param pred_bb: predicted bounding box
'''
gt_bb = tf.stack([
gt_bb[:, :, :, :, 0] - gt_bb[:, :, :, :, 2] / 2.0,
gt_bb[:, :, :, :, 1] - gt_bb[:, :, :, :, 3] / 2.0,
gt_bb[:, :, :, :, 0] + gt_bb[:, :, :, :, 2] / 2.0,
gt_bb[:, :, :, :, 1] + gt_bb[:, :, :, :, 3] / 2.0])
gt_bb = tf.transpose(gt_bb, [1, 2, 3, 4, 0])
pred_bb = tf.stack([
pred_bb[:, :, :, :, 0] - pred_bb[:, :, :, :, 2] / 2.0,
pred_bb[:, :, :, :, 1] - pred_bb[:, :, :, :, 3] / 2.0,
pred_bb[:, :, :, :, 0] + pred_bb[:, :, :, :, 2] / 2.0,
pred_bb[:, :, :, :, 1] + pred_bb[:, :, :, :, 3] / 2.0])
pred_bb = tf.transpose(pred_bb, [1, 2, 3, 4, 0])
area = tf.maximum(
0.0,
tf.minimum(gt_bb[:, :, :, :, 2:], pred_bb[:, :, :, :, 2:]) -
tf.maximum(gt_bb[:, :, :, :, :2], pred_bb[:, :, :, :, :2]))
intersection_area= area[:, :, :, :, 0] * area[:, :, :, :, 1]
gt_bb_area = (gt_bb[:, :, :, :, 2] - gt_bb[:, :, :, :, 0]) * \
(gt_bb[:, :, :, :, 3] - gt_bb[:, :, :, :, 1])
pred_bb_area = (pred_bb[:, :, :, :, 2] - pred_bb[:, :, :, :, 0]) * \
(pred_bb[:, :, :, :, 3] - pred_bb[:, :, :, :, 1])
union_area = tf.maximum(gt_bb_area + pred_bb_area - intersection_area, 1e-10)
iou = tf.clip_by_value(intersection_area / union_area, 0.0, 1.0)
return iou
================================================
FILE: Chapter04/2_overfeat.py
================================================
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
mnist_data = input_data.read_data_sets('MNIST_data', one_hot=True)
input_size = 784
no_classes = 10
batch_size = 100
total_batches = 300
x_input = tf.placeholder(tf.float32, shape=[None, input_size])
y_input = tf.placeholder(tf.float32, shape=[None, no_classes])
def add_variable_summary(tf_variable, summary_name):
with tf.name_scope(summary_name + '_summary'):
mean = tf.reduce_mean(tf_variable)
tf.summary.scalar('Mean', mean)
with tf.name_scope('standard_deviation'):
standard_deviation = tf.sqrt(tf.reduce_mean(
tf.square(tf_variable - mean)))
tf.summary.scalar('StandardDeviation', standard_deviation)
tf.summary.scalar('Maximum', tf.reduce_max(tf_variable))
tf.summary.scalar('Minimum', tf.reduce_min(tf_variable))
tf.summary.histogram('Histogram', tf_variable)
x_input_reshape = tf.reshape(x_input, [-1, 28, 28, 1],
name='input_reshape')
def convolution_layer(input_layer, filters, kernel_size=[3, 3],
activation=tf.nn.relu):
layer = tf.layers.conv2d(
inputs=input_layer,
filters=filters,
kernel_size=kernel_size,
activation=activation
)
add_variable_summary(layer, 'convolution')
return layer
def pooling_layer(input_layer, pool_size=[2, 2], strides=2):
layer = tf.layers.max_pooling2d(
inputs=input_layer,
pool_size=pool_size,
strides=strides
)
add_variable_summary(layer, 'pooling')
return layer
convolution_layer_1 = convolution_layer(x_input_reshape, 64)
pooling_layer_1 = pooling_layer(convolution_layer_1)
convolution_layer_2 = convolution_layer(pooling_layer_1, 128)
pooling_layer_2 = pooling_layer(convolution_layer_2)
dense_layer_bottleneck = convolution_layer(pooling_layer_2, 1024, [5, 5])
logits = convolution_layer(dense_layer_bottleneck, no_classes, [1, 1])
logits = tf.reshape(logits, [-1, 10])
with tf.name_scope('loss'):
softmax_cross_entropy = tf.nn.softmax_cross_entropy_with_logits(
labels=y_input, logits=logits)
print(softmax_cross_entropy)
loss_operation = tf.reduce_mean(softmax_cross_entropy, name='loss')
print(loss_operation)
tf.summary.scalar('loss', loss_operation)
with tf.name_scope('optimiser'):
optimiser = tf.train.AdamOptimizer().minimize(loss_operation)
with tf.name_scope('accuracy'):
with tf.name_scope('correct_prediction'):
predictions = tf.argmax(logits, 1)
correct_predictions = tf.equal(predictions, tf.argmax(y_input, 1))
with tf.name_scope('accuracy'):
accuracy_operation = tf.reduce_mean(
tf.cast(correct_predictions, tf.float32))
tf.summary.scalar('accuracy', accuracy_operation)
session = tf.Session()
session.run(tf.global_variables_initializer())
merged_summary_operation = tf.summary.merge_all()
train_summary_writer = tf.summary.FileWriter('/tmp/train', session.graph)
test_summary_writer = tf.summary.FileWriter('/tmp/test')
test_images, test_labels = mnist_data.test.images, mnist_data.test.labels
for batch_no in range(total_batches):
mnist_batch = mnist_data.train.next_batch(batch_size)
train_images, train_labels = mnist_batch[0], mnist_batch[1]
_, merged_summary = session.run([optimiser, merged_summary_operation],
feed_dict={
x_input: train_images,
y_input: train_labels,
})
train_summary_writer.add_summary(merged_summary, batch_no)
if batch_no % 10 == 0:
merged_summary, _ = session.run([merged_summary_operation,
accuracy_operation], feed_dict={
x_input: test_images,
y_input: test_labels,
})
test_summary_writer.add_summary(merged_summary, batch_no)
================================================
FILE: Chapter04/3_object_detection_api.py
================================================
================================================
FILE: Chapter04/4_yolo.py
================================================
import tensorflow as tf
import numpy as np
import os
from pascal_voc import pascal_voc
def calculate_iou(gt_bb, pred_bb):
'''
:param gt_bb: ground truth bounding box
:param pred_bb: predicted bounding box
'''
gt_bb = tf.stack([
gt_bb[:, :, :, :, 0] - gt_bb[:, :, :, :, 2] / 2.0,
gt_bb[:, :, :, :, 1] - gt_bb[:, :, :, :, 3] / 2.0,
gt_bb[:, :, :, :, 0] + gt_bb[:, :, :, :, 2] / 2.0,
gt_bb[:, :, :, :, 1] + gt_bb[:, :, :, :, 3] / 2.0])
gt_bb = tf.transpose(gt_bb, [1, 2, 3, 4, 0])
pred_bb = tf.stack([
pred_bb[:, :, :, :, 0] - pred_bb[:, :, :, :, 2] / 2.0,
pred_bb[:, :, :, :, 1] - pred_bb[:, :, :, :, 3] / 2.0,
pred_bb[:, :, :, :, 0] + pred_bb[:, :, :, :, 2] / 2.0,
pred_bb[:, :, :, :, 1] + pred_bb[:, :, :, :, 3] / 2.0])
pred_bb = tf.transpose(pred_bb, [1, 2, 3, 4, 0])
area = tf.maximum(
0.0,
tf.minimum(gt_bb[:, :, :, :, 2:], pred_bb[:, :, :, :, 2:]) -
tf.maximum(gt_bb[:, :, :, :, :2], pred_bb[:, :, :, :, :2]))
intersection_area= area[:, :, :, :, 0] * area[:, :, :, :, 1]
gt_bb_area = (gt_bb[:, :, :, :, 2] - gt_bb[:, :, :, :, 0]) * \
(gt_bb[:, :, :, :, 3] - gt_bb[:, :, :, :, 1])
pred_bb_area = (pred_bb[:, :, :, :, 2] - pred_bb[:, :, :, :, 0]) * \
(pred_bb[:, :, :, :, 3] - pred_bb[:, :, :, :, 1])
union_area = tf.maximum(gt_bb_area + pred_bb_area - intersection_area, 1e-10)
iou = tf.clip_by_value(intersection_area / union_area, 0.0, 1.0)
return iou
DATA_PATH = 'data'
PASCAL_PATH = os.path.join(DATA_PATH, 'pascal_voc')
CACHE_PATH = os.path.join(PASCAL_PATH, 'cache')
OUTPUT_DIR = os.path.join(PASCAL_PATH, 'output')
WEIGHTS_DIR = os.path.join(PASCAL_PATH, 'weight')
FLIPPED = True
DISP_CONSOLE = False
GPU = ''
THRESHOLD = 0.2
IOU_THRESHOLD = 0.5
classes = ['aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus',
'car', 'cat', 'chair', 'cow', 'diningtable', 'dog', 'horse',
'motorbike', 'person', 'pottedplant', 'sheep', 'sofa',
'train', 'tvmonitor']
num_class = len(classes)
image_size = 448
cell_size = 7
boxes_per_cell = 2
output_size = (cell_size * cell_size) * (num_class + boxes_per_cell * 5)
scale = 1.0 * image_size / cell_size
boundary1 = cell_size * cell_size * num_class
boundary2 = boundary1 + cell_size * cell_size * boxes_per_cell
object_scale = 1.0
noobject_scale = 1.0
class_scale = 2.0
coord_scale = 5.0
learning_rate = 0.0001
batch_size = 45
alpha = 0.1
weights_file = None
max_iter = 15000
initial_learning_rate = 0.0001
decay_steps = 30000
decay_rate = 0.1
staircase = True
offset = np.transpose(np.reshape(np.array(
[np.arange(cell_size)] * cell_size * boxes_per_cell),
(boxes_per_cell, cell_size, cell_size)), (1, 2, 0))
images = tf.placeholder(tf.float32, [None, image_size, image_size, 3], name='images')
def add_variable_summary(tf_variable, summary_name):
with tf.name_scope(summary_name + '_summary'):
mean = tf.reduce_mean(tf_variable)
tf.summary.scalar('Mean', mean)
with tf.name_scope('standard_deviation'):
standard_deviation = tf.sqrt(tf.reduce_mean(
tf.square(tf_variable - mean)))
tf.summary.scalar('StandardDeviation', standard_deviation)
tf.summary.scalar('Maximum', tf.reduce_max(tf_variable))
tf.summary.scalar('Minimum', tf.reduce_min(tf_variable))
tf.summary.histogram('Histogram', tf_variable)
def pooling_layer(input_layer, pool_size=[2, 2], strides=2, padding='valid'):
layer = tf.layers.max_pooling2d(
inputs=input_layer,
pool_size=pool_size,
strides=strides,
padding=padding
)
add_variable_summary(layer, 'pooling')
return layer
def convolution_layer(input_layer, filters, kernel_size=[3, 3], padding='valid',
activation=tf.nn.leaky_relu):
layer = tf.layers.conv2d(
inputs=input_layer,
filters=filters,
kernel_size=kernel_size,
activation=activation,
padding=padding,
weights_initializer=tf.truncated_normal_initializer(0.0, 0.01),
weights_regularizer=tf.l2_regularizer(0.0005)
)
add_variable_summary(layer, 'convolution')
return layer
def dense_layer(input_layer, units, activation=tf.nn.leaky_relu):
layer = tf.layers.dense(
inputs=input_layer,
units=units,
activation=activation,
weights_initializer=tf.truncated_normal_initializer(0.0, 0.01),
weights_regularizer=tf.l2_regularizer(0.0005)
)
add_variable_summary(layer, 'dense')
return layer
yolo = tf.pad(images, np.array([[0, 0], [3, 3], [3, 3], [0, 0]]), name='pad_1')
yolo = convolution_layer(yolo, 64, 7, 2)
yolo = pooling_layer(yolo, [2, 2], 2, 'same')
yolo = convolution_layer(yolo, 192, 3)
yolo = pooling_layer(yolo, 2, 'same')
yolo = convolution_layer(yolo, 128, 1)
yolo = convolution_layer(yolo, 256, 3)
yolo = convolution_layer(yolo, 256, 1)
yolo = convolution_layer(yolo, 512, 3)
yolo = pooling_layer(yolo, 2, 'same')
yolo = convolution_layer(yolo, 256, 1)
yolo = convolution_layer(yolo, 512, 3)
yolo = convolution_layer(yolo, 256, 1)
yolo = convolution_layer(yolo, 512, 3)
yolo = convolution_layer(yolo, 256, 1)
yolo = convolution_layer(yolo, 512, 3)
yolo = convolution_layer(yolo, 256, 1)
yolo = convolution_layer(yolo, 512, 3)
yolo = convolution_layer(yolo, 512, 1)
yolo = convolution_layer(yolo, 1024, 3)
yolo = pooling_layer(yolo, 2)
yolo = convolution_layer(yolo, 512, 1)
yolo = convolution_layer(yolo, 1024, 3)
yolo = convolution_layer(yolo, 512, 1)
yolo = convolution_layer(yolo, 1024, 3)
yolo = convolution_layer(yolo, 1024, 3)
yolo = tf.pad(yolo, np.array([[0, 0], [1, 1], [1, 1], [0, 0]]))
yolo = convolution_layer(yolo, 1024, 3, 2)
yolo = convolution_layer(yolo, 1024, 3)
yolo = convolution_layer(yolo, 1024, 3)
yolo = tf.transpose(yolo, [0, 3, 1, 2])
yolo = tf.layers.flatten(yolo)
yolo = dense_layer(yolo, 512)
yolo = dense_layer(yolo, 4096)
dropout_bool = tf.placeholder(tf.bool)
yolo = tf.layers.dropout(
inputs=yolo,
rate=0.4,
training=dropout_bool
)
yolo = dense_layer(yolo, output_size, None)
predicts = yolo
labels = tf.placeholder(tf.float32, [None, cell_size, cell_size, 5 + num_class])
predict_classes = tf.reshape(predicts[:, :boundary1], [batch_size, cell_size, cell_size, num_class])
predict_scales = tf.reshape(predicts[:, boundary1:boundary2], [batch_size, cell_size, cell_size, boxes_per_cell])
predict_boxes = tf.reshape(predicts[:, boundary2:], [batch_size, cell_size, cell_size, boxes_per_cell, 4])
response = tf.reshape(labels[:, :, :, 0], [batch_size, cell_size, cell_size, 1])
boxes = tf.reshape(labels[:, :, :, 1:5], [batch_size, cell_size, cell_size, 1, 4])
boxes = tf.tile(boxes, [1, 1, 1, boxes_per_cell, 1]) / image_size
classes = labels[:, :, :, 5:]
offset = tf.constant(offset, dtype=tf.float32)
offset = tf.reshape(offset, [1, cell_size, cell_size, boxes_per_cell])
offset = tf.tile(offset, [batch_size, 1, 1, 1])
predict_boxes_tran = tf.stack([(predict_boxes[:, :, :, :, 0] + offset) / cell_size,
(predict_boxes[:, :, :, :, 1] + tf.transpose(offset, (0, 2, 1, 3))) / cell_size,
tf.square(predict_boxes[:, :, :, :, 2]),
tf.square(predict_boxes[:, :, :, :, 3])])
predict_boxes_tran = tf.transpose(predict_boxes_tran, [1, 2, 3, 4, 0])
iou_predict_truth = calculate_iou(predict_boxes_tran, boxes)
object_mask = tf.reduce_max(iou_predict_truth, 3, keep_dims=True)
object_mask = tf.cast((iou_predict_truth >= object_mask), tf.float32) * response
noobject_mask = tf.ones_like(object_mask, dtype=tf.float32) - object_mask
boxes_tran = tf.stack([boxes[:, :, :, :, 0] * cell_size - offset,
boxes[:, :, :, :, 1] * cell_size - tf.transpose(offset, (0, 2, 1, 3)),
tf.sqrt(boxes[:, :, :, :, 2]),
tf.sqrt(boxes[:, :, :, :, 3])])
boxes_tran = tf.transpose(boxes_tran, [1, 2, 3, 4, 0])
class_delta = response * (predict_classes - classes)
class_loss = tf.reduce_mean(tf.reduce_sum(tf.square(class_delta), axis=[1, 2, 3]), name='class_loss') * class_scale
object_delta = object_mask * (predict_scales - iou_predict_truth)
object_loss = tf.reduce_mean(tf.reduce_sum(tf.square(object_delta), axis=[1, 2, 3]), name='object_loss') * object_scale
noobject_delta = noobject_mask * predict_scales
noobject_loss = tf.reduce_mean(tf.reduce_sum(tf.square(noobject_delta), axis=[1, 2, 3]), name='noobject_loss') * noobject_scale
coord_mask = tf.expand_dims(object_mask, 4)
boxes_delta = coord_mask * (predict_boxes - boxes_tran)
coord_loss = tf.reduce_mean(tf.reduce_sum(tf.square(boxes_delta), axis=[1, 2, 3, 4]), name='coord_loss') * coord_scale
tf.losses.add_loss(class_loss)
tf.losses.add_loss(object_loss)
tf.losses.add_loss(noobject_loss)
tf.losses.add_loss(coord_loss)
total_loss = tf.losses.get_total_loss()
yolo = yolo
data = data
global_step = tf.get_variable(
'global_step', [], initializer=tf.constant_initializer(0), trainable=False)
learning_rate = tf.train.exponential_decay(
initial_learning_rate, global_step, decay_steps,
decay_rate, staircase, name='learning_rate')
optimizer = tf.train.GradientDescentOptimizer(
learning_rate=learning_rate).minimize(
yolo.total_loss, global_step=global_step)
ema = tf.train.ExponentialMovingAverage(decay=0.9999)
averages_op = ema.apply(tf.trainable_variables())
with tf.control_dependencies([optimizer]):
train_op = tf.group(averages_op)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
for step in range(1, max_iter + 1):
images, labels = data.get()
feed_dict = {yolo.images: images, yolo.labels: labels}
sess.run(train_op, feed_dict=feed_dict)
================================================
FILE: Chapter04/pascal_voc.py
================================================
import os
import xml.etree.ElementTree as ET
import numpy as np
import cv2
import cPickle
import copy
import yolo.config as cfg
class pascal_voc(object):
def __init__(self, phase, rebuild=False):
self.devkil_path = os.path.join(cfg.PASCAL_PATH, 'VOCdevkit')
self.data_path = os.path.join(self.devkil_path, 'VOC2007')
self.cache_path = cfg.CACHE_PATH
self.batch_size = cfg.BATCH_SIZE
self.image_size = cfg.IMAGE_SIZE
self.cell_size = cfg.CELL_SIZE
self.classes = cfg.CLASSES
self.class_to_ind = dict(zip(self.classes, xrange(len(self.classes))))
self.flipped = cfg.FLIPPED
self.phase = phase
self.rebuild = rebuild
self.cursor = 0
self.epoch = 1
self.gt_labels = None
self.prepare()
def get(self):
images = np.zeros((self.batch_size, self.image_size, self.image_size, 3))
labels = np.zeros((self.batch_size, self.cell_size, self.cell_size, 25))
count = 0
while count < self.batch_size:
imname = self.gt_labels[self.cursor]['imname']
flipped = self.gt_labels[self.cursor]['flipped']
images[count, :, :, :] = self.image_read(imname, flipped)
labels[count, :, :, :] = self.gt_labels[self.cursor]['label']
count += 1
self.cursor += 1
if self.cursor >= len(self.gt_labels):
np.random.shuffle(self.gt_labels)
self.cursor = 0
self.epoch += 1
return images, labels
def image_read(self, imname, flipped=False):
image = cv2.imread(imname)
image = cv2.resize(image, (self.image_size, self.image_size))
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB).astype(np.float32)
image = (image / 255.0) * 2.0 - 1.0
if flipped:
image = image[:, ::-1, :]
return image
def prepare(self):
gt_labels = self.load_labels()
if self.flipped:
print('Appending horizontally-flipped training examples ...')
gt_labels_cp = copy.deepcopy(gt_labels)
for idx in range(len(gt_labels_cp)):
gt_labels_cp[idx]['flipped'] = True
gt_labels_cp[idx]['label'] = gt_labels_cp[idx]['label'][:, ::-1, :]
for i in xrange(self.cell_size):
for j in xrange(self.cell_size):
if gt_labels_cp[idx]['label'][i, j, 0] == 1:
gt_labels_cp[idx]['label'][i, j, 1] = self.image_size - 1 - gt_labels_cp[idx]['label'][i, j, 1]
gt_labels += gt_labels_cp
np.random.shuffle(gt_labels)
self.gt_labels = gt_labels
return gt_labels
def load_labels(self):
cache_file = os.path.join(self.cache_path, 'pascal_' + self.phase + '_gt_labels.pkl')
if os.path.isfile(cache_file) and not self.rebuild:
print('Loading gt_labels from: ' + cache_file)
with open(cache_file, 'rb') as f:
gt_labels = cPickle.load(f)
return gt_labels
print('Processing gt_labels from: ' + self.data_path)
if not os.path.exists(self.cache_path):
os.makedirs(self.cache_path)
if self.phase == 'train':
txtname = os.path.join(self.data_path, 'ImageSets', 'Main',
'trainval.txt')
else:
txtname = os.path.join(self.data_path, 'ImageSets', 'Main',
'test.txt')
with open(txtname, 'r') as f:
self.image_index = [x.strip() for x in f.readlines()]
gt_labels = []
for index in self.image_index:
label, num = self.load_pascal_annotation(index)
if num == 0:
continue
imname = os.path.join(self.data_path, 'JPEGImages', index + '.jpg')
gt_labels.append({'imname': imname, 'label': label, 'flipped': False})
print('Saving gt_labels to: ' + cache_file)
with open(cache_file, 'wb') as f:
cPickle.dump(gt_labels, f)
return gt_labels
def load_pascal_annotation(self, index):
"""
Load image and bounding boxes info from XML file in the PASCAL VOC
format.
"""
imname = os.path.join(self.data_path, 'JPEGImages', index + '.jpg')
im = cv2.imread(imname)
h_ratio = 1.0 * self.image_size / im.shape[0]
w_ratio = 1.0 * self.image_size / im.shape[1]
# im = cv2.resize(im, [self.image_size, self.image_size])
label = np.zeros((self.cell_size, self.cell_size, 25))
filename = os.path.join(self.data_path, 'Annotations', index + '.xml')
tree = ET.parse(filename)
objs = tree.findall('object')
for obj in objs:
bbox = obj.find('bndbox')
# Make pixel indexes 0-based
x1 = max(min((float(bbox.find('xmin').text) - 1) * w_ratio, self.image_size - 1), 0)
y1 = max(min((float(bbox.find('ymin').text) - 1) * h_ratio, self.image_size - 1), 0)
x2 = max(min((float(bbox.find('xmax').text) - 1) * w_ratio, self.image_size - 1), 0)
y2 = max(min((float(bbox.find('ymax').text) - 1) * h_ratio, self.image_size - 1), 0)
cls_ind = self.class_to_ind[obj.find('name').text.lower().strip()]
boxes = [(x2 + x1) / 2.0, (y2 + y1) / 2.0, x2 - x1, y2 - y1]
x_ind = int(boxes[0] * self.cell_size / self.image_size)
y_ind = int(boxes[1] * self.cell_size / self.image_size)
if label[y_ind, x_ind, 0] == 1:
continue
label[y_ind, x_ind, 0] = 1
label[y_ind, x_ind, 1:5] = boxes
label[y_ind, x_ind, 5 + cls_ind] = 1
return label, len(objs)
================================================
FILE: Chapter05/1_segnet.py
================================================
import tensorflow as tf
input_height = 360
input_width = 480
kernel = 3
filter_size = 64
pad = 1
pool_size = 2
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Layer(input_shape=(3, input_height, input_width)))
# encoder
model.add(tf.keras.layers.ZeroPadding2D(padding=(pad, pad)))
model.add(tf.keras.layers.Conv2D(filter_size, kernel, kernel, border_mode='valid'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Activation('relu'))
model.add(tf.keras.layers.MaxPooling2D(pool_size=(pool_size, pool_size)))
model.add(tf.keras.layers.ZeroPadding2D(padding=(pad, pad)))
model.add(tf.keras.layers.Conv2D(128, kernel, kernel, border_mode='valid'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Activation('relu'))
model.add(tf.keras.layers.MaxPooling2D(pool_size=(pool_size, pool_size)))
model.add(tf.keras.layers.ZeroPadding2D(padding=(pad, pad)))
model.add(tf.keras.layers.Conv2D(256, kernel, kernel, border_mode='valid'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Activation('relu'))
model.add(tf.keras.layers.MaxPooling2D(pool_size=(pool_size, pool_size)))
model.add(tf.keras.layers.ZeroPadding2D(padding=(pad, pad)))
model.add(tf.keras.layers.Conv2D(512, kernel, kernel, border_mode='valid'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Activation('relu'))
# decoder
model.add(tf.keras.layers.ZeroPadding2D(padding=(pad, pad)))
model.add(tf.keras.layers.Conv2D(512, kernel, kernel, border_mode='valid'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.UpSampling2D(size=(pool_size, pool_size)))
model.add(tf.keras.layers.ZeroPadding2D(padding=(pad, pad)))
model.add(tf.keras.layers.Conv2D(256, kernel, kernel, border_mode='valid'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.UpSampling2D(size=(pool_size, pool_size)))
model.add(tf.keras.layers.ZeroPadding2D(padding=(pad, pad)))
model.add(tf.keras.layers.Conv2D(128, kernel, kernel, border_mode='valid'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.UpSampling2D(size=(pool_size, pool_size)))
model.add(tf.keras.layers.ZeroPadding2D(padding=(pad, pad)))
model.add(tf.keras.layers.Conv2D(filter_size, kernel, kernel, border_mode='valid'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Conv2D(nClasses, 1, 1, border_mode='valid', ))
model.outputHeight = model.output_shape[-2]
model.outputWidth = model.output_shape[-1]
model.add(tf.keras.layers.Reshape((nClasses, model.output_shape[-2] * model.output_shape[-1]),
input_shape=(nClasses, model.output_shape[-2], model.output_shape[-1])))
model.add(tf.keras.layers.Permute((2, 1)))
model.add(tf.keras.layers.Activation('softmax'))
model.compile(loss="categorical_crossentropy", optimizer=tf.keras.optimizers.Adam, metrics=['accuracy'])
================================================
FILE: Chapter05/2_nerve_segmentation.py
================================================
import os
from skimage.transform import resize
from skimage.io import imsave
import numpy as np
import tensorflow as tf
from data import load_train_data, load_test_data
image_height, image_width = 96, 96
smoothness = 1.0
work_dir = ''
def dice_coefficient(y1, y2):
y1 = tf.flatten(y1)
y2 = tf.flatten(y2)
return (2. * tf.sum(y1 * y2) + smoothness) / (tf.sum(y1) + tf.sum(y2) + smoothness)
def dice_coefficient_loss(y1, y2):
return -dice_coefficient(y1, y2)
def preprocess(imgs):
imgs_p = np.ndarray((imgs.shape[0], image_height, image_width), dtype=np.uint8)
for i in range(imgs.shape[0]):
imgs_p[i] = resize(imgs[i], (image_width, image_height), preserve_range=True)
imgs_p = imgs_p[..., np.newaxis]
return imgs_p
def convolution_layer(filters, kernel=(3,3), activation='relu', input_shape=None):
if input_shape is None:
return tf.keras.layers.Conv2D(
filters=filters,
kernel_size=kernel,
activation=activation)
else:
return tf.keras.layers.Conv2D(
filters=filters,
kernel_size=kernel,
activation=activation,
input_shape=input_shape)
def concatenated_de_convolution_layer(filters):
return tf.keras.layers.concatenate([
tf.keras.layers.Conv2DTranspose(
filters=filters,
kernel=(2, 2),
strides=(2, 2),
padding='same'
)],
axis=3
)
def pooling_layer():
return tf.keras.layers.MaxPooling2D(pool_size=(2, 2))
unet = tf.keras.models.Sequential()
inputs = tf.keras.layers.Input((image_height, image_width, 1))
input_shape = (image_height, image_width, 1)
unet.add(convolution_layer(32, input_shape=input_shape))
unet.add(convolution_layer(32))
unet.add(pooling_layer())
unet.add(convolution_layer(64))
unet.add(convolution_layer(64))
unet.add(pooling_layer())
unet.add(convolution_layer(128))
unet.add(convolution_layer(128))
unet.add(pooling_layer())
unet.add(convolution_layer(256))
unet.add(convolution_layer(256))
unet.add(pooling_layer())
unet.add(convolution_layer(512))
unet.add(convolution_layer(512))
unet.add(concatenated_de_convolution_layer(256))
unet.add(convolution_layer(256))
unet.add(convolution_layer(256))
unet.add(concatenated_de_convolution_layer(128))
unet.add(convolution_layer(128))
unet.add(convolution_layer(128))
unet.add(concatenated_de_convolution_layer(64))
unet.add(convolution_layer(64))
unet.add(convolution_layer(64))
unet.add(concatenated_de_convolution_layer(32))
unet.add(convolution_layer(32))
unet.add(convolution_layer(32))
unet.add(convolution_layer(1, kernel=(1, 1), activation='sigmoid'))
unet.compile(optimizer=tf.keras.optimizers.Adam(lr=1e-5),
loss=dice_coefficient_loss,
metrics=[dice_coefficient])
x_train, y_train_mask = load_train_data()
x_train = preprocess(x_train)
y_train_mask = preprocess(y_train_mask)
x_train = x_train.astype('float32')
mean = np.mean(x_train)
std = np.std(x_train)
x_train -= mean
x_train /= std
y_train_mask = y_train_mask.astype('float32')
y_train_mask /= 255.
unet.fit(x_train, y_train_mask, batch_size=32, epochs=20, verbose=1, shuffle=True,
validation_split=0.2)
x_test, y_test_mask = load_test_data()
x_test = preprocess(x_test)
x_test = x_test.astype('float32')
x_test -= mean
x_test /= std
y_test_pred = unet.predict(x_test, verbose=1)
for image, image_id in zip(y_test_pred, y_test_mask):
image = (image[:, :, 0] * 255.).astype(np.uint8)
imsave(os.path.join(work_dir, str(image_id) + '.png'), image)
================================================
FILE: Chapter05/3_satellite.py
================================================
import tensorflow as tf
from .resnet50 import ResNet50
nb_labels = 6
input_shape = [28, 28]
img_height, img_width, _ = input_shape
input_tensor = tf.keras.layers.Input(shape=input_shape)
weights = 'imagenet'
resnet50_model = ResNet50(
include_top=False, weights='imagenet', input_tensor=input_tensor)
final_32 = resnet50_model.get_layer('final_32').output
final_16 = resnet50_model.get_layer('final_16').output
final_x8 = resnet50_model.get_layer('final_x8').output
c32 = tf.keras.layers.Conv2D(nb_labels, (1, 1))(final_32)
c16 = tf.keras.layers.Conv2D(nb_labels, (1, 1))(final_16)
c8 = tf.keras.layers.Conv2D(nb_labels, (1, 1))(final_x8)
def resize_bilinear(images):
return tf.image.resize_bilinear(images, [img_height, img_width])
r32 = tf.keras.layers.Lambda(resize_bilinear)(c32)
r16 = tf.keras.layers.Lambda(resize_bilinear)(c16)
r8 = tf.keras.layers.Lambda(resize_bilinear)(c8)
m = tf.keras.layers.Add()([r32, r16, r8])
x = tf.keras.ayers.Reshape((img_height * img_width, nb_labels))(m)
x = tf.keras.layers.Activation('img_height')(x)
x = tf.keras.layers.Reshape((img_height, img_width, nb_labels))(x)
fcn_model = tf.keras.models.Model(input=input_tensor, output=x)
================================================
FILE: Chapter05/data.py
================================================
from __future__ import print_function
import os
import numpy as np
from skimage.io import imsave, imread
data_path = 'raw/'
image_rows = 420
image_cols = 580
def create_train_data():
train_data_path = os.path.join(data_path, 'train')
images = os.listdir(train_data_path)
total = int(len(images) / 2)
imgs = np.ndarray((total, image_rows, image_cols), dtype=np.uint8)
imgs_mask = np.ndarray((total, image_rows, image_cols), dtype=np.uint8)
i = 0
print('-'*30)
print('Creating training images...')
print('-'*30)
for image_name in images:
if 'mask' in image_name:
continue
image_mask_name = image_name.split('.')[0] + '_mask.tif'
img = imread(os.path.join(train_data_path, image_name), as_grey=True)
img_mask = imread(os.path.join(train_data_path, image_mask_name), as_grey=True)
img = np.array([img])
img_mask = np.array([img_mask])
imgs[i] = img
imgs_mask[i] = img_mask
if i % 100 == 0:
print('Done: {0}/{1} images'.format(i, total))
i += 1
print('Loading done.')
np.save('imgs_train.npy', imgs)
np.save('imgs_mask_train.npy', imgs_mask)
print('Saving to .npy files done.')
def load_train_data():
imgs_train = np.load('imgs_train.npy')
imgs_mask_train = np.load('imgs_mask_train.npy')
return imgs_train, imgs_mask_train
def create_test_data():
train_data_path = os.path.join(data_path, 'test')
images = os.listdir(train_data_path)
total = len(images)
imgs = np.ndarray((total, image_rows, image_cols), dtype=np.uint8)
imgs_id = np.ndarray((total, ), dtype=np.int32)
i = 0
print('-'*30)
print('Creating test images...')
print('-'*30)
for image_name in images:
img_id = int(image_name.split('.')[0])
img = imread(os.path.join(train_data_path, image_name), as_grey=True)
img = np.array([img])
imgs[i] = img
imgs_id[i] = img_id
if i % 100 == 0:
print('Done: {0}/{1} images'.format(i, total))
i += 1
print('Loading done.')
np.save('imgs_test.npy', imgs)
np.save('imgs_id_test.npy', imgs_id)
print('Saving to .npy files done.')
def load_test_data():
imgs_test = np.load('imgs_test.npy')
imgs_id = np.load('imgs_id_test.npy')
return imgs_test, imgs_id
if __name__ == '__main__':
create_train_data()
create_test_data()
================================================
FILE: Chapter06/1_contrastive_loss.py
================================================
import tensorflow as tf
def contrastive_loss(model_1, model_2, label, margin=0.1):
distance = tf.reduce_sum(tf.square(model_1 - model_2), 1)
loss = label * tf.square(
tf.maximum(0., margin - tf.sqrt(distance))) + (1 - label) * distance
loss = 0.5 * tf.reduce_mean(loss)
return loss
================================================
FILE: Chapter06/2_siamese_network.py
================================================
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
mnist_data = input_data.read_data_sets('MNIST_data', one_hot=True)
input_size = 784
no_classes = 10
batch_size = 100
total_batches = 300
def add_variable_summary(tf_variable, summary_name):
with tf.name_scope(summary_name + '_summary'):
mean = tf.reduce_mean(tf_variable)
tf.summary.scalar('Mean', mean)
with tf.name_scope('standard_deviation'):
standard_deviation = tf.sqrt(tf.reduce_mean(
tf.square(tf_variable - mean)))
tf.summary.scalar('StandardDeviation', standard_deviation)
tf.summary.scalar('Maximum', tf.reduce_max(tf_variable))
tf.summary.scalar('Minimum', tf.reduce_min(tf_variable))
tf.summary.histogram('Histogram', tf_variable)
def convolution_layer(input_layer, filters, kernel_size=[3, 3],
activation=tf.nn.relu):
layer = tf.layers.conv2d(
inputs=input_layer,
filters=filters,
kernel_size=kernel_size,
activation=activation
)
add_variable_summary(layer, 'convolution')
return layer
def pooling_layer(input_layer, pool_size=[2, 2], strides=2):
layer = tf.layers.max_pooling2d(
inputs=input_layer,
pool_size=pool_size,
strides=strides
)
add_variable_summary(layer, 'pooling')
return layer
def dense_layer(input_layer, units, activation=tf.nn.relu):
layer = tf.layers.dense(
inputs=input_layer,
units=units,
activation=activation
)
add_variable_summary(layer, 'dense')
return layer
def get_model(input_):
input_reshape = tf.reshape(input_, [-1, 28, 28, 1],
name='input_reshape')
convolution_layer_1 = convolution_layer(input_reshape, 64)
pooling_layer_1 = pooling_layer(convolution_layer_1)
convolution_layer_2 = convolution_layer(pooling_layer_1, 128)
pooling_layer_2 = pooling_layer(convolution_layer_2)
flattened_pool = tf.reshape(pooling_layer_2, [-1, 5 * 5 * 128],
name='flattened_pool')
dense_layer_bottleneck = dense_layer(flattened_pool, 1024)
return dense_layer_bottleneck
left_input = tf.placeholder(tf.float32, shape=[None, input_size])
right_input = tf.placeholder(tf.float32, shape=[None, input_size])
y_input = tf.placeholder(tf.float32, shape=[None, no_classes])
left_bottleneck = get_model(left_input)
right_bottleneck = get_model(right_input)
dense_layer_bottleneck = tf.concat([left_bottleneck, right_bottleneck], 1)
dropout_bool = tf.placeholder(tf.bool)
dropout_layer = tf.layers.dropout(
inputs=dense_layer_bottleneck,
rate=0.4,
training=dropout_bool
)
logits = dense_layer(dropout_layer, no_classes)
with tf.name_scope('loss'):
softmax_cross_entropy = tf.nn.softmax_cross_entropy_with_logits(
labels=y_input, logits=logits)
loss_operation = tf.reduce_mean(softmax_cross_entropy, name='loss')
tf.summary.scalar('loss', loss_operation)
with tf.name_scope('optimiser'):
optimiser = tf.train.AdamOptimizer().minimize(loss_operation)
with tf.name_scope('accuracy'):
with tf.name_scope('correct_prediction'):
predictions = tf.argmax(logits, 1)
correct_predictions = tf.equal(predictions, tf.argmax(y_input, 1))
with tf.name_scope('accuracy'):
accuracy_operation = tf.reduce_mean(
tf.cast(correct_predictions, tf.float32))
tf.summary.scalar('accuracy', accuracy_operation)
session = tf.Session()
session.run(tf.global_variables_initializer())
merged_summary_operation = tf.summary.merge_all()
train_summary_writer = tf.summary.FileWriter('/tmp/train', session.graph)
test_summary_writer = tf.summary.FileWriter('/tmp/test')
test_images, test_labels = mnist_data.test.images, mnist_data.test.labels
for batch_no in range(total_batches):
mnist_batch = mnist_data.train.next_batch(batch_size)
train_images, train_labels = mnist_batch[0], mnist_batch[1]
_, merged_summary = session.run([optimiser, merged_summary_operation],
feed_dict={
left_input: train_images,
right_input: train_images,
y_input: train_labels,
dropout_bool: True
})
train_summary_writer.add_summary(merged_summary, batch_no)
if batch_no % 10 == 0:
merged_summary, _ = session.run([merged_summary_operation,
accuracy_operation], feed_dict={
left_input: test_images,
right_input: test_images,
y_input: test_labels,
dropout_bool: False
})
test_summary_writer.add_summary(merged_summary, batch_no)
================================================
FILE: Chapter06/3_triplet_loss.py
================================================
import tensorflow as tf
def triplet_loss(anchor_face, positive_face, negative_face, margin):
def get_distance(x, y):
return tf.reduce_sum(tf.square(tf.subtract(x, y)), 1)
positive_distance = get_distance(anchor_face, positive_face)
negative_distance = get_distance(anchor_face, negative_face)
total_distance = tf.add(tf.subtract(positive_distance, negative_distance), margin)
return tf.reduce_mean(tf.maximum(total_distance, 0.0), 0)
================================================
FILE: Chapter06/4_triplet_mining.py
================================================
from scipy.spatial.distance import cdist
import numpy as np
def mine_triplets(anchor, targets, negative_samples):
distances = cdist(anchor, targets, 'cosine')
distances = cdist(anchor, targets, 'cosine').tolist()
QnQ_duplicated = [
[target_index for target_index, dist in enumerate(QnQ_dist) if dist == QnQ_dist[query_index]]
for query_index, QnQ_dist in enumerate(distances)]
for i, QnT_dist in enumerate(QnT_dists):
for j in QnQ_duplicated[i]:
QnT_dist.itemset(j, np.inf)
QnT_dists_topk = QnT_dists.argsort(axis=1)[:, :negative_samples]
top_k_index = np.array([np.insert(QnT_dist, 0, i) for i, QnT_dist in enumerate(QnT_dists_topk)])
return top_k_index
================================================
FILE: Chapter06/5_fiducial_points.py
================================================
import fiducial_data
import tensorflow as tf
def add_variable_summary(tf_variable, summary_name):
with tf.name_scope(summary_name + '_summary'):
mean = tf.reduce_mean(tf_variable)
tf.summary.scalar('Mean', mean)
with tf.name_scope('standard_deviation'):
standard_deviation = tf.sqrt(tf.reduce_mean(
tf.square(tf_variable - mean)))
tf.summary.scalar('StandardDeviation', standard_deviation)
tf.summary.scalar('Maximum', tf.reduce_max(tf_variable))
tf.summary.scalar('Minimum', tf.reduce_min(tf_variable))
tf.summary.histogram('Histogram', tf_variable)
def convolution_layer(input_layer, filters, kernel_size=[3, 3],
activation=tf.nn.tanh):
layer = tf.layers.conv2d(
inputs=input_layer,
filters=filters,
kernel_size=kernel_size,
activation=activation
)
add_variable_summary(layer, 'convolution')
return layer
def pooling_layer(input_layer, pool_size=[2, 2], strides=2):
layer = tf.layers.max_pooling2d(
inputs=input_layer,
pool_size=pool_size,
strides=strides
)
add_variable_summary(layer, 'pooling')
return layer
def dense_layer(input_layer, units, activation=tf.nn.tanh):
layer = tf.layers.dense(
inputs=input_layer,
units=units,
activation=activation
)
add_variable_summary(layer, 'dense')
return layer
image_size = 40
no_landmark = 10
no_gender_classes = 2
no_smile_classes = 2
no_glasses_classes = 2
no_headpose_classes = 5
batch_size = 100
total_batches = 300
image_input = tf.placeholder(tf.float32, shape=[None, image_size, image_size])
landmark_input = tf.placeholder(tf.float32, shape=[None, no_landmark])
gender_input = tf.placeholder(tf.float32, shape=[None, no_gender_classes])
smile_input = tf.placeholder(tf.float32, shape=[None, no_smile_classes])
glasses_input = tf.placeholder(tf.float32, shape=[None, no_glasses_classes])
headpose_input = tf.placeholder(tf.float32, shape=[None, no_headpose_classes])
image_input_reshape = tf.reshape(image_input, [-1, image_size, image_size, 1],
name='input_reshape')
convolution_layer_1 = convolution_layer(image_input_reshape, 16)
pooling_layer_1 = pooling_layer(convolution_layer_1)
convolution_layer_2 = convolution_layer(pooling_layer_1, 48)
pooling_layer_2 = pooling_layer(convolution_layer_2)
convolution_layer_3 = convolution_layer(pooling_layer_2, 64)
pooling_layer_3 = pooling_layer(convolution_layer_3)
convolution_layer_4 = convolution_layer(pooling_layer_3, 64)
flattened_pool = tf.reshape(convolution_layer_4, [-1, 5 * 5 * 64],
name='flattened_pool')
dense_layer_bottleneck = dense_layer(flattened_pool, 1024)
dropout_bool = tf.placeholder(tf.bool)
dropout_layer = tf.layers.dropout(
inputs=dense_layer_bottleneck,
rate=0.4,
training=dropout_bool
)
landmark_logits = dense_layer(dropout_layer, 10)
smile_logits = dense_layer(dropout_layer, 2)
glass_logits = dense_layer(dropout_layer, 2)
gender_logits = dense_layer(dropout_layer, 2)
headpose_logits = dense_layer(dropout_layer, 5)
landmark_loss = 0.5 * tf.reduce_mean(
tf.square(landmark_input, landmark_logits))
gender_loss = tf.reduce_mean(
tf.nn.softmax_cross_entropy_with_logits(
labels=gender_input, logits=gender_logits))
smile_loss = tf.reduce_mean(
tf.nn.softmax_cross_entropy_with_logits(
labels=smile_input, logits=smile_logits))
glass_loss = tf.reduce_mean(
tf.nn.softmax_cross_entropy_with_logits(
labels=glasses_input, logits=glass_logits))
headpose_loss = tf.reduce_mean(
tf.nn.softmax_cross_entropy_with_logits(
labels=headpose_input, logits=headpose_logits))
loss_operation = landmark_loss + gender_loss + \
smile_loss + glass_loss + headpose_loss
optimiser = tf.train.AdamOptimizer().minimize(loss_operation)
session = tf.Session()
session.run(tf.initialize_all_variables())
fiducial_test_data = fiducial_data.test
for batch_no in range(total_batches):
fiducial_data_batch = fiducial_data.train.next_batch(batch_size)
loss, _landmark_loss, _ = session.run(
[loss_operation, landmark_loss, optimiser],
feed_dict={
image_input: fiducial_data_batch.images,
landmark_input: fiducial_data_batch.landmarks,
gender_input: fiducial_data_batch.gender,
smile_input: fiducial_data_batch.smile,
glasses_input: fiducial_data_batch.glasses,
headpose_input: fiducial_data_batch.pose,
dropout_bool: True
})
if batch_no % 10 == 0:
loss, _landmark_loss, _ = session.run(
[loss_operation, landmark_loss],
feed_dict={
image_input: fiducial_test_data.images,
landmark_input: fiducial_test_data.landmarks,
gender_input: fiducial_test_data.gender,
smile_input: fiducial_test_data.smile,
glasses_input: fiducial_test_data.glasses,
headpose_input: fiducial_test_data.pose,
dropout_bool: False
})
================================================
FILE: Chapter06/6_extract_features.py
================================================
from scipy import misc
import tensorflow as tf
import numpy as np
import os
import facenet
print facenet
from facenet import load_model, prewhiten
import align.detect_face
def load_and_align_data(image_paths,
image_size=160,
margin=44,
gpu_memory_fraction=1.0):
minsize = 20
threshold = [0.6, 0.7, 0.7]
factor = 0.709
print('Creating networks and loading parameters')
with tf.Graph().as_default():
gpu_options = tf.GPUOptions(
per_process_gpu_memory_fraction=gpu_memory_fraction)
sess = tf.Session(config=tf.ConfigProto(
gpu_options=gpu_options, log_device_placement=False))
with sess.as_default():
pnet, rnet, onet = align.detect_face.create_mtcnn(sess, None)
nrof_samples = len(image_paths)
img_list = [None] * nrof_samples
for i in range(nrof_samples):
img = misc.imread(os.path.expanduser(image_paths[i]), mode='RGB')
img_size = np.asarray(img.shape)[0:2]
bounding_boxes, _ = align.detect_face.detect_face(
img, minsize, pnet, rnet, onet, threshold, factor)
det = np.squeeze(bounding_boxes[0, 0:4])
bb = np.zeros(4, dtype=np.int32)
bb[0] = np.maximum(det[0] - margin / 2, 0)
bb[1] = np.maximum(det[1] - margin / 2, 0)
bb[2] = np.minimum(det[2] + margin / 2, img_size[1])
bb[3] = np.minimum(det[3] + margin / 2, img_size[0])
cropped = img[bb[1]:bb[3], bb[0]:bb[2], :]
aligned = misc.imresize(
cropped, (image_size, image_size), interp='bilinear')
prewhitened = prewhiten(aligned)
img_list[i] = prewhitened
images = np.stack(img_list)
return images
def get_face_embeddings(image_paths,
model=''):
images = load_and_align_data(image_paths)
with tf.Graph().as_default():
with tf.Session() as sess:
load_model(model)
images_placeholder = tf.get_default_graph().get_tensor_by_name(
"input:0")
embeddings = tf.get_default_graph().get_tensor_by_name(
"embeddings:0")
phase_train_placeholder = tf.get_default_graph().get_tensor_by_name(
"phase_train:0")
feed_dict = {images_placeholder: images,
phase_train_placeholder: False}
emb = sess.run(embeddings, feed_dict=feed_dict)
return emb
def compute_distance(embedding_1, embedding_2):
dist = np.sqrt(np.sum(np.square(np.subtract(embedding_1, embedding_2))))
return dist
================================================
FILE: Chapter07/1_caption_attention.py
================================================
import tensorflow as tf
from keras.layers.recurrent import *
training = True
sequence_length = 0
vocabulary_size = 0
input_tensor = 0
input_shape = 0
embedding_dimension = 0
dropout_prob = 0.3
previous_words = 0
height = 0
shape = 0
cnn_features = 0
depth = 0
vgg_model = tf.keras.applications.vgg16.VGG16(weights='imagenet',
include_top=False,
input_tensor=input_tensor,
input_shape=input_shape)
word_embedding = tf.keras.layers.Embedding(
vocabulary_size, embedding_dimension, input_length=sequence_length)
embbedding = word_embedding(previous_words)
embbedding = tf.keras.layers.Activation('relu')(embbedding)
embbedding = tf.keras.layers.Dropout(dropout_prob)(embbedding)
cnn_features_flattened = tf.keras.layers.Reshape((height * height, shape))(cnn_features)
net = tf.keras.layers.GlobalAveragePooling1D()(cnn_features_flattened)
net = tf.keras.layers.Dense(embedding_dimension, activation='relu')(net)
net = tf.keras.layers.Dropout(dropout_prob)(net)
net = tf.keras.layers.RepeatVector(sequence_length)(net)
net = tf.keras.layers.concatenate()([net, embbedding])
net = tf.keras.layers.Dropout(dropout_prob)(net)
class LSTM_sent(Recurrent):
def __init__(self, output_dim,
init='glorot_uniform', inner_init='orthogonal',
forget_bias_init='one', activation='tanh',
inner_activation='hard_sigmoid',
W_regularizer=None, U_regularizer=None, b_regularizer=None,
dropout_W=0., dropout_U=0., sentinel=True, **kwargs):
self.output_dim = output_dim
self.init = initializations.get(init)
self.inner_init = initializations.get(inner_init)
self.forget_bias_init = initializations.get(forget_bias_init)
self.activation = activations.get(activation)
self.inner_activation = activations.get(inner_activation)
self.W_regularizer = regularizers.get(W_regularizer)
self.U_regularizer = regularizers.get(U_regularizer)
self.b_regularizer = regularizers.get(b_regularizer)
self.dropout_W, self.dropout_U = dropout_W, dropout_U
self.sentinel = sentinel
if self.dropout_W or self.dropout_U:
self.uses_learning_phase = True
super(LSTM_sent, self).__init__(**kwargs)
def build(self, input_shape):
self.input_spec = [InputSpec(shape=input_shape)]
input_dim = input_shape[2]
self.input_dim = input_dim
if self.stateful:
self.reset_states()
else:
if self.sentinel:
self.states = [None, None]
else:
self.states = [None]
self.W_i = self.init((input_dim, self.output_dim),
name='{}_W_i'.format(self.name))
self.U_i = self.inner_init((self.output_dim, self.output_dim),
name='{}_U_i'.format(self.name))
self.b_i = K.zeros((self.output_dim,), name='{}_b_i'.format(self.name))
self.W_f = self.init((input_dim, self.output_dim),
name='{}_W_f'.format(self.name))
self.U_f = self.inner_init((self.output_dim, self.output_dim),
name='{}_U_f'.format(self.name))
self.b_f = self.forget_bias_init((self.output_dim,),
name='{}_b_f'.format(self.name))
self.W_c = self.init((input_dim, self.output_dim),
name='{}_W_c'.format(self.name))
self.U_c = self.inner_init((self.output_dim, self.output_dim),
name='{}_U_c'.format(self.name))
self.b_c = K.zeros((self.output_dim,), name='{}_b_c'.format(self.name))
self.W_o = self.init((input_dim, self.output_dim),
name='{}_W_o'.format(self.name))
self.U_o = self.inner_init((self.output_dim, self.output_dim),
name='{}_U_o'.format(self.name))
self.b_o = K.zeros((self.output_dim,), name='{}_b_o'.format(self.name))
if self.sentinel:
# sentinel gate
self.W_g = self.init((input_dim, self.output_dim),
name='{}_W_g'.format(self.name))
self.U_g = self.inner_init((self.output_dim, self.output_dim),
name='{}_U_g'.format(self.name))
self.b_g = K.zeros((self.output_dim,), name='{}_b_g'.format(self.name))
self.trainable_weights = [self.W_i, self.U_i, self.b_i,
self.W_c, self.U_c, self.b_c,
self.W_f, self.U_f, self.b_f,
self.W_o, self.U_o, self.b_o,
self.W_g, self.U_g, self.b_g]
else:
self.trainable_weights = [self.W_i, self.U_i, self.b_i,
self.W_c, self.U_c, self.b_c,
self.W_f, self.U_f, self.b_f,
self.W_o, self.U_o, self.b_o]
if self.initial_weights is not None:
self.set_weights(self.initial_weights)
del self.initial_weights
def reset_states(self):
assert self.stateful, 'Layer must be stateful.'
input_shape = self.input_spec[0].shape
if not input_shape[0]:
raise Exception('If a RNN is stateful, a complete ' +
'input_shape must be provided (including batch size).')
if hasattr(self, 'states'):
K.set_value(self.states[0],
np.zeros((input_shape[0], self.output_dim)))
K.set_value(self.states[1],
np.zeros((input_shape[0], self.output_dim)))
else:
self.states = [K.zeros((input_shape[0], self.output_dim)),
K.zeros((input_shape[0], self.output_dim))]
def preprocess_input(self, x, train=False):
if self.consume_less == 'cpu':
if train and (0 < self.dropout_W < 1):
dropout = self.dropout_W
else:
dropout = 0
input_shape = self.input_spec[0].shape
input_dim = input_shape[2]
timesteps = input_shape[1]
x_i = time_distributed_dense(x, self.W_i, self.b_i, dropout,
input_dim, self.output_dim, timesteps)
x_f = time_distributed_dense(x, self.W_f, self.b_f, dropout,
input_dim, self.output_dim, timesteps)
x_c = time_distributed_dense(x, self.W_c, self.b_c, dropout,
input_dim, self.output_dim, timesteps)
x_o = time_distributed_dense(x, self.W_o, self.b_o, dropout,
input_dim, self.output_dim, timesteps)
if self.sentinel:
x_g = time_distributed_dense(x, self.W_g, self.b_g, dropout,
input_dim, self.output_dim, timesteps)
return K.concatenate([x_i, x_f, x_c, x_o,x_g], axis=2)
else:
return K.concatenate([x_i, x_f, x_c, x_o], axis=2)
else:
return x
def step(self, x, states):
h_tm1 = states[0]
c_tm1 = states[1]
B_U = states[2]
B_W = states[3]
if self.consume_less == 'cpu':
x_i = x[:, :self.output_dim]
x_f = x[:, self.output_dim: 2 * self.output_dim]
x_c = x[:, 2 * self.output_dim: 3 * self.output_dim]
x_o = x[:, 3 * self.output_dim: 4 * self.output_dim]
if self.sentinel:
x_g = x[:, 4 * self.output_dim:]
else:
x_i = K.dot(x, self.W_i) + self.b_i
x_f = K.dot(x * B_W[1], self.W_f) + self.b_f
x_c = K.dot(x * B_W[2], self.W_c) + self.b_c
x_o = K.dot(x * B_W[3], self.W_o) + self.b_o
if self.sentinel:
x_g = K.dot(x * B_W[4], self.W_g) + self.b_g
i = self.inner_activation(x_i + K.dot(h_tm1 * B_U[0], self.U_i))
f = self.inner_activation(x_f + K.dot(h_tm1 * B_U[1], self.U_f))
c = f * c_tm1 + i * self.activation(x_c + K.dot(h_tm1 * B_U[2], self.U_c))
o = self.inner_activation(x_o + K.dot(h_tm1 * B_U[3], self.U_o))
h = o * self.activation(c)
if self.sentinel:
g = self.inner_activation(x_g + K.dot(h_tm1 * B_U[4], self.U_g))
s = g * self.activation(c)
return [h,s], [h, c]
else:
return h, [h, c]
def get_constants(self, x):
constants = []
if self.sentinel:
Ngate = 5
else:
Ngate = 4
if 0 < self.dropout_U < 1:
ones = K.ones_like(K.reshape(x[:, 0, 0], (-1, 1)))
ones = K.concatenate([ones] * self.output_dim, 1)
B_U = [K.dropout(ones, self.dropout_U) for _ in range(Ngate)]
constants.append(B_U)
else:
constants.append([K.cast_to_floatx(1.) for _ in range(Ngate)])
if self.consume_less == 'cpu' and 0 < self.dropout_W < 1:
input_shape = self.input_spec[0].shape
input_dim = input_shape[-1]
ones = K.ones_like(K.reshape(x[:, 0, 0], (-1, 1)))
ones = K.concatenate([ones] * input_dim, 1)
B_W = [K.dropout(ones, self.dropout_W) for _ in range(Ngate)]
constants.append(B_W)
else:
constants.append([K.cast_to_floatx(1.) for _ in range(Ngate)])
return constants
def get_output_shape_for(self, input_shape):
if isinstance(input_shape, list) and len(input_shape) > 1:
input_shape = input_shape[0]
if self.return_sequences:
output_shape = (input_shape[0], input_shape[1], self.output_dim)
else:
output_shape = (input_shape[0], self.output_dim)
#the hidden state and the sentinel have the same shape
if self.sentinel:
return [output_shape, output_shape]
else:
return output_shape
def compute_mask(self, input, mask):
if self.return_sequences:
if self.sentinel:
return [mask, mask]
else :
return mask
else:
if self.sentinel:
return [None, None]
else:
return None
def call(self, x, mask=None):
input_shape = self.input_spec[0].shape
if K._BACKEND == 'tensorflow':
if not input_shape[1]:
raise Exception('When using TensorFlow, you should define '
'explicitly the number of timesteps of '
'your sequences.\n'
'If your first layer is an Embedding, '
'make sure to pass it an "input_length" '
'argument. Otherwise, make sure '
'the first layer has '
'an "input_shape" or "batch_input_shape" '
'argument, including the time axis. '
'Found input shape at layer ' + self.name +
': ' + str(input_shape))
if self.stateful:
initial_states = self.states
else:
initial_states = self.get_initial_states(x)
constants = self.get_constants(x)
preprocessed_input = self.preprocess_input(x)
last_output, outputs, states = K.rnn(self.step, preprocessed_input,
initial_states,
go_backwards=self.go_backwards,
mask=mask,
constants=constants,
unroll=self.unroll,
input_length=input_shape[1])
if self.stateful:
self.updates = []
for i in range(len(states)):
self.updates.append((self.states[i], states[i]))
if self.sentinel:
outputs = K.permute_dimensions(outputs, [0,2,1,3])
if self.return_sequences:
return [outputs[0], outputs[1]]
else:
return [last_output[0],last_output[1]]
else:
if self.return_sequences:
return outputs
else:
return last_output
def get_config(self):
config = {"output_dim": self.output_dim,
"init": self.init.__name__,
"inner_init": self.inner_init.__name__,
"forget_bias_init": self.forget_bias_init.__name__,
"activation": self.activation.__name__,
"inner_activation": self.inner_activation.__name__,
"W_regularizer": self.W_regularizer.get_config() if self.W_regularizer else None,
"U_regularizer": self.U_regularizer.get_config() if self.U_regularizer else None,
"b_regularizer": self.b_regularizer.get_config() if self.b_regularizer else None,
"dropout_W": self.dropout_W,
"dropout_U": self.dropout_U,
"sentinel": self.sentinel}
base_config = super(LSTM_sent, self).get_config()
return dict(list(base_config.items()) + list(config.items()))
lstm_ = LSTM_sent(output_dim = args_dict.lstm_dim,
return_sequences=True,stateful=True,
dropout_W = dropout_prob,
dropout_U = dropout_prob,
sentinel=True,name='hs')
h, s = lstm_(x)
num_vfeats = wh * wh
num_vfeats = num_vfeats + 1
h_out_linear = tf.keras.layers.Convolution1D(
depth, 1, activation='tanh', border_mode='same')(h)
h_out_linear = tf.keras.layers.Dropout(
dropout_prob)(h_out_linear)
h_out_embed = tf.keras.layers.Convolution1D(
embedding_dimension, 1, border_mode='same')(h_out_linear)
z_h_embed = tf.keras.layers.TimeDistributed(
tf.keras.layers.RepeatVector(num_vfeats))(h_out_embed)
Vi = tf.keras.layers.Convolution1D(
depth, 1, border_mode='same', activation='relu')(V)
Vi = tf.keras.layers.Dropout(dropout_prob)(Vi)
Vi_emb = tf.keras.layers.Convolution1D(
embedding_dimension, 1, border_mode='same', activation='relu')(Vi)
z_v_linear = tf.keras.layers.TimeDistributed(
tf.keras.layers.RepeatVector(sequence_length))(Vi)
z_v_embed = tf.keras.layers.TimeDistributed(
tf.keras.layers.RepeatVector(sequence_length))(Vi_emb)
z_v_linear = tf.keras.layers.Permute((2, 1, 3))(z_v_linear)
z_v_embed = tf.keras.layers.Permute((2, 1, 3))(z_v_embed)
fake_feat = tf.keras.layers.Convolution1D(
depth, 1, activation='relu', border_mode='same')(s)
fake_feat = tf.keras.layers.Dropout(dropout_prob)(fake_feat)
fake_feat_embed = tf.keras.layers.Convolution1D(
embedding_dimension, 1, border_mode='same')(fake_feat)
z_s_linear = tf.keras.layers.Reshape((sequence_length, 1, depth))(fake_feat)
z_s_embed = tf.keras.layers.Reshape(
(sequence_length, 1, embedding_dimension))(fake_feat_embed)
z_v_linear = tf.keras.layers.concatenate(axis=-2)([z_v_linear, z_s_linear])
z_v_embed = tf.keras.layers.concatenate(axis=-2)([z_v_embed, z_s_embed])
z = tf.keras.layers.Merge(mode='sum')([z_h_embed,z_v_embed])
z = tf.keras.layers.Dropout(dropout_prob)(z)
z = tf.keras.layers.TimeDistributed(
tf.keras.layers.Activation('tanh'))(z)
attention= tf.keras.layers.TimeDistributed(
tf.keras.layers.Convolution1D(1, 1, border_mode='same'))(z)
attention = tf.keras.layers.Reshape((sequence_length, num_vfeats))(attention)
attention = tf.keras.layers.TimeDistributed(
tf.keras.layers.Activation('softmax'))(attention)
attention = tf.keras.layers.TimeDistributed(
tf.keras.layers.RepeatVector(depth))(attention)
attention = tf.keras.layers.Permute((1,3,2))(attention)
w_Vi = tf.keras.layers.Add()([attention,z_v_linear])
sumpool = tf.keras.layers.Lambda(lambda x: K.sum(x, axis=-2),
output_shape=(depth,))
c_vec = tf.keras.layers.TimeDistributed(sumpool)(w_Vi)
atten_out = tf.keras.layers.Merge(mode='sum')([h_out_linear,c_vec])
h = tf.keras.layers.TimeDistributed(
tf.keras.layers.Dense(embedding_dimension,activation='tanh'))(atten_out)
h = tf.keras.layers.Dropout(dropout_prob)(h)
predictions = tf.keras.layers.TimeDistributed(
tf.keras.layers.Dense(vocabulary_size, activation='softmax'))(h)
model = tf.keras.models.Model(input=[cnn_features, prev_words], output=predictions)
opt = get_opt(args_dict)
in_im = tf.keras.layers.Input(
batch_shape=(args_dict.bs, args_dict.imsize, args_dict.imsize, 3), name='image')
wh = vgg_model.output_shape[1]
dim = vgg_model.output_shape[3]
if not args_dict.cnn_train:
for i,layer in enumerate(convnet.layers):
if i > args_dict.finetune_start_layer:
layer.trainable = False
imfeats = vgg_model(in_im)
cnn_features = tf.keras.layers.Input(batch_shape=(args_dict.bs, wh, wh, dim))
prev_words = tf.keras.layers.Input(batch_shape=(args_dict.bs, sequence_length))
lang_model = language_model(args_dict, wh, dim, cnn_features, prev_words)
out = lang_model([imfeats,prev_words])
model = tf.keras.models.Model(input=[in_im, prev_words], output=out)
================================================
FILE: Chapter08/1_style_transfer.py
================================================
import numpy as np
from PIL import Image
from scipy.optimize import fmin_l_bfgs_b
from scipy.misc import imsave
from vgg16_avg import VGG16_Avg
from keras import metrics
from keras.models import Model
from keras import backend as K
import tensorflow as tf
work_dir = ''
content_image = Image.open(work_dir + 'bird_orig.png')
imagenet_mean = np.array([123.68, 116.779, 103.939], dtype=np.float32)
def subtract_imagenet_mean(image):
return (image - imagenet_mean)[:, :, :, ::-1]
def add_imagenet_mean(image, s):
return np.clip(image.reshape(s)[:, :, :, ::-1] + imagenet_mean, 0, 255)
vgg_model = VGG16_Avg(include_top=False)
content_layer = vgg_model.get_layer('block5_conv1').output
content_model = Model(vgg_model.input, content_layer)
content_image_array = subtract_imagenet_mean(np.expand_dims(np.array(content_image), 0))
content_image_shape = content_image_array.shape
target = K.variable(content_model.predict(content_image_array))
class ConvexOptimiser(object):
def __init__(self, cost_function, tensor_shape):
self.cost_function = cost_function
self.tensor_shape = tensor_shape
self.gradient_values = None
def loss(self, point):
loss_value, self.gradient_values = self.cost_function([point.reshape(self.tensor_shape)])
return loss_value.astype(np.float64)
def gradients(self, point):
return self.gradient_values.flatten().astype(np.float64)
mse_loss = metrics.mean_squared_error(content_layer, target)
grads = K.gradients(mse_loss, vgg_model.input)
cost_function = K.function([vgg_model.input], [mse_loss]+grads)
optimiser = ConvexOptimiser(cost_function, content_image_shape)
def optimise(optimiser, iterations, point, tensor_shape, file_name):
for i in range(iterations):
point, min_val, info = fmin_l_bfgs_b(optimiser.loss, point.flatten(),
fprime=optimiser.gradients, maxfun=20)
point = np.clip(point, -127, 127)
print('Loss:', min_val)
imsave(work_dir + 'gen_'+file_name+'_{i}.png', add_imagenet_mean(point.copy(), tensor_shape)[0])
return point
def generate_rand_img(shape):
return np.random.uniform(-2.5, 2.5, shape)/1
generated_image = generate_rand_img(content_image_shape)
iterations = 2
generated_image = optimise(optimiser, iterations, generated_image, content_image_shape, 'content')
# Style transfer
style_image = Image.open(work_dir + 'starry_night.png')
style_image = style_image.resize(np.divide(style_image.size, 3.5).astype('int32'))
style_image_array = subtract_imagenet_mean(np.expand_dims(style_image, 0)[:, :, :, :3])
style_image_shape = style_image_array.shape
vgg_model = VGG16_Avg(include_top=False, input_shape=style_image_shape[1:])
style_layers = {layer.name: layer.output for layer in vgg_model.layers}
style_features = [style_layers['block{}_conv1'.format(o)] for o in range(1,3)]
layers_model = Model(vgg_model.input, style_features)
style_targets = [K.variable(feature) for feature in layers_model.predict(style_image_array)]
def grammian_matrix(matrix):
flattened_matrix = K.batch_flatten(K.permute_dimensions(matrix, (2, 0, 1)))
matrix_transpose_dot = K.dot(flattened_matrix, K.transpose(flattened_matrix))
element_count = matrix.get_shape().num_elements()
return matrix_transpose_dot / element_count
def style_mse_loss(x, y):
return metrics.mse(grammian_matrix(x), grammian_matrix(y))
style_loss = sum(style_mse_loss(l1[0], l2[0]) for l1, l2 in zip(style_features, style_targets))
grads = K.gradients(style_loss, vgg_model.input)
style_fn = K.function([vgg_model.input], [style_loss]+grads)
optimiser = ConvexOptimiser(style_fn, style_image_shape)
generated_image = generate_rand_img(style_image_shape)
generated_image = optimise(optimiser, iterations, generated_image, style_image_shape, 'style')
w, h = style_image.size
src = content_image_array[:, :h, :w]
outputs = {l.name: l.output for l in vgg_model.layers}
style_layers = [outputs['block{}_conv2'.format(o)] for o in range(1,6)]
content_name = 'block4_conv2'
content_layer = outputs[content_name]
style_model = Model(vgg_model.input, style_layers)
style_targs = [K.variable(o) for o in style_model.predict(style_image_array)]
content_model = Model(vgg_model.input, content_layer)
content_targ = K.variable(content_model.predict(src))
style_wgts = [0.05, 0.2, 0.2, 0.25, 0.3]
loss = sum(style_loss(l1[0], l2[0])*w for l1, l2, w in zip(style_layers, style_targs, style_wgts))
loss += metrics.mse(content_layer, content_targ)/10
grads = tf.keras.backend.gradients(loss, vgg_model.input)
transfer_fn = tf.keras.backend.function([vgg_model.input], [loss]+grads)
evaluator = ConvexOptimiser(transfer_fn, style_image_shape)
enerated_image = generate_rand_img(style_image_shape)
generated_image = optimise(optimiser, iterations, generated_image, style_image_shape)
# Content plus style transfer
style_width, style_height = style_image.size
content_image_array = content_image_array[:, :style_height, :style_width]
style_layers_2 = [style_layers['block{}_conv2'.format(block_no)] for block_no in range(1,6)]
content_layer = style_layers['block4_conv2']
style_model = Model(vgg_model.input, style_layers_2)
style_targets = [K.variable(o) for o in style_model.predict(style_image_array)]
content_model = Model(vgg_model.input, content_layer)
content_target = K.variable(content_model.predict(content_image_array))
style_weights = [0.05, 0.2, 0.2, 0.25, 0.3]
style_loss = sum(style_loss(l1[0], l2[0])*w for l1, l2, w in zip(style_layers_2, style_targets, style_weights))
content_loss = metrics.mse(content_layer, content_target)/10
loss = style_loss + content_loss
gradients = K.gradients(loss, vgg_model.input)
transfer_fn = K.function([vgg_model.input], [loss]+gradients)
optimiser = ConvexOptimiser(transfer_fn, style_image_shape)
generated_image = generate_rand_img(style_image_shape)
generated_image = optimise(optimiser, iterations, generated_image, style_image_shape)
================================================
FILE: Chapter08/2_vanilla_gan.py
================================================
import tensorflow as tf
batch_size = 32
input_dimension = [227, 227]
real_images = None
def add_variable_summary(tf_variable, summary_name):
with tf.name_scope(summary_name + '_summary'):
mean = tf.reduce_mean(tf_variable)
tf.summary.scalar('Mean', mean)
with tf.name_scope('standard_deviation'):
standard_deviation = tf.sqrt(tf.reduce_mean(
tf.square(tf_variable - mean)))
tf.summary.scalar('StandardDeviation', standard_deviation)
tf.summary.scalar('Maximum', tf.reduce_max(tf_variable))
tf.summary.scalar('Minimum', tf.reduce_min(tf_variable))
tf.summary.histogram('Histogram', tf_variable)
def convolution_layer(input_layer,
filters,
kernel_size=[4, 4],
activation=tf.nn.leaky_relu):
layer = tf.layers.conv2d(
inputs=input_layer,
filters=filters,
kernel_size=kernel_size,
activation=activation,
kernel_regularizer=tf.nn.l2_loss,
bias_regularizer=tf.nn.l2_loss,
)
add_variable_summary(layer, 'convolution')
return layer
def transpose_convolution_layer(input_layer,
filters,
kernel_size=[4, 4],
activation=tf.nn.relu,
strides=2):
layer = tf.layers.conv2d_transpose(
inputs=input_layer,
filters=filters,
kernel_size=kernel_size,
activation=activation,
strides=strides,
kernel_regularizer=tf.nn.l2_loss,
bias_regularizer=tf.nn.l2_loss,
)
add_variable_summary(layer, 'convolution')
return layer
def pooling_layer(input_layer,
pool_size=[2, 2],
strides=2):
layer = tf.layers.max_pooling2d(
inputs=input_layer,
pool_size=pool_size,
strides=strides
)
add_variable_summary(layer, 'pooling')
return layer
def dense_layer(input_layer,
units,
activation=tf.nn.relu):
layer = tf.layers.dense(
inputs=input_layer,
units=units,
activation=activation
)
add_variable_summary(layer, 'dense')
return layer
def get_generator(input_noise, is_training=True):
generator = dense_layer(input_noise, 1024)
generator = tf.layers.batch_normalization(generator, training=is_training)
generator = dense_layer(generator, 7 * 7 * 256)
generator = tf.layers.batch_normalization(generator, training=is_training)
generator = tf.reshape(generator, [-1, 7, 7, 256])
generator = transpose_convolution_layer(generator, 64)
generator = tf.layers.batch_normalization(generator, training=is_training)
generator = transpose_convolution_layer(generator, 32)
generator = tf.layers.batch_normalization(generator, training=is_training)
generator = convolution_layer(generator, 3)
generator = convolution_layer(generator, 1, activation=tf.nn.tanh)
print(generator)
return generator
def get_discriminator(image, is_training=True):
x_input_reshape = tf.reshape(image, [-1, 28, 28, 1],
name='input_reshape')
discriminator = convolution_layer(x_input_reshape, 64)
discriminator = convolution_layer(discriminator, 128)
discriminator = tf.layers.flatten(discriminator)
discriminator = dense_layer(discriminator, 1024)
discriminator = tf.layers.batch_normalization(discriminator, training=is_training)
discriminator = dense_layer(discriminator, 2)
return discriminator
input_noise = tf.random_normal([batch_size, input_dimension])
gan = tf.contrib.gan.gan_model(
get_generator,
get_discriminator,
real_images,
input_noise)
tf.contrib.gan.gan_train(
tf.contrib.gan.gan_train_ops(
gan,
tf.contrib.gan.gan_loss(gan),
tf.train.AdamOptimizer(0.001),
tf.train.AdamOptimizer(0.0001)))
================================================
FILE: Chapter08/3_conditional_gan.py
================================================
import tensorflow as tf
batch_size = 32
input_dimension = [227, 227]
real_images = None
labels = None
def add_variable_summary(tf_variable, summary_name):
with tf.name_scope(summary_name + '_summary'):
mean = tf.reduce_mean(tf_variable)
tf.summary.scalar('Mean', mean)
with tf.name_scope('standard_deviation'):
standard_deviation = tf.sqrt(tf.reduce_mean(
tf.square(tf_variable - mean)))
tf.summary.scalar('StandardDeviation', standard_deviation)
tf.summary.scalar('Maximum', tf.reduce_max(tf_variable))
tf.summary.scalar('Minimum', tf.reduce_min(tf_variable))
tf.summary.histogram('Histogram', tf_variable)
def convolution_layer(input_layer,
filters,
kernel_size=[4, 4],
activation=tf.nn.leaky_relu):
layer = tf.layers.conv2d(
inputs=input_layer,
filters=filters,
kernel_size=kernel_size,
activation=activation,
kernel_regularizer=tf.nn.l2_loss,
bias_regularizer=tf.nn.l2_loss,
)
add_variable_summary(layer, 'convolution')
return layer
def transpose_convolution_layer(input_layer,
filters,
kernel_size=[4, 4],
activation=tf.nn.relu,
strides=2):
layer = tf.layers.conv2d_transpose(
inputs=input_layer,
filters=filters,
kernel_size=kernel_size,
activation=activation,
strides=strides,
kernel_regularizer=tf.nn.l2_loss,
bias_regularizer=tf.nn.l2_loss,
)
add_variable_summary(layer, 'convolution')
return layer
def pooling_layer(input_layer,
pool_size=[2, 2],
strides=2):
layer = tf.layers.max_pooling2d(
inputs=input_layer,
pool_size=pool_size,
strides=strides
)
add_variable_summary(layer, 'pooling')
return layer
def dense_layer(input_layer,
units,
activation=tf.nn.relu):
layer = tf.layers.dense(
inputs=input_layer,
units=units,
activation=activation
)
add_variable_summary(layer, 'dense')
return layer
def get_generator(input_noise, is_training=True):
generator = dense_layer(input_noise, 1024)
generator = tf.layers.batch_normalization(generator, training=is_training)
generator = dense_layer(generator, 7 * 7 * 256)
generator = tf.layers.batch_normalization(generator, training=is_training)
generator = tf.reshape(generator, [-1, 7, 7, 256])
generator = transpose_convolution_layer(generator, 64)
generator = tf.layers.batch_normalization(generator, training=is_training)
generator = transpose_convolution_layer(generator, 32)
generator = tf.layers.batch_normalization(generator, training=is_training)
generator = convolution_layer(generator, 3)
generator = convolution_layer(generator, 1, activation=tf.nn.tanh)
print(generator)
return generator
def get_discriminator(image, is_training=True):
x_input_reshape = tf.reshape(image, [-1, 28, 28, 1],
name='input_reshape')
discriminator = convolution_layer(x_input_reshape, 64)
discriminator = convolution_layer(discriminator, 128)
discriminator = tf.layers.flatten(discriminator)
discriminator = dense_layer(discriminator, 1024)
discriminator = tf.layers.batch_normalization(discriminator, training=is_training)
discriminator = dense_layer(discriminator, 2)
return discriminator
input_noise = tf.random_normal([batch_size, input_dimension])
gan = tf.contrib.gan.gan_model(
get_generator,
get_discriminator,
real_images,
(input_noise, labels))
================================================
FILE: Chapter08/4_adverserial_loss.py
================================================
import tensorflow as tf
batch_size = 32
input_dimension = [227, 227]
real_images = None
labels = None
def add_variable_summary(tf_variable, summary_name):
with tf.name_scope(summary_name + '_summary'):
mean = tf.reduce_mean(tf_variable)
tf.summary.scalar('Mean', mean)
with tf.name_scope('standard_deviation'):
standard_deviation = tf.sqrt(tf.reduce_mean(
tf.square(tf_variable - mean)))
tf.summary.scalar('StandardDeviation', standard_deviation)
tf.summary.scalar('Maximum', tf.reduce_max(tf_variable))
tf.summary.scalar('Minimum', tf.reduce_min(tf_variable))
tf.summary.histogram('Histogram', tf_variable)
def convolution_layer(input_layer,
filters,
kernel_size=[4, 4],
activation=tf.nn.leaky_relu):
layer = tf.layers.conv2d(
inputs=input_layer,
filters=filters,
kernel_size=kernel_size,
activation=activation,
kernel_regularizer=tf.nn.l2_loss,
bias_regularizer=tf.nn.l2_loss,
)
add_variable_summary(layer, 'convolution')
return layer
def transpose_convolution_layer(input_layer,
filters,
kernel_size=[4, 4],
activation=tf.nn.relu,
strides=2):
layer = tf.layers.conv2d_transpose(
inputs=input_layer,
filters=filters,
kernel_size=kernel_size,
activation=activation,
strides=strides,
kernel_regularizer=tf.nn.l2_loss,
bias_regularizer=tf.nn.l2_loss,
)
add_variable_summary(layer, 'convolution')
return layer
def pooling_layer(input_layer,
pool_size=[2, 2],
strides=2):
layer = tf.layers.max_pooling2d(
inputs=input_layer,
pool_size=pool_size,
strides=strides
)
add_variable_summary(layer, 'pooling')
return layer
def dense_layer(input_layer,
units,
activation=tf.nn.relu):
layer = tf.layers.dense(
inputs=input_layer,
units=units,
activation=activation
)
add_variable_summary(layer, 'dense')
return layer
def get_generator(input_noise, is_training=True):
generator = dense_layer(input_noise, 1024)
generator = tf.layers.batch_normalization(generator, training=is_training)
generator = dense_layer(generator, 7 * 7 * 256)
generator = tf.layers.batch_normalization(generator, training=is_training)
generator = tf.reshape(generator, [-1, 7, 7, 256])
generator = transpose_convolution_layer(generator, 64)
generator = tf.layers.batch_normalization(generator, training=is_training)
generator = transpose_convolution_layer(generator, 32)
generator = tf.layers.batch_normalization(generator, training=is_training)
generator = convolution_layer(generator, 3)
generator = convolution_layer(generator, 1, activation=tf.nn.tanh)
print(generator)
return generator
def get_discriminator(image, is_training=True):
x_input_reshape = tf.reshape(image, [-1, 28, 28, 1],
name='input_reshape')
discriminator = convolution_layer(x_input_reshape, 64)
discriminator = convolution_layer(discriminator, 128)
discriminator = tf.layers.flatten(discriminator)
discriminator = dense_layer(discriminator, 1024)
discriminator = tf.layers.batch_normalization(discriminator, training=is_training)
discriminator = dense_layer(discriminator, 2)
return discriminator
def fully_connected_layer(input_layer, units):
return tf.layers.dense(
input_layer,
units=units,
activation=tf.nn.relu
)
def convolution_layer(input_layer, filter_size):
return tf.layers.conv2d(
input_layer,
filters=filter_size,
kernel_initializer=tf.contrib.layers.xavier_initializer_conv2d(),
kernel_size=3,
strides=2
)
def deconvolution_layer(input_layer, filter_size, activation=tf.nn.relu):
return tf.layers.conv2d_transpose(
input_layer,
filters=filter_size,
kernel_initializer=tf.contrib.layers.xavier_initializer_conv2d(),
kernel_size=3,
activation=activation,
strides=2
)
def get_autoencoder():
input_layer = tf.placeholder(tf.float32, [None, 128, 128, 3])
convolution_layer_1 = convolution_layer(input_layer, 1024)
convolution_layer_2 = convolution_layer(convolution_layer_1, 512)
convolution_layer_3 = convolution_layer(convolution_layer_2, 256)
convolution_layer_4 = convolution_layer(convolution_layer_3, 128)
convolution_layer_5 = convolution_layer(convolution_layer_4, 32)
convolution_layer_5_flattened = tf.layers.flatten(convolution_layer_5)
bottleneck_layer = fully_connected_layer(convolution_layer_5_flattened, 16)
c5_shape = convolution_layer_5.get_shape().as_list()
c5f_flat_shape = convolution_layer_5_flattened.get_shape().as_list()[1]
fully_connected = fully_connected_layer(bottleneck_layer, c5f_flat_shape)
fully_connected = tf.reshape(fully_connected,
[-1, c5_shape[1], c5_shape[2], c5_shape[3]])
deconvolution_layer_1 = deconvolution_layer(fully_connected, 128)
deconvolution_layer_2 = deconvolution_layer(deconvolution_layer_1, 256)
deconvolution_layer_3 = deconvolution_layer(deconvolution_layer_2, 512)
deconvolution_layer_4 = deconvolution_layer(deconvolution_layer_3, 1024)
deconvolution_layer_5 = deconvolution_layer(deconvolution_layer_4, 3,
activation=tf.nn.tanh)
return deconvolution_layer_5
gan = tf.contrib.gan.gan_model(
get_autoencoder,
get_discriminator,
real_images,
real_images)
loss = tf.contrib.gan.gan_loss(
gan, gradient_penalty=1.0)
l1_pixel_loss = tf.norm(gan.real_data - gan.generated_data, ord=1)
loss = tf.contrib.gan.losses.combine_adversarial_loss(
loss, gan, l1_pixel_loss, weight_factor=1)
================================================
FILE: Chapter08/5_image_translation.py
================================================
import tensorflow as tf
batch_size = 32
input_dimension = [227, 227]
real_images = None
labels = None
input_images = None
def add_variable_summary(tf_variable, summary_name):
with tf.name_scope(summary_name + '_summary'):
mean = tf.reduce_mean(tf_variable)
tf.summary.scalar('Mean', mean)
with tf.name_scope('standard_deviation'):
standard_deviation = tf.sqrt(tf.reduce_mean(
tf.square(tf_variable - mean)))
tf.summary.scalar('StandardDeviation', standard_deviation)
tf.summary.scalar('Maximum', tf.reduce_max(tf_variable))
tf.summary.scalar('Minimum', tf.reduce_min(tf_variable))
tf.summary.histogram('Histogram', tf_variable)
def convolution_layer(input_layer,
filters,
kernel_size=[4, 4],
activation=tf.nn.leaky_relu):
layer = tf.layers.conv2d(
inputs=input_layer,
filters=filters,
kernel_size=kernel_size,
activation=activation,
kernel_regularizer=tf.nn.l2_loss,
bias_regularizer=tf.nn.l2_loss,
)
add_variable_summary(layer, 'convolution')
return layer
def transpose_convolution_layer(input_layer,
filters,
kernel_size=[4, 4],
activation=tf.nn.relu,
strides=2):
layer = tf.layers.conv2d_transpose(
inputs=input_layer,
filters=filters,
kernel_size=kernel_size,
activation=activation,
strides=strides,
kernel_regularizer=tf.nn.l2_loss,
bias_regularizer=tf.nn.l2_loss,
)
add_variable_summary(layer, 'convolution')
return layer
def pooling_layer(input_layer,
pool_size=[2, 2],
strides=2):
layer = tf.layers.max_pooling2d(
inputs=input_layer,
pool_size=pool_size,
strides=strides
)
add_variable_summary(layer, 'pooling')
return layer
def dense_layer(input_layer,
units,
activation=tf.nn.relu):
layer = tf.layers.dense(
inputs=input_layer,
units=units,
activation=activation
)
add_variable_summary(layer, 'dense')
return layer
def get_generator(input_noise, is_training=True):
generator = dense_layer(input_noise, 1024)
generator = tf.layers.batch_normalization(generator, training=is_training)
generator = dense_layer(generator, 7 * 7 * 256)
generator = tf.layers.batch_normalization(generator, training=is_training)
generator = tf.reshape(generator, [-1, 7, 7, 256])
generator = transpose_convolution_layer(generator, 64)
generator = tf.layers.batch_normalization(generator, training=is_training)
generator = transpose_convolution_layer(generator, 32)
generator = tf.layers.batch_normalization(generator, training=is_training)
generator = convolution_layer(generator, 3)
generator = convolution_layer(generator, 1, activation=tf.nn.tanh)
print(generator)
return generator
def get_discriminator(image, is_training=True):
x_input_reshape = tf.reshape(image, [-1, 28, 28, 1],
name='input_reshape')
discriminator = convolution_layer(x_input_reshape, 64)
discriminator = convolution_layer(discriminator, 128)
discriminator = tf.layers.flatten(discriminator)
discriminator = dense_layer(discriminator, 1024)
discriminator = tf.layers.batch_normalization(discriminator, training=is_training)
discriminator = dense_layer(discriminator, 2)
return discriminator
gan = tf.contrib.gan.gan_model(
get_generator,
get_discriminator,
real_images,
input_images)
loss = tf.contrib.gan.gan_loss(
gan,
tf.contrib.gan.losses.least_squares_generator_loss,
tf.contrib.gan.losses.least_squares_discriminator_loss)
l1_loss = tf.norm(gan.real_data - gan.generated_data, ord=1)
gan_loss = tf.contrib.gan.losses.combine_adversarial_loss(
loss, gan, l1_loss, weight_factor=1)
================================================
FILE: Chapter08/6_infogan.py
================================================
import tensorflow as tf
batch_size = 32
input_dimension = [227, 227]
real_images = None
labels = None
unstructured_input = None
structured_input = None
def add_variable_summary(tf_variable, summary_name):
with tf.name_scope(summary_name + '_summary'):
mean = tf.reduce_mean(tf_variable)
tf.summary.scalar('Mean', mean)
with tf.name_scope('standard_deviation'):
standard_deviation = tf.sqrt(tf.reduce_mean(
tf.square(tf_variable - mean)))
tf.summary.scalar('StandardDeviation', standard_deviation)
tf.summary.scalar('Maximum', tf.reduce_max(tf_variable))
tf.summary.scalar('Minimum', tf.reduce_min(tf_variable))
tf.summary.histogram('Histogram', tf_variable)
def convolution_layer(input_layer,
filters,
kernel_size=[4, 4],
activation=tf.nn.leaky_relu):
layer = tf.layers.conv2d(
inputs=input_layer,
filters=filters,
kernel_size=kernel_size,
activation=activation,
kernel_regularizer=tf.nn.l2_loss,
bias_regularizer=tf.nn.l2_loss,
)
add_variable_summary(layer, 'convolution')
return layer
def transpose_convolution_layer(input_layer,
filters,
kernel_size=[4, 4],
activation=tf.nn.relu,
strides=2):
layer = tf.layers.conv2d_transpose(
inputs=input_layer,
filters=filters,
kernel_size=kernel_size,
activation=activation,
strides=strides,
kernel_regularizer=tf.nn.l2_loss,
bias_regularizer=tf.nn.l2_loss,
)
add_variable_summary(layer, 'convolution')
return layer
def pooling_layer(input_layer,
pool_size=[2, 2],
strides=2):
layer = tf.layers.max_pooling2d(
inputs=input_layer,
pool_size=pool_size,
strides=strides
)
add_variable_summary(layer, 'pooling')
return layer
def dense_layer(input_layer,
units,
activation=tf.nn.relu):
layer = tf.layers.dense(
inputs=input_layer,
units=units,
activation=activation
)
add_variable_summary(layer, 'dense')
return layer
def get_generator(input_noise, is_training=True):
generator = dense_layer(input_noise, 1024)
generator = tf.layers.batch_normalization(generator, training=is_training)
generator = dense_layer(generator, 7 * 7 * 256)
generator = tf.layers.batch_normalization(generator, training=is_training)
generator = tf.reshape(generator, [-1, 7, 7, 256])
generator = transpose_convolution_layer(generator, 64)
generator = tf.layers.batch_normalization(generator, training=is_training)
generator = transpose_convolution_layer(generator, 32)
generator = tf.layers.batch_normalization(generator, training=is_training)
generator = convolution_layer(generator, 3)
generator = convolution_layer(generator, 1, activation=tf.nn.tanh)
print(generator)
return generator
def get_discriminator(image, is_training=True):
x_input_reshape = tf.reshape(image, [-1, 28, 28, 1],
name='input_reshape')
discriminator = convolution_layer(x_input_reshape, 64)
discriminator = convolution_layer(discriminator, 128)
discriminator = tf.layers.flatten(discriminator)
discriminator = dense_layer(discriminator, 1024)
discriminator = tf.layers.batch_normalization(discriminator, training=is_training)
discriminator = dense_layer(discriminator, 2)
return discriminator
info_gan = tf.contrib.gan.infogan_model(
get_generator,
get_discriminator,
real_images,
unstructured_input,
structured_input)
loss = tf.contrib.gan.gan_loss(
info_gan,
gradient_penalty_weight=1,
gradient_penalty_epsilon=1e-10,
mutual_information_penalty_weight=1)
================================================
FILE: Chapter08/utils.py
================================================
import math, keras, datetime, pandas as pd, numpy as np, keras.backend as K, threading, json, re, collections
import tarfile, tensorflow as tf, matplotlib.pyplot as plt, xgboost, operator, random, pickle, glob, os, bcolz
import shutil, sklearn, functools, itertools, scipy
from PIL import Image
from concurrent.futures import ProcessPoolExecutor, as_completed, ThreadPoolExecutor
import matplotlib.patheffects as PathEffects
from sklearn.preprocessing import LabelEncoder, StandardScaler
from sklearn.neighbors import NearestNeighbors, LSHForest
import IPython
from IPython.display import display, Audio
from numpy.random import normal
from gensim.models import word2vec
from keras.preprocessing.text import Tokenizer
from nltk.tokenize import ToktokTokenizer, StanfordTokenizer
from functools import reduce
from itertools import chain
from tensorflow.python.framework import ops
#from tensorflow.contrib import rnn, legacy_seq2seq as seq2seq
from keras_tqdm import TQDMNotebookCallback
from keras import initializations
from keras.applications.resnet50 import ResNet50, decode_predictions, conv_block, identity_block
from keras.applications.vgg16 import VGG16
from keras.preprocessing import image
from keras.preprocessing.sequence import pad_sequences
from keras.models import Model, Sequential
from keras.layers import *
from keras.optimizers import Adam
from keras.regularizers import l2
from keras.utils.data_utils import get_file
from keras.applications.imagenet_utils import decode_predictions, preprocess_input
np.set_printoptions(threshold=50, edgeitems=20)
def beep(): return Audio(filename='/home/jhoward/beep.mp3', autoplay=True)
def dump(obj, fname): pickle.dump(obj, open(fname, 'wb'))
def load(fname): return pickle.load(open(fname, 'rb'))
def limit_mem():
K.get_session().close()
cfg = K.tf.ConfigProto()
cfg.gpu_options.allow_growth = True
K.set_session(K.tf.Session(config=cfg))
def autolabel(plt, fmt='%.2f'):
rects = plt.patches
ax = rects[0].axes
y_bottom, y_top = ax.get_ylim()
y_height = y_top - y_bottom
for rect in rects:
height = rect.get_height()
if height / y_height > 0.95:
label_position = height - (y_height * 0.06)
else:
label_position = height + (y_height * 0.01)
txt = ax.text(rect.get_x() + rect.get_width()/2., label_position,
fmt % height, ha='center', va='bottom')
txt.set_path_effects([PathEffects.withStroke(linewidth=3, foreground='w')])
def column_chart(lbls, vals, val_lbls='%.2f'):
n = len(lbls)
p = plt.bar(np.arange(n), vals)
plt.xticks(np.arange(n), lbls)
if val_lbls: autolabel(p, val_lbls)
def save_array(fname, arr):
c=bcolz.carray(arr, rootdir=fname, mode='w')
c.flush()
def load_array(fname): return bcolz.open(fname)[:]
def load_glove(loc):
return (load_array(loc+'.dat'),
pickle.load(open(loc+'_words.pkl','rb'), encoding='latin1'),
pickle.load(open(loc+'_idx.pkl','rb'), encoding='latin1'))
def plot_multi(im, dim=(4,4), figsize=(6,6), **kwargs ):
plt.figure(figsize=figsize)
for i,img in enumerate(im):
plt.subplot(*((dim)+(i+1,)))
plt.imshow(img, **kwargs)
plt.axis('off')
plt.tight_layout()
def plot_train(hist):
h = hist.history
if 'acc' in h:
meas='acc'
loc='lower right'
else:
meas='loss'
loc='upper right'
plt.plot(hist.history[meas])
plt.plot(hist.history['val_'+meas])
plt.title('model '+meas)
plt.ylabel(meas)
plt.xlabel('epoch')
plt.legend(['train', 'validation'], loc=loc)
def fit_gen(gen, fn, eval_fn, nb_iter):
for i in range(nb_iter):
fn(*next(gen))
if i % (nb_iter//10) == 0: eval_fn()
def wrap_config(layer):
return {'class_name': layer.__class__.__name__, 'config': layer.get_config()}
def copy_layer(layer): return layer_from_config(wrap_config(layer))
def copy_layers(layers): return [copy_layer(layer) for layer in layers]
def copy_weights(from_layers, to_layers):
for from_layer,to_layer in zip(from_layers, to_layers):
to_layer.set_weights(from_layer.get_weights())
def copy_model(m):
res = Sequential(copy_layers(m.layers))
copy_weights(m.layers, res.layers)
return res
def insert_layer(model, new_layer, index):
res = Sequential()
for i,layer in enumerate(model.layers):
if i==index: res.add(new_layer)
copied = layer_from_config(wrap_config(layer))
res.add(copied)
copied.set_weights(layer.get_weights())
return res
================================================
FILE: Chapter08/vgg16_avg.py
================================================
from __future__ import print_function
from __future__ import absolute_import
import warnings
from keras.models import Model
from keras.layers import Flatten, Dense, Input
from keras.layers import Convolution2D, AveragePooling2D
from keras.engine.topology import get_source_inputs
from keras.utils.layer_utils import convert_all_kernels_in_model
from keras.utils.data_utils import get_file
from keras import backend as K
from keras.applications.imagenet_utils import _obtain_input_shape
TH_WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_th_dim_ordering_th_kernels.h5'
TF_WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels.h5'
TH_WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_th_dim_ordering_th_kernels_notop.h5'
TF_WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5'
def VGG16_Avg(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, classes=1000):
if weights not in {'imagenet', None}:
raise ValueError('The `weights` argument should be either '
'`None` (random initialization) or `imagenet` '
'(pre-training on ImageNet).')
if weights == 'imagenet' and include_top and classes != 1000:
raise ValueError('If using `weights` as imagenet with `include_top`'
' as true, `classes` should be 1000')
# Determine proper input shape
input_shape = _obtain_input_shape(input_shape,
default_size=224,
min_size=48,
dim_ordering=K.image_dim_ordering(),
include_top=include_top)
if input_tensor is None:
img_input = Input(shape=input_shape)
else:
if not K.is_keras_tensor(input_tensor):
img_input = Input(tensor=input_tensor, shape=input_shape)
else:
img_input = input_tensor
# Block 1
x = Convolution2D(64, 3, 3, activation='relu', border_mode='same', name='block1_conv1')(img_input)
x = Convolution2D(64, 3, 3, activation='relu', border_mode='same', name='block1_conv2')(x)
x = AveragePooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)
# Block 2
x = Convolution2D(128, 3, 3, activation='relu', border_mode='same', name='block2_conv1')(x)
x = Convolution2D(128, 3, 3, activation='relu', border_mode='same', name='block2_conv2')(x)
x = AveragePooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)
# Block 3
x = Convolution2D(256, 3, 3, activation='relu', border_mode='same', name='block3_conv1')(x)
x = Convolution2D(256, 3, 3, activation='relu', border_mode='same', name='block3_conv2')(x)
x = Convolution2D(256, 3, 3, activation='relu', border_mode='same', name='block3_conv3')(x)
x = AveragePooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)
# Block 4
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block4_conv1')(x)
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block4_conv2')(x)
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block4_conv3')(x)
x = AveragePooling2D((2, 2), strides=(2, 2), name='block4_pool')(x)
# Block 5
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block5_conv1')(x)
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block5_conv2')(x)
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block5_conv3')(x)
x = AveragePooling2D((2, 2), strides=(2, 2), name='block5_pool')(x)
if include_top:
# Classification block
x = Flatten(name='flatten')(x)
x = Dense(4096, activation='relu', name='fc1')(x)
x = Dense(4096, activation='relu', name='fc2')(x)
x = Dense(classes, activation='softmax', name='predictions')(x)
# Ensure that the model takes into account
# any potential predecessors of `input_tensor`.
if input_tensor is not None:
inputs = get_source_inputs(input_tensor)
else:
inputs = img_input
# Create model.
model = Model(inputs, x, name='vgg16')
# load weights
if weights == 'imagenet':
if K.image_dim_ordering() == 'th':
if include_top:
weights_path = get_file('vgg16_weights_th_dim_ordering_th_kernels.h5',
TH_WEIGHTS_PATH,
cache_subdir='models')
else:
weights_path = get_file('vgg16_weights_th_dim_ordering_th_kernels_notop.h5',
TH_WEIGHTS_PATH_NO_TOP,
cache_subdir='models')
model.load_weights(weights_path)
if K.backend() == 'tensorflow':
warnings.warn('You are using the TensorFlow backend, yet you '
'are using the Theano '
'image dimension ordering convention '
'(`image_dim_ordering="th"`). '
'For best performance, set '
'`image_dim_ordering="tf"` in '
'your Keras config '
'at ~/.keras/keras.json.')
convert_all_kernels_in_model(model)
else:
if include_top:
weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels.h5',
TF_WEIGHTS_PATH,
cache_subdir='models')
else:
weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5',
TF_WEIGHTS_PATH_NO_TOP,
cache_subdir='models')
model.load_weights(weights_path)
if K.backend() == 'theano':
convert_all_kernels_in_model(model)
return model
================================================
FILE: Chapter09/1_video_to_frames_1.py
================================================
import cv2
video_path = '/Users/i335713/Desktop/epat/lecture recordings and live lectures/batch35epat (batch 35) Lecture Recordings Live Lec Additional Lecture on Machine Learning (.mp4'
video_handle = cv2.VideoCapture(video_path)
frame_no = 0
while True:
eof, frame = video_handle.read()
if not eof:
break
cv2.imwrite("frame%d.jpg" % frame_no, frame)
frame_no += 1
================================================
FILE: Chapter09/2_parallel_stream.py
================================================
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
mnist_data = input_data.read_data_sets('MNIST_data', one_hot=True)
input_size = 784
no_classes = 10
batch_size = 100
total_batches = 300
def add_variable_summary(tf_variable, summary_name):
with tf.name_scope(summary_name + '_summary'):
mean = tf.reduce_mean(tf_variable)
tf.summary.scalar('Mean', mean)
with tf.name_scope('standard_deviation'):
standard_deviation = tf.sqrt(tf.reduce_mean(
tf.square(tf_variable - mean)))
tf.summary.scalar('StandardDeviation', standard_deviation)
tf.summary.scalar('Maximum', tf.reduce_max(tf_variable))
tf.summary.scalar('Minimum', tf.reduce_min(tf_variable))
tf.summary.histogram('Histogram', tf_variable)
def convolution_layer(input_layer, filters, kernel_size=[3, 3],
activation=tf.nn.relu):
layer = tf.layers.conv2d(
inputs=input_layer,
filters=filters,
kernel_size=kernel_size,
activation=activation
)
add_variable_summary(layer, 'convolution')
return layer
def pooling_layer(input_layer, pool_size=[2, 2], strides=2):
layer = tf.layers.max_pooling2d(
inputs=input_layer,
pool_size=pool_size,
strides=strides
)
add_variable_summary(layer, 'pooling')
return layer
def dense_layer(input_layer, units, activation=tf.nn.relu):
layer = tf.layers.dense(
inputs=input_layer,
units=units,
activation=activation
)
add_variable_summary(layer, 'dense')
return layer
def get_model(input_):
input_reshape = tf.reshape(input_, [-1, 28, 28, 1],
name='input_reshape')
convolution_layer_1 = convolution_layer(input_reshape, 64)
pooling_layer_1 = pooling_layer(convolution_layer_1)
convolution_layer_2 = convolution_layer(pooling_layer_1, 128)
pooling_layer_2 = pooling_layer(convolution_layer_2)
flattened_pool = tf.reshape(pooling_layer_2, [-1, 5 * 5 * 128],
name='flattened_pool')
return flattened_pool
high_resolution_input = tf.placeholder(tf.float32, shape=[None, input_size])
low_resolution_input = tf.placeholder(tf.float32, shape=[None, input_size])
y_input = tf.placeholder(tf.float32, shape=[None, no_classes])
high_resolution_cnn = get_model(high_resolution_input)
low_resolution_cnn = get_model(low_resolution_input)
dense_layer_1 = tf.concat([high_resolution_cnn, low_resolution_cnn], 1)
dense_layer_bottleneck = dense_layer(dense_layer_1, 1024)
logits = dense_layer(dense_layer_bottleneck, no_classes)
with tf.name_scope('loss'):
softmax_cross_entropy = tf.nn.softmax_cross_entropy_with_logits(
labels=y_input, logits=logits)
loss_operation = tf.reduce_mean(softmax_cross_entropy, name='loss')
tf.summary.scalar('loss', loss_operation)
with tf.name_scope('optimiser'):
optimiser = tf.train.AdamOptimizer().minimize(loss_operation)
with tf.name_scope('accuracy'):
with tf.name_scope('correct_prediction'):
predictions = tf.argmax(logits, 1)
correct_predictions = tf.equal(predictions, tf.argmax(y_input, 1))
with tf.name_scope('accuracy'):
accuracy_operation = tf.reduce_mean(
tf.cast(correct_predictions, tf.float32))
tf.summary.scalar('accuracy', accuracy_operation)
session = tf.Session()
session.run(tf.global_variables_initializer())
merged_summary_operation = tf.summary.merge_all()
train_summary_writer = tf.summary.FileWriter('/tmp/train', session.graph)
test_summary_writer = tf.summary.FileWriter('/tmp/test')
test_images, test_labels = mnist_data.test.images, mnist_data.test.labels
for batch_no in range(total_batches):
mnist_batch = mnist_data.train.next_batch(batch_size)
train_images, train_labels = mnist_batch[0], mnist_batch[1]
_, merged_summary = session.run([optimiser, merged_summary_operation],
feed_dict={
high_resolution_input: train_images,
low_resolution_input: train_images,
y_input: train_labels
})
train_summary_writer.add_summary(merged_summary, batch_no)
if batch_no % 10 == 0:
merged_summary, _ = session.run([merged_summary_operation,
accuracy_operation], feed_dict={
high_resolution_input: test_images,
low_resolution_input: test_images,
y_input: test_labels
})
test_summary_writer.add_summary(merged_summary, batch_no)
================================================
FILE: Chapter09/3_lstm_after_cnn.py
================================================
import tensorflow as tf
input_shape = [500,500]
no_classes = 2
net = tf.keras.models.Sequential()
net.add(tf.keras.layers.LSTM(2048,
return_sequences=False,
input_shape=input_shape,
dropout=0.5))
net.add(tf.keras.layers.Dense(512, activation='relu'))
net.add(tf.keras.layers.Dropout(0.5))
net.add(tf.keras.layers.Dense(no_classes, activation='softmax'))
================================================
FILE: Chapter09/4_3d_convolution.py
================================================
import tensorflow as tf
input_shape = 227, 227, 200, 3
no_classes = 2
net = tf.keras.models.Sequential()
net.add(tf.keras.layers.Conv3D(32,
kernel_size=(3, 3, 3),
input_shape=(input_shape)))
net.add(tf.keras.layers.Activation('relu'))
net.add(tf.keras.layers.Conv3D(32, (3, 3, 3)))
net.add(tf.keras.layers.Activation('softmax'))
net.add(tf.keras.layers.MaxPooling3D())
net.add(tf.keras.layers.Dropout(0.25))
net.add(tf.keras.layers.Conv3D(64, (3, 3, 3)))
net.add(tf.keras.layers.Activation('relu'))
net.add(tf.keras.layers.Conv3D(64, (3, 3, 3)))
net.add(tf.keras.layers.Activation('softmax'))
net.add(tf.keras.layers.MaxPool3D())
net.add(tf.keras.layers.Dropout(0.25))
net.add(tf.keras.layers.Flatten())
net.add(tf.keras.layers.Dense(512, activation='sigmoid'))
net.add(tf.keras.layers.Dropout(0.5))
net.add(tf.keras.layers.Dense(no_classes, activation='softmax'))
net.compile(loss=tf.keras.losses.categorical_crossentropy,
optimizer=tf.keras.optimizers.Adam(), metrics=['accuracy'])
================================================
FILE: Chapter10/1_ios.py
================================================
import tfcoreml as tf_converter
tf_converter.convert(tf_model_path='tf_model_path.pb',
mlmodel_path='mlmodel_path.mlmodel',
output_feature_names=['softmax:0'],
input_name_shape_dict={'input:0': [1, 227, 227, 3]})
================================================
FILE: LICENSE
================================================
MIT License
Copyright (c) 2018 Packt
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
================================================
FILE: README.md
================================================
# Deep-Learning-for-Computer-Vision
Code repository for Deep Learning for Computer Vision, by Packt
This is the code repository for [Deep Learning for Computer Vision](https://www.packtpub.com/big-data-and-business-intelligence/deep-learning-computer-vision?utm_source=github&utm_medium=repository&utm_campaign=9781788295628), published by [Packt](https://www.packtpub.com/?utm_source=github). It contains all the supporting project files necessary to work through the book from start to finish.
## About the Book
Deep learning has shown its power in several application areas of Artificial Intelligence, especially in Computer Vision. Computer Vision is the science of understanding and manipulating images, and finds enormous applications in the areas of robotics, automation, and so on. This book will also show you, with practical examples, how to develop Computer Vision applications by leveraging the power of deep learning.
## Instructions and Navigation
All of the code is organized into folders. Each folder starts with a number followed by the application name. For example, Chapter02.
The code will look like the following:
```
merged_summary_operation = tf.summary.merge_all()
train_summary_writer = tf.summary.FileWriter('/tmp/train' , session.graph)
test_summary_writer = tf.summary.FileWriter('/tmp/test' )
```
The examples covered in this book can be run with Windows, Ubuntu, or Mac. All the installation instructions are covered. Basic knowledge of Python and machine learning is required. It's preferable that the reader has GPU hardware but it's not necessary
## Related Products
* [Deep Learning with Keras](https://www.packtpub.com/big-data-and-business-intelligence/deep-learning-keras?utm_source=github&utm_medium=repository&utm_campaign=9781787128422)
* [TensorFlow 1.x Deep Learning Cookbook](https://www.packtpub.com/big-data-and-business-intelligence/tensorflow-1x-deep-learning-cookbook?utm_source=github&utm_medium=repository&utm_campaign=9781788293594)
* [Deep Learning with TensorFlow](https://www.packtpub.com/big-data-and-business-intelligence/deep-learning-tensorflow?utm_source=github&utm_medium=repository&utm_campaign=9781786469786)
### Download a free PDF
If you have already purchased a print or Kindle version of this book, you can get a DRM-free PDF version at no cost.
Simply click on the link to claim your free PDF.