Repository: PacktPublishing/Deep-Learning-for-Computer-Vision Branch: master Commit: b8dfcf3f4860 Files: 52 Total size: 146.6 KB Directory structure: gitextract_77peer6f/ ├── .gitignore ├── Chapter01/ │ ├── 1_hello_tensorflow.py │ ├── 2_add.py │ └── 3_add_tensorboard.py ├── Chapter02/ │ ├── 1_mnist_tf_perceptron.py │ ├── 2_mnist_cnn.py │ ├── 3_mnist_keras.py │ ├── 4_cat_vs_dog_data_prep.py │ ├── 5_cat_vs_dog_cnn.py │ ├── 6_cat_vs_dog_augmentation.py │ ├── 7_cat_vs_dog_bottleneck.py │ └── 8_cat_vs_dog_fine_tune.py ├── Chapter03/ │ ├── 1_embedding_vis.py │ ├── 2_guided_back_prop.py │ ├── 3_deep_dream.py │ ├── 4_export_model.py │ ├── 5_serving_client.py │ ├── 6_bottleneck_features.py │ ├── 7_annoy.py │ ├── 8_auto_encoder.py │ └── 9_denoising.py ├── Chapter04/ │ ├── 1_iou.py │ ├── 2_overfeat.py │ ├── 3_object_detection_api.py │ ├── 4_yolo.py │ └── pascal_voc.py ├── Chapter05/ │ ├── 1_segnet.py │ ├── 2_nerve_segmentation.py │ ├── 3_satellite.py │ └── data.py ├── Chapter06/ │ ├── 1_contrastive_loss.py │ ├── 2_siamese_network.py │ ├── 3_triplet_loss.py │ ├── 4_triplet_mining.py │ ├── 5_fiducial_points.py │ └── 6_extract_features.py ├── Chapter07/ │ └── 1_caption_attention.py ├── Chapter08/ │ ├── 1_style_transfer.py │ ├── 2_vanilla_gan.py │ ├── 3_conditional_gan.py │ ├── 4_adverserial_loss.py │ ├── 5_image_translation.py │ ├── 6_infogan.py │ ├── utils.py │ └── vgg16_avg.py ├── Chapter09/ │ ├── 1_video_to_frames_1.py │ ├── 2_parallel_stream.py │ ├── 3_lstm_after_cnn.py │ └── 4_3d_convolution.py ├── Chapter10/ │ └── 1_ios.py ├── LICENSE └── README.md ================================================ FILE CONTENTS ================================================ ================================================ FILE: .gitignore ================================================ *MNIST_data* Chapter02/test/* Chapter02/train/* Chapter03/inception5h.zip Chapter03/classify_image_graph_def.pb Chapter03/imagenet_2012_challenge_label_map_proto.pbtxt Chapter03/imagenet_comp_graph_label_strings.txt Chapter03/imagenet_synset_to_human_label_map.txt Chapter03/inception-2015-12-05.tgz Chapter03/LICENSE Chapter03/tensorflow_inception_graph.pb Chapter03/stitched_filters_3x3.png Chapter03/cropped_panda.jpg Chapter08/gen_* .idea/* # Byte-compiled / optimized / DLL files __pycache__/ *.py[cod] *$py.class # C extensions *.so # Distribution / packaging .Python env/ build/ develop-eggs/ dist/ downloads/ eggs/ .eggs/ lib/ lib64/ parts/ sdist/ var/ wheels/ *.egg-info/ .installed.cfg *.egg # PyInstaller # Usually these files are written by a python script from a template # before PyInstaller builds the exe, so as to inject date/other infos into it. *.manifest *.spec # Installer logs pip-log.txt pip-delete-this-directory.txt # Unit test / coverage reports htmlcov/ .tox/ .coverage .coverage.* .cache nosetests.xml coverage.xml *.cover .hypothesis/ # Translations *.mo *.pot # Django stuff: *.log local_settings.py # Flask stuff: instance/ .webassets-cache # Scrapy stuff: .scrapy # Sphinx documentation docs/_build/ # PyBuilder target/ # Jupyter Notebook .ipynb_checkpoints # pyenv .python-version # celery beat schedule file celerybeat-schedule # SageMath parsed files *.sage.py # dotenv .env # virtualenv .venv venv/ ENV/ # Spyder project settings .spyderproject .spyproject # Rope project settings .ropeproject # mkdocs documentation /site # mypy .mypy_cache/ ================================================ FILE: Chapter01/1_hello_tensorflow.py ================================================ import tensorflow as tf hello = tf.constant('Hello, TensorFlow!') session = tf.Session() print(session.run(hello)) ================================================ FILE: Chapter01/2_add.py ================================================ import tensorflow as tf x = tf.placeholder(tf.float32) y = tf.placeholder(tf.float32) z = x + y session = tf.Session() values = {x: 5.0, y: 4.0} result = session.run([z], values) print(result) ================================================ FILE: Chapter01/3_add_tensorboard.py ================================================ import tensorflow as tf x = tf.placeholder(tf.float32, name='x') y = tf.placeholder(tf.float32, name='y') z = tf.add(x, y, name='sum') session = tf.Session() summary_writer = tf.summary.FileWriter('/tmp/1', session.graph) summary_writer.flush() ================================================ FILE: Chapter02/1_mnist_tf_perceptron.py ================================================ import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data mnist_data = input_data.read_data_sets('MNIST_data', one_hot=True) input_size = 784 no_classes = 10 batch_size = 100 total_batches = 200 x_input = tf.placeholder(tf.float32, shape=[None, input_size]) y_input = tf.placeholder(tf.float32, shape=[None, no_classes]) weights = tf.Variable(tf.random_normal([input_size, no_classes])) bias = tf.Variable(tf.random_normal([no_classes])) logits = tf.matmul(x_input, weights) + bias softmax_cross_entropy = tf.nn.softmax_cross_entropy_with_logits_v2(labels=y_input, logits=logits) loss_operation = tf.reduce_mean(softmax_cross_entropy) optimiser = tf.train.GradientDescentOptimizer(learning_rate=0.5).minimize(loss_operation) session = tf.Session() session.run(tf.global_variables_initializer()) for batch_no in range(total_batches): mnist_batch = mnist_data.train.next_batch(batch_size) train_images, train_labels = mnist_batch[0], mnist_batch[1] _, loss_value = session.run([optimiser, loss_operation], feed_dict={x_input: train_images, y_input: train_labels}) print(loss_value) predictions = tf.argmax(logits, 1) correct_predictions = tf.equal(predictions, tf.argmax(y_input, 1)) accuracy_operation = tf.reduce_mean(tf.cast(correct_predictions, tf.float32)) test_images, test_labels = mnist_data.test.images, mnist_data.test.labels accuracy_value = session.run(accuracy_operation, feed_dict={x_input: test_images, y_input: test_labels}) print('Accuracy : ', accuracy_value) session.close() ================================================ FILE: Chapter02/2_mnist_cnn.py ================================================ import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data mnist_data = input_data.read_data_sets('MNIST_data', one_hot=True) input_size = 784 no_classes = 10 batch_size = 100 total_batches = 200 x_input = tf.placeholder(tf.float32, shape=[None, input_size]) y_input = tf.placeholder(tf.float32, shape=[None, no_classes]) def add_variable_summary(tf_variable, summary_name): with tf.name_scope(summary_name + '_summary'): mean = tf.reduce_mean(tf_variable) tf.summary.scalar('Mean', mean) with tf.name_scope('standard_deviation'): standard_deviation = tf.sqrt(tf.reduce_mean( tf.square(tf_variable - mean))) tf.summary.scalar('StandardDeviation', standard_deviation) tf.summary.scalar('Maximum', tf.reduce_max(tf_variable)) tf.summary.scalar('Minimum', tf.reduce_min(tf_variable)) tf.summary.histogram('Histogram', tf_variable) x_input_reshape = tf.reshape(x_input, [-1, 28, 28, 1], name='input_reshape') def convolution_layer(input_layer, filters, kernel_size=[3, 3], activation=tf.nn.relu): layer = tf.layers.conv2d( inputs=input_layer, filters=filters, kernel_size=kernel_size, activation=activation ) add_variable_summary(layer, 'convolution') return layer def pooling_layer(input_layer, pool_size=[2, 2], strides=2): layer = tf.layers.max_pooling2d( inputs=input_layer, pool_size=pool_size, strides=strides ) add_variable_summary(layer, 'pooling') return layer def dense_layer(input_layer, units, activation=tf.nn.relu): layer = tf.layers.dense( inputs=input_layer, units=units, activation=activation ) add_variable_summary(layer, 'dense') return layer convolution_layer_1 = convolution_layer(x_input_reshape, 64) pooling_layer_1 = pooling_layer(convolution_layer_1) convolution_layer_2 = convolution_layer(pooling_layer_1, 128) pooling_layer_2 = pooling_layer(convolution_layer_2) flattened_pool = tf.reshape(pooling_layer_2, [-1, 5 * 5 * 128], name='flattened_pool') dense_layer_bottleneck = dense_layer(flattened_pool, 1024) dropout_bool = tf.placeholder(tf.bool) dropout_layer = tf.layers.dropout( inputs=dense_layer_bottleneck, rate=0.4, training=dropout_bool ) logits = dense_layer(dropout_layer, no_classes) with tf.name_scope('loss'): softmax_cross_entropy = tf.nn.softmax_cross_entropy_with_logits( labels=y_input, logits=logits) loss_operation = tf.reduce_mean(softmax_cross_entropy, name='loss') tf.summary.scalar('loss', loss_operation) with tf.name_scope('optimiser'): optimiser = tf.train.AdamOptimizer().minimize(loss_operation) with tf.name_scope('accuracy'): with tf.name_scope('correct_prediction'): predictions = tf.argmax(logits, 1) correct_predictions = tf.equal(predictions, tf.argmax(y_input, 1)) with tf.name_scope('accuracy'): accuracy_operation = tf.reduce_mean( tf.cast(correct_predictions, tf.float32)) tf.summary.scalar('accuracy', accuracy_operation) session = tf.Session() session.run(tf.global_variables_initializer()) merged_summary_operation = tf.summary.merge_all() train_summary_writer = tf.summary.FileWriter('/tmp/train', session.graph) test_summary_writer = tf.summary.FileWriter('/tmp/test') test_images, test_labels = mnist_data.test.images, mnist_data.test.labels for batch_no in range(total_batches): mnist_batch = mnist_data.train.next_batch(batch_size) train_images, train_labels = mnist_batch[0], mnist_batch[1] _, merged_summary = session.run([optimiser, merged_summary_operation], feed_dict={ x_input: train_images, y_input: train_labels, dropout_bool: True }) train_summary_writer.add_summary(merged_summary, batch_no) if batch_no % 10 == 0: merged_summary, _ = session.run([merged_summary_operation, accuracy_operation], feed_dict={ x_input: test_images, y_input: test_labels, dropout_bool: False }) test_summary_writer.add_summary(merged_summary, batch_no) ================================================ FILE: Chapter02/3_mnist_keras.py ================================================ import tensorflow as tf batch_size = 128 no_classes = 10 epochs = 50 image_height, image_width = 28, 28 (x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data() x_train = x_train.reshape(x_train.shape[0], image_height, image_width, 1) x_test = x_test.reshape(x_test.shape[0], image_height, image_width, 1) input_shape = (image_height, image_width, 1) x_train = x_train.astype('float32') x_test = x_test.astype('float32') x_train /= 255 x_test /= 255 y_train = tf.keras.utils.to_categorical(y_train, no_classes) y_test = tf.keras.utils.to_categorical(y_test, no_classes) def simple_cnn(input_shape): model = tf.keras.models.Sequential() model.add(tf.keras.layers.Conv2D( filters=64, kernel_size=(3, 3), activation='relu', input_shape=input_shape )) model.add(tf.keras.layers.Conv2D( filters=128, kernel_size=(3, 3), activation='relu' )) model.add(tf.keras.layers.MaxPooling2D(pool_size=(2, 2))) model.add(tf.keras.layers.Dropout(rate=0.3)) model.add(tf.keras.layers.Flatten()) model.add(tf.keras.layers.Dense(units=1024, activation='relu')) model.add(tf.keras.layers.Dropout(rate=0.3)) model.add(tf.keras.layers.Dense(units=no_classes, activation='softmax')) model.compile(loss=tf.keras.losses.categorical_crossentropy, optimizer=tf.keras.optimizers.Adam(), metrics=['accuracy']) return model simple_cnn_model = simple_cnn(input_shape) simple_cnn_model.fit(x_train, y_train, batch_size, epochs, (x_test, y_test)) train_loss, train_accuracy = simple_cnn_model.evaluate( x_train, y_train, verbose=0) print('Train data loss:', train_loss) print('Train data accuracy:', train_accuracy) test_loss, test_accuracy = simple_cnn_model.evaluate( x_test, y_test, verbose=0) print('Test data loss:', test_loss) print('Test data accuracy:', test_accuracy) ================================================ FILE: Chapter02/4_cat_vs_dog_data_prep.py ================================================ import os import shutil work_dir = '' image_names = sorted(os.listdir(os.path.join(work_dir, 'train'))) def copy_files(prefix_str, range_start, range_end, target_dir): image_paths = [os.path.join(work_dir, 'train', prefix_str + '.' + str(i) + '.jpg') for i in range(range_start, range_end)] dest_dir = os.path.join(work_dir, 'data', target_dir, prefix_str) os.makedirs(dest_dir) for image_path in image_paths: shutil.copy(image_path, dest_dir) copy_files('dog', 0, 1000, 'train') copy_files('cat', 0, 1000, 'train') copy_files('dog', 1000, 1400, 'test') copy_files('cat', 1000, 1400, 'test') ================================================ FILE: Chapter02/5_cat_vs_dog_cnn.py ================================================ import numpy as np import os import tensorflow as tf work_dir = '' image_height, image_width = 150, 150 train_dir = os.path.join(work_dir, 'train') test_dir = os.path.join(work_dir, 'test') no_classes = 2 no_validation = 800 epochs = 2 batch_size = 200 no_train = 2000 no_test = 800 input_shape = (image_height, image_width, 3) epoch_steps = no_train // batch_size test_steps = no_test // batch_size def simple_cnn(input_shape): model = tf.keras.models.Sequential() model.add(tf.keras.layers.Conv2D( filters=64, kernel_size=(3, 3), activation='relu', input_shape=input_shape )) model.add(tf.keras.layers.Conv2D( filters=128, kernel_size=(3, 3), activation='relu' )) model.add(tf.keras.layers.MaxPooling2D(pool_size=(2, 2))) model.add(tf.keras.layers.Dropout(rate=0.3)) model.add(tf.keras.layers.Flatten()) model.add(tf.keras.layers.Dense(units=1024, activation='relu')) model.add(tf.keras.layers.Dropout(rate=0.3)) model.add(tf.keras.layers.Dense(units=no_classes, activation='softmax')) model.compile(loss=tf.keras.losses.categorical_crossentropy, optimizer=tf.keras.optimizers.Adam(), metrics=['accuracy']) return model simple_cnn_model = simple_cnn(input_shape) generator_train = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1. / 255) generator_test = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1. / 255) train_images = generator_train.flow_from_directory( train_dir, batch_size=batch_size, target_size=(image_width, image_height)) test_images = generator_test.flow_from_directory( test_dir, batch_size=batch_size, target_size=(image_width, image_height)) simple_cnn_model.fit_generator( train_images, steps_per_epoch=epoch_steps, epochs=epochs, validation_data=test_images, validation_steps=test_steps) ================================================ FILE: Chapter02/6_cat_vs_dog_augmentation.py ================================================ import tensorflow as tf import os work_dir = '' image_height, image_width = 150, 150 train_dir = os.path.join(work_dir, 'train') test_dir = os.path.join(work_dir, 'test') no_classes = 2 no_validation = 800 epochs = 50 batch_size = 32 no_train = 2000 no_test = 800 input_shape = (image_height, image_width, 3) epoch_steps = no_train // batch_size test_steps = no_test // batch_size def simple_cnn(input_shape): model = tf.keras.models.Sequential() model.add(tf.keras.layers.Conv2D( filters=64, kernel_size=(3, 3), activation='relu', input_shape=input_shape )) model.add(tf.keras.layers.Conv2D( filters=128, kernel_size=(3, 3), activation='relu' )) model.add(tf.keras.layers.MaxPooling2D(pool_size=(2, 2))) model.add(tf.keras.layers.Dropout(rate=0.3)) model.add(tf.keras.layers.Flatten()) model.add(tf.keras.layers.Dense(units=1024, activation='relu')) model.add(tf.keras.layers.Dropout(rate=0.3)) model.add(tf.keras.layers.Dense(units=no_classes, activation='softmax')) model.compile(loss=tf.keras.losses.categorical_crossentropy, optimizer=tf.keras.optimizers.Adam(), metrics=['accuracy']) return model simple_cnn_model = simple_cnn(input_shape) generator_train = tf.keras.preprocessing.image.ImageDataGenerator( rescale=1. / 255, horizontal_flip=True, zoom_range=0.3, shear_range=0.3,) generator_test = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1. / 255) train_images = generator_train.flow_from_directory( train_dir, batch_size=batch_size, target_size=(image_width, image_height)) test_images = generator_test.flow_from_directory( test_dir, batch_size=batch_size, target_size=(image_width, image_height)) simple_cnn_model.fit_generator( train_images, steps_per_epoch=epoch_steps, epochs=epochs, validation_data=test_images, validation_steps=test_steps) ================================================ FILE: Chapter02/7_cat_vs_dog_bottleneck.py ================================================ import numpy as np import os import tensorflow as tf work_dir = '' image_height, image_width = 150, 150 train_dir = os.path.join(work_dir, 'train') test_dir = os.path.join(work_dir, 'test') no_classes = 2 no_validation = 800 epochs = 50 batch_size = 32 no_train = 2000 no_test = 800 input_shape = (image_height, image_width, 3) epoch_steps = no_train // batch_size test_steps = no_test // batch_size generator = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1. / 255) model = tf.keras.applications.VGG16(include_top=False) train_images = generator.flow_from_directory( train_dir, batch_size=batch_size, target_size=(image_width, image_height), class_mode=None, shuffle=False ) train_bottleneck_features = model.predict_generator(train_images, epoch_steps) test_images = generator.flow_from_directory( test_dir, batch_size=batch_size, target_size=(image_width, image_height), class_mode=None, shuffle=False ) test_bottleneck_features = model.predict_generator(test_images, test_steps) train_labels = np.array([0] * int(no_train / 2) + [1] * int(no_train / 2)) test_labels = np.array([0] * int(no_test / 2) + [1] * int(no_test / 2)) model = tf.keras.models.Sequential() model.add(tf.keras.layers.Flatten(input_shape=train_bottleneck_features.shape[1:])) model.add(tf.keras.layers.Dense(1024, activation='relu')) model.add(tf.keras.layers.Dropout(0.3)) model.add(tf.keras.layers.Dense(1, activation='softmax')) model.compile(loss=tf.keras.losses.categorical_crossentropy, optimizer=tf.keras.optimizers.Adam(), metrics=['accuracy']) model.fit( train_bottleneck_features, train_labels, batch_size=batch_size, epochs=epochs, validation_data=(test_bottleneck_features, test_labels)) ================================================ FILE: Chapter02/8_cat_vs_dog_fine_tune.py ================================================ import tensorflow as tf import os work_dir = '' weights_path = '../keras/examples/vgg16_weights.h5' top_model_weights_path = 'fc_model.h5' image_height, image_width = 150, 150 train_dir = os.path.join(work_dir, 'train') test_dir = os.path.join(work_dir, 'test') no_classes = 2 no_validation = 800 epochs = 50 batch_size = 32 no_train = 2000 no_test = 800 input_shape = (image_height, image_width, 3) epoch_steps = no_train // batch_size test_steps = no_test // batch_size model = tf.keras.applications.VGG16(include_top=False) model_fine_tune = tf.keras.models.Sequential() model_fine_tune.add(tf.keras.layers.Flatten(input_shape=model.output_shape)) model_fine_tune.add(tf.keras.layers.Dense(256, activation='relu')) model_fine_tune.add(tf.keras.layers.Dropout(0.5)) model_fine_tune.add(tf.keras.layers.Dense(no_classes, activation='softmax')) model_fine_tune.load_weights(top_model_weights_path) model.add(model_fine_tune) for vgg_layer in model.layers[:25]: vgg_layer.trainable = False model.compile(loss='binary_crossentropy', optimizer=tf.keras.optimizers.SGD(lr=1e-4, momentum=0.9), metrics=['accuracy']) generator_train = tf.keras.preprocessing.image.ImageDataGenerator( rescale=1. / 255, horizontal_flip=True, zoom_range=0.3, shear_range=0.3 ) generator_test = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1. / 255) generator_train = generator_train.flow_from_directory( train_dir, batch_size=batch_size, target_size=(image_width, image_height) ) generator_test = generator_test.flow_from_directory( test_dir, batch_size=batch_size, target_size=(image_width, image_height) ) model.fit_generator( generator_train, steps_per_epoch=epoch_steps, epochs=epochs, validation_data=generator_test, validation_steps=test_steps ) ================================================ FILE: Chapter03/1_embedding_vis.py ================================================ import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data import os import numpy as np mnist = input_data.read_data_sets('MNIST_data', one_hot=True) input_size = 784 no_classes = 10 batch_size = 100 total_batches = 100 x_input = tf.placeholder(tf.float32, shape=[None, input_size]) y_input = tf.placeholder(tf.float32, shape=[None, no_classes]) def add_variable_summary(tf_variable, summary_name): with tf.name_scope(summary_name + '_summary'): mean = tf.reduce_mean(tf_variable) tf.summary.scalar('Mean', mean) with tf.name_scope('standard_deviation'): standard_deviation = tf.sqrt(tf.reduce_mean(tf.square(tf_variable - mean))) tf.summary.scalar('StandardDeviation', standard_deviation) tf.summary.scalar('Maximum', tf.reduce_max(tf_variable)) tf.summary.scalar('Minimum', tf.reduce_min(tf_variable)) tf.summary.histogram('Histogram', tf_variable) x_input_reshape = tf.reshape(x_input, [-1, 28, 28, 1], name='input_reshape') convolution_layer_1 = tf.layers.conv2d( inputs=x_input_reshape, filters=64, kernel_size=[3, 3], activation=tf.nn.relu, ) add_variable_summary(convolution_layer_1, 'convolution1') pooling_layer_1 = tf.layers.max_pooling2d( inputs=convolution_layer_1, pool_size=[2, 2], strides=2 ) add_variable_summary(pooling_layer_1, 'pooling1') convolution_layer_2 = tf.layers.conv2d( inputs=pooling_layer_1, filters=128, kernel_size=[3, 3], activation=tf.nn.relu, ) add_variable_summary(convolution_layer_2, 'convolution2') pooling_layer_2 = tf.layers.max_pooling2d( inputs=convolution_layer_2, pool_size=[2, 2], strides=2 ) add_variable_summary(pooling_layer_2, 'pool2') flattened_pool = tf.reshape(pooling_layer_2, [-1, 5 * 5 * 128], name='flattened_pool') dense_layer = tf.layers.dense( inputs=flattened_pool, units=1024, activation=tf.nn.relu, name='dense' ) add_variable_summary(dense_layer, 'dense') dropout_bool = tf.placeholder(tf.bool) dropout_layer = tf.layers.dropout( inputs=dense_layer, rate=0.4, training=dropout_bool, name='dropout' ) logits = tf.layers.dense(inputs=dropout_layer, units=no_classes, name='logits') add_variable_summary(logits, 'logits') with tf.name_scope('loss'): softmax_cross_entropy = tf.nn.softmax_cross_entropy_with_logits(labels=y_input, logits=logits) loss_operation = tf.reduce_mean(softmax_cross_entropy, name='loss') tf.summary.scalar('loss', loss_operation) with tf.name_scope('optimiser'): optimiser = tf.train.AdamOptimizer().minimize(loss_operation) with tf.name_scope('accuracy'): with tf.name_scope('correct_prediction'): predictions = tf.argmax(logits, 1) correct_predictions = tf.equal(predictions, tf.argmax(y_input, 1)) with tf.name_scope('accuracy'): accuracy_operation = tf.reduce_mean(tf.cast(correct_predictions, tf.float32)) tf.summary.scalar('accuracy', accuracy_operation) session = tf.Session() # Adding the variable in between creating the session and initialising the graph no_embedding_data = 1000 embedding_variable = tf.Variable(tf.stack( mnist.test.images[:no_embedding_data], axis=0), trainable=False) session.run(tf.global_variables_initializer()) merged_summary_operation = tf.summary.merge_all() train_summary_writer = tf.summary.FileWriter('/tmp/train', session.graph) test_images, test_labels = mnist.test.images, mnist.test.labels for batch_no in range(total_batches): image_batch = mnist.train.next_batch(100) _, merged_summary = session.run([optimiser, merged_summary_operation], feed_dict={ x_input: image_batch[0], y_input: image_batch[1], dropout_bool: True }) train_summary_writer.add_summary(merged_summary, batch_no) work_dir = '' # change path metadata_path = '/tmp/train/metadata.tsv' with open(metadata_path, 'w') as metadata_file: for i in range(no_embedding_data): metadata_file.write('{}\n'.format( np.nonzero(mnist.test.labels[::1])[1:][0][i])) from tensorflow.contrib.tensorboard.plugins import projector projector_config = projector.ProjectorConfig() embedding_projection = projector_config.embeddings.add() embedding_projection.tensor_name = embedding_variable.name embedding_projection.metadata_path = metadata_path embedding_projection.sprite.image_path = os.path.join(work_dir + '/mnist_10k_sprite.png') embedding_projection.sprite.single_image_dim.extend([28, 28]) projector.visualize_embeddings(train_summary_writer, projector_config) tf.train.Saver().save(session, '/tmp/train/model.ckpt', global_step=1) ================================================ FILE: Chapter03/2_guided_back_prop.py ================================================ from scipy.misc import imsave import numpy as np import tensorflow as tf image_width, image_height = 128, 128 vgg_model = tf.keras.applications.vgg16.VGG16(include_top=False) input_image = vgg_model.input vgg_layer_dict = dict([(vgg_layer.name, vgg_layer) for vgg_layer in vgg_model.layers[1:]]) vgg_layer_output = vgg_layer_dict['block5_conv1'].output filters = [] for filter_idx in range(20): loss = tf.keras.backend.mean(vgg_layer_output[:, :, :, filter_idx]) gradients = tf.keras.backend.gradients(loss, input_image)[0] gradient_mean_square = tf.keras.backend.mean(tf.keras.backend.square(gradients)) gradients /= (tf.keras.backend.sqrt(gradient_mean_square) + 1e-5) evaluator = tf.keras.backend.function([input_image], [loss, gradients]) gradient_ascent_step = 1. input_image_data = np.random.random((1, image_width, image_height, 3)) input_image_data = (input_image_data - 0.5) * 20 + 128 for i in range(20): loss_value, gradient_values = evaluator([input_image_data]) input_image_data += gradient_values * gradient_ascent_step # print('Loss :', loss_value) if loss_value <= 0.: break if loss_value > 0: filter = input_image_data[0] filter -= filter.mean() filter /= (filter.std() + 1e-5) filter *= 0.1 filter += 0.5 filter = np.clip(filter, 0, 1) filter *= 255 filter = np.clip(filter, 0, 255).astype('uint8') filters.append((filter, loss_value)) # For visualisation, not in book n = 3 filters.sort(key=lambda x: x[1], reverse=True) filters = filters[:n * n] margin = 5 width = n * image_width + (n - 1) * margin height = n * image_height + (n - 1) * margin stitched_filters = np.zeros((width, height, 3)) for i in range(n): for j in range(n): img, loss = filters[i * n + j] stitched_filters[(image_width + margin) * i: (image_width + margin) * i + image_width, (image_height + margin) * j: (image_height + margin) * j + image_height, :] = img imsave('stitched_filters_%dx%d.png' % (n, n), stitched_filters) ================================================ FILE: Chapter03/3_deep_dream.py ================================================ import os import numpy as np import PIL.Image import urllib.request from tensorflow.python.platform import gfile import zipfile import tensorflow as tf work_dir = '' model_url = 'https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip' file_name = model_url.split('/')[-1] file_path = os.path.join(work_dir, file_name) if not os.path.exists(file_path): file_path, _ = urllib.request.urlretrieve(model_url, file_path) zip_handle = zipfile.ZipFile(file_path, 'r') zip_handle.extractall(work_dir) zip_handle.close() graph = tf.Graph() session = tf.InteractiveSession(graph=graph) model_path = os.path.join(work_dir, 'tensorflow_inception_graph.pb') with gfile.FastGFile(model_path, 'rb') as f: graph_defnition = tf.GraphDef() graph_defnition.ParseFromString(f.read()) input_placeholder = tf.placeholder(np.float32, name='input') imagenet_mean_value = 117.0 preprocessed_input = tf.expand_dims(input_placeholder-imagenet_mean_value, 0) tf.import_graph_def(graph_defnition, {'input': preprocessed_input}) def resize_image(image, size): resize_placeholder = tf.placeholder(tf.float32) resize_placeholder_expanded = tf.expand_dims(resize_placeholder, 0) resized_image = tf.image.resize_bilinear(resize_placeholder_expanded, size)[0, :, :, :] return session.run(resized_image, feed_dict={resize_placeholder: image}) image_name = 'mountain.jpg' image = PIL.Image.open(image_name) image = np.float32(image) objective_fn = tf.square(graph.get_tensor_by_name("import/mixed4c:0")) no_octave = 4 scale = 1.4 window_size = 51 score = tf.reduce_mean(objective_fn) gradients = tf.gradients(score, input_placeholder)[0] octave_images = [] for i in range(no_octave - 1): image_height_width = image.shape[:2] scaled_image = resize_image(image, np.int32(np.float32(image_height_width) / scale)) image_difference = image - resize_image(scaled_image, image_height_width) image = scaled_image octave_images.append(image_difference) for octave_idx in range(no_octave): if octave_idx > 0: image_difference = octave_images[-octave_idx] image = resize_image(image, image_difference.shape[:2]) + image_difference for i in range(10): image_heigth, image_width = image.shape[:2] sx, sy = np.random.randint(window_size, size=2) shifted_image = np.roll(np.roll(image, sx, 1), sy, 0) gradient_values = np.zeros_like(image) for y in range(0, max(image_heigth - window_size // 2, window_size), window_size): for x in range(0, max(image_width - window_size // 2, window_size), window_size): sub = shifted_image[y:y + window_size, x:x + window_size] gradient_windows = session.run(gradients, {input_placeholder: sub}) gradient_values[y:y + window_size, x:x + window_size] = gradient_windows gradient_windows = np.roll(np.roll(gradient_values, -sx, 1), -sy, 0) image += gradient_windows * (1.5 / (np.abs(gradient_windows).mean() + 1e-7)) image /= 255.0 image = np.uint8(np.clip(image, 0, 1) * 255) PIL.Image.fromarray(image).save('dream_' + image_name, 'jpeg') ================================================ FILE: Chapter03/4_export_model.py ================================================ import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data import os work_dir = '/tmp' model_version = 9 training_iteration = 1000 input_size = 784 no_classes = 10 batch_size = 100 total_batches = 200 tf_example = tf.parse_example(tf.placeholder(tf.string, name='tf_example'), {'x': tf.FixedLenFeature(shape=[784], dtype=tf.float32), }) x_input = tf.identity(tf_example['x'], name='x') y_input = tf.placeholder(tf.float32, shape=[None, no_classes]) weights = tf.Variable(tf.random_normal([input_size, no_classes])) bias = tf.Variable(tf.random_normal([no_classes])) logits = tf.matmul(x_input, weights) + bias softmax_cross_entropy = tf.nn.softmax_cross_entropy_with_logits(labels=y_input, logits=logits) loss_operation = tf.reduce_mean(softmax_cross_entropy) optimiser = tf.train.GradientDescentOptimizer(0.5).minimize(loss_operation) session = tf.Session() session.run(tf.global_variables_initializer()) mnist = input_data.read_data_sets('MNIST_data', one_hot=True) for batch_no in range(total_batches): mnist_batch = mnist.train.next_batch(batch_size) _, loss_value = session.run([optimiser, loss_operation], feed_dict={ x_input: mnist_batch[0], y_input: mnist_batch[1] }) print(loss_value) signature_def = ( tf.saved_model.signature_def_utils.build_signature_def( inputs={'x': tf.saved_model.utils.build_tensor_info(x_input)}, outputs={'y': tf.saved_model.utils.build_tensor_info(y_input)}, method_name="tensorflow/serving/predict")) model_path = os.path.join(work_dir, str(model_version)) saved_model_builder = tf.saved_model.builder.SavedModelBuilder(model_path) saved_model_builder.add_meta_graph_and_variables( session, [tf.saved_model.tag_constants.SERVING], signature_def_map={ 'prediction': signature_def }, legacy_init_op=tf.group(tf.tables_initializer(), name='legacy_init_op')) saved_model_builder.save() ================================================ FILE: Chapter03/5_serving_client.py ================================================ from grpc.beta import implementations import numpy import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data from tensorflow_serving.apis import predict_pb2 from tensorflow_serving.apis import prediction_service_pb2 mnist = input_data.read_data_sets('MNIST_data', one_hot=True) concurrency = 1 num_tests = 100 host = '' port = 8000 work_dir = '/tmp' def _create_rpc_callback(): def _callback(result): response = numpy.array( result.result().outputs['y'].float_val) prediction = numpy.argmax(response) print(prediction) return _callback test_data_set = mnist.test test_image = mnist.test.images[0] predict_request = predict_pb2.PredictRequest() predict_request.model_spec.name = 'mnist' predict_request.model_spec.signature_name = 'prediction' predict_channel = implementations.insecure_channel(host, int(port)) predict_stub = prediction_service_pb2.beta_create_PredictionService_stub(predict_channel) predict_request.inputs['x'].CopyFrom( tf.contrib.util.make_tensor_proto(test_image, shape=[1, test_image.size])) result = predict_stub.Predict.future(predict_request, 3.0) result.add_done_callback( _create_rpc_callback()) ================================================ FILE: Chapter03/6_bottleneck_features.py ================================================ import tensorflow as tf import os import urllib.request from tensorflow.python.platform import gfile import tarfile import numpy as np work_dir = '' model_url = 'http://download.tensorflow.org/models/image/imagenet/inception-2015-12-05.tgz' file_name = model_url.split('/')[-1] file_path = os.path.join(work_dir, file_name) if not os.path.exists(file_path): file_path, _ = urllib.request.urlretrieve(model_url, file_path) tarfile.open(file_path, 'r:gz').extractall(work_dir) model_path = os.path.join(work_dir, 'classify_image_graph_def.pb') with gfile.FastGFile(model_path, 'rb') as f: graph_defnition = tf.GraphDef() graph_defnition.ParseFromString(f.read()) bottleneck, image, resized_input = ( tf.import_graph_def( graph_defnition, name='', return_elements=['pool_3/_reshape:0', 'DecodeJpeg/contents:0', 'ResizeBilinear:0']) ) query_image_path = os.path.join(work_dir, 'cat.1000.jpg') query_image = gfile.FastGFile(query_image_path, 'rb').read() target_image_path = os.path.join(work_dir, 'cat.1001.jpg') target_image = gfile.FastGFile(target_image_path, 'rb').read() def get_bottleneck_data(session, image_data): bottleneck_data = session.run(bottleneck, {image: image_data}) bottleneck_data = np.squeeze(bottleneck_data) return bottleneck_data session = tf.Session() query_feature = get_bottleneck_data(session, query_image) print(query_feature) target_feature = get_bottleneck_data(session, target_image) print(target_feature) dist = np.linalg.norm(np.asarray(query_feature) - np.asarray(target_feature)) print(dist) ================================================ FILE: Chapter03/7_annoy.py ================================================ import os from annoy import AnnoyIndex work_dir = '' layer_dimension = 256 target_features = [] query_feature = [] def create_annoy(target_features): t = AnnoyIndex(layer_dimension) for idx, target_feature in enumerate(target_features): t.add_item(idx, target_feature) t.build(10) t.save(os.path.join(work_dir, 'annoy.ann')) create_annoy(target_features) annoy_index = AnnoyIndex(10) annoy_index.load(os.path.join(work_dir, 'annoy.ann')) matches = annoy_index.get_nns_by_vector(query_feature, 20) ================================================ FILE: Chapter03/8_auto_encoder.py ================================================ import tensorflow as tf def fully_connected_layer(input_layer, units): return tf.layers.dense( input_layer, units=units, activation=tf.nn.relu ) def convolution_layer(input_layer, filter_size): return tf.layers.conv2d( input_layer, filters=filter_size, kernel_initializer=tf.contrib.layers.xavier_initializer_conv2d(), kernel_size=3, strides=2 ) def deconvolution_layer(input_layer, filter_size, activation=tf.nn.relu): return tf.layers.conv2d_transpose( input_layer, filters=filter_size, kernel_initializer=tf.contrib.layers.xavier_initializer_conv2d(), kernel_size=3, activation=activation, strides=2 ) input_layer = tf.placeholder(tf.float32, [None, 128, 128, 3]) convolution_layer_1 = convolution_layer(input_layer, 1024) convolution_layer_2 = convolution_layer(convolution_layer_1, 512) convolution_layer_3 = convolution_layer(convolution_layer_2, 256) convolution_layer_4 = convolution_layer(convolution_layer_3, 128) convolution_layer_5 = convolution_layer(convolution_layer_4, 32) convolution_layer_5_flattened = tf.layers.flatten(convolution_layer_5) bottleneck_layer = fully_connected_layer(convolution_layer_5_flattened, 16) c5_shape = convolution_layer_5.get_shape().as_list() c5f_flat_shape = convolution_layer_5_flattened.get_shape().as_list()[1] fully_connected = fully_connected_layer(bottleneck_layer, c5f_flat_shape) fully_connected = tf.reshape(fully_connected, [-1, c5_shape[1], c5_shape[2], c5_shape[3]]) deconvolution_layer_1 = deconvolution_layer(fully_connected, 128) deconvolution_layer_2 = deconvolution_layer(deconvolution_layer_1, 256) deconvolution_layer_3 = deconvolution_layer(deconvolution_layer_2, 512) deconvolution_layer_4 = deconvolution_layer(deconvolution_layer_3, 1024) deconvolution_layer_5 = deconvolution_layer(deconvolution_layer_4, 3, activation=tf.nn.tanh) ================================================ FILE: Chapter03/9_denoising.py ================================================ import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data import numpy as np mnist_data = input_data.read_data_sets('MNIST_data', one_hot=True) input_size = 784 no_classes = 10 batch_size = 100 total_batches = 2000 x_input = tf.placeholder(tf.float32, shape=[None, input_size]) y_input = tf.placeholder(tf.float32, shape=[None, input_size]) def add_variable_summary(tf_variable, summary_name): with tf.name_scope(summary_name + '_summary'): mean = tf.reduce_mean(tf_variable) tf.summary.scalar('Mean', mean) with tf.name_scope('standard_deviation'): standard_deviation = tf.sqrt(tf.reduce_mean( tf.square(tf_variable - mean))) tf.summary.scalar('StandardDeviation', standard_deviation) tf.summary.scalar('Maximum', tf.reduce_max(tf_variable)) tf.summary.scalar('Minimum', tf.reduce_min(tf_variable)) tf.summary.histogram('Histogram', tf_variable) def dense_layer(input_layer, units, activation=tf.nn.tanh): layer = tf.layers.dense( inputs=input_layer, units=units, activation=activation ) add_variable_summary(layer, 'dense') return layer layer_1 = dense_layer(x_input, 500) layer_2 = dense_layer(layer_1, 250) layer_3 = dense_layer(layer_2, 50) layer_4 = dense_layer(layer_3, 250) layer_5 = dense_layer(layer_4, 500) layer_6 = dense_layer(layer_5, 784) with tf.name_scope('loss'): softmax_cross_entropy = tf.nn.sigmoid_cross_entropy_with_logits( labels=y_input, logits=layer_6) loss_operation = tf.reduce_mean(softmax_cross_entropy, name='loss') tf.summary.scalar('loss', loss_operation) with tf.name_scope('optimiser'): optimiser = tf.train.AdamOptimizer().minimize(loss_operation) x_input_reshaped = tf.reshape(x_input, [-1, 28, 28, 1]) tf.summary.image("noisy_images", x_input_reshaped) y_input_reshaped = tf.reshape(y_input, [-1, 28, 28, 1]) tf.summary.image("original_images", y_input_reshaped) layer_6_reshaped = tf.reshape(layer_6, [-1, 28, 28, 1]) tf.summary.image("reconstructed_images", layer_6_reshaped) session = tf.Session() session.run(tf.global_variables_initializer()) merged_summary_operation = tf.summary.merge_all() train_summary_writer = tf.summary.FileWriter('/tmp/train', session.graph) for batch_no in range(total_batches): mnist_batch = mnist_data.train.next_batch(batch_size) train_images, _ = mnist_batch[0], mnist_batch[1] train_images_noise = train_images + 0.2 * np.random.normal(size=train_images.shape) train_images_noise = np.clip(train_images_noise, 0., 1.) _, merged_summary = session.run([optimiser, merged_summary_operation], feed_dict={ x_input: train_images_noise, y_input: train_images, }) train_summary_writer.add_summary(merged_summary, batch_no) ================================================ FILE: Chapter04/1_iou.py ================================================ import tensorflow as tf def calculate_iou(gt_bb, pred_bb): ''' :param gt_bb: ground truth bounding box :param pred_bb: predicted bounding box ''' gt_bb = tf.stack([ gt_bb[:, :, :, :, 0] - gt_bb[:, :, :, :, 2] / 2.0, gt_bb[:, :, :, :, 1] - gt_bb[:, :, :, :, 3] / 2.0, gt_bb[:, :, :, :, 0] + gt_bb[:, :, :, :, 2] / 2.0, gt_bb[:, :, :, :, 1] + gt_bb[:, :, :, :, 3] / 2.0]) gt_bb = tf.transpose(gt_bb, [1, 2, 3, 4, 0]) pred_bb = tf.stack([ pred_bb[:, :, :, :, 0] - pred_bb[:, :, :, :, 2] / 2.0, pred_bb[:, :, :, :, 1] - pred_bb[:, :, :, :, 3] / 2.0, pred_bb[:, :, :, :, 0] + pred_bb[:, :, :, :, 2] / 2.0, pred_bb[:, :, :, :, 1] + pred_bb[:, :, :, :, 3] / 2.0]) pred_bb = tf.transpose(pred_bb, [1, 2, 3, 4, 0]) area = tf.maximum( 0.0, tf.minimum(gt_bb[:, :, :, :, 2:], pred_bb[:, :, :, :, 2:]) - tf.maximum(gt_bb[:, :, :, :, :2], pred_bb[:, :, :, :, :2])) intersection_area= area[:, :, :, :, 0] * area[:, :, :, :, 1] gt_bb_area = (gt_bb[:, :, :, :, 2] - gt_bb[:, :, :, :, 0]) * \ (gt_bb[:, :, :, :, 3] - gt_bb[:, :, :, :, 1]) pred_bb_area = (pred_bb[:, :, :, :, 2] - pred_bb[:, :, :, :, 0]) * \ (pred_bb[:, :, :, :, 3] - pred_bb[:, :, :, :, 1]) union_area = tf.maximum(gt_bb_area + pred_bb_area - intersection_area, 1e-10) iou = tf.clip_by_value(intersection_area / union_area, 0.0, 1.0) return iou ================================================ FILE: Chapter04/2_overfeat.py ================================================ import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data mnist_data = input_data.read_data_sets('MNIST_data', one_hot=True) input_size = 784 no_classes = 10 batch_size = 100 total_batches = 300 x_input = tf.placeholder(tf.float32, shape=[None, input_size]) y_input = tf.placeholder(tf.float32, shape=[None, no_classes]) def add_variable_summary(tf_variable, summary_name): with tf.name_scope(summary_name + '_summary'): mean = tf.reduce_mean(tf_variable) tf.summary.scalar('Mean', mean) with tf.name_scope('standard_deviation'): standard_deviation = tf.sqrt(tf.reduce_mean( tf.square(tf_variable - mean))) tf.summary.scalar('StandardDeviation', standard_deviation) tf.summary.scalar('Maximum', tf.reduce_max(tf_variable)) tf.summary.scalar('Minimum', tf.reduce_min(tf_variable)) tf.summary.histogram('Histogram', tf_variable) x_input_reshape = tf.reshape(x_input, [-1, 28, 28, 1], name='input_reshape') def convolution_layer(input_layer, filters, kernel_size=[3, 3], activation=tf.nn.relu): layer = tf.layers.conv2d( inputs=input_layer, filters=filters, kernel_size=kernel_size, activation=activation ) add_variable_summary(layer, 'convolution') return layer def pooling_layer(input_layer, pool_size=[2, 2], strides=2): layer = tf.layers.max_pooling2d( inputs=input_layer, pool_size=pool_size, strides=strides ) add_variable_summary(layer, 'pooling') return layer convolution_layer_1 = convolution_layer(x_input_reshape, 64) pooling_layer_1 = pooling_layer(convolution_layer_1) convolution_layer_2 = convolution_layer(pooling_layer_1, 128) pooling_layer_2 = pooling_layer(convolution_layer_2) dense_layer_bottleneck = convolution_layer(pooling_layer_2, 1024, [5, 5]) logits = convolution_layer(dense_layer_bottleneck, no_classes, [1, 1]) logits = tf.reshape(logits, [-1, 10]) with tf.name_scope('loss'): softmax_cross_entropy = tf.nn.softmax_cross_entropy_with_logits( labels=y_input, logits=logits) print(softmax_cross_entropy) loss_operation = tf.reduce_mean(softmax_cross_entropy, name='loss') print(loss_operation) tf.summary.scalar('loss', loss_operation) with tf.name_scope('optimiser'): optimiser = tf.train.AdamOptimizer().minimize(loss_operation) with tf.name_scope('accuracy'): with tf.name_scope('correct_prediction'): predictions = tf.argmax(logits, 1) correct_predictions = tf.equal(predictions, tf.argmax(y_input, 1)) with tf.name_scope('accuracy'): accuracy_operation = tf.reduce_mean( tf.cast(correct_predictions, tf.float32)) tf.summary.scalar('accuracy', accuracy_operation) session = tf.Session() session.run(tf.global_variables_initializer()) merged_summary_operation = tf.summary.merge_all() train_summary_writer = tf.summary.FileWriter('/tmp/train', session.graph) test_summary_writer = tf.summary.FileWriter('/tmp/test') test_images, test_labels = mnist_data.test.images, mnist_data.test.labels for batch_no in range(total_batches): mnist_batch = mnist_data.train.next_batch(batch_size) train_images, train_labels = mnist_batch[0], mnist_batch[1] _, merged_summary = session.run([optimiser, merged_summary_operation], feed_dict={ x_input: train_images, y_input: train_labels, }) train_summary_writer.add_summary(merged_summary, batch_no) if batch_no % 10 == 0: merged_summary, _ = session.run([merged_summary_operation, accuracy_operation], feed_dict={ x_input: test_images, y_input: test_labels, }) test_summary_writer.add_summary(merged_summary, batch_no) ================================================ FILE: Chapter04/3_object_detection_api.py ================================================ ================================================ FILE: Chapter04/4_yolo.py ================================================ import tensorflow as tf import numpy as np import os from pascal_voc import pascal_voc def calculate_iou(gt_bb, pred_bb): ''' :param gt_bb: ground truth bounding box :param pred_bb: predicted bounding box ''' gt_bb = tf.stack([ gt_bb[:, :, :, :, 0] - gt_bb[:, :, :, :, 2] / 2.0, gt_bb[:, :, :, :, 1] - gt_bb[:, :, :, :, 3] / 2.0, gt_bb[:, :, :, :, 0] + gt_bb[:, :, :, :, 2] / 2.0, gt_bb[:, :, :, :, 1] + gt_bb[:, :, :, :, 3] / 2.0]) gt_bb = tf.transpose(gt_bb, [1, 2, 3, 4, 0]) pred_bb = tf.stack([ pred_bb[:, :, :, :, 0] - pred_bb[:, :, :, :, 2] / 2.0, pred_bb[:, :, :, :, 1] - pred_bb[:, :, :, :, 3] / 2.0, pred_bb[:, :, :, :, 0] + pred_bb[:, :, :, :, 2] / 2.0, pred_bb[:, :, :, :, 1] + pred_bb[:, :, :, :, 3] / 2.0]) pred_bb = tf.transpose(pred_bb, [1, 2, 3, 4, 0]) area = tf.maximum( 0.0, tf.minimum(gt_bb[:, :, :, :, 2:], pred_bb[:, :, :, :, 2:]) - tf.maximum(gt_bb[:, :, :, :, :2], pred_bb[:, :, :, :, :2])) intersection_area= area[:, :, :, :, 0] * area[:, :, :, :, 1] gt_bb_area = (gt_bb[:, :, :, :, 2] - gt_bb[:, :, :, :, 0]) * \ (gt_bb[:, :, :, :, 3] - gt_bb[:, :, :, :, 1]) pred_bb_area = (pred_bb[:, :, :, :, 2] - pred_bb[:, :, :, :, 0]) * \ (pred_bb[:, :, :, :, 3] - pred_bb[:, :, :, :, 1]) union_area = tf.maximum(gt_bb_area + pred_bb_area - intersection_area, 1e-10) iou = tf.clip_by_value(intersection_area / union_area, 0.0, 1.0) return iou DATA_PATH = 'data' PASCAL_PATH = os.path.join(DATA_PATH, 'pascal_voc') CACHE_PATH = os.path.join(PASCAL_PATH, 'cache') OUTPUT_DIR = os.path.join(PASCAL_PATH, 'output') WEIGHTS_DIR = os.path.join(PASCAL_PATH, 'weight') FLIPPED = True DISP_CONSOLE = False GPU = '' THRESHOLD = 0.2 IOU_THRESHOLD = 0.5 classes = ['aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat', 'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike', 'person', 'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor'] num_class = len(classes) image_size = 448 cell_size = 7 boxes_per_cell = 2 output_size = (cell_size * cell_size) * (num_class + boxes_per_cell * 5) scale = 1.0 * image_size / cell_size boundary1 = cell_size * cell_size * num_class boundary2 = boundary1 + cell_size * cell_size * boxes_per_cell object_scale = 1.0 noobject_scale = 1.0 class_scale = 2.0 coord_scale = 5.0 learning_rate = 0.0001 batch_size = 45 alpha = 0.1 weights_file = None max_iter = 15000 initial_learning_rate = 0.0001 decay_steps = 30000 decay_rate = 0.1 staircase = True offset = np.transpose(np.reshape(np.array( [np.arange(cell_size)] * cell_size * boxes_per_cell), (boxes_per_cell, cell_size, cell_size)), (1, 2, 0)) images = tf.placeholder(tf.float32, [None, image_size, image_size, 3], name='images') def add_variable_summary(tf_variable, summary_name): with tf.name_scope(summary_name + '_summary'): mean = tf.reduce_mean(tf_variable) tf.summary.scalar('Mean', mean) with tf.name_scope('standard_deviation'): standard_deviation = tf.sqrt(tf.reduce_mean( tf.square(tf_variable - mean))) tf.summary.scalar('StandardDeviation', standard_deviation) tf.summary.scalar('Maximum', tf.reduce_max(tf_variable)) tf.summary.scalar('Minimum', tf.reduce_min(tf_variable)) tf.summary.histogram('Histogram', tf_variable) def pooling_layer(input_layer, pool_size=[2, 2], strides=2, padding='valid'): layer = tf.layers.max_pooling2d( inputs=input_layer, pool_size=pool_size, strides=strides, padding=padding ) add_variable_summary(layer, 'pooling') return layer def convolution_layer(input_layer, filters, kernel_size=[3, 3], padding='valid', activation=tf.nn.leaky_relu): layer = tf.layers.conv2d( inputs=input_layer, filters=filters, kernel_size=kernel_size, activation=activation, padding=padding, weights_initializer=tf.truncated_normal_initializer(0.0, 0.01), weights_regularizer=tf.l2_regularizer(0.0005) ) add_variable_summary(layer, 'convolution') return layer def dense_layer(input_layer, units, activation=tf.nn.leaky_relu): layer = tf.layers.dense( inputs=input_layer, units=units, activation=activation, weights_initializer=tf.truncated_normal_initializer(0.0, 0.01), weights_regularizer=tf.l2_regularizer(0.0005) ) add_variable_summary(layer, 'dense') return layer yolo = tf.pad(images, np.array([[0, 0], [3, 3], [3, 3], [0, 0]]), name='pad_1') yolo = convolution_layer(yolo, 64, 7, 2) yolo = pooling_layer(yolo, [2, 2], 2, 'same') yolo = convolution_layer(yolo, 192, 3) yolo = pooling_layer(yolo, 2, 'same') yolo = convolution_layer(yolo, 128, 1) yolo = convolution_layer(yolo, 256, 3) yolo = convolution_layer(yolo, 256, 1) yolo = convolution_layer(yolo, 512, 3) yolo = pooling_layer(yolo, 2, 'same') yolo = convolution_layer(yolo, 256, 1) yolo = convolution_layer(yolo, 512, 3) yolo = convolution_layer(yolo, 256, 1) yolo = convolution_layer(yolo, 512, 3) yolo = convolution_layer(yolo, 256, 1) yolo = convolution_layer(yolo, 512, 3) yolo = convolution_layer(yolo, 256, 1) yolo = convolution_layer(yolo, 512, 3) yolo = convolution_layer(yolo, 512, 1) yolo = convolution_layer(yolo, 1024, 3) yolo = pooling_layer(yolo, 2) yolo = convolution_layer(yolo, 512, 1) yolo = convolution_layer(yolo, 1024, 3) yolo = convolution_layer(yolo, 512, 1) yolo = convolution_layer(yolo, 1024, 3) yolo = convolution_layer(yolo, 1024, 3) yolo = tf.pad(yolo, np.array([[0, 0], [1, 1], [1, 1], [0, 0]])) yolo = convolution_layer(yolo, 1024, 3, 2) yolo = convolution_layer(yolo, 1024, 3) yolo = convolution_layer(yolo, 1024, 3) yolo = tf.transpose(yolo, [0, 3, 1, 2]) yolo = tf.layers.flatten(yolo) yolo = dense_layer(yolo, 512) yolo = dense_layer(yolo, 4096) dropout_bool = tf.placeholder(tf.bool) yolo = tf.layers.dropout( inputs=yolo, rate=0.4, training=dropout_bool ) yolo = dense_layer(yolo, output_size, None) predicts = yolo labels = tf.placeholder(tf.float32, [None, cell_size, cell_size, 5 + num_class]) predict_classes = tf.reshape(predicts[:, :boundary1], [batch_size, cell_size, cell_size, num_class]) predict_scales = tf.reshape(predicts[:, boundary1:boundary2], [batch_size, cell_size, cell_size, boxes_per_cell]) predict_boxes = tf.reshape(predicts[:, boundary2:], [batch_size, cell_size, cell_size, boxes_per_cell, 4]) response = tf.reshape(labels[:, :, :, 0], [batch_size, cell_size, cell_size, 1]) boxes = tf.reshape(labels[:, :, :, 1:5], [batch_size, cell_size, cell_size, 1, 4]) boxes = tf.tile(boxes, [1, 1, 1, boxes_per_cell, 1]) / image_size classes = labels[:, :, :, 5:] offset = tf.constant(offset, dtype=tf.float32) offset = tf.reshape(offset, [1, cell_size, cell_size, boxes_per_cell]) offset = tf.tile(offset, [batch_size, 1, 1, 1]) predict_boxes_tran = tf.stack([(predict_boxes[:, :, :, :, 0] + offset) / cell_size, (predict_boxes[:, :, :, :, 1] + tf.transpose(offset, (0, 2, 1, 3))) / cell_size, tf.square(predict_boxes[:, :, :, :, 2]), tf.square(predict_boxes[:, :, :, :, 3])]) predict_boxes_tran = tf.transpose(predict_boxes_tran, [1, 2, 3, 4, 0]) iou_predict_truth = calculate_iou(predict_boxes_tran, boxes) object_mask = tf.reduce_max(iou_predict_truth, 3, keep_dims=True) object_mask = tf.cast((iou_predict_truth >= object_mask), tf.float32) * response noobject_mask = tf.ones_like(object_mask, dtype=tf.float32) - object_mask boxes_tran = tf.stack([boxes[:, :, :, :, 0] * cell_size - offset, boxes[:, :, :, :, 1] * cell_size - tf.transpose(offset, (0, 2, 1, 3)), tf.sqrt(boxes[:, :, :, :, 2]), tf.sqrt(boxes[:, :, :, :, 3])]) boxes_tran = tf.transpose(boxes_tran, [1, 2, 3, 4, 0]) class_delta = response * (predict_classes - classes) class_loss = tf.reduce_mean(tf.reduce_sum(tf.square(class_delta), axis=[1, 2, 3]), name='class_loss') * class_scale object_delta = object_mask * (predict_scales - iou_predict_truth) object_loss = tf.reduce_mean(tf.reduce_sum(tf.square(object_delta), axis=[1, 2, 3]), name='object_loss') * object_scale noobject_delta = noobject_mask * predict_scales noobject_loss = tf.reduce_mean(tf.reduce_sum(tf.square(noobject_delta), axis=[1, 2, 3]), name='noobject_loss') * noobject_scale coord_mask = tf.expand_dims(object_mask, 4) boxes_delta = coord_mask * (predict_boxes - boxes_tran) coord_loss = tf.reduce_mean(tf.reduce_sum(tf.square(boxes_delta), axis=[1, 2, 3, 4]), name='coord_loss') * coord_scale tf.losses.add_loss(class_loss) tf.losses.add_loss(object_loss) tf.losses.add_loss(noobject_loss) tf.losses.add_loss(coord_loss) total_loss = tf.losses.get_total_loss() yolo = yolo data = data global_step = tf.get_variable( 'global_step', [], initializer=tf.constant_initializer(0), trainable=False) learning_rate = tf.train.exponential_decay( initial_learning_rate, global_step, decay_steps, decay_rate, staircase, name='learning_rate') optimizer = tf.train.GradientDescentOptimizer( learning_rate=learning_rate).minimize( yolo.total_loss, global_step=global_step) ema = tf.train.ExponentialMovingAverage(decay=0.9999) averages_op = ema.apply(tf.trainable_variables()) with tf.control_dependencies([optimizer]): train_op = tf.group(averages_op) sess = tf.Session() sess.run(tf.global_variables_initializer()) for step in range(1, max_iter + 1): images, labels = data.get() feed_dict = {yolo.images: images, yolo.labels: labels} sess.run(train_op, feed_dict=feed_dict) ================================================ FILE: Chapter04/pascal_voc.py ================================================ import os import xml.etree.ElementTree as ET import numpy as np import cv2 import cPickle import copy import yolo.config as cfg class pascal_voc(object): def __init__(self, phase, rebuild=False): self.devkil_path = os.path.join(cfg.PASCAL_PATH, 'VOCdevkit') self.data_path = os.path.join(self.devkil_path, 'VOC2007') self.cache_path = cfg.CACHE_PATH self.batch_size = cfg.BATCH_SIZE self.image_size = cfg.IMAGE_SIZE self.cell_size = cfg.CELL_SIZE self.classes = cfg.CLASSES self.class_to_ind = dict(zip(self.classes, xrange(len(self.classes)))) self.flipped = cfg.FLIPPED self.phase = phase self.rebuild = rebuild self.cursor = 0 self.epoch = 1 self.gt_labels = None self.prepare() def get(self): images = np.zeros((self.batch_size, self.image_size, self.image_size, 3)) labels = np.zeros((self.batch_size, self.cell_size, self.cell_size, 25)) count = 0 while count < self.batch_size: imname = self.gt_labels[self.cursor]['imname'] flipped = self.gt_labels[self.cursor]['flipped'] images[count, :, :, :] = self.image_read(imname, flipped) labels[count, :, :, :] = self.gt_labels[self.cursor]['label'] count += 1 self.cursor += 1 if self.cursor >= len(self.gt_labels): np.random.shuffle(self.gt_labels) self.cursor = 0 self.epoch += 1 return images, labels def image_read(self, imname, flipped=False): image = cv2.imread(imname) image = cv2.resize(image, (self.image_size, self.image_size)) image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB).astype(np.float32) image = (image / 255.0) * 2.0 - 1.0 if flipped: image = image[:, ::-1, :] return image def prepare(self): gt_labels = self.load_labels() if self.flipped: print('Appending horizontally-flipped training examples ...') gt_labels_cp = copy.deepcopy(gt_labels) for idx in range(len(gt_labels_cp)): gt_labels_cp[idx]['flipped'] = True gt_labels_cp[idx]['label'] = gt_labels_cp[idx]['label'][:, ::-1, :] for i in xrange(self.cell_size): for j in xrange(self.cell_size): if gt_labels_cp[idx]['label'][i, j, 0] == 1: gt_labels_cp[idx]['label'][i, j, 1] = self.image_size - 1 - gt_labels_cp[idx]['label'][i, j, 1] gt_labels += gt_labels_cp np.random.shuffle(gt_labels) self.gt_labels = gt_labels return gt_labels def load_labels(self): cache_file = os.path.join(self.cache_path, 'pascal_' + self.phase + '_gt_labels.pkl') if os.path.isfile(cache_file) and not self.rebuild: print('Loading gt_labels from: ' + cache_file) with open(cache_file, 'rb') as f: gt_labels = cPickle.load(f) return gt_labels print('Processing gt_labels from: ' + self.data_path) if not os.path.exists(self.cache_path): os.makedirs(self.cache_path) if self.phase == 'train': txtname = os.path.join(self.data_path, 'ImageSets', 'Main', 'trainval.txt') else: txtname = os.path.join(self.data_path, 'ImageSets', 'Main', 'test.txt') with open(txtname, 'r') as f: self.image_index = [x.strip() for x in f.readlines()] gt_labels = [] for index in self.image_index: label, num = self.load_pascal_annotation(index) if num == 0: continue imname = os.path.join(self.data_path, 'JPEGImages', index + '.jpg') gt_labels.append({'imname': imname, 'label': label, 'flipped': False}) print('Saving gt_labels to: ' + cache_file) with open(cache_file, 'wb') as f: cPickle.dump(gt_labels, f) return gt_labels def load_pascal_annotation(self, index): """ Load image and bounding boxes info from XML file in the PASCAL VOC format. """ imname = os.path.join(self.data_path, 'JPEGImages', index + '.jpg') im = cv2.imread(imname) h_ratio = 1.0 * self.image_size / im.shape[0] w_ratio = 1.0 * self.image_size / im.shape[1] # im = cv2.resize(im, [self.image_size, self.image_size]) label = np.zeros((self.cell_size, self.cell_size, 25)) filename = os.path.join(self.data_path, 'Annotations', index + '.xml') tree = ET.parse(filename) objs = tree.findall('object') for obj in objs: bbox = obj.find('bndbox') # Make pixel indexes 0-based x1 = max(min((float(bbox.find('xmin').text) - 1) * w_ratio, self.image_size - 1), 0) y1 = max(min((float(bbox.find('ymin').text) - 1) * h_ratio, self.image_size - 1), 0) x2 = max(min((float(bbox.find('xmax').text) - 1) * w_ratio, self.image_size - 1), 0) y2 = max(min((float(bbox.find('ymax').text) - 1) * h_ratio, self.image_size - 1), 0) cls_ind = self.class_to_ind[obj.find('name').text.lower().strip()] boxes = [(x2 + x1) / 2.0, (y2 + y1) / 2.0, x2 - x1, y2 - y1] x_ind = int(boxes[0] * self.cell_size / self.image_size) y_ind = int(boxes[1] * self.cell_size / self.image_size) if label[y_ind, x_ind, 0] == 1: continue label[y_ind, x_ind, 0] = 1 label[y_ind, x_ind, 1:5] = boxes label[y_ind, x_ind, 5 + cls_ind] = 1 return label, len(objs) ================================================ FILE: Chapter05/1_segnet.py ================================================ import tensorflow as tf input_height = 360 input_width = 480 kernel = 3 filter_size = 64 pad = 1 pool_size = 2 model = tf.keras.models.Sequential() model.add(tf.keras.layers.Layer(input_shape=(3, input_height, input_width))) # encoder model.add(tf.keras.layers.ZeroPadding2D(padding=(pad, pad))) model.add(tf.keras.layers.Conv2D(filter_size, kernel, kernel, border_mode='valid')) model.add(tf.keras.layers.BatchNormalization()) model.add(tf.keras.layers.Activation('relu')) model.add(tf.keras.layers.MaxPooling2D(pool_size=(pool_size, pool_size))) model.add(tf.keras.layers.ZeroPadding2D(padding=(pad, pad))) model.add(tf.keras.layers.Conv2D(128, kernel, kernel, border_mode='valid')) model.add(tf.keras.layers.BatchNormalization()) model.add(tf.keras.layers.Activation('relu')) model.add(tf.keras.layers.MaxPooling2D(pool_size=(pool_size, pool_size))) model.add(tf.keras.layers.ZeroPadding2D(padding=(pad, pad))) model.add(tf.keras.layers.Conv2D(256, kernel, kernel, border_mode='valid')) model.add(tf.keras.layers.BatchNormalization()) model.add(tf.keras.layers.Activation('relu')) model.add(tf.keras.layers.MaxPooling2D(pool_size=(pool_size, pool_size))) model.add(tf.keras.layers.ZeroPadding2D(padding=(pad, pad))) model.add(tf.keras.layers.Conv2D(512, kernel, kernel, border_mode='valid')) model.add(tf.keras.layers.BatchNormalization()) model.add(tf.keras.layers.Activation('relu')) # decoder model.add(tf.keras.layers.ZeroPadding2D(padding=(pad, pad))) model.add(tf.keras.layers.Conv2D(512, kernel, kernel, border_mode='valid')) model.add(tf.keras.layers.BatchNormalization()) model.add(tf.keras.layers.UpSampling2D(size=(pool_size, pool_size))) model.add(tf.keras.layers.ZeroPadding2D(padding=(pad, pad))) model.add(tf.keras.layers.Conv2D(256, kernel, kernel, border_mode='valid')) model.add(tf.keras.layers.BatchNormalization()) model.add(tf.keras.layers.UpSampling2D(size=(pool_size, pool_size))) model.add(tf.keras.layers.ZeroPadding2D(padding=(pad, pad))) model.add(tf.keras.layers.Conv2D(128, kernel, kernel, border_mode='valid')) model.add(tf.keras.layers.BatchNormalization()) model.add(tf.keras.layers.UpSampling2D(size=(pool_size, pool_size))) model.add(tf.keras.layers.ZeroPadding2D(padding=(pad, pad))) model.add(tf.keras.layers.Conv2D(filter_size, kernel, kernel, border_mode='valid')) model.add(tf.keras.layers.BatchNormalization()) model.add(tf.keras.layers.Conv2D(nClasses, 1, 1, border_mode='valid', )) model.outputHeight = model.output_shape[-2] model.outputWidth = model.output_shape[-1] model.add(tf.keras.layers.Reshape((nClasses, model.output_shape[-2] * model.output_shape[-1]), input_shape=(nClasses, model.output_shape[-2], model.output_shape[-1]))) model.add(tf.keras.layers.Permute((2, 1))) model.add(tf.keras.layers.Activation('softmax')) model.compile(loss="categorical_crossentropy", optimizer=tf.keras.optimizers.Adam, metrics=['accuracy']) ================================================ FILE: Chapter05/2_nerve_segmentation.py ================================================ import os from skimage.transform import resize from skimage.io import imsave import numpy as np import tensorflow as tf from data import load_train_data, load_test_data image_height, image_width = 96, 96 smoothness = 1.0 work_dir = '' def dice_coefficient(y1, y2): y1 = tf.flatten(y1) y2 = tf.flatten(y2) return (2. * tf.sum(y1 * y2) + smoothness) / (tf.sum(y1) + tf.sum(y2) + smoothness) def dice_coefficient_loss(y1, y2): return -dice_coefficient(y1, y2) def preprocess(imgs): imgs_p = np.ndarray((imgs.shape[0], image_height, image_width), dtype=np.uint8) for i in range(imgs.shape[0]): imgs_p[i] = resize(imgs[i], (image_width, image_height), preserve_range=True) imgs_p = imgs_p[..., np.newaxis] return imgs_p def convolution_layer(filters, kernel=(3,3), activation='relu', input_shape=None): if input_shape is None: return tf.keras.layers.Conv2D( filters=filters, kernel_size=kernel, activation=activation) else: return tf.keras.layers.Conv2D( filters=filters, kernel_size=kernel, activation=activation, input_shape=input_shape) def concatenated_de_convolution_layer(filters): return tf.keras.layers.concatenate([ tf.keras.layers.Conv2DTranspose( filters=filters, kernel=(2, 2), strides=(2, 2), padding='same' )], axis=3 ) def pooling_layer(): return tf.keras.layers.MaxPooling2D(pool_size=(2, 2)) unet = tf.keras.models.Sequential() inputs = tf.keras.layers.Input((image_height, image_width, 1)) input_shape = (image_height, image_width, 1) unet.add(convolution_layer(32, input_shape=input_shape)) unet.add(convolution_layer(32)) unet.add(pooling_layer()) unet.add(convolution_layer(64)) unet.add(convolution_layer(64)) unet.add(pooling_layer()) unet.add(convolution_layer(128)) unet.add(convolution_layer(128)) unet.add(pooling_layer()) unet.add(convolution_layer(256)) unet.add(convolution_layer(256)) unet.add(pooling_layer()) unet.add(convolution_layer(512)) unet.add(convolution_layer(512)) unet.add(concatenated_de_convolution_layer(256)) unet.add(convolution_layer(256)) unet.add(convolution_layer(256)) unet.add(concatenated_de_convolution_layer(128)) unet.add(convolution_layer(128)) unet.add(convolution_layer(128)) unet.add(concatenated_de_convolution_layer(64)) unet.add(convolution_layer(64)) unet.add(convolution_layer(64)) unet.add(concatenated_de_convolution_layer(32)) unet.add(convolution_layer(32)) unet.add(convolution_layer(32)) unet.add(convolution_layer(1, kernel=(1, 1), activation='sigmoid')) unet.compile(optimizer=tf.keras.optimizers.Adam(lr=1e-5), loss=dice_coefficient_loss, metrics=[dice_coefficient]) x_train, y_train_mask = load_train_data() x_train = preprocess(x_train) y_train_mask = preprocess(y_train_mask) x_train = x_train.astype('float32') mean = np.mean(x_train) std = np.std(x_train) x_train -= mean x_train /= std y_train_mask = y_train_mask.astype('float32') y_train_mask /= 255. unet.fit(x_train, y_train_mask, batch_size=32, epochs=20, verbose=1, shuffle=True, validation_split=0.2) x_test, y_test_mask = load_test_data() x_test = preprocess(x_test) x_test = x_test.astype('float32') x_test -= mean x_test /= std y_test_pred = unet.predict(x_test, verbose=1) for image, image_id in zip(y_test_pred, y_test_mask): image = (image[:, :, 0] * 255.).astype(np.uint8) imsave(os.path.join(work_dir, str(image_id) + '.png'), image) ================================================ FILE: Chapter05/3_satellite.py ================================================ import tensorflow as tf from .resnet50 import ResNet50 nb_labels = 6 input_shape = [28, 28] img_height, img_width, _ = input_shape input_tensor = tf.keras.layers.Input(shape=input_shape) weights = 'imagenet' resnet50_model = ResNet50( include_top=False, weights='imagenet', input_tensor=input_tensor) final_32 = resnet50_model.get_layer('final_32').output final_16 = resnet50_model.get_layer('final_16').output final_x8 = resnet50_model.get_layer('final_x8').output c32 = tf.keras.layers.Conv2D(nb_labels, (1, 1))(final_32) c16 = tf.keras.layers.Conv2D(nb_labels, (1, 1))(final_16) c8 = tf.keras.layers.Conv2D(nb_labels, (1, 1))(final_x8) def resize_bilinear(images): return tf.image.resize_bilinear(images, [img_height, img_width]) r32 = tf.keras.layers.Lambda(resize_bilinear)(c32) r16 = tf.keras.layers.Lambda(resize_bilinear)(c16) r8 = tf.keras.layers.Lambda(resize_bilinear)(c8) m = tf.keras.layers.Add()([r32, r16, r8]) x = tf.keras.ayers.Reshape((img_height * img_width, nb_labels))(m) x = tf.keras.layers.Activation('img_height')(x) x = tf.keras.layers.Reshape((img_height, img_width, nb_labels))(x) fcn_model = tf.keras.models.Model(input=input_tensor, output=x) ================================================ FILE: Chapter05/data.py ================================================ from __future__ import print_function import os import numpy as np from skimage.io import imsave, imread data_path = 'raw/' image_rows = 420 image_cols = 580 def create_train_data(): train_data_path = os.path.join(data_path, 'train') images = os.listdir(train_data_path) total = int(len(images) / 2) imgs = np.ndarray((total, image_rows, image_cols), dtype=np.uint8) imgs_mask = np.ndarray((total, image_rows, image_cols), dtype=np.uint8) i = 0 print('-'*30) print('Creating training images...') print('-'*30) for image_name in images: if 'mask' in image_name: continue image_mask_name = image_name.split('.')[0] + '_mask.tif' img = imread(os.path.join(train_data_path, image_name), as_grey=True) img_mask = imread(os.path.join(train_data_path, image_mask_name), as_grey=True) img = np.array([img]) img_mask = np.array([img_mask]) imgs[i] = img imgs_mask[i] = img_mask if i % 100 == 0: print('Done: {0}/{1} images'.format(i, total)) i += 1 print('Loading done.') np.save('imgs_train.npy', imgs) np.save('imgs_mask_train.npy', imgs_mask) print('Saving to .npy files done.') def load_train_data(): imgs_train = np.load('imgs_train.npy') imgs_mask_train = np.load('imgs_mask_train.npy') return imgs_train, imgs_mask_train def create_test_data(): train_data_path = os.path.join(data_path, 'test') images = os.listdir(train_data_path) total = len(images) imgs = np.ndarray((total, image_rows, image_cols), dtype=np.uint8) imgs_id = np.ndarray((total, ), dtype=np.int32) i = 0 print('-'*30) print('Creating test images...') print('-'*30) for image_name in images: img_id = int(image_name.split('.')[0]) img = imread(os.path.join(train_data_path, image_name), as_grey=True) img = np.array([img]) imgs[i] = img imgs_id[i] = img_id if i % 100 == 0: print('Done: {0}/{1} images'.format(i, total)) i += 1 print('Loading done.') np.save('imgs_test.npy', imgs) np.save('imgs_id_test.npy', imgs_id) print('Saving to .npy files done.') def load_test_data(): imgs_test = np.load('imgs_test.npy') imgs_id = np.load('imgs_id_test.npy') return imgs_test, imgs_id if __name__ == '__main__': create_train_data() create_test_data() ================================================ FILE: Chapter06/1_contrastive_loss.py ================================================ import tensorflow as tf def contrastive_loss(model_1, model_2, label, margin=0.1): distance = tf.reduce_sum(tf.square(model_1 - model_2), 1) loss = label * tf.square( tf.maximum(0., margin - tf.sqrt(distance))) + (1 - label) * distance loss = 0.5 * tf.reduce_mean(loss) return loss ================================================ FILE: Chapter06/2_siamese_network.py ================================================ import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data mnist_data = input_data.read_data_sets('MNIST_data', one_hot=True) input_size = 784 no_classes = 10 batch_size = 100 total_batches = 300 def add_variable_summary(tf_variable, summary_name): with tf.name_scope(summary_name + '_summary'): mean = tf.reduce_mean(tf_variable) tf.summary.scalar('Mean', mean) with tf.name_scope('standard_deviation'): standard_deviation = tf.sqrt(tf.reduce_mean( tf.square(tf_variable - mean))) tf.summary.scalar('StandardDeviation', standard_deviation) tf.summary.scalar('Maximum', tf.reduce_max(tf_variable)) tf.summary.scalar('Minimum', tf.reduce_min(tf_variable)) tf.summary.histogram('Histogram', tf_variable) def convolution_layer(input_layer, filters, kernel_size=[3, 3], activation=tf.nn.relu): layer = tf.layers.conv2d( inputs=input_layer, filters=filters, kernel_size=kernel_size, activation=activation ) add_variable_summary(layer, 'convolution') return layer def pooling_layer(input_layer, pool_size=[2, 2], strides=2): layer = tf.layers.max_pooling2d( inputs=input_layer, pool_size=pool_size, strides=strides ) add_variable_summary(layer, 'pooling') return layer def dense_layer(input_layer, units, activation=tf.nn.relu): layer = tf.layers.dense( inputs=input_layer, units=units, activation=activation ) add_variable_summary(layer, 'dense') return layer def get_model(input_): input_reshape = tf.reshape(input_, [-1, 28, 28, 1], name='input_reshape') convolution_layer_1 = convolution_layer(input_reshape, 64) pooling_layer_1 = pooling_layer(convolution_layer_1) convolution_layer_2 = convolution_layer(pooling_layer_1, 128) pooling_layer_2 = pooling_layer(convolution_layer_2) flattened_pool = tf.reshape(pooling_layer_2, [-1, 5 * 5 * 128], name='flattened_pool') dense_layer_bottleneck = dense_layer(flattened_pool, 1024) return dense_layer_bottleneck left_input = tf.placeholder(tf.float32, shape=[None, input_size]) right_input = tf.placeholder(tf.float32, shape=[None, input_size]) y_input = tf.placeholder(tf.float32, shape=[None, no_classes]) left_bottleneck = get_model(left_input) right_bottleneck = get_model(right_input) dense_layer_bottleneck = tf.concat([left_bottleneck, right_bottleneck], 1) dropout_bool = tf.placeholder(tf.bool) dropout_layer = tf.layers.dropout( inputs=dense_layer_bottleneck, rate=0.4, training=dropout_bool ) logits = dense_layer(dropout_layer, no_classes) with tf.name_scope('loss'): softmax_cross_entropy = tf.nn.softmax_cross_entropy_with_logits( labels=y_input, logits=logits) loss_operation = tf.reduce_mean(softmax_cross_entropy, name='loss') tf.summary.scalar('loss', loss_operation) with tf.name_scope('optimiser'): optimiser = tf.train.AdamOptimizer().minimize(loss_operation) with tf.name_scope('accuracy'): with tf.name_scope('correct_prediction'): predictions = tf.argmax(logits, 1) correct_predictions = tf.equal(predictions, tf.argmax(y_input, 1)) with tf.name_scope('accuracy'): accuracy_operation = tf.reduce_mean( tf.cast(correct_predictions, tf.float32)) tf.summary.scalar('accuracy', accuracy_operation) session = tf.Session() session.run(tf.global_variables_initializer()) merged_summary_operation = tf.summary.merge_all() train_summary_writer = tf.summary.FileWriter('/tmp/train', session.graph) test_summary_writer = tf.summary.FileWriter('/tmp/test') test_images, test_labels = mnist_data.test.images, mnist_data.test.labels for batch_no in range(total_batches): mnist_batch = mnist_data.train.next_batch(batch_size) train_images, train_labels = mnist_batch[0], mnist_batch[1] _, merged_summary = session.run([optimiser, merged_summary_operation], feed_dict={ left_input: train_images, right_input: train_images, y_input: train_labels, dropout_bool: True }) train_summary_writer.add_summary(merged_summary, batch_no) if batch_no % 10 == 0: merged_summary, _ = session.run([merged_summary_operation, accuracy_operation], feed_dict={ left_input: test_images, right_input: test_images, y_input: test_labels, dropout_bool: False }) test_summary_writer.add_summary(merged_summary, batch_no) ================================================ FILE: Chapter06/3_triplet_loss.py ================================================ import tensorflow as tf def triplet_loss(anchor_face, positive_face, negative_face, margin): def get_distance(x, y): return tf.reduce_sum(tf.square(tf.subtract(x, y)), 1) positive_distance = get_distance(anchor_face, positive_face) negative_distance = get_distance(anchor_face, negative_face) total_distance = tf.add(tf.subtract(positive_distance, negative_distance), margin) return tf.reduce_mean(tf.maximum(total_distance, 0.0), 0) ================================================ FILE: Chapter06/4_triplet_mining.py ================================================ from scipy.spatial.distance import cdist import numpy as np def mine_triplets(anchor, targets, negative_samples): distances = cdist(anchor, targets, 'cosine') distances = cdist(anchor, targets, 'cosine').tolist() QnQ_duplicated = [ [target_index for target_index, dist in enumerate(QnQ_dist) if dist == QnQ_dist[query_index]] for query_index, QnQ_dist in enumerate(distances)] for i, QnT_dist in enumerate(QnT_dists): for j in QnQ_duplicated[i]: QnT_dist.itemset(j, np.inf) QnT_dists_topk = QnT_dists.argsort(axis=1)[:, :negative_samples] top_k_index = np.array([np.insert(QnT_dist, 0, i) for i, QnT_dist in enumerate(QnT_dists_topk)]) return top_k_index ================================================ FILE: Chapter06/5_fiducial_points.py ================================================ import fiducial_data import tensorflow as tf def add_variable_summary(tf_variable, summary_name): with tf.name_scope(summary_name + '_summary'): mean = tf.reduce_mean(tf_variable) tf.summary.scalar('Mean', mean) with tf.name_scope('standard_deviation'): standard_deviation = tf.sqrt(tf.reduce_mean( tf.square(tf_variable - mean))) tf.summary.scalar('StandardDeviation', standard_deviation) tf.summary.scalar('Maximum', tf.reduce_max(tf_variable)) tf.summary.scalar('Minimum', tf.reduce_min(tf_variable)) tf.summary.histogram('Histogram', tf_variable) def convolution_layer(input_layer, filters, kernel_size=[3, 3], activation=tf.nn.tanh): layer = tf.layers.conv2d( inputs=input_layer, filters=filters, kernel_size=kernel_size, activation=activation ) add_variable_summary(layer, 'convolution') return layer def pooling_layer(input_layer, pool_size=[2, 2], strides=2): layer = tf.layers.max_pooling2d( inputs=input_layer, pool_size=pool_size, strides=strides ) add_variable_summary(layer, 'pooling') return layer def dense_layer(input_layer, units, activation=tf.nn.tanh): layer = tf.layers.dense( inputs=input_layer, units=units, activation=activation ) add_variable_summary(layer, 'dense') return layer image_size = 40 no_landmark = 10 no_gender_classes = 2 no_smile_classes = 2 no_glasses_classes = 2 no_headpose_classes = 5 batch_size = 100 total_batches = 300 image_input = tf.placeholder(tf.float32, shape=[None, image_size, image_size]) landmark_input = tf.placeholder(tf.float32, shape=[None, no_landmark]) gender_input = tf.placeholder(tf.float32, shape=[None, no_gender_classes]) smile_input = tf.placeholder(tf.float32, shape=[None, no_smile_classes]) glasses_input = tf.placeholder(tf.float32, shape=[None, no_glasses_classes]) headpose_input = tf.placeholder(tf.float32, shape=[None, no_headpose_classes]) image_input_reshape = tf.reshape(image_input, [-1, image_size, image_size, 1], name='input_reshape') convolution_layer_1 = convolution_layer(image_input_reshape, 16) pooling_layer_1 = pooling_layer(convolution_layer_1) convolution_layer_2 = convolution_layer(pooling_layer_1, 48) pooling_layer_2 = pooling_layer(convolution_layer_2) convolution_layer_3 = convolution_layer(pooling_layer_2, 64) pooling_layer_3 = pooling_layer(convolution_layer_3) convolution_layer_4 = convolution_layer(pooling_layer_3, 64) flattened_pool = tf.reshape(convolution_layer_4, [-1, 5 * 5 * 64], name='flattened_pool') dense_layer_bottleneck = dense_layer(flattened_pool, 1024) dropout_bool = tf.placeholder(tf.bool) dropout_layer = tf.layers.dropout( inputs=dense_layer_bottleneck, rate=0.4, training=dropout_bool ) landmark_logits = dense_layer(dropout_layer, 10) smile_logits = dense_layer(dropout_layer, 2) glass_logits = dense_layer(dropout_layer, 2) gender_logits = dense_layer(dropout_layer, 2) headpose_logits = dense_layer(dropout_layer, 5) landmark_loss = 0.5 * tf.reduce_mean( tf.square(landmark_input, landmark_logits)) gender_loss = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits( labels=gender_input, logits=gender_logits)) smile_loss = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits( labels=smile_input, logits=smile_logits)) glass_loss = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits( labels=glasses_input, logits=glass_logits)) headpose_loss = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits( labels=headpose_input, logits=headpose_logits)) loss_operation = landmark_loss + gender_loss + \ smile_loss + glass_loss + headpose_loss optimiser = tf.train.AdamOptimizer().minimize(loss_operation) session = tf.Session() session.run(tf.initialize_all_variables()) fiducial_test_data = fiducial_data.test for batch_no in range(total_batches): fiducial_data_batch = fiducial_data.train.next_batch(batch_size) loss, _landmark_loss, _ = session.run( [loss_operation, landmark_loss, optimiser], feed_dict={ image_input: fiducial_data_batch.images, landmark_input: fiducial_data_batch.landmarks, gender_input: fiducial_data_batch.gender, smile_input: fiducial_data_batch.smile, glasses_input: fiducial_data_batch.glasses, headpose_input: fiducial_data_batch.pose, dropout_bool: True }) if batch_no % 10 == 0: loss, _landmark_loss, _ = session.run( [loss_operation, landmark_loss], feed_dict={ image_input: fiducial_test_data.images, landmark_input: fiducial_test_data.landmarks, gender_input: fiducial_test_data.gender, smile_input: fiducial_test_data.smile, glasses_input: fiducial_test_data.glasses, headpose_input: fiducial_test_data.pose, dropout_bool: False }) ================================================ FILE: Chapter06/6_extract_features.py ================================================ from scipy import misc import tensorflow as tf import numpy as np import os import facenet print facenet from facenet import load_model, prewhiten import align.detect_face def load_and_align_data(image_paths, image_size=160, margin=44, gpu_memory_fraction=1.0): minsize = 20 threshold = [0.6, 0.7, 0.7] factor = 0.709 print('Creating networks and loading parameters') with tf.Graph().as_default(): gpu_options = tf.GPUOptions( per_process_gpu_memory_fraction=gpu_memory_fraction) sess = tf.Session(config=tf.ConfigProto( gpu_options=gpu_options, log_device_placement=False)) with sess.as_default(): pnet, rnet, onet = align.detect_face.create_mtcnn(sess, None) nrof_samples = len(image_paths) img_list = [None] * nrof_samples for i in range(nrof_samples): img = misc.imread(os.path.expanduser(image_paths[i]), mode='RGB') img_size = np.asarray(img.shape)[0:2] bounding_boxes, _ = align.detect_face.detect_face( img, minsize, pnet, rnet, onet, threshold, factor) det = np.squeeze(bounding_boxes[0, 0:4]) bb = np.zeros(4, dtype=np.int32) bb[0] = np.maximum(det[0] - margin / 2, 0) bb[1] = np.maximum(det[1] - margin / 2, 0) bb[2] = np.minimum(det[2] + margin / 2, img_size[1]) bb[3] = np.minimum(det[3] + margin / 2, img_size[0]) cropped = img[bb[1]:bb[3], bb[0]:bb[2], :] aligned = misc.imresize( cropped, (image_size, image_size), interp='bilinear') prewhitened = prewhiten(aligned) img_list[i] = prewhitened images = np.stack(img_list) return images def get_face_embeddings(image_paths, model=''): images = load_and_align_data(image_paths) with tf.Graph().as_default(): with tf.Session() as sess: load_model(model) images_placeholder = tf.get_default_graph().get_tensor_by_name( "input:0") embeddings = tf.get_default_graph().get_tensor_by_name( "embeddings:0") phase_train_placeholder = tf.get_default_graph().get_tensor_by_name( "phase_train:0") feed_dict = {images_placeholder: images, phase_train_placeholder: False} emb = sess.run(embeddings, feed_dict=feed_dict) return emb def compute_distance(embedding_1, embedding_2): dist = np.sqrt(np.sum(np.square(np.subtract(embedding_1, embedding_2)))) return dist ================================================ FILE: Chapter07/1_caption_attention.py ================================================ import tensorflow as tf from keras.layers.recurrent import * training = True sequence_length = 0 vocabulary_size = 0 input_tensor = 0 input_shape = 0 embedding_dimension = 0 dropout_prob = 0.3 previous_words = 0 height = 0 shape = 0 cnn_features = 0 depth = 0 vgg_model = tf.keras.applications.vgg16.VGG16(weights='imagenet', include_top=False, input_tensor=input_tensor, input_shape=input_shape) word_embedding = tf.keras.layers.Embedding( vocabulary_size, embedding_dimension, input_length=sequence_length) embbedding = word_embedding(previous_words) embbedding = tf.keras.layers.Activation('relu')(embbedding) embbedding = tf.keras.layers.Dropout(dropout_prob)(embbedding) cnn_features_flattened = tf.keras.layers.Reshape((height * height, shape))(cnn_features) net = tf.keras.layers.GlobalAveragePooling1D()(cnn_features_flattened) net = tf.keras.layers.Dense(embedding_dimension, activation='relu')(net) net = tf.keras.layers.Dropout(dropout_prob)(net) net = tf.keras.layers.RepeatVector(sequence_length)(net) net = tf.keras.layers.concatenate()([net, embbedding]) net = tf.keras.layers.Dropout(dropout_prob)(net) class LSTM_sent(Recurrent): def __init__(self, output_dim, init='glorot_uniform', inner_init='orthogonal', forget_bias_init='one', activation='tanh', inner_activation='hard_sigmoid', W_regularizer=None, U_regularizer=None, b_regularizer=None, dropout_W=0., dropout_U=0., sentinel=True, **kwargs): self.output_dim = output_dim self.init = initializations.get(init) self.inner_init = initializations.get(inner_init) self.forget_bias_init = initializations.get(forget_bias_init) self.activation = activations.get(activation) self.inner_activation = activations.get(inner_activation) self.W_regularizer = regularizers.get(W_regularizer) self.U_regularizer = regularizers.get(U_regularizer) self.b_regularizer = regularizers.get(b_regularizer) self.dropout_W, self.dropout_U = dropout_W, dropout_U self.sentinel = sentinel if self.dropout_W or self.dropout_U: self.uses_learning_phase = True super(LSTM_sent, self).__init__(**kwargs) def build(self, input_shape): self.input_spec = [InputSpec(shape=input_shape)] input_dim = input_shape[2] self.input_dim = input_dim if self.stateful: self.reset_states() else: if self.sentinel: self.states = [None, None] else: self.states = [None] self.W_i = self.init((input_dim, self.output_dim), name='{}_W_i'.format(self.name)) self.U_i = self.inner_init((self.output_dim, self.output_dim), name='{}_U_i'.format(self.name)) self.b_i = K.zeros((self.output_dim,), name='{}_b_i'.format(self.name)) self.W_f = self.init((input_dim, self.output_dim), name='{}_W_f'.format(self.name)) self.U_f = self.inner_init((self.output_dim, self.output_dim), name='{}_U_f'.format(self.name)) self.b_f = self.forget_bias_init((self.output_dim,), name='{}_b_f'.format(self.name)) self.W_c = self.init((input_dim, self.output_dim), name='{}_W_c'.format(self.name)) self.U_c = self.inner_init((self.output_dim, self.output_dim), name='{}_U_c'.format(self.name)) self.b_c = K.zeros((self.output_dim,), name='{}_b_c'.format(self.name)) self.W_o = self.init((input_dim, self.output_dim), name='{}_W_o'.format(self.name)) self.U_o = self.inner_init((self.output_dim, self.output_dim), name='{}_U_o'.format(self.name)) self.b_o = K.zeros((self.output_dim,), name='{}_b_o'.format(self.name)) if self.sentinel: # sentinel gate self.W_g = self.init((input_dim, self.output_dim), name='{}_W_g'.format(self.name)) self.U_g = self.inner_init((self.output_dim, self.output_dim), name='{}_U_g'.format(self.name)) self.b_g = K.zeros((self.output_dim,), name='{}_b_g'.format(self.name)) self.trainable_weights = [self.W_i, self.U_i, self.b_i, self.W_c, self.U_c, self.b_c, self.W_f, self.U_f, self.b_f, self.W_o, self.U_o, self.b_o, self.W_g, self.U_g, self.b_g] else: self.trainable_weights = [self.W_i, self.U_i, self.b_i, self.W_c, self.U_c, self.b_c, self.W_f, self.U_f, self.b_f, self.W_o, self.U_o, self.b_o] if self.initial_weights is not None: self.set_weights(self.initial_weights) del self.initial_weights def reset_states(self): assert self.stateful, 'Layer must be stateful.' input_shape = self.input_spec[0].shape if not input_shape[0]: raise Exception('If a RNN is stateful, a complete ' + 'input_shape must be provided (including batch size).') if hasattr(self, 'states'): K.set_value(self.states[0], np.zeros((input_shape[0], self.output_dim))) K.set_value(self.states[1], np.zeros((input_shape[0], self.output_dim))) else: self.states = [K.zeros((input_shape[0], self.output_dim)), K.zeros((input_shape[0], self.output_dim))] def preprocess_input(self, x, train=False): if self.consume_less == 'cpu': if train and (0 < self.dropout_W < 1): dropout = self.dropout_W else: dropout = 0 input_shape = self.input_spec[0].shape input_dim = input_shape[2] timesteps = input_shape[1] x_i = time_distributed_dense(x, self.W_i, self.b_i, dropout, input_dim, self.output_dim, timesteps) x_f = time_distributed_dense(x, self.W_f, self.b_f, dropout, input_dim, self.output_dim, timesteps) x_c = time_distributed_dense(x, self.W_c, self.b_c, dropout, input_dim, self.output_dim, timesteps) x_o = time_distributed_dense(x, self.W_o, self.b_o, dropout, input_dim, self.output_dim, timesteps) if self.sentinel: x_g = time_distributed_dense(x, self.W_g, self.b_g, dropout, input_dim, self.output_dim, timesteps) return K.concatenate([x_i, x_f, x_c, x_o,x_g], axis=2) else: return K.concatenate([x_i, x_f, x_c, x_o], axis=2) else: return x def step(self, x, states): h_tm1 = states[0] c_tm1 = states[1] B_U = states[2] B_W = states[3] if self.consume_less == 'cpu': x_i = x[:, :self.output_dim] x_f = x[:, self.output_dim: 2 * self.output_dim] x_c = x[:, 2 * self.output_dim: 3 * self.output_dim] x_o = x[:, 3 * self.output_dim: 4 * self.output_dim] if self.sentinel: x_g = x[:, 4 * self.output_dim:] else: x_i = K.dot(x, self.W_i) + self.b_i x_f = K.dot(x * B_W[1], self.W_f) + self.b_f x_c = K.dot(x * B_W[2], self.W_c) + self.b_c x_o = K.dot(x * B_W[3], self.W_o) + self.b_o if self.sentinel: x_g = K.dot(x * B_W[4], self.W_g) + self.b_g i = self.inner_activation(x_i + K.dot(h_tm1 * B_U[0], self.U_i)) f = self.inner_activation(x_f + K.dot(h_tm1 * B_U[1], self.U_f)) c = f * c_tm1 + i * self.activation(x_c + K.dot(h_tm1 * B_U[2], self.U_c)) o = self.inner_activation(x_o + K.dot(h_tm1 * B_U[3], self.U_o)) h = o * self.activation(c) if self.sentinel: g = self.inner_activation(x_g + K.dot(h_tm1 * B_U[4], self.U_g)) s = g * self.activation(c) return [h,s], [h, c] else: return h, [h, c] def get_constants(self, x): constants = [] if self.sentinel: Ngate = 5 else: Ngate = 4 if 0 < self.dropout_U < 1: ones = K.ones_like(K.reshape(x[:, 0, 0], (-1, 1))) ones = K.concatenate([ones] * self.output_dim, 1) B_U = [K.dropout(ones, self.dropout_U) for _ in range(Ngate)] constants.append(B_U) else: constants.append([K.cast_to_floatx(1.) for _ in range(Ngate)]) if self.consume_less == 'cpu' and 0 < self.dropout_W < 1: input_shape = self.input_spec[0].shape input_dim = input_shape[-1] ones = K.ones_like(K.reshape(x[:, 0, 0], (-1, 1))) ones = K.concatenate([ones] * input_dim, 1) B_W = [K.dropout(ones, self.dropout_W) for _ in range(Ngate)] constants.append(B_W) else: constants.append([K.cast_to_floatx(1.) for _ in range(Ngate)]) return constants def get_output_shape_for(self, input_shape): if isinstance(input_shape, list) and len(input_shape) > 1: input_shape = input_shape[0] if self.return_sequences: output_shape = (input_shape[0], input_shape[1], self.output_dim) else: output_shape = (input_shape[0], self.output_dim) #the hidden state and the sentinel have the same shape if self.sentinel: return [output_shape, output_shape] else: return output_shape def compute_mask(self, input, mask): if self.return_sequences: if self.sentinel: return [mask, mask] else : return mask else: if self.sentinel: return [None, None] else: return None def call(self, x, mask=None): input_shape = self.input_spec[0].shape if K._BACKEND == 'tensorflow': if not input_shape[1]: raise Exception('When using TensorFlow, you should define ' 'explicitly the number of timesteps of ' 'your sequences.\n' 'If your first layer is an Embedding, ' 'make sure to pass it an "input_length" ' 'argument. Otherwise, make sure ' 'the first layer has ' 'an "input_shape" or "batch_input_shape" ' 'argument, including the time axis. ' 'Found input shape at layer ' + self.name + ': ' + str(input_shape)) if self.stateful: initial_states = self.states else: initial_states = self.get_initial_states(x) constants = self.get_constants(x) preprocessed_input = self.preprocess_input(x) last_output, outputs, states = K.rnn(self.step, preprocessed_input, initial_states, go_backwards=self.go_backwards, mask=mask, constants=constants, unroll=self.unroll, input_length=input_shape[1]) if self.stateful: self.updates = [] for i in range(len(states)): self.updates.append((self.states[i], states[i])) if self.sentinel: outputs = K.permute_dimensions(outputs, [0,2,1,3]) if self.return_sequences: return [outputs[0], outputs[1]] else: return [last_output[0],last_output[1]] else: if self.return_sequences: return outputs else: return last_output def get_config(self): config = {"output_dim": self.output_dim, "init": self.init.__name__, "inner_init": self.inner_init.__name__, "forget_bias_init": self.forget_bias_init.__name__, "activation": self.activation.__name__, "inner_activation": self.inner_activation.__name__, "W_regularizer": self.W_regularizer.get_config() if self.W_regularizer else None, "U_regularizer": self.U_regularizer.get_config() if self.U_regularizer else None, "b_regularizer": self.b_regularizer.get_config() if self.b_regularizer else None, "dropout_W": self.dropout_W, "dropout_U": self.dropout_U, "sentinel": self.sentinel} base_config = super(LSTM_sent, self).get_config() return dict(list(base_config.items()) + list(config.items())) lstm_ = LSTM_sent(output_dim = args_dict.lstm_dim, return_sequences=True,stateful=True, dropout_W = dropout_prob, dropout_U = dropout_prob, sentinel=True,name='hs') h, s = lstm_(x) num_vfeats = wh * wh num_vfeats = num_vfeats + 1 h_out_linear = tf.keras.layers.Convolution1D( depth, 1, activation='tanh', border_mode='same')(h) h_out_linear = tf.keras.layers.Dropout( dropout_prob)(h_out_linear) h_out_embed = tf.keras.layers.Convolution1D( embedding_dimension, 1, border_mode='same')(h_out_linear) z_h_embed = tf.keras.layers.TimeDistributed( tf.keras.layers.RepeatVector(num_vfeats))(h_out_embed) Vi = tf.keras.layers.Convolution1D( depth, 1, border_mode='same', activation='relu')(V) Vi = tf.keras.layers.Dropout(dropout_prob)(Vi) Vi_emb = tf.keras.layers.Convolution1D( embedding_dimension, 1, border_mode='same', activation='relu')(Vi) z_v_linear = tf.keras.layers.TimeDistributed( tf.keras.layers.RepeatVector(sequence_length))(Vi) z_v_embed = tf.keras.layers.TimeDistributed( tf.keras.layers.RepeatVector(sequence_length))(Vi_emb) z_v_linear = tf.keras.layers.Permute((2, 1, 3))(z_v_linear) z_v_embed = tf.keras.layers.Permute((2, 1, 3))(z_v_embed) fake_feat = tf.keras.layers.Convolution1D( depth, 1, activation='relu', border_mode='same')(s) fake_feat = tf.keras.layers.Dropout(dropout_prob)(fake_feat) fake_feat_embed = tf.keras.layers.Convolution1D( embedding_dimension, 1, border_mode='same')(fake_feat) z_s_linear = tf.keras.layers.Reshape((sequence_length, 1, depth))(fake_feat) z_s_embed = tf.keras.layers.Reshape( (sequence_length, 1, embedding_dimension))(fake_feat_embed) z_v_linear = tf.keras.layers.concatenate(axis=-2)([z_v_linear, z_s_linear]) z_v_embed = tf.keras.layers.concatenate(axis=-2)([z_v_embed, z_s_embed]) z = tf.keras.layers.Merge(mode='sum')([z_h_embed,z_v_embed]) z = tf.keras.layers.Dropout(dropout_prob)(z) z = tf.keras.layers.TimeDistributed( tf.keras.layers.Activation('tanh'))(z) attention= tf.keras.layers.TimeDistributed( tf.keras.layers.Convolution1D(1, 1, border_mode='same'))(z) attention = tf.keras.layers.Reshape((sequence_length, num_vfeats))(attention) attention = tf.keras.layers.TimeDistributed( tf.keras.layers.Activation('softmax'))(attention) attention = tf.keras.layers.TimeDistributed( tf.keras.layers.RepeatVector(depth))(attention) attention = tf.keras.layers.Permute((1,3,2))(attention) w_Vi = tf.keras.layers.Add()([attention,z_v_linear]) sumpool = tf.keras.layers.Lambda(lambda x: K.sum(x, axis=-2), output_shape=(depth,)) c_vec = tf.keras.layers.TimeDistributed(sumpool)(w_Vi) atten_out = tf.keras.layers.Merge(mode='sum')([h_out_linear,c_vec]) h = tf.keras.layers.TimeDistributed( tf.keras.layers.Dense(embedding_dimension,activation='tanh'))(atten_out) h = tf.keras.layers.Dropout(dropout_prob)(h) predictions = tf.keras.layers.TimeDistributed( tf.keras.layers.Dense(vocabulary_size, activation='softmax'))(h) model = tf.keras.models.Model(input=[cnn_features, prev_words], output=predictions) opt = get_opt(args_dict) in_im = tf.keras.layers.Input( batch_shape=(args_dict.bs, args_dict.imsize, args_dict.imsize, 3), name='image') wh = vgg_model.output_shape[1] dim = vgg_model.output_shape[3] if not args_dict.cnn_train: for i,layer in enumerate(convnet.layers): if i > args_dict.finetune_start_layer: layer.trainable = False imfeats = vgg_model(in_im) cnn_features = tf.keras.layers.Input(batch_shape=(args_dict.bs, wh, wh, dim)) prev_words = tf.keras.layers.Input(batch_shape=(args_dict.bs, sequence_length)) lang_model = language_model(args_dict, wh, dim, cnn_features, prev_words) out = lang_model([imfeats,prev_words]) model = tf.keras.models.Model(input=[in_im, prev_words], output=out) ================================================ FILE: Chapter08/1_style_transfer.py ================================================ import numpy as np from PIL import Image from scipy.optimize import fmin_l_bfgs_b from scipy.misc import imsave from vgg16_avg import VGG16_Avg from keras import metrics from keras.models import Model from keras import backend as K import tensorflow as tf work_dir = '' content_image = Image.open(work_dir + 'bird_orig.png') imagenet_mean = np.array([123.68, 116.779, 103.939], dtype=np.float32) def subtract_imagenet_mean(image): return (image - imagenet_mean)[:, :, :, ::-1] def add_imagenet_mean(image, s): return np.clip(image.reshape(s)[:, :, :, ::-1] + imagenet_mean, 0, 255) vgg_model = VGG16_Avg(include_top=False) content_layer = vgg_model.get_layer('block5_conv1').output content_model = Model(vgg_model.input, content_layer) content_image_array = subtract_imagenet_mean(np.expand_dims(np.array(content_image), 0)) content_image_shape = content_image_array.shape target = K.variable(content_model.predict(content_image_array)) class ConvexOptimiser(object): def __init__(self, cost_function, tensor_shape): self.cost_function = cost_function self.tensor_shape = tensor_shape self.gradient_values = None def loss(self, point): loss_value, self.gradient_values = self.cost_function([point.reshape(self.tensor_shape)]) return loss_value.astype(np.float64) def gradients(self, point): return self.gradient_values.flatten().astype(np.float64) mse_loss = metrics.mean_squared_error(content_layer, target) grads = K.gradients(mse_loss, vgg_model.input) cost_function = K.function([vgg_model.input], [mse_loss]+grads) optimiser = ConvexOptimiser(cost_function, content_image_shape) def optimise(optimiser, iterations, point, tensor_shape, file_name): for i in range(iterations): point, min_val, info = fmin_l_bfgs_b(optimiser.loss, point.flatten(), fprime=optimiser.gradients, maxfun=20) point = np.clip(point, -127, 127) print('Loss:', min_val) imsave(work_dir + 'gen_'+file_name+'_{i}.png', add_imagenet_mean(point.copy(), tensor_shape)[0]) return point def generate_rand_img(shape): return np.random.uniform(-2.5, 2.5, shape)/1 generated_image = generate_rand_img(content_image_shape) iterations = 2 generated_image = optimise(optimiser, iterations, generated_image, content_image_shape, 'content') # Style transfer style_image = Image.open(work_dir + 'starry_night.png') style_image = style_image.resize(np.divide(style_image.size, 3.5).astype('int32')) style_image_array = subtract_imagenet_mean(np.expand_dims(style_image, 0)[:, :, :, :3]) style_image_shape = style_image_array.shape vgg_model = VGG16_Avg(include_top=False, input_shape=style_image_shape[1:]) style_layers = {layer.name: layer.output for layer in vgg_model.layers} style_features = [style_layers['block{}_conv1'.format(o)] for o in range(1,3)] layers_model = Model(vgg_model.input, style_features) style_targets = [K.variable(feature) for feature in layers_model.predict(style_image_array)] def grammian_matrix(matrix): flattened_matrix = K.batch_flatten(K.permute_dimensions(matrix, (2, 0, 1))) matrix_transpose_dot = K.dot(flattened_matrix, K.transpose(flattened_matrix)) element_count = matrix.get_shape().num_elements() return matrix_transpose_dot / element_count def style_mse_loss(x, y): return metrics.mse(grammian_matrix(x), grammian_matrix(y)) style_loss = sum(style_mse_loss(l1[0], l2[0]) for l1, l2 in zip(style_features, style_targets)) grads = K.gradients(style_loss, vgg_model.input) style_fn = K.function([vgg_model.input], [style_loss]+grads) optimiser = ConvexOptimiser(style_fn, style_image_shape) generated_image = generate_rand_img(style_image_shape) generated_image = optimise(optimiser, iterations, generated_image, style_image_shape, 'style') w, h = style_image.size src = content_image_array[:, :h, :w] outputs = {l.name: l.output for l in vgg_model.layers} style_layers = [outputs['block{}_conv2'.format(o)] for o in range(1,6)] content_name = 'block4_conv2' content_layer = outputs[content_name] style_model = Model(vgg_model.input, style_layers) style_targs = [K.variable(o) for o in style_model.predict(style_image_array)] content_model = Model(vgg_model.input, content_layer) content_targ = K.variable(content_model.predict(src)) style_wgts = [0.05, 0.2, 0.2, 0.25, 0.3] loss = sum(style_loss(l1[0], l2[0])*w for l1, l2, w in zip(style_layers, style_targs, style_wgts)) loss += metrics.mse(content_layer, content_targ)/10 grads = tf.keras.backend.gradients(loss, vgg_model.input) transfer_fn = tf.keras.backend.function([vgg_model.input], [loss]+grads) evaluator = ConvexOptimiser(transfer_fn, style_image_shape) enerated_image = generate_rand_img(style_image_shape) generated_image = optimise(optimiser, iterations, generated_image, style_image_shape) # Content plus style transfer style_width, style_height = style_image.size content_image_array = content_image_array[:, :style_height, :style_width] style_layers_2 = [style_layers['block{}_conv2'.format(block_no)] for block_no in range(1,6)] content_layer = style_layers['block4_conv2'] style_model = Model(vgg_model.input, style_layers_2) style_targets = [K.variable(o) for o in style_model.predict(style_image_array)] content_model = Model(vgg_model.input, content_layer) content_target = K.variable(content_model.predict(content_image_array)) style_weights = [0.05, 0.2, 0.2, 0.25, 0.3] style_loss = sum(style_loss(l1[0], l2[0])*w for l1, l2, w in zip(style_layers_2, style_targets, style_weights)) content_loss = metrics.mse(content_layer, content_target)/10 loss = style_loss + content_loss gradients = K.gradients(loss, vgg_model.input) transfer_fn = K.function([vgg_model.input], [loss]+gradients) optimiser = ConvexOptimiser(transfer_fn, style_image_shape) generated_image = generate_rand_img(style_image_shape) generated_image = optimise(optimiser, iterations, generated_image, style_image_shape) ================================================ FILE: Chapter08/2_vanilla_gan.py ================================================ import tensorflow as tf batch_size = 32 input_dimension = [227, 227] real_images = None def add_variable_summary(tf_variable, summary_name): with tf.name_scope(summary_name + '_summary'): mean = tf.reduce_mean(tf_variable) tf.summary.scalar('Mean', mean) with tf.name_scope('standard_deviation'): standard_deviation = tf.sqrt(tf.reduce_mean( tf.square(tf_variable - mean))) tf.summary.scalar('StandardDeviation', standard_deviation) tf.summary.scalar('Maximum', tf.reduce_max(tf_variable)) tf.summary.scalar('Minimum', tf.reduce_min(tf_variable)) tf.summary.histogram('Histogram', tf_variable) def convolution_layer(input_layer, filters, kernel_size=[4, 4], activation=tf.nn.leaky_relu): layer = tf.layers.conv2d( inputs=input_layer, filters=filters, kernel_size=kernel_size, activation=activation, kernel_regularizer=tf.nn.l2_loss, bias_regularizer=tf.nn.l2_loss, ) add_variable_summary(layer, 'convolution') return layer def transpose_convolution_layer(input_layer, filters, kernel_size=[4, 4], activation=tf.nn.relu, strides=2): layer = tf.layers.conv2d_transpose( inputs=input_layer, filters=filters, kernel_size=kernel_size, activation=activation, strides=strides, kernel_regularizer=tf.nn.l2_loss, bias_regularizer=tf.nn.l2_loss, ) add_variable_summary(layer, 'convolution') return layer def pooling_layer(input_layer, pool_size=[2, 2], strides=2): layer = tf.layers.max_pooling2d( inputs=input_layer, pool_size=pool_size, strides=strides ) add_variable_summary(layer, 'pooling') return layer def dense_layer(input_layer, units, activation=tf.nn.relu): layer = tf.layers.dense( inputs=input_layer, units=units, activation=activation ) add_variable_summary(layer, 'dense') return layer def get_generator(input_noise, is_training=True): generator = dense_layer(input_noise, 1024) generator = tf.layers.batch_normalization(generator, training=is_training) generator = dense_layer(generator, 7 * 7 * 256) generator = tf.layers.batch_normalization(generator, training=is_training) generator = tf.reshape(generator, [-1, 7, 7, 256]) generator = transpose_convolution_layer(generator, 64) generator = tf.layers.batch_normalization(generator, training=is_training) generator = transpose_convolution_layer(generator, 32) generator = tf.layers.batch_normalization(generator, training=is_training) generator = convolution_layer(generator, 3) generator = convolution_layer(generator, 1, activation=tf.nn.tanh) print(generator) return generator def get_discriminator(image, is_training=True): x_input_reshape = tf.reshape(image, [-1, 28, 28, 1], name='input_reshape') discriminator = convolution_layer(x_input_reshape, 64) discriminator = convolution_layer(discriminator, 128) discriminator = tf.layers.flatten(discriminator) discriminator = dense_layer(discriminator, 1024) discriminator = tf.layers.batch_normalization(discriminator, training=is_training) discriminator = dense_layer(discriminator, 2) return discriminator input_noise = tf.random_normal([batch_size, input_dimension]) gan = tf.contrib.gan.gan_model( get_generator, get_discriminator, real_images, input_noise) tf.contrib.gan.gan_train( tf.contrib.gan.gan_train_ops( gan, tf.contrib.gan.gan_loss(gan), tf.train.AdamOptimizer(0.001), tf.train.AdamOptimizer(0.0001))) ================================================ FILE: Chapter08/3_conditional_gan.py ================================================ import tensorflow as tf batch_size = 32 input_dimension = [227, 227] real_images = None labels = None def add_variable_summary(tf_variable, summary_name): with tf.name_scope(summary_name + '_summary'): mean = tf.reduce_mean(tf_variable) tf.summary.scalar('Mean', mean) with tf.name_scope('standard_deviation'): standard_deviation = tf.sqrt(tf.reduce_mean( tf.square(tf_variable - mean))) tf.summary.scalar('StandardDeviation', standard_deviation) tf.summary.scalar('Maximum', tf.reduce_max(tf_variable)) tf.summary.scalar('Minimum', tf.reduce_min(tf_variable)) tf.summary.histogram('Histogram', tf_variable) def convolution_layer(input_layer, filters, kernel_size=[4, 4], activation=tf.nn.leaky_relu): layer = tf.layers.conv2d( inputs=input_layer, filters=filters, kernel_size=kernel_size, activation=activation, kernel_regularizer=tf.nn.l2_loss, bias_regularizer=tf.nn.l2_loss, ) add_variable_summary(layer, 'convolution') return layer def transpose_convolution_layer(input_layer, filters, kernel_size=[4, 4], activation=tf.nn.relu, strides=2): layer = tf.layers.conv2d_transpose( inputs=input_layer, filters=filters, kernel_size=kernel_size, activation=activation, strides=strides, kernel_regularizer=tf.nn.l2_loss, bias_regularizer=tf.nn.l2_loss, ) add_variable_summary(layer, 'convolution') return layer def pooling_layer(input_layer, pool_size=[2, 2], strides=2): layer = tf.layers.max_pooling2d( inputs=input_layer, pool_size=pool_size, strides=strides ) add_variable_summary(layer, 'pooling') return layer def dense_layer(input_layer, units, activation=tf.nn.relu): layer = tf.layers.dense( inputs=input_layer, units=units, activation=activation ) add_variable_summary(layer, 'dense') return layer def get_generator(input_noise, is_training=True): generator = dense_layer(input_noise, 1024) generator = tf.layers.batch_normalization(generator, training=is_training) generator = dense_layer(generator, 7 * 7 * 256) generator = tf.layers.batch_normalization(generator, training=is_training) generator = tf.reshape(generator, [-1, 7, 7, 256]) generator = transpose_convolution_layer(generator, 64) generator = tf.layers.batch_normalization(generator, training=is_training) generator = transpose_convolution_layer(generator, 32) generator = tf.layers.batch_normalization(generator, training=is_training) generator = convolution_layer(generator, 3) generator = convolution_layer(generator, 1, activation=tf.nn.tanh) print(generator) return generator def get_discriminator(image, is_training=True): x_input_reshape = tf.reshape(image, [-1, 28, 28, 1], name='input_reshape') discriminator = convolution_layer(x_input_reshape, 64) discriminator = convolution_layer(discriminator, 128) discriminator = tf.layers.flatten(discriminator) discriminator = dense_layer(discriminator, 1024) discriminator = tf.layers.batch_normalization(discriminator, training=is_training) discriminator = dense_layer(discriminator, 2) return discriminator input_noise = tf.random_normal([batch_size, input_dimension]) gan = tf.contrib.gan.gan_model( get_generator, get_discriminator, real_images, (input_noise, labels)) ================================================ FILE: Chapter08/4_adverserial_loss.py ================================================ import tensorflow as tf batch_size = 32 input_dimension = [227, 227] real_images = None labels = None def add_variable_summary(tf_variable, summary_name): with tf.name_scope(summary_name + '_summary'): mean = tf.reduce_mean(tf_variable) tf.summary.scalar('Mean', mean) with tf.name_scope('standard_deviation'): standard_deviation = tf.sqrt(tf.reduce_mean( tf.square(tf_variable - mean))) tf.summary.scalar('StandardDeviation', standard_deviation) tf.summary.scalar('Maximum', tf.reduce_max(tf_variable)) tf.summary.scalar('Minimum', tf.reduce_min(tf_variable)) tf.summary.histogram('Histogram', tf_variable) def convolution_layer(input_layer, filters, kernel_size=[4, 4], activation=tf.nn.leaky_relu): layer = tf.layers.conv2d( inputs=input_layer, filters=filters, kernel_size=kernel_size, activation=activation, kernel_regularizer=tf.nn.l2_loss, bias_regularizer=tf.nn.l2_loss, ) add_variable_summary(layer, 'convolution') return layer def transpose_convolution_layer(input_layer, filters, kernel_size=[4, 4], activation=tf.nn.relu, strides=2): layer = tf.layers.conv2d_transpose( inputs=input_layer, filters=filters, kernel_size=kernel_size, activation=activation, strides=strides, kernel_regularizer=tf.nn.l2_loss, bias_regularizer=tf.nn.l2_loss, ) add_variable_summary(layer, 'convolution') return layer def pooling_layer(input_layer, pool_size=[2, 2], strides=2): layer = tf.layers.max_pooling2d( inputs=input_layer, pool_size=pool_size, strides=strides ) add_variable_summary(layer, 'pooling') return layer def dense_layer(input_layer, units, activation=tf.nn.relu): layer = tf.layers.dense( inputs=input_layer, units=units, activation=activation ) add_variable_summary(layer, 'dense') return layer def get_generator(input_noise, is_training=True): generator = dense_layer(input_noise, 1024) generator = tf.layers.batch_normalization(generator, training=is_training) generator = dense_layer(generator, 7 * 7 * 256) generator = tf.layers.batch_normalization(generator, training=is_training) generator = tf.reshape(generator, [-1, 7, 7, 256]) generator = transpose_convolution_layer(generator, 64) generator = tf.layers.batch_normalization(generator, training=is_training) generator = transpose_convolution_layer(generator, 32) generator = tf.layers.batch_normalization(generator, training=is_training) generator = convolution_layer(generator, 3) generator = convolution_layer(generator, 1, activation=tf.nn.tanh) print(generator) return generator def get_discriminator(image, is_training=True): x_input_reshape = tf.reshape(image, [-1, 28, 28, 1], name='input_reshape') discriminator = convolution_layer(x_input_reshape, 64) discriminator = convolution_layer(discriminator, 128) discriminator = tf.layers.flatten(discriminator) discriminator = dense_layer(discriminator, 1024) discriminator = tf.layers.batch_normalization(discriminator, training=is_training) discriminator = dense_layer(discriminator, 2) return discriminator def fully_connected_layer(input_layer, units): return tf.layers.dense( input_layer, units=units, activation=tf.nn.relu ) def convolution_layer(input_layer, filter_size): return tf.layers.conv2d( input_layer, filters=filter_size, kernel_initializer=tf.contrib.layers.xavier_initializer_conv2d(), kernel_size=3, strides=2 ) def deconvolution_layer(input_layer, filter_size, activation=tf.nn.relu): return tf.layers.conv2d_transpose( input_layer, filters=filter_size, kernel_initializer=tf.contrib.layers.xavier_initializer_conv2d(), kernel_size=3, activation=activation, strides=2 ) def get_autoencoder(): input_layer = tf.placeholder(tf.float32, [None, 128, 128, 3]) convolution_layer_1 = convolution_layer(input_layer, 1024) convolution_layer_2 = convolution_layer(convolution_layer_1, 512) convolution_layer_3 = convolution_layer(convolution_layer_2, 256) convolution_layer_4 = convolution_layer(convolution_layer_3, 128) convolution_layer_5 = convolution_layer(convolution_layer_4, 32) convolution_layer_5_flattened = tf.layers.flatten(convolution_layer_5) bottleneck_layer = fully_connected_layer(convolution_layer_5_flattened, 16) c5_shape = convolution_layer_5.get_shape().as_list() c5f_flat_shape = convolution_layer_5_flattened.get_shape().as_list()[1] fully_connected = fully_connected_layer(bottleneck_layer, c5f_flat_shape) fully_connected = tf.reshape(fully_connected, [-1, c5_shape[1], c5_shape[2], c5_shape[3]]) deconvolution_layer_1 = deconvolution_layer(fully_connected, 128) deconvolution_layer_2 = deconvolution_layer(deconvolution_layer_1, 256) deconvolution_layer_3 = deconvolution_layer(deconvolution_layer_2, 512) deconvolution_layer_4 = deconvolution_layer(deconvolution_layer_3, 1024) deconvolution_layer_5 = deconvolution_layer(deconvolution_layer_4, 3, activation=tf.nn.tanh) return deconvolution_layer_5 gan = tf.contrib.gan.gan_model( get_autoencoder, get_discriminator, real_images, real_images) loss = tf.contrib.gan.gan_loss( gan, gradient_penalty=1.0) l1_pixel_loss = tf.norm(gan.real_data - gan.generated_data, ord=1) loss = tf.contrib.gan.losses.combine_adversarial_loss( loss, gan, l1_pixel_loss, weight_factor=1) ================================================ FILE: Chapter08/5_image_translation.py ================================================ import tensorflow as tf batch_size = 32 input_dimension = [227, 227] real_images = None labels = None input_images = None def add_variable_summary(tf_variable, summary_name): with tf.name_scope(summary_name + '_summary'): mean = tf.reduce_mean(tf_variable) tf.summary.scalar('Mean', mean) with tf.name_scope('standard_deviation'): standard_deviation = tf.sqrt(tf.reduce_mean( tf.square(tf_variable - mean))) tf.summary.scalar('StandardDeviation', standard_deviation) tf.summary.scalar('Maximum', tf.reduce_max(tf_variable)) tf.summary.scalar('Minimum', tf.reduce_min(tf_variable)) tf.summary.histogram('Histogram', tf_variable) def convolution_layer(input_layer, filters, kernel_size=[4, 4], activation=tf.nn.leaky_relu): layer = tf.layers.conv2d( inputs=input_layer, filters=filters, kernel_size=kernel_size, activation=activation, kernel_regularizer=tf.nn.l2_loss, bias_regularizer=tf.nn.l2_loss, ) add_variable_summary(layer, 'convolution') return layer def transpose_convolution_layer(input_layer, filters, kernel_size=[4, 4], activation=tf.nn.relu, strides=2): layer = tf.layers.conv2d_transpose( inputs=input_layer, filters=filters, kernel_size=kernel_size, activation=activation, strides=strides, kernel_regularizer=tf.nn.l2_loss, bias_regularizer=tf.nn.l2_loss, ) add_variable_summary(layer, 'convolution') return layer def pooling_layer(input_layer, pool_size=[2, 2], strides=2): layer = tf.layers.max_pooling2d( inputs=input_layer, pool_size=pool_size, strides=strides ) add_variable_summary(layer, 'pooling') return layer def dense_layer(input_layer, units, activation=tf.nn.relu): layer = tf.layers.dense( inputs=input_layer, units=units, activation=activation ) add_variable_summary(layer, 'dense') return layer def get_generator(input_noise, is_training=True): generator = dense_layer(input_noise, 1024) generator = tf.layers.batch_normalization(generator, training=is_training) generator = dense_layer(generator, 7 * 7 * 256) generator = tf.layers.batch_normalization(generator, training=is_training) generator = tf.reshape(generator, [-1, 7, 7, 256]) generator = transpose_convolution_layer(generator, 64) generator = tf.layers.batch_normalization(generator, training=is_training) generator = transpose_convolution_layer(generator, 32) generator = tf.layers.batch_normalization(generator, training=is_training) generator = convolution_layer(generator, 3) generator = convolution_layer(generator, 1, activation=tf.nn.tanh) print(generator) return generator def get_discriminator(image, is_training=True): x_input_reshape = tf.reshape(image, [-1, 28, 28, 1], name='input_reshape') discriminator = convolution_layer(x_input_reshape, 64) discriminator = convolution_layer(discriminator, 128) discriminator = tf.layers.flatten(discriminator) discriminator = dense_layer(discriminator, 1024) discriminator = tf.layers.batch_normalization(discriminator, training=is_training) discriminator = dense_layer(discriminator, 2) return discriminator gan = tf.contrib.gan.gan_model( get_generator, get_discriminator, real_images, input_images) loss = tf.contrib.gan.gan_loss( gan, tf.contrib.gan.losses.least_squares_generator_loss, tf.contrib.gan.losses.least_squares_discriminator_loss) l1_loss = tf.norm(gan.real_data - gan.generated_data, ord=1) gan_loss = tf.contrib.gan.losses.combine_adversarial_loss( loss, gan, l1_loss, weight_factor=1) ================================================ FILE: Chapter08/6_infogan.py ================================================ import tensorflow as tf batch_size = 32 input_dimension = [227, 227] real_images = None labels = None unstructured_input = None structured_input = None def add_variable_summary(tf_variable, summary_name): with tf.name_scope(summary_name + '_summary'): mean = tf.reduce_mean(tf_variable) tf.summary.scalar('Mean', mean) with tf.name_scope('standard_deviation'): standard_deviation = tf.sqrt(tf.reduce_mean( tf.square(tf_variable - mean))) tf.summary.scalar('StandardDeviation', standard_deviation) tf.summary.scalar('Maximum', tf.reduce_max(tf_variable)) tf.summary.scalar('Minimum', tf.reduce_min(tf_variable)) tf.summary.histogram('Histogram', tf_variable) def convolution_layer(input_layer, filters, kernel_size=[4, 4], activation=tf.nn.leaky_relu): layer = tf.layers.conv2d( inputs=input_layer, filters=filters, kernel_size=kernel_size, activation=activation, kernel_regularizer=tf.nn.l2_loss, bias_regularizer=tf.nn.l2_loss, ) add_variable_summary(layer, 'convolution') return layer def transpose_convolution_layer(input_layer, filters, kernel_size=[4, 4], activation=tf.nn.relu, strides=2): layer = tf.layers.conv2d_transpose( inputs=input_layer, filters=filters, kernel_size=kernel_size, activation=activation, strides=strides, kernel_regularizer=tf.nn.l2_loss, bias_regularizer=tf.nn.l2_loss, ) add_variable_summary(layer, 'convolution') return layer def pooling_layer(input_layer, pool_size=[2, 2], strides=2): layer = tf.layers.max_pooling2d( inputs=input_layer, pool_size=pool_size, strides=strides ) add_variable_summary(layer, 'pooling') return layer def dense_layer(input_layer, units, activation=tf.nn.relu): layer = tf.layers.dense( inputs=input_layer, units=units, activation=activation ) add_variable_summary(layer, 'dense') return layer def get_generator(input_noise, is_training=True): generator = dense_layer(input_noise, 1024) generator = tf.layers.batch_normalization(generator, training=is_training) generator = dense_layer(generator, 7 * 7 * 256) generator = tf.layers.batch_normalization(generator, training=is_training) generator = tf.reshape(generator, [-1, 7, 7, 256]) generator = transpose_convolution_layer(generator, 64) generator = tf.layers.batch_normalization(generator, training=is_training) generator = transpose_convolution_layer(generator, 32) generator = tf.layers.batch_normalization(generator, training=is_training) generator = convolution_layer(generator, 3) generator = convolution_layer(generator, 1, activation=tf.nn.tanh) print(generator) return generator def get_discriminator(image, is_training=True): x_input_reshape = tf.reshape(image, [-1, 28, 28, 1], name='input_reshape') discriminator = convolution_layer(x_input_reshape, 64) discriminator = convolution_layer(discriminator, 128) discriminator = tf.layers.flatten(discriminator) discriminator = dense_layer(discriminator, 1024) discriminator = tf.layers.batch_normalization(discriminator, training=is_training) discriminator = dense_layer(discriminator, 2) return discriminator info_gan = tf.contrib.gan.infogan_model( get_generator, get_discriminator, real_images, unstructured_input, structured_input) loss = tf.contrib.gan.gan_loss( info_gan, gradient_penalty_weight=1, gradient_penalty_epsilon=1e-10, mutual_information_penalty_weight=1) ================================================ FILE: Chapter08/utils.py ================================================ import math, keras, datetime, pandas as pd, numpy as np, keras.backend as K, threading, json, re, collections import tarfile, tensorflow as tf, matplotlib.pyplot as plt, xgboost, operator, random, pickle, glob, os, bcolz import shutil, sklearn, functools, itertools, scipy from PIL import Image from concurrent.futures import ProcessPoolExecutor, as_completed, ThreadPoolExecutor import matplotlib.patheffects as PathEffects from sklearn.preprocessing import LabelEncoder, StandardScaler from sklearn.neighbors import NearestNeighbors, LSHForest import IPython from IPython.display import display, Audio from numpy.random import normal from gensim.models import word2vec from keras.preprocessing.text import Tokenizer from nltk.tokenize import ToktokTokenizer, StanfordTokenizer from functools import reduce from itertools import chain from tensorflow.python.framework import ops #from tensorflow.contrib import rnn, legacy_seq2seq as seq2seq from keras_tqdm import TQDMNotebookCallback from keras import initializations from keras.applications.resnet50 import ResNet50, decode_predictions, conv_block, identity_block from keras.applications.vgg16 import VGG16 from keras.preprocessing import image from keras.preprocessing.sequence import pad_sequences from keras.models import Model, Sequential from keras.layers import * from keras.optimizers import Adam from keras.regularizers import l2 from keras.utils.data_utils import get_file from keras.applications.imagenet_utils import decode_predictions, preprocess_input np.set_printoptions(threshold=50, edgeitems=20) def beep(): return Audio(filename='/home/jhoward/beep.mp3', autoplay=True) def dump(obj, fname): pickle.dump(obj, open(fname, 'wb')) def load(fname): return pickle.load(open(fname, 'rb')) def limit_mem(): K.get_session().close() cfg = K.tf.ConfigProto() cfg.gpu_options.allow_growth = True K.set_session(K.tf.Session(config=cfg)) def autolabel(plt, fmt='%.2f'): rects = plt.patches ax = rects[0].axes y_bottom, y_top = ax.get_ylim() y_height = y_top - y_bottom for rect in rects: height = rect.get_height() if height / y_height > 0.95: label_position = height - (y_height * 0.06) else: label_position = height + (y_height * 0.01) txt = ax.text(rect.get_x() + rect.get_width()/2., label_position, fmt % height, ha='center', va='bottom') txt.set_path_effects([PathEffects.withStroke(linewidth=3, foreground='w')]) def column_chart(lbls, vals, val_lbls='%.2f'): n = len(lbls) p = plt.bar(np.arange(n), vals) plt.xticks(np.arange(n), lbls) if val_lbls: autolabel(p, val_lbls) def save_array(fname, arr): c=bcolz.carray(arr, rootdir=fname, mode='w') c.flush() def load_array(fname): return bcolz.open(fname)[:] def load_glove(loc): return (load_array(loc+'.dat'), pickle.load(open(loc+'_words.pkl','rb'), encoding='latin1'), pickle.load(open(loc+'_idx.pkl','rb'), encoding='latin1')) def plot_multi(im, dim=(4,4), figsize=(6,6), **kwargs ): plt.figure(figsize=figsize) for i,img in enumerate(im): plt.subplot(*((dim)+(i+1,))) plt.imshow(img, **kwargs) plt.axis('off') plt.tight_layout() def plot_train(hist): h = hist.history if 'acc' in h: meas='acc' loc='lower right' else: meas='loss' loc='upper right' plt.plot(hist.history[meas]) plt.plot(hist.history['val_'+meas]) plt.title('model '+meas) plt.ylabel(meas) plt.xlabel('epoch') plt.legend(['train', 'validation'], loc=loc) def fit_gen(gen, fn, eval_fn, nb_iter): for i in range(nb_iter): fn(*next(gen)) if i % (nb_iter//10) == 0: eval_fn() def wrap_config(layer): return {'class_name': layer.__class__.__name__, 'config': layer.get_config()} def copy_layer(layer): return layer_from_config(wrap_config(layer)) def copy_layers(layers): return [copy_layer(layer) for layer in layers] def copy_weights(from_layers, to_layers): for from_layer,to_layer in zip(from_layers, to_layers): to_layer.set_weights(from_layer.get_weights()) def copy_model(m): res = Sequential(copy_layers(m.layers)) copy_weights(m.layers, res.layers) return res def insert_layer(model, new_layer, index): res = Sequential() for i,layer in enumerate(model.layers): if i==index: res.add(new_layer) copied = layer_from_config(wrap_config(layer)) res.add(copied) copied.set_weights(layer.get_weights()) return res ================================================ FILE: Chapter08/vgg16_avg.py ================================================ from __future__ import print_function from __future__ import absolute_import import warnings from keras.models import Model from keras.layers import Flatten, Dense, Input from keras.layers import Convolution2D, AveragePooling2D from keras.engine.topology import get_source_inputs from keras.utils.layer_utils import convert_all_kernels_in_model from keras.utils.data_utils import get_file from keras import backend as K from keras.applications.imagenet_utils import _obtain_input_shape TH_WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_th_dim_ordering_th_kernels.h5' TF_WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels.h5' TH_WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_th_dim_ordering_th_kernels_notop.h5' TF_WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5' def VGG16_Avg(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, classes=1000): if weights not in {'imagenet', None}: raise ValueError('The `weights` argument should be either ' '`None` (random initialization) or `imagenet` ' '(pre-training on ImageNet).') if weights == 'imagenet' and include_top and classes != 1000: raise ValueError('If using `weights` as imagenet with `include_top`' ' as true, `classes` should be 1000') # Determine proper input shape input_shape = _obtain_input_shape(input_shape, default_size=224, min_size=48, dim_ordering=K.image_dim_ordering(), include_top=include_top) if input_tensor is None: img_input = Input(shape=input_shape) else: if not K.is_keras_tensor(input_tensor): img_input = Input(tensor=input_tensor, shape=input_shape) else: img_input = input_tensor # Block 1 x = Convolution2D(64, 3, 3, activation='relu', border_mode='same', name='block1_conv1')(img_input) x = Convolution2D(64, 3, 3, activation='relu', border_mode='same', name='block1_conv2')(x) x = AveragePooling2D((2, 2), strides=(2, 2), name='block1_pool')(x) # Block 2 x = Convolution2D(128, 3, 3, activation='relu', border_mode='same', name='block2_conv1')(x) x = Convolution2D(128, 3, 3, activation='relu', border_mode='same', name='block2_conv2')(x) x = AveragePooling2D((2, 2), strides=(2, 2), name='block2_pool')(x) # Block 3 x = Convolution2D(256, 3, 3, activation='relu', border_mode='same', name='block3_conv1')(x) x = Convolution2D(256, 3, 3, activation='relu', border_mode='same', name='block3_conv2')(x) x = Convolution2D(256, 3, 3, activation='relu', border_mode='same', name='block3_conv3')(x) x = AveragePooling2D((2, 2), strides=(2, 2), name='block3_pool')(x) # Block 4 x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block4_conv1')(x) x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block4_conv2')(x) x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block4_conv3')(x) x = AveragePooling2D((2, 2), strides=(2, 2), name='block4_pool')(x) # Block 5 x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block5_conv1')(x) x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block5_conv2')(x) x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block5_conv3')(x) x = AveragePooling2D((2, 2), strides=(2, 2), name='block5_pool')(x) if include_top: # Classification block x = Flatten(name='flatten')(x) x = Dense(4096, activation='relu', name='fc1')(x) x = Dense(4096, activation='relu', name='fc2')(x) x = Dense(classes, activation='softmax', name='predictions')(x) # Ensure that the model takes into account # any potential predecessors of `input_tensor`. if input_tensor is not None: inputs = get_source_inputs(input_tensor) else: inputs = img_input # Create model. model = Model(inputs, x, name='vgg16') # load weights if weights == 'imagenet': if K.image_dim_ordering() == 'th': if include_top: weights_path = get_file('vgg16_weights_th_dim_ordering_th_kernels.h5', TH_WEIGHTS_PATH, cache_subdir='models') else: weights_path = get_file('vgg16_weights_th_dim_ordering_th_kernels_notop.h5', TH_WEIGHTS_PATH_NO_TOP, cache_subdir='models') model.load_weights(weights_path) if K.backend() == 'tensorflow': warnings.warn('You are using the TensorFlow backend, yet you ' 'are using the Theano ' 'image dimension ordering convention ' '(`image_dim_ordering="th"`). ' 'For best performance, set ' '`image_dim_ordering="tf"` in ' 'your Keras config ' 'at ~/.keras/keras.json.') convert_all_kernels_in_model(model) else: if include_top: weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels.h5', TF_WEIGHTS_PATH, cache_subdir='models') else: weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5', TF_WEIGHTS_PATH_NO_TOP, cache_subdir='models') model.load_weights(weights_path) if K.backend() == 'theano': convert_all_kernels_in_model(model) return model ================================================ FILE: Chapter09/1_video_to_frames_1.py ================================================ import cv2 video_path = '/Users/i335713/Desktop/epat/lecture recordings and live lectures/batch35epat (batch 35) Lecture Recordings Live Lec Additional Lecture on Machine Learning (.mp4' video_handle = cv2.VideoCapture(video_path) frame_no = 0 while True: eof, frame = video_handle.read() if not eof: break cv2.imwrite("frame%d.jpg" % frame_no, frame) frame_no += 1 ================================================ FILE: Chapter09/2_parallel_stream.py ================================================ import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data mnist_data = input_data.read_data_sets('MNIST_data', one_hot=True) input_size = 784 no_classes = 10 batch_size = 100 total_batches = 300 def add_variable_summary(tf_variable, summary_name): with tf.name_scope(summary_name + '_summary'): mean = tf.reduce_mean(tf_variable) tf.summary.scalar('Mean', mean) with tf.name_scope('standard_deviation'): standard_deviation = tf.sqrt(tf.reduce_mean( tf.square(tf_variable - mean))) tf.summary.scalar('StandardDeviation', standard_deviation) tf.summary.scalar('Maximum', tf.reduce_max(tf_variable)) tf.summary.scalar('Minimum', tf.reduce_min(tf_variable)) tf.summary.histogram('Histogram', tf_variable) def convolution_layer(input_layer, filters, kernel_size=[3, 3], activation=tf.nn.relu): layer = tf.layers.conv2d( inputs=input_layer, filters=filters, kernel_size=kernel_size, activation=activation ) add_variable_summary(layer, 'convolution') return layer def pooling_layer(input_layer, pool_size=[2, 2], strides=2): layer = tf.layers.max_pooling2d( inputs=input_layer, pool_size=pool_size, strides=strides ) add_variable_summary(layer, 'pooling') return layer def dense_layer(input_layer, units, activation=tf.nn.relu): layer = tf.layers.dense( inputs=input_layer, units=units, activation=activation ) add_variable_summary(layer, 'dense') return layer def get_model(input_): input_reshape = tf.reshape(input_, [-1, 28, 28, 1], name='input_reshape') convolution_layer_1 = convolution_layer(input_reshape, 64) pooling_layer_1 = pooling_layer(convolution_layer_1) convolution_layer_2 = convolution_layer(pooling_layer_1, 128) pooling_layer_2 = pooling_layer(convolution_layer_2) flattened_pool = tf.reshape(pooling_layer_2, [-1, 5 * 5 * 128], name='flattened_pool') return flattened_pool high_resolution_input = tf.placeholder(tf.float32, shape=[None, input_size]) low_resolution_input = tf.placeholder(tf.float32, shape=[None, input_size]) y_input = tf.placeholder(tf.float32, shape=[None, no_classes]) high_resolution_cnn = get_model(high_resolution_input) low_resolution_cnn = get_model(low_resolution_input) dense_layer_1 = tf.concat([high_resolution_cnn, low_resolution_cnn], 1) dense_layer_bottleneck = dense_layer(dense_layer_1, 1024) logits = dense_layer(dense_layer_bottleneck, no_classes) with tf.name_scope('loss'): softmax_cross_entropy = tf.nn.softmax_cross_entropy_with_logits( labels=y_input, logits=logits) loss_operation = tf.reduce_mean(softmax_cross_entropy, name='loss') tf.summary.scalar('loss', loss_operation) with tf.name_scope('optimiser'): optimiser = tf.train.AdamOptimizer().minimize(loss_operation) with tf.name_scope('accuracy'): with tf.name_scope('correct_prediction'): predictions = tf.argmax(logits, 1) correct_predictions = tf.equal(predictions, tf.argmax(y_input, 1)) with tf.name_scope('accuracy'): accuracy_operation = tf.reduce_mean( tf.cast(correct_predictions, tf.float32)) tf.summary.scalar('accuracy', accuracy_operation) session = tf.Session() session.run(tf.global_variables_initializer()) merged_summary_operation = tf.summary.merge_all() train_summary_writer = tf.summary.FileWriter('/tmp/train', session.graph) test_summary_writer = tf.summary.FileWriter('/tmp/test') test_images, test_labels = mnist_data.test.images, mnist_data.test.labels for batch_no in range(total_batches): mnist_batch = mnist_data.train.next_batch(batch_size) train_images, train_labels = mnist_batch[0], mnist_batch[1] _, merged_summary = session.run([optimiser, merged_summary_operation], feed_dict={ high_resolution_input: train_images, low_resolution_input: train_images, y_input: train_labels }) train_summary_writer.add_summary(merged_summary, batch_no) if batch_no % 10 == 0: merged_summary, _ = session.run([merged_summary_operation, accuracy_operation], feed_dict={ high_resolution_input: test_images, low_resolution_input: test_images, y_input: test_labels }) test_summary_writer.add_summary(merged_summary, batch_no) ================================================ FILE: Chapter09/3_lstm_after_cnn.py ================================================ import tensorflow as tf input_shape = [500,500] no_classes = 2 net = tf.keras.models.Sequential() net.add(tf.keras.layers.LSTM(2048, return_sequences=False, input_shape=input_shape, dropout=0.5)) net.add(tf.keras.layers.Dense(512, activation='relu')) net.add(tf.keras.layers.Dropout(0.5)) net.add(tf.keras.layers.Dense(no_classes, activation='softmax')) ================================================ FILE: Chapter09/4_3d_convolution.py ================================================ import tensorflow as tf input_shape = 227, 227, 200, 3 no_classes = 2 net = tf.keras.models.Sequential() net.add(tf.keras.layers.Conv3D(32, kernel_size=(3, 3, 3), input_shape=(input_shape))) net.add(tf.keras.layers.Activation('relu')) net.add(tf.keras.layers.Conv3D(32, (3, 3, 3))) net.add(tf.keras.layers.Activation('softmax')) net.add(tf.keras.layers.MaxPooling3D()) net.add(tf.keras.layers.Dropout(0.25)) net.add(tf.keras.layers.Conv3D(64, (3, 3, 3))) net.add(tf.keras.layers.Activation('relu')) net.add(tf.keras.layers.Conv3D(64, (3, 3, 3))) net.add(tf.keras.layers.Activation('softmax')) net.add(tf.keras.layers.MaxPool3D()) net.add(tf.keras.layers.Dropout(0.25)) net.add(tf.keras.layers.Flatten()) net.add(tf.keras.layers.Dense(512, activation='sigmoid')) net.add(tf.keras.layers.Dropout(0.5)) net.add(tf.keras.layers.Dense(no_classes, activation='softmax')) net.compile(loss=tf.keras.losses.categorical_crossentropy, optimizer=tf.keras.optimizers.Adam(), metrics=['accuracy']) ================================================ FILE: Chapter10/1_ios.py ================================================ import tfcoreml as tf_converter tf_converter.convert(tf_model_path='tf_model_path.pb', mlmodel_path='mlmodel_path.mlmodel', output_feature_names=['softmax:0'], input_name_shape_dict={'input:0': [1, 227, 227, 3]}) ================================================ FILE: LICENSE ================================================ MIT License Copyright (c) 2018 Packt Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ================================================ FILE: README.md ================================================ # Deep-Learning-for-Computer-Vision Code repository for Deep Learning for Computer Vision, by Packt This is the code repository for [Deep Learning for Computer Vision](https://www.packtpub.com/big-data-and-business-intelligence/deep-learning-computer-vision?utm_source=github&utm_medium=repository&utm_campaign=9781788295628), published by [Packt](https://www.packtpub.com/?utm_source=github). It contains all the supporting project files necessary to work through the book from start to finish. ## About the Book Deep learning has shown its power in several application areas of Artificial Intelligence, especially in Computer Vision. Computer Vision is the science of understanding and manipulating images, and finds enormous applications in the areas of robotics, automation, and so on. This book will also show you, with practical examples, how to develop Computer Vision applications by leveraging the power of deep learning. ## Instructions and Navigation All of the code is organized into folders. Each folder starts with a number followed by the application name. For example, Chapter02. The code will look like the following: ``` merged_summary_operation = tf.summary.merge_all() train_summary_writer = tf.summary.FileWriter('/tmp/train' , session.graph) test_summary_writer = tf.summary.FileWriter('/tmp/test' ) ``` The examples covered in this book can be run with Windows, Ubuntu, or Mac. All the installation instructions are covered. Basic knowledge of Python and machine learning is required. It's preferable that the reader has GPU hardware but it's not necessary ## Related Products * [Deep Learning with Keras](https://www.packtpub.com/big-data-and-business-intelligence/deep-learning-keras?utm_source=github&utm_medium=repository&utm_campaign=9781787128422) * [TensorFlow 1.x Deep Learning Cookbook](https://www.packtpub.com/big-data-and-business-intelligence/tensorflow-1x-deep-learning-cookbook?utm_source=github&utm_medium=repository&utm_campaign=9781788293594) * [Deep Learning with TensorFlow](https://www.packtpub.com/big-data-and-business-intelligence/deep-learning-tensorflow?utm_source=github&utm_medium=repository&utm_campaign=9781786469786) ### Download a free PDF If you have already purchased a print or Kindle version of this book, you can get a DRM-free PDF version at no cost.
Simply click on the link to claim your free PDF.

https://packt.link/free-ebook/9781788295628