Copy disabled (too large)
Download .txt
Showing preview only (10,501K chars total). Download the full file to get everything.
Repository: fchollet/deep-learning-with-python-notebooks
Branch: master
Commit: fbf7f1bf2041
Files: 59
Total size: 10.0 MB
Directory structure:
gitextract_laoqm0uq/
├── LICENSE
├── README.md
├── chapter02_mathematical-building-blocks.ipynb
├── chapter03_introduction-to-ml-frameworks.ipynb
├── chapter04_classification-and-regression.ipynb
├── chapter05_fundamentals-of-ml.ipynb
├── chapter07_deep-dive-keras.ipynb
├── chapter08_image-classification.ipynb
├── chapter09_convnet-architecture-patterns.ipynb
├── chapter10_interpreting-what-convnets-learn.ipynb
├── chapter11_image-segmentation.ipynb
├── chapter12_object-detection.ipynb
├── chapter13_timeseries-forecasting.ipynb
├── chapter14_text-classification.ipynb
├── chapter15_language-models-and-the-transformer.ipynb
├── chapter16_text-generation.ipynb
├── chapter17_image-generation.ipynb
├── chapter18_best-practices-for-the-real-world.ipynb
├── first_edition/
│ ├── 2.1-a-first-look-at-a-neural-network.ipynb
│ ├── 3.5-classifying-movie-reviews.ipynb
│ ├── 3.6-classifying-newswires.ipynb
│ ├── 3.7-predicting-house-prices.ipynb
│ ├── 4.4-overfitting-and-underfitting.ipynb
│ ├── 5.1-introduction-to-convnets.ipynb
│ ├── 5.2-using-convnets-with-small-datasets.ipynb
│ ├── 5.3-using-a-pretrained-convnet.ipynb
│ ├── 5.4-visualizing-what-convnets-learn.ipynb
│ ├── 6.1-one-hot-encoding-of-words-or-characters.ipynb
│ ├── 6.1-using-word-embeddings.ipynb
│ ├── 6.2-understanding-recurrent-neural-networks.ipynb
│ ├── 6.3-advanced-usage-of-recurrent-neural-networks.ipynb
│ ├── 6.4-sequence-processing-with-convnets.ipynb
│ ├── 8.1-text-generation-with-lstm.ipynb
│ ├── 8.2-deep-dream.ipynb
│ ├── 8.3-neural-style-transfer.ipynb
│ ├── 8.4-generating-images-with-vaes.ipynb
│ └── 8.5-introduction-to-gans.ipynb
└── second_edition/
├── README.md
├── chapter02_mathematical-building-blocks.ipynb
├── chapter03_introduction-to-keras-and-tf.ipynb
├── chapter04_getting-started-with-neural-networks.ipynb
├── chapter05_fundamentals-of-ml.ipynb
├── chapter07_working-with-keras.ipynb
├── chapter08_intro-to-dl-for-computer-vision.ipynb
├── chapter09_part01_image-segmentation.ipynb
├── chapter09_part02_modern-convnet-architecture-patterns.ipynb
├── chapter09_part03_interpreting-what-convnets-learn.ipynb
├── chapter10_dl-for-timeseries.ipynb
├── chapter11_part01_introduction.ipynb
├── chapter11_part02_sequence-models.ipynb
├── chapter11_part03_transformer.ipynb
├── chapter11_part04_sequence-to-sequence-learning.ipynb
├── chapter12_part01_text-generation.ipynb
├── chapter12_part02_deep-dream.ipynb
├── chapter12_part03_neural-style-transfer.ipynb
├── chapter12_part04_variational-autoencoders.ipynb
├── chapter12_part05_gans.ipynb
├── chapter13_best-practices-for-the-real-world.ipynb
└── chapter14_conclusions.ipynb
================================================
FILE CONTENTS
================================================
================================================
FILE: LICENSE
================================================
MIT License
Copyright (c) 2017-present François Chollet
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
================================================
FILE: README.md
================================================
# Companion notebooks for Deep Learning with Python
This repository contains Jupyter notebooks implementing the code samples found in the book [Deep Learning with Python, third edition (2025)](https://www.manning.com/books/deep-learning-with-python-third-edition?a_aid=keras&a_bid=76564dff)
by Francois Chollet and Matthew Watson. In addition, you will also find the legacy notebooks for the [second edition (2021)](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff)
and the [first edition (2017)](https://www.manning.com/books/deep-learning-with-python?a_aid=keras&a_bid=76564dff).
For readability, these notebooks only contain runnable code blocks and section titles, and omit everything else in the book: text paragraphs, figures, and pseudocode.
**If you want to be able to follow what's going on, I recommend reading the notebooks side by side with your copy of the book.**
## Running the code
We recommend running these notebooks on [Colab](https://colab.google), which
provides a hosted runtime with all the dependencies you will need. You can also,
run these notebooks locally, either by setting up your own Jupyter environment,
or using Colab's instructions for
[running locally](https://research.google.com/colaboratory/local-runtimes.html).
By default, all notebooks will run on Colab's free tier GPU runtime, which
is sufficient to run all code in this book. Chapter 8-18 chapters will benefit
from a faster GPU if you have a Colab Pro subscription. You can change your
runtime type using **Runtime -> Change runtime type** in Colab's dropdown menus.
## Choosing a backend
The code for third edition is written using Keras 3. As such, it can be run with
JAX, TensorFlow or PyTorch as a backend. To set the backend, update the backend
in the cell at the top of the colab that looks like this:
```python
import os
os.environ["KERAS_BACKEND"] = "jax"
```
This must be done only once per session before importing Keras. If you are
in the middle running a notebook, you will need to restart the notebook session
and rerun all relevant notebook cells. This can be done in using
**Runtime -> Restart Session** in Colab's dropdown menus.
## Using Kaggle data
This book uses datasets and model weights provided by Kaggle, an online Machine
Learning community and platform. You will need to create a Kaggle login to run
Kaggle code in this book; instructions are given in Chapter 8.
For chapters that need Kaggle data, you can login to Kaggle once per session
when you hit the notebook cell with `kagglehub.login()`. Alternately,
you can set up your Kaggle login information once as Colab secrets:
* Go to https://www.kaggle.com/ and sign in.
* Go to https://www.kaggle.com/settings and generate a Kaggle API key.
* Open the secrets tab in Colab by clicking the key icon on the left.
* Add two secrets, `KAGGLE_USERNAME` and `KAGGLE_KEY` with the username and key
you just created.
Following this approach you will only need to copy your Kaggle secret key once,
though you will need to allow each notebook to access your secrets when running
the relevant Kaggle code.
## Table of contents
* [Chapter 2: The mathematical building blocks of neural networks](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter02_mathematical-building-blocks.ipynb)
* [Chapter 3: Introduction to TensorFlow, PyTorch, JAX, and Keras](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter03_introduction-to-ml-frameworks.ipynb)
* [Chapter 4: Classification and regression](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter04_classification-and-regression.ipynb)
* [Chapter 5: Fundamentals of machine learning](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter05_fundamentals-of-ml.ipynb)
* [Chapter 7: A deep dive on Keras](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter07_deep-dive-keras.ipynb)
* [Chapter 8: Image Classification](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter08_image-classification.ipynb)
* [Chapter 9: Convnet architecture patterns](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter09_convnet-architecture-patterns.ipynb)
* [Chapter 10: Interpreting what ConvNets learn](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter10_interpreting-what-convnets-learn.ipynb)
* [Chapter 11: Image Segmentation](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter11_image-segmentation.ipynb)
* [Chapter 12: Object Detection](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter12_object-detection.ipynb)
* [Chapter 13: Timeseries Forecasting](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter13_timeseries-forecasting.ipynb)
* [Chapter 14: Text Classification](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter14_text-classification.ipynb)
* [Chapter 15: Language Models and the Transformer](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter15_language-models-and-the-transformer.ipynb)
* [Chapter 16: Text Generation](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter16_text-generation.ipynb)
* [Chapter 17: Image Generation](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter17_image-generation.ipynb)
* [Chapter 18: Best practices for the real world](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter18_best-practices-for-the-real-world.ipynb)
================================================
FILE: chapter02_mathematical-building-blocks.ipynb
================================================
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)."
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"!pip install keras keras-hub --upgrade -q"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import os\n",
"os.environ[\"KERAS_BACKEND\"] = \"tensorflow\""
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"cellView": "form",
"colab_type": "code"
},
"outputs": [],
"source": [
"# @title\n",
"import os\n",
"from IPython.core.magic import register_cell_magic\n",
"\n",
"@register_cell_magic\n",
"def backend(line, cell):\n",
" current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n",
" if current == required:\n",
" get_ipython().run_cell(cell)\n",
" else:\n",
" print(\n",
" f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n",
" f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n",
" )"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"## The mathematical building blocks of neural networks"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"### A first look at a neural network"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"from keras.datasets import mnist\n",
"\n",
"(train_images, train_labels), (test_images, test_labels) = mnist.load_data()"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"train_images.shape"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"len(train_labels)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"train_labels"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"test_images.shape"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"len(test_labels)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"test_labels"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import keras\n",
"from keras import layers\n",
"\n",
"model = keras.Sequential(\n",
" [\n",
" layers.Dense(512, activation=\"relu\"),\n",
" layers.Dense(10, activation=\"softmax\"),\n",
" ]\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model.compile(\n",
" optimizer=\"adam\",\n",
" loss=\"sparse_categorical_crossentropy\",\n",
" metrics=[\"accuracy\"],\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"train_images = train_images.reshape((60000, 28 * 28))\n",
"train_images = train_images.astype(\"float32\") / 255\n",
"test_images = test_images.reshape((10000, 28 * 28))\n",
"test_images = test_images.astype(\"float32\") / 255"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model.fit(train_images, train_labels, epochs=5, batch_size=128)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"test_digits = test_images[0:10]\n",
"predictions = model.predict(test_digits)\n",
"predictions[0]"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"predictions[0].argmax()"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"predictions[0][7]"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"test_labels[0]"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"test_loss, test_acc = model.evaluate(test_images, test_labels)\n",
"print(f\"test_acc: {test_acc}\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"### Data representations for neural networks"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Scalars (rank-0 tensors)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import numpy as np\n",
"x = np.array(12)\n",
"x"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"x.ndim"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Vectors (rank-1 tensors)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"x = np.array([12, 3, 6, 14, 7])\n",
"x"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"x.ndim"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Matrices (rank-2 tensors)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"x = np.array([[5, 78, 2, 34, 0],\n",
" [6, 79, 3, 35, 1],\n",
" [7, 80, 4, 36, 2]])\n",
"x.ndim"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Rank-3 tensors and higher-rank tensors"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"x = np.array([[[5, 78, 2, 34, 0],\n",
" [6, 79, 3, 35, 1],\n",
" [7, 80, 4, 36, 2]],\n",
" [[5, 78, 2, 34, 0],\n",
" [6, 79, 3, 35, 1],\n",
" [7, 80, 4, 36, 2]],\n",
" [[5, 78, 2, 34, 0],\n",
" [6, 79, 3, 35, 1],\n",
" [7, 80, 4, 36, 2]]])\n",
"x.ndim"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Key attributes"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"from keras.datasets import mnist\n",
"\n",
"(train_images, train_labels), (test_images, test_labels) = mnist.load_data()"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"train_images.ndim"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"train_images.shape"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"train_images.dtype"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
"\n",
"digit = train_images[4]\n",
"plt.imshow(digit, cmap=plt.cm.binary)\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"train_labels[4]"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Manipulating tensors in NumPy"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"my_slice = train_images[10:100]\n",
"my_slice.shape"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"my_slice = train_images[10:100, :, :]\n",
"my_slice.shape"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"my_slice = train_images[10:100, 0:28, 0:28]\n",
"my_slice.shape"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"my_slice = train_images[:, 14:, 14:]"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"my_slice = train_images[:, 7:-7, 7:-7]"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### The notion of data batches"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"batch = train_images[:128]"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"batch = train_images[128:256]"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"n = 3\n",
"batch = train_images[128 * n : 128 * (n + 1)]"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Real-world examples of data tensors"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Vector data"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Timeseries data or sequence data"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Image data"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Video data"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"### The gears of neural networks: Tensor operations"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Element-wise operations"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"def naive_relu(x):\n",
" assert len(x.shape) == 2\n",
" x = x.copy()\n",
" for i in range(x.shape[0]):\n",
" for j in range(x.shape[1]):\n",
" x[i, j] = max(x[i, j], 0)\n",
" return x"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"def naive_add(x, y):\n",
" assert len(x.shape) == 2\n",
" assert x.shape == y.shape\n",
" x = x.copy()\n",
" for i in range(x.shape[0]):\n",
" for j in range(x.shape[1]):\n",
" x[i, j] += y[i, j]\n",
" return x"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import time\n",
"\n",
"x = np.random.random((20, 100))\n",
"y = np.random.random((20, 100))\n",
"\n",
"t0 = time.time()\n",
"for _ in range(1000):\n",
" z = x + y\n",
" z = np.maximum(z, 0.0)\n",
"print(\"Took: {0:.2f} s\".format(time.time() - t0))"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"t0 = time.time()\n",
"for _ in range(1000):\n",
" z = naive_add(x, y)\n",
" z = naive_relu(z)\n",
"print(\"Took: {0:.2f} s\".format(time.time() - t0))"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Broadcasting"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import numpy as np\n",
"\n",
"X = np.random.random((32, 10))\n",
"y = np.random.random((10,))"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"y = np.expand_dims(y, axis=0)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"Y = np.tile(y, (32, 1))"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"def naive_add_matrix_and_vector(x, y):\n",
" assert len(x.shape) == 2\n",
" assert len(y.shape) == 1\n",
" assert x.shape[1] == y.shape[0]\n",
" x = x.copy()\n",
" for i in range(x.shape[0]):\n",
" for j in range(x.shape[1]):\n",
" x[i, j] += y[j]\n",
" return x"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import numpy as np\n",
"\n",
"x = np.random.random((64, 3, 32, 10))\n",
"y = np.random.random((32, 10))\n",
"z = np.maximum(x, y)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Tensor product"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"x = np.random.random((32,))\n",
"y = np.random.random((32,))\n",
"\n",
"z = np.matmul(x, y)\n",
"z = x @ y"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"def naive_vector_product(x, y):\n",
" assert len(x.shape) == 1\n",
" assert len(y.shape) == 1\n",
" assert x.shape[0] == y.shape[0]\n",
" z = 0.0\n",
" for i in range(x.shape[0]):\n",
" z += x[i] * y[i]\n",
" return z"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"def naive_matrix_vector_product(x, y):\n",
" assert len(x.shape) == 2\n",
" assert len(y.shape) == 1\n",
" assert x.shape[1] == y.shape[0]\n",
" z = np.zeros(x.shape[0])\n",
" for i in range(x.shape[0]):\n",
" for j in range(x.shape[1]):\n",
" z[i] += x[i, j] * y[j]\n",
" return z"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"def naive_matrix_vector_product(x, y):\n",
" z = np.zeros(x.shape[0])\n",
" for i in range(x.shape[0]):\n",
" z[i] = naive_vector_product(x[i, :], y)\n",
" return z"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"def naive_matrix_product(x, y):\n",
" assert len(x.shape) == 2\n",
" assert len(y.shape) == 2\n",
" assert x.shape[1] == y.shape[0]\n",
" z = np.zeros((x.shape[0], y.shape[1]))\n",
" for i in range(x.shape[0]):\n",
" for j in range(y.shape[1]):\n",
" row_x = x[i, :]\n",
" column_y = y[:, j]\n",
" z[i, j] = naive_vector_product(row_x, column_y)\n",
" return z"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Tensor reshaping"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"train_images = train_images.reshape((60000, 28 * 28))"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"x = np.array([[0., 1.],\n",
" [2., 3.],\n",
" [4., 5.]])\n",
"x.shape"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"x = x.reshape((6, 1))\n",
"x"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"x = x.reshape((2, 3))\n",
"x"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"x = np.zeros((300, 20))\n",
"x = np.transpose(x)\n",
"x.shape"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Geometric interpretation of tensor operations"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### A geometric interpretation of deep learning"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"### The engine of neural networks: Gradient-based optimization"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### What's a derivative?"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Derivative of a tensor operation: The gradient"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Stochastic gradient descent"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Chaining derivatives: The Backpropagation algorithm"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### The chain rule"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Automatic differentiation with computation graphs"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"### Looking back at our first example"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"(train_images, train_labels), (test_images, test_labels) = mnist.load_data()\n",
"train_images = train_images.reshape((60000, 28 * 28))\n",
"train_images = train_images.astype(\"float32\") / 255\n",
"test_images = test_images.reshape((10000, 28 * 28))\n",
"test_images = test_images.astype(\"float32\") / 255"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model = keras.Sequential(\n",
" [\n",
" layers.Dense(512, activation=\"relu\"),\n",
" layers.Dense(10, activation=\"softmax\"),\n",
" ]\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model.compile(\n",
" optimizer=\"adam\",\n",
" loss=\"sparse_categorical_crossentropy\",\n",
" metrics=[\"accuracy\"],\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model.fit(\n",
" train_images,\n",
" train_labels,\n",
" epochs=5,\n",
" batch_size=128,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Reimplementing our first example from scratch"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### A simple Dense class"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import keras\n",
"from keras import ops\n",
"\n",
"class NaiveDense:\n",
" def __init__(self, input_size, output_size, activation=None):\n",
" self.activation = activation\n",
" self.W = keras.Variable(\n",
" shape=(input_size, output_size), initializer=\"uniform\"\n",
" )\n",
" self.b = keras.Variable(shape=(output_size,), initializer=\"zeros\")\n",
"\n",
" def __call__(self, inputs):\n",
" x = ops.matmul(inputs, self.W)\n",
" x = x + self.b\n",
" if self.activation is not None:\n",
" x = self.activation(x)\n",
" return x\n",
"\n",
" @property\n",
" def weights(self):\n",
" return [self.W, self.b]"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### A simple Sequential class"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"class NaiveSequential:\n",
" def __init__(self, layers):\n",
" self.layers = layers\n",
"\n",
" def __call__(self, inputs):\n",
" x = inputs\n",
" for layer in self.layers:\n",
" x = layer(x)\n",
" return x\n",
"\n",
" @property\n",
" def weights(self):\n",
" weights = []\n",
" for layer in self.layers:\n",
" weights += layer.weights\n",
" return weights"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model = NaiveSequential(\n",
" [\n",
" NaiveDense(input_size=28 * 28, output_size=512, activation=ops.relu),\n",
" NaiveDense(input_size=512, output_size=10, activation=ops.softmax),\n",
" ]\n",
")\n",
"assert len(model.weights) == 4"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### A batch generator"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import math\n",
"\n",
"class BatchGenerator:\n",
" def __init__(self, images, labels, batch_size=128):\n",
" assert len(images) == len(labels)\n",
" self.index = 0\n",
" self.images = images\n",
" self.labels = labels\n",
" self.batch_size = batch_size\n",
" self.num_batches = math.ceil(len(images) / batch_size)\n",
"\n",
" def next(self):\n",
" images = self.images[self.index : self.index + self.batch_size]\n",
" labels = self.labels[self.index : self.index + self.batch_size]\n",
" self.index += self.batch_size\n",
" return images, labels"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Running one training step"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### The weight update step"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"learning_rate = 1e-3\n",
"\n",
"def update_weights(gradients, weights):\n",
" for g, w in zip(gradients, weights):\n",
" w.assign(w - g * learning_rate)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"from keras import optimizers\n",
"\n",
"optimizer = optimizers.SGD(learning_rate=1e-3)\n",
"\n",
"def update_weights(gradients, weights):\n",
" optimizer.apply_gradients(zip(gradients, weights))"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Gradient computation"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"%%backend tensorflow\n",
"import tensorflow as tf\n",
"\n",
"x = tf.zeros(shape=())\n",
"with tf.GradientTape() as tape:\n",
" y = 2 * x + 3\n",
"grad_of_y_wrt_x = tape.gradient(y, x)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"%%backend tensorflow\n",
"def one_training_step(model, images_batch, labels_batch):\n",
" with tf.GradientTape() as tape:\n",
" predictions = model(images_batch)\n",
" loss = ops.sparse_categorical_crossentropy(labels_batch, predictions)\n",
" average_loss = ops.mean(loss)\n",
" gradients = tape.gradient(average_loss, model.weights)\n",
" update_weights(gradients, model.weights)\n",
" return average_loss"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### The full training loop"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"%%backend tensorflow\n",
"def fit(model, images, labels, epochs, batch_size=128):\n",
" for epoch_counter in range(epochs):\n",
" print(f\"Epoch {epoch_counter}\")\n",
" batch_generator = BatchGenerator(images, labels)\n",
" for batch_counter in range(batch_generator.num_batches):\n",
" images_batch, labels_batch = batch_generator.next()\n",
" loss = one_training_step(model, images_batch, labels_batch)\n",
" if batch_counter % 100 == 0:\n",
" print(f\"loss at batch {batch_counter}: {loss:.2f}\")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"%%backend tensorflow\n",
"from keras.datasets import mnist\n",
"\n",
"(train_images, train_labels), (test_images, test_labels) = mnist.load_data()\n",
"\n",
"train_images = train_images.reshape((60000, 28 * 28))\n",
"train_images = train_images.astype(\"float32\") / 255\n",
"test_images = test_images.reshape((10000, 28 * 28))\n",
"test_images = test_images.astype(\"float32\") / 255\n",
"\n",
"fit(model, train_images, train_labels, epochs=10, batch_size=128)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Evaluating the model"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"%%backend tensorflow\n",
"predictions = model(test_images)\n",
"predicted_labels = ops.argmax(predictions, axis=1)\n",
"matches = predicted_labels == test_labels\n",
"f\"accuracy: {ops.mean(matches):.2f}\""
]
}
],
"metadata": {
"accelerator": "GPU",
"colab": {
"collapsed_sections": [],
"name": "chapter02_mathematical-building-blocks",
"private_outputs": false,
"provenance": [],
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.0"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
================================================
FILE: chapter03_introduction-to-ml-frameworks.ipynb
================================================
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)."
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"!pip install keras keras-hub --upgrade -q"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import os\n",
"os.environ[\"KERAS_BACKEND\"] = \"jax\""
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"cellView": "form",
"colab_type": "code"
},
"outputs": [],
"source": [
"# @title\n",
"import os\n",
"from IPython.core.magic import register_cell_magic\n",
"\n",
"@register_cell_magic\n",
"def backend(line, cell):\n",
" current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n",
" if current == required:\n",
" get_ipython().run_cell(cell)\n",
" else:\n",
" print(\n",
" f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n",
" f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n",
" )"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"## Introduction to TensorFlow, PyTorch, JAX, and Keras"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"### A brief history of deep learning frameworks"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"### How these frameworks relate to each other"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"### Introduction to TensorFlow"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### First steps with TensorFlow"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Tensors and variables in TensorFlow"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"###### Constant tensors"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import tensorflow as tf\n",
"tf.ones(shape=(2, 1))"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"tf.zeros(shape=(2, 1))"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"tf.constant([1, 2, 3], dtype=\"float32\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"###### Random tensors"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"x = tf.random.normal(shape=(3, 1), mean=0., stddev=1.)\n",
"print(x)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"x = tf.random.uniform(shape=(3, 1), minval=0., maxval=1.)\n",
"print(x)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"###### Tensor assignment and the Variable class"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import numpy as np\n",
"\n",
"x = np.ones(shape=(2, 2))\n",
"x[0, 0] = 0.0"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"v = tf.Variable(initial_value=tf.random.normal(shape=(3, 1)))\n",
"print(v)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"v.assign(tf.ones((3, 1)))"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"v[0, 0].assign(3.)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"v.assign_add(tf.ones((3, 1)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Tensor operations: Doing math in TensorFlow"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"a = tf.ones((2, 2))\n",
"b = tf.square(a)\n",
"c = tf.sqrt(a)\n",
"d = b + c\n",
"e = tf.matmul(a, b)\n",
"f = tf.concat((a, b), axis=0)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"def dense(inputs, W, b):\n",
" return tf.nn.relu(tf.matmul(inputs, W) + b)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Gradients in TensorFlow: A second look at the GradientTape API"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"input_var = tf.Variable(initial_value=3.0)\n",
"with tf.GradientTape() as tape:\n",
" result = tf.square(input_var)\n",
"gradient = tape.gradient(result, input_var)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"input_const = tf.constant(3.0)\n",
"with tf.GradientTape() as tape:\n",
" tape.watch(input_const)\n",
" result = tf.square(input_const)\n",
"gradient = tape.gradient(result, input_const)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"time = tf.Variable(0.0)\n",
"with tf.GradientTape() as outer_tape:\n",
" with tf.GradientTape() as inner_tape:\n",
" position = 4.9 * time**2\n",
" speed = inner_tape.gradient(position, time)\n",
"acceleration = outer_tape.gradient(speed, time)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Making TensorFlow functions fast using compilation"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"@tf.function\n",
"def dense(inputs, W, b):\n",
" return tf.nn.relu(tf.matmul(inputs, W) + b)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"@tf.function(jit_compile=True)\n",
"def dense(inputs, W, b):\n",
" return tf.nn.relu(tf.matmul(inputs, W) + b)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### An end-to-end example: A linear classifier in pure TensorFlow"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import numpy as np\n",
"\n",
"num_samples_per_class = 1000\n",
"negative_samples = np.random.multivariate_normal(\n",
" mean=[0, 3], cov=[[1, 0.5], [0.5, 1]], size=num_samples_per_class\n",
")\n",
"positive_samples = np.random.multivariate_normal(\n",
" mean=[3, 0], cov=[[1, 0.5], [0.5, 1]], size=num_samples_per_class\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"inputs = np.vstack((negative_samples, positive_samples)).astype(np.float32)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"targets = np.vstack(\n",
" (\n",
" np.zeros((num_samples_per_class, 1), dtype=\"float32\"),\n",
" np.ones((num_samples_per_class, 1), dtype=\"float32\"),\n",
" )\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
"\n",
"plt.scatter(inputs[:, 0], inputs[:, 1], c=targets[:, 0])\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"input_dim = 2\n",
"output_dim = 1\n",
"W = tf.Variable(initial_value=tf.random.uniform(shape=(input_dim, output_dim)))\n",
"b = tf.Variable(initial_value=tf.zeros(shape=(output_dim,)))"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"def model(inputs, W, b):\n",
" return tf.matmul(inputs, W) + b"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"def mean_squared_error(targets, predictions):\n",
" per_sample_losses = tf.square(targets - predictions)\n",
" return tf.reduce_mean(per_sample_losses)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"learning_rate = 0.1\n",
"\n",
"@tf.function(jit_compile=True)\n",
"def training_step(inputs, targets, W, b):\n",
" with tf.GradientTape() as tape:\n",
" predictions = model(inputs, W, b)\n",
" loss = mean_squared_error(predictions, targets)\n",
" grad_loss_wrt_W, grad_loss_wrt_b = tape.gradient(loss, [W, b])\n",
" W.assign_sub(grad_loss_wrt_W * learning_rate)\n",
" b.assign_sub(grad_loss_wrt_b * learning_rate)\n",
" return loss"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"for step in range(40):\n",
" loss = training_step(inputs, targets, W, b)\n",
" print(f\"Loss at step {step}: {loss:.4f}\")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"predictions = model(inputs, W, b)\n",
"plt.scatter(inputs[:, 0], inputs[:, 1], c=predictions[:, 0] > 0.5)\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"x = np.linspace(-1, 4, 100)\n",
"y = -W[0] / W[1] * x + (0.5 - b) / W[1]\n",
"plt.plot(x, y, \"-r\")\n",
"plt.scatter(inputs[:, 0], inputs[:, 1], c=predictions[:, 0] > 0.5)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### What makes the TensorFlow approach unique"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"### Introduction to PyTorch"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### First steps with PyTorch"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Tensors and parameters in PyTorch"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"###### Constant tensors"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import torch\n",
"torch.ones(size=(2, 1))"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"torch.zeros(size=(2, 1))"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"torch.tensor([1, 2, 3], dtype=torch.float32)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"###### Random tensors"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"torch.normal(\n",
"mean=torch.zeros(size=(3, 1)),\n",
"std=torch.ones(size=(3, 1)))"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"torch.rand(3, 1)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"###### Tensor assignment and the Parameter class"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"x = torch.zeros(size=(2, 1))\n",
"x[0, 0] = 1.\n",
"x"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"x = torch.zeros(size=(2, 1))\n",
"p = torch.nn.parameter.Parameter(data=x)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Tensor operations: Doing math in PyTorch"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"a = torch.ones((2, 2))\n",
"b = torch.square(a)\n",
"c = torch.sqrt(a)\n",
"d = b + c\n",
"e = torch.matmul(a, b)\n",
"f = torch.cat((a, b), dim=0)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"def dense(inputs, W, b):\n",
" return torch.nn.relu(torch.matmul(inputs, W) + b)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Computing gradients with PyTorch"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"input_var = torch.tensor(3.0, requires_grad=True)\n",
"result = torch.square(input_var)\n",
"result.backward()\n",
"gradient = input_var.grad\n",
"gradient"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"result = torch.square(input_var)\n",
"result.backward()\n",
"input_var.grad"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"input_var.grad = None"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### An end-to-end example: A linear classifier in pure PyTorch"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"input_dim = 2\n",
"output_dim = 1\n",
"\n",
"W = torch.rand(input_dim, output_dim, requires_grad=True)\n",
"b = torch.zeros(output_dim, requires_grad=True)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"def model(inputs, W, b):\n",
" return torch.matmul(inputs, W) + b"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"def mean_squared_error(targets, predictions):\n",
" per_sample_losses = torch.square(targets - predictions)\n",
" return torch.mean(per_sample_losses)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"learning_rate = 0.1\n",
"\n",
"def training_step(inputs, targets, W, b):\n",
" predictions = model(inputs)\n",
" loss = mean_squared_error(targets, predictions)\n",
" loss.backward()\n",
" grad_loss_wrt_W, grad_loss_wrt_b = W.grad, b.grad\n",
" with torch.no_grad():\n",
" W -= grad_loss_wrt_W * learning_rate\n",
" b -= grad_loss_wrt_b * learning_rate\n",
" W.grad = None\n",
" b.grad = None\n",
" return loss"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Packaging state and computation with the Module class"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"class LinearModel(torch.nn.Module):\n",
" def __init__(self):\n",
" super().__init__()\n",
" self.W = torch.nn.Parameter(torch.rand(input_dim, output_dim))\n",
" self.b = torch.nn.Parameter(torch.zeros(output_dim))\n",
"\n",
" def forward(self, inputs):\n",
" return torch.matmul(inputs, self.W) + self.b"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model = LinearModel()"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"torch_inputs = torch.tensor(inputs)\n",
"output = model(torch_inputs)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"def training_step(inputs, targets):\n",
" predictions = model(inputs)\n",
" loss = mean_squared_error(targets, predictions)\n",
" loss.backward()\n",
" optimizer.step()\n",
" model.zero_grad()\n",
" return loss"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Making PyTorch modules fast using compilation"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"compiled_model = torch.compile(model)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"@torch.compile\n",
"def dense(inputs, W, b):\n",
" return torch.nn.relu(torch.matmul(inputs, W) + b)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### What makes the PyTorch approach unique"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"### Introduction to JAX"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### First steps with JAX"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Tensors in JAX"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"from jax import numpy as jnp\n",
"jnp.ones(shape=(2, 1))"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"jnp.zeros(shape=(2, 1))"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"jnp.array([1, 2, 3], dtype=\"float32\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Random number generation in JAX"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"np.random.normal(size=(3,))"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"np.random.normal(size=(3,))"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"def apply_noise(x, seed):\n",
" np.random.seed(seed)\n",
" x = x * np.random.normal((3,))\n",
" return x\n",
"\n",
"seed = 1337\n",
"y = apply_noise(x, seed)\n",
"seed += 1\n",
"z = apply_noise(x, seed)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import jax\n",
"\n",
"seed_key = jax.random.key(1337)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"seed_key = jax.random.key(0)\n",
"jax.random.normal(seed_key, shape=(3,))"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"seed_key = jax.random.key(123)\n",
"jax.random.normal(seed_key, shape=(3,))"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"jax.random.normal(seed_key, shape=(3,))"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"seed_key = jax.random.key(123)\n",
"jax.random.normal(seed_key, shape=(3,))"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"new_seed_key = jax.random.split(seed_key, num=1)[0]\n",
"jax.random.normal(new_seed_key, shape=(3,))"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Tensor assignment"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"x = jnp.array([1, 2, 3], dtype=\"float32\")\n",
"new_x = x.at[0].set(10)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Tensor operations: Doing math in JAX"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"a = jnp.ones((2, 2))\n",
"b = jnp.square(a)\n",
"c = jnp.sqrt(a)\n",
"d = b + c\n",
"e = jnp.matmul(a, b)\n",
"e *= d"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"def dense(inputs, W, b):\n",
" return jax.nn.relu(jnp.matmul(inputs, W) + b)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Computing gradients with JAX"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"def compute_loss(input_var):\n",
" return jnp.square(input_var)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"grad_fn = jax.grad(compute_loss)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"input_var = jnp.array(3.0)\n",
"grad_of_loss_wrt_input_var = grad_fn(input_var)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### JAX gradient-computation best practices"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"###### Returning the loss value"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"grad_fn = jax.value_and_grad(compute_loss)\n",
"output, grad_of_loss_wrt_input_var = grad_fn(input_var)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"###### Getting gradients for a complex function"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"###### Returning auxiliary outputs"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Making JAX functions fast with @jax.jit"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"@jax.jit\n",
"def dense(inputs, W, b):\n",
" return jax.nn.relu(jnp.matmul(inputs, W) + b)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### An end-to-end example: A linear classifier in pure JAX"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"def model(inputs, W, b):\n",
" return jnp.matmul(inputs, W) + b\n",
"\n",
"def mean_squared_error(targets, predictions):\n",
" per_sample_losses = jnp.square(targets - predictions)\n",
" return jnp.mean(per_sample_losses)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"def compute_loss(state, inputs, targets):\n",
" W, b = state\n",
" predictions = model(inputs, W, b)\n",
" loss = mean_squared_error(targets, predictions)\n",
" return loss"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"grad_fn = jax.value_and_grad(compute_loss)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"learning_rate = 0.1\n",
"\n",
"@jax.jit\n",
"def training_step(inputs, targets, W, b):\n",
" loss, grads = grad_fn((W, b), inputs, targets)\n",
" grad_wrt_W, grad_wrt_b = grads\n",
" W = W - grad_wrt_W * learning_rate\n",
" b = b - grad_wrt_b * learning_rate\n",
" return loss, W, b"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"input_dim = 2\n",
"output_dim = 1\n",
"\n",
"W = jax.numpy.array(np.random.uniform(size=(input_dim, output_dim)))\n",
"b = jax.numpy.array(np.zeros(shape=(output_dim,)))\n",
"state = (W, b)\n",
"for step in range(40):\n",
" loss, W, b = training_step(inputs, targets, W, b)\n",
" print(f\"Loss at step {step}: {loss:.4f}\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### What makes the JAX approach unique"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"### Introduction to Keras"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### First steps with Keras"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Picking a backend framework"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"KERAS_BACKEND\"] = \"jax\"\n",
"\n",
"import keras"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Layers: The building blocks of deep learning"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### The base `Layer` class in Keras"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import keras\n",
"\n",
"class SimpleDense(keras.Layer):\n",
" def __init__(self, units, activation=None):\n",
" super().__init__()\n",
" self.units = units\n",
" self.activation = activation\n",
"\n",
" def build(self, input_shape):\n",
" batch_dim, input_dim = input_shape\n",
" self.W = self.add_weight(\n",
" shape=(input_dim, self.units), initializer=\"random_normal\"\n",
" )\n",
" self.b = self.add_weight(shape=(self.units,), initializer=\"zeros\")\n",
"\n",
" def call(self, inputs):\n",
" y = keras.ops.matmul(inputs, self.W) + self.b\n",
" if self.activation is not None:\n",
" y = self.activation(y)\n",
" return y"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"my_dense = SimpleDense(units=32, activation=keras.ops.relu)\n",
"input_tensor = keras.ops.ones(shape=(2, 784))\n",
"output_tensor = my_dense(input_tensor)\n",
"print(output_tensor.shape)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Automatic shape inference: Building layers on the fly"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"from keras import layers\n",
"\n",
"layer = layers.Dense(32, activation=\"relu\")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"from keras import models\n",
"from keras import layers\n",
"\n",
"model = models.Sequential(\n",
" [\n",
" layers.Dense(32, activation=\"relu\"),\n",
" layers.Dense(32),\n",
" ]\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model = keras.Sequential(\n",
" [\n",
" SimpleDense(32, activation=\"relu\"),\n",
" SimpleDense(64, activation=\"relu\"),\n",
" SimpleDense(32, activation=\"relu\"),\n",
" SimpleDense(10, activation=\"softmax\"),\n",
" ]\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### From layers to models"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### The \"compile\" step: Configuring the learning process"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model = keras.Sequential([keras.layers.Dense(1)])\n",
"model.compile(\n",
" optimizer=\"rmsprop\",\n",
" loss=\"mean_squared_error\",\n",
" metrics=[\"accuracy\"],\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model.compile(\n",
" optimizer=keras.optimizers.RMSprop(),\n",
" loss=keras.losses.MeanSquaredError(),\n",
" metrics=[keras.metrics.BinaryAccuracy()],\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Picking a loss function"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Understanding the fit method"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"history = model.fit(\n",
" inputs,\n",
" targets,\n",
" epochs=5,\n",
" batch_size=128,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"history.history"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Monitoring loss and metrics on validation data"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model = keras.Sequential([keras.layers.Dense(1)])\n",
"model.compile(\n",
" optimizer=keras.optimizers.RMSprop(learning_rate=0.1),\n",
" loss=keras.losses.MeanSquaredError(),\n",
" metrics=[keras.metrics.BinaryAccuracy()],\n",
")\n",
"\n",
"indices_permutation = np.random.permutation(len(inputs))\n",
"shuffled_inputs = inputs[indices_permutation]\n",
"shuffled_targets = targets[indices_permutation]\n",
"\n",
"num_validation_samples = int(0.3 * len(inputs))\n",
"val_inputs = shuffled_inputs[:num_validation_samples]\n",
"val_targets = shuffled_targets[:num_validation_samples]\n",
"training_inputs = shuffled_inputs[num_validation_samples:]\n",
"training_targets = shuffled_targets[num_validation_samples:]\n",
"model.fit(\n",
" training_inputs,\n",
" training_targets,\n",
" epochs=5,\n",
" batch_size=16,\n",
" validation_data=(val_inputs, val_targets),\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Inference: Using a model after training"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"predictions = model.predict(val_inputs, batch_size=128)\n",
"print(predictions[:10])"
]
}
],
"metadata": {
"accelerator": "GPU",
"colab": {
"collapsed_sections": [],
"name": "chapter03_introduction-to-ml-frameworks",
"private_outputs": false,
"provenance": [],
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.0"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
================================================
FILE: chapter04_classification-and-regression.ipynb
================================================
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)."
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"!pip install keras keras-hub --upgrade -q"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import os\n",
"os.environ[\"KERAS_BACKEND\"] = \"jax\""
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"cellView": "form",
"colab_type": "code"
},
"outputs": [],
"source": [
"# @title\n",
"import os\n",
"from IPython.core.magic import register_cell_magic\n",
"\n",
"@register_cell_magic\n",
"def backend(line, cell):\n",
" current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n",
" if current == required:\n",
" get_ipython().run_cell(cell)\n",
" else:\n",
" print(\n",
" f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n",
" f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n",
" )"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"## Classification and regression"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"### Classifying movie reviews: A binary classification example"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### The IMDb dataset"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"from keras.datasets import imdb\n",
"\n",
"(train_data, train_labels), (test_data, test_labels) = imdb.load_data(\n",
" num_words=10000\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"train_data[0]"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"train_labels[0]"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"max([max(sequence) for sequence in train_data])"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"word_index = imdb.get_word_index()\n",
"reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])\n",
"decoded_review = \" \".join(\n",
" [reverse_word_index.get(i - 3, \"?\") for i in train_data[0]]\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"decoded_review[:100]"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Preparing the data"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import numpy as np\n",
"\n",
"def multi_hot_encode(sequences, num_classes):\n",
" results = np.zeros((len(sequences), num_classes))\n",
" for i, sequence in enumerate(sequences):\n",
" results[i][sequence] = 1.0\n",
" return results\n",
"\n",
"x_train = multi_hot_encode(train_data, num_classes=10000)\n",
"x_test = multi_hot_encode(test_data, num_classes=10000)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"x_train[0]"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"y_train = train_labels.astype(\"float32\")\n",
"y_test = test_labels.astype(\"float32\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Building your model"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import keras\n",
"from keras import layers\n",
"\n",
"model = keras.Sequential(\n",
" [\n",
" layers.Dense(16, activation=\"relu\"),\n",
" layers.Dense(16, activation=\"relu\"),\n",
" layers.Dense(1, activation=\"sigmoid\"),\n",
" ]\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model.compile(\n",
" optimizer=\"adam\",\n",
" loss=\"binary_crossentropy\",\n",
" metrics=[\"accuracy\"],\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Validating your approach"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"x_val = x_train[:10000]\n",
"partial_x_train = x_train[10000:]\n",
"y_val = y_train[:10000]\n",
"partial_y_train = y_train[10000:]"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"history = model.fit(\n",
" partial_x_train,\n",
" partial_y_train,\n",
" epochs=20,\n",
" batch_size=512,\n",
" validation_data=(x_val, y_val),\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"history = model.fit(\n",
" x_train,\n",
" y_train,\n",
" epochs=20,\n",
" batch_size=512,\n",
" validation_split=0.2,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"history_dict = history.history\n",
"history_dict.keys()"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
"\n",
"history_dict = history.history\n",
"loss_values = history_dict[\"loss\"]\n",
"val_loss_values = history_dict[\"val_loss\"]\n",
"epochs = range(1, len(loss_values) + 1)\n",
"plt.plot(epochs, loss_values, \"r--\", label=\"Training loss\")\n",
"plt.plot(epochs, val_loss_values, \"b\", label=\"Validation loss\")\n",
"plt.title(\"[IMDB] Training and validation loss\")\n",
"plt.xlabel(\"Epochs\")\n",
"plt.xticks(epochs)\n",
"plt.ylabel(\"Loss\")\n",
"plt.legend()\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"plt.clf()\n",
"acc = history_dict[\"accuracy\"]\n",
"val_acc = history_dict[\"val_accuracy\"]\n",
"plt.plot(epochs, acc, \"r--\", label=\"Training acc\")\n",
"plt.plot(epochs, val_acc, \"b\", label=\"Validation acc\")\n",
"plt.title(\"[IMDB] Training and validation accuracy\")\n",
"plt.xlabel(\"Epochs\")\n",
"plt.xticks(epochs)\n",
"plt.ylabel(\"Accuracy\")\n",
"plt.legend()\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model = keras.Sequential(\n",
" [\n",
" layers.Dense(16, activation=\"relu\"),\n",
" layers.Dense(16, activation=\"relu\"),\n",
" layers.Dense(1, activation=\"sigmoid\"),\n",
" ]\n",
")\n",
"model.compile(\n",
" optimizer=\"adam\",\n",
" loss=\"binary_crossentropy\",\n",
" metrics=[\"accuracy\"],\n",
")\n",
"model.fit(x_train, y_train, epochs=4, batch_size=512)\n",
"results = model.evaluate(x_test, y_test)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"results"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Using a trained model to generate predictions on new data"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model.predict(x_test)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Further experiments"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Wrapping up"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"### Classifying newswires: A multiclass classification example"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### The Reuters dataset"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"from keras.datasets import reuters\n",
"\n",
"(train_data, train_labels), (test_data, test_labels) = reuters.load_data(\n",
" num_words=10000\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"len(train_data)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"len(test_data)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"train_data[10]"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"word_index = reuters.get_word_index()\n",
"reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])\n",
"decoded_newswire = \" \".join(\n",
" [reverse_word_index.get(i - 3, \"?\") for i in train_data[10]]\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"train_labels[10]"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Preparing the data"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"x_train = multi_hot_encode(train_data, num_classes=10000)\n",
"x_test = multi_hot_encode(test_data, num_classes=10000)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"def one_hot_encode(labels, num_classes=46):\n",
" results = np.zeros((len(labels), num_classes))\n",
" for i, label in enumerate(labels):\n",
" results[i, label] = 1.0\n",
" return results\n",
"\n",
"y_train = one_hot_encode(train_labels)\n",
"y_test = one_hot_encode(test_labels)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"from keras.utils import to_categorical\n",
"\n",
"y_train = to_categorical(train_labels)\n",
"y_test = to_categorical(test_labels)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Building your model"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model = keras.Sequential(\n",
" [\n",
" layers.Dense(64, activation=\"relu\"),\n",
" layers.Dense(64, activation=\"relu\"),\n",
" layers.Dense(46, activation=\"softmax\"),\n",
" ]\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"top_3_accuracy = keras.metrics.TopKCategoricalAccuracy(\n",
" k=3, name=\"top_3_accuracy\"\n",
")\n",
"model.compile(\n",
" optimizer=\"adam\",\n",
" loss=\"categorical_crossentropy\",\n",
" metrics=[\"accuracy\", top_3_accuracy],\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Validating your approach"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"x_val = x_train[:1000]\n",
"partial_x_train = x_train[1000:]\n",
"y_val = y_train[:1000]\n",
"partial_y_train = y_train[1000:]"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"history = model.fit(\n",
" partial_x_train,\n",
" partial_y_train,\n",
" epochs=20,\n",
" batch_size=512,\n",
" validation_data=(x_val, y_val),\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"loss = history.history[\"loss\"]\n",
"val_loss = history.history[\"val_loss\"]\n",
"epochs = range(1, len(loss) + 1)\n",
"plt.plot(epochs, loss, \"r--\", label=\"Training loss\")\n",
"plt.plot(epochs, val_loss, \"b\", label=\"Validation loss\")\n",
"plt.title(\"Training and validation loss\")\n",
"plt.xlabel(\"Epochs\")\n",
"plt.xticks(epochs)\n",
"plt.ylabel(\"Loss\")\n",
"plt.legend()\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"plt.clf()\n",
"acc = history.history[\"accuracy\"]\n",
"val_acc = history.history[\"val_accuracy\"]\n",
"plt.plot(epochs, acc, \"r--\", label=\"Training accuracy\")\n",
"plt.plot(epochs, val_acc, \"b\", label=\"Validation accuracy\")\n",
"plt.title(\"Training and validation accuracy\")\n",
"plt.xlabel(\"Epochs\")\n",
"plt.xticks(epochs)\n",
"plt.ylabel(\"Accuracy\")\n",
"plt.legend()\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"plt.clf()\n",
"acc = history.history[\"top_3_accuracy\"]\n",
"val_acc = history.history[\"val_top_3_accuracy\"]\n",
"plt.plot(epochs, acc, \"r--\", label=\"Training top-3 accuracy\")\n",
"plt.plot(epochs, val_acc, \"b\", label=\"Validation top-3 accuracy\")\n",
"plt.title(\"Training and validation top-3 accuracy\")\n",
"plt.xlabel(\"Epochs\")\n",
"plt.xticks(epochs)\n",
"plt.ylabel(\"Top-3 accuracy\")\n",
"plt.legend()\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model = keras.Sequential(\n",
" [\n",
" layers.Dense(64, activation=\"relu\"),\n",
" layers.Dense(64, activation=\"relu\"),\n",
" layers.Dense(46, activation=\"softmax\"),\n",
" ]\n",
")\n",
"model.compile(\n",
" optimizer=\"adam\",\n",
" loss=\"categorical_crossentropy\",\n",
" metrics=[\"accuracy\"],\n",
")\n",
"model.fit(\n",
" x_train,\n",
" y_train,\n",
" epochs=9,\n",
" batch_size=512,\n",
")\n",
"results = model.evaluate(x_test, y_test)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"results"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import copy\n",
"test_labels_copy = copy.copy(test_labels)\n",
"np.random.shuffle(test_labels_copy)\n",
"hits_array = np.array(test_labels == test_labels_copy)\n",
"hits_array.mean()"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Generating predictions on new data"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"predictions = model.predict(x_test)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"predictions[0].shape"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"np.sum(predictions[0])"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"np.argmax(predictions[0])"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### A different way to handle the labels and the loss"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"y_train = train_labels\n",
"y_test = test_labels"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model.compile(\n",
" optimizer=\"adam\",\n",
" loss=\"sparse_categorical_crossentropy\",\n",
" metrics=[\"accuracy\"],\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### The importance of having sufficiently large intermediate layers"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model = keras.Sequential(\n",
" [\n",
" layers.Dense(64, activation=\"relu\"),\n",
" layers.Dense(4, activation=\"relu\"),\n",
" layers.Dense(46, activation=\"softmax\"),\n",
" ]\n",
")\n",
"model.compile(\n",
" optimizer=\"adam\",\n",
" loss=\"categorical_crossentropy\",\n",
" metrics=[\"accuracy\"],\n",
")\n",
"model.fit(\n",
" partial_x_train,\n",
" partial_y_train,\n",
" epochs=20,\n",
" batch_size=128,\n",
" validation_data=(x_val, y_val),\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Further experiments"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Wrapping up"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"### Predicting house prices: A regression example"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### The California Housing Price dataset"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"from keras.datasets import california_housing\n",
"\n",
"(train_data, train_targets), (test_data, test_targets) = (\n",
" california_housing.load_data(version=\"small\")\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"train_data.shape"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"test_data.shape"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"train_targets"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Preparing the data"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"mean = train_data.mean(axis=0)\n",
"std = train_data.std(axis=0)\n",
"x_train = (train_data - mean) / std\n",
"x_test = (test_data - mean) / std"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"y_train = train_targets / 100000\n",
"y_test = test_targets / 100000"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Building your model"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"def get_model():\n",
" model = keras.Sequential(\n",
" [\n",
" layers.Dense(64, activation=\"relu\"),\n",
" layers.Dense(64, activation=\"relu\"),\n",
" layers.Dense(1),\n",
" ]\n",
" )\n",
" model.compile(\n",
" optimizer=\"adam\",\n",
" loss=\"mean_squared_error\",\n",
" metrics=[\"mean_absolute_error\"],\n",
" )\n",
" return model"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Validating your approach using K-fold validation"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"k = 4\n",
"num_val_samples = len(x_train) // k\n",
"num_epochs = 50\n",
"all_scores = []\n",
"for i in range(k):\n",
" print(f\"Processing fold #{i + 1}\")\n",
" fold_x_val = x_train[i * num_val_samples : (i + 1) * num_val_samples]\n",
" fold_y_val = y_train[i * num_val_samples : (i + 1) * num_val_samples]\n",
" fold_x_train = np.concatenate(\n",
" [x_train[: i * num_val_samples], x_train[(i + 1) * num_val_samples :]],\n",
" axis=0,\n",
" )\n",
" fold_y_train = np.concatenate(\n",
" [y_train[: i * num_val_samples], y_train[(i + 1) * num_val_samples :]],\n",
" axis=0,\n",
" )\n",
" model = get_model()\n",
" model.fit(\n",
" fold_x_train,\n",
" fold_y_train,\n",
" epochs=num_epochs,\n",
" batch_size=16,\n",
" verbose=0,\n",
" )\n",
" scores = model.evaluate(fold_x_val, fold_y_val, verbose=0)\n",
" val_loss, val_mae = scores\n",
" all_scores.append(val_mae)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"[round(value, 3) for value in all_scores]"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"round(np.mean(all_scores), 3)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"k = 4\n",
"num_val_samples = len(x_train) // k\n",
"num_epochs = 200\n",
"all_mae_histories = []\n",
"for i in range(k):\n",
" print(f\"Processing fold #{i + 1}\")\n",
" fold_x_val = x_train[i * num_val_samples : (i + 1) * num_val_samples]\n",
" fold_y_val = y_train[i * num_val_samples : (i + 1) * num_val_samples]\n",
" fold_x_train = np.concatenate(\n",
" [x_train[: i * num_val_samples], x_train[(i + 1) * num_val_samples :]],\n",
" axis=0,\n",
" )\n",
" fold_y_train = np.concatenate(\n",
" [y_train[: i * num_val_samples], y_train[(i + 1) * num_val_samples :]],\n",
" axis=0,\n",
" )\n",
" model = get_model()\n",
" history = model.fit(\n",
" fold_x_train,\n",
" fold_y_train,\n",
" validation_data=(fold_x_val, fold_y_val),\n",
" epochs=num_epochs,\n",
" batch_size=16,\n",
" verbose=0,\n",
" )\n",
" mae_history = history.history[\"val_mean_absolute_error\"]\n",
" all_mae_histories.append(mae_history)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"average_mae_history = [\n",
" np.mean([x[i] for x in all_mae_histories]) for i in range(num_epochs)\n",
"]"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"epochs = range(1, len(average_mae_history) + 1)\n",
"plt.plot(epochs, average_mae_history)\n",
"plt.xlabel(\"Epochs\")\n",
"plt.ylabel(\"Validation MAE\")\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"truncated_mae_history = average_mae_history[10:]\n",
"epochs = range(10, len(truncated_mae_history) + 10)\n",
"plt.plot(epochs, truncated_mae_history)\n",
"plt.xlabel(\"Epochs\")\n",
"plt.ylabel(\"Validation MAE\")\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model = get_model()\n",
"model.fit(x_train, y_train, epochs=130, batch_size=16, verbose=0)\n",
"test_mean_squared_error, test_mean_absolute_error = model.evaluate(\n",
" x_test, y_test\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"round(test_mean_absolute_error, 3)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Generating predictions on new data"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"predictions = model.predict(x_test)\n",
"predictions[0]"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Wrapping up"
]
}
],
"metadata": {
"accelerator": "GPU",
"colab": {
"collapsed_sections": [],
"name": "chapter04_classification-and-regression",
"private_outputs": false,
"provenance": [],
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.0"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
================================================
FILE: chapter05_fundamentals-of-ml.ipynb
================================================
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)."
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"!pip install keras keras-hub --upgrade -q"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import os\n",
"os.environ[\"KERAS_BACKEND\"] = \"jax\""
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"cellView": "form",
"colab_type": "code"
},
"outputs": [],
"source": [
"# @title\n",
"import os\n",
"from IPython.core.magic import register_cell_magic\n",
"\n",
"@register_cell_magic\n",
"def backend(line, cell):\n",
" current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n",
" if current == required:\n",
" get_ipython().run_cell(cell)\n",
" else:\n",
" print(\n",
" f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n",
" f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n",
" )"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"## Fundamentals of machine learning"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"### Generalization: The goal of machine learning"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Underfitting and overfitting"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Noisy training data"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Ambiguous features"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Rare features and spurious correlations"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"from keras.datasets import mnist\n",
"import numpy as np\n",
"\n",
"(train_images, train_labels), _ = mnist.load_data()\n",
"train_images = train_images.reshape((60000, 28 * 28))\n",
"train_images = train_images.astype(\"float32\") / 255\n",
"\n",
"train_images_with_noise_channels = np.concatenate(\n",
" [train_images, np.random.random((len(train_images), 784))], axis=1\n",
")\n",
"\n",
"train_images_with_zeros_channels = np.concatenate(\n",
" [train_images, np.zeros((len(train_images), 784))], axis=1\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import keras\n",
"from keras import layers\n",
"\n",
"def get_model():\n",
" model = keras.Sequential(\n",
" [\n",
" layers.Dense(512, activation=\"relu\"),\n",
" layers.Dense(10, activation=\"softmax\"),\n",
" ]\n",
" )\n",
" model.compile(\n",
" optimizer=\"adam\",\n",
" loss=\"sparse_categorical_crossentropy\",\n",
" metrics=[\"accuracy\"],\n",
" )\n",
" return model\n",
"\n",
"model = get_model()\n",
"history_noise = model.fit(\n",
" train_images_with_noise_channels,\n",
" train_labels,\n",
" epochs=10,\n",
" batch_size=128,\n",
" validation_split=0.2,\n",
")\n",
"\n",
"model = get_model()\n",
"history_zeros = model.fit(\n",
" train_images_with_zeros_channels,\n",
" train_labels,\n",
" epochs=10,\n",
" batch_size=128,\n",
" validation_split=0.2,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
"\n",
"val_acc_noise = history_noise.history[\"val_accuracy\"]\n",
"val_acc_zeros = history_zeros.history[\"val_accuracy\"]\n",
"epochs = range(1, 11)\n",
"plt.plot(\n",
" epochs,\n",
" val_acc_noise,\n",
" \"b-\",\n",
" label=\"Validation accuracy with noise channels\",\n",
")\n",
"plt.plot(\n",
" epochs,\n",
" val_acc_zeros,\n",
" \"r--\",\n",
" label=\"Validation accuracy with zeros channels\",\n",
")\n",
"plt.title(\"Effect of noise channels on validation accuracy\")\n",
"plt.xlabel(\"Epochs\")\n",
"plt.xticks(epochs)\n",
"plt.ylabel(\"Accuracy\")\n",
"plt.legend()\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### The nature of generalization in deep learning"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"(train_images, train_labels), _ = mnist.load_data()\n",
"train_images = train_images.reshape((60000, 28 * 28))\n",
"train_images = train_images.astype(\"float32\") / 255\n",
"\n",
"random_train_labels = train_labels[:]\n",
"np.random.shuffle(random_train_labels)\n",
"\n",
"model = keras.Sequential(\n",
" [\n",
" layers.Dense(512, activation=\"relu\"),\n",
" layers.Dense(10, activation=\"softmax\"),\n",
" ]\n",
")\n",
"model.compile(\n",
" optimizer=\"rmsprop\",\n",
" loss=\"sparse_categorical_crossentropy\",\n",
" metrics=[\"accuracy\"],\n",
")\n",
"model.fit(\n",
" train_images,\n",
" random_train_labels,\n",
" epochs=100,\n",
" batch_size=128,\n",
" validation_split=0.2,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### The manifold hypothesis"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Interpolation as a source of generalization"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Why deep learning works"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Training data is paramount"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"### Evaluating machine-learning models"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Training, validation, and test sets"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Simple hold-out validation"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### K-fold validation"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Iterated K-fold validation with shuffling"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Beating a common-sense baseline"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Things to keep in mind about model evaluation"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"### Improving model fit"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Tuning key gradient descent parameters"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"(train_images, train_labels), _ = mnist.load_data()\n",
"train_images = train_images.reshape((60000, 28 * 28))\n",
"train_images = train_images.astype(\"float32\") / 255\n",
"\n",
"model = keras.Sequential(\n",
" [\n",
" layers.Dense(512, activation=\"relu\"),\n",
" layers.Dense(10, activation=\"softmax\"),\n",
" ]\n",
")\n",
"model.compile(\n",
" optimizer=keras.optimizers.RMSprop(learning_rate=1.0),\n",
" loss=\"sparse_categorical_crossentropy\",\n",
" metrics=[\"accuracy\"],\n",
")\n",
"model.fit(\n",
" train_images, train_labels, epochs=10, batch_size=128, validation_split=0.2\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model = keras.Sequential(\n",
" [\n",
" layers.Dense(512, activation=\"relu\"),\n",
" layers.Dense(10, activation=\"softmax\"),\n",
" ]\n",
")\n",
"model.compile(\n",
" optimizer=keras.optimizers.RMSprop(learning_rate=1e-2),\n",
" loss=\"sparse_categorical_crossentropy\",\n",
" metrics=[\"accuracy\"],\n",
")\n",
"model.fit(\n",
" train_images, train_labels, epochs=10, batch_size=128, validation_split=0.2\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Using better architecture priors"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Increasing model capacity"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model = keras.Sequential([layers.Dense(10, activation=\"softmax\")])\n",
"model.compile(\n",
" optimizer=\"rmsprop\",\n",
" loss=\"sparse_categorical_crossentropy\",\n",
" metrics=[\"accuracy\"],\n",
")\n",
"history_small_model = model.fit(\n",
" train_images, train_labels, epochs=20, batch_size=128, validation_split=0.2\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
"\n",
"val_loss = history_small_model.history[\"val_loss\"]\n",
"epochs = range(1, 21)\n",
"plt.plot(epochs, val_loss, \"b-\", label=\"Validation loss\")\n",
"plt.title(\"Validation loss for a model with insufficient capacity\")\n",
"plt.xlabel(\"Epochs\")\n",
"plt.ylabel(\"Loss\")\n",
"plt.legend()\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model = keras.Sequential(\n",
" [\n",
" layers.Dense(128, activation=\"relu\"),\n",
" layers.Dense(128, activation=\"relu\"),\n",
" layers.Dense(10, activation=\"softmax\"),\n",
" ]\n",
")\n",
"model.compile(\n",
" optimizer=\"rmsprop\",\n",
" loss=\"sparse_categorical_crossentropy\",\n",
" metrics=[\"accuracy\"],\n",
")\n",
"history_large_model = model.fit(\n",
" train_images,\n",
" train_labels,\n",
" epochs=20,\n",
" batch_size=128,\n",
" validation_split=0.2,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"val_loss = history_large_model.history[\"val_loss\"]\n",
"epochs = range(1, 21)\n",
"plt.plot(epochs, val_loss, \"b-\", label=\"Validation loss\")\n",
"plt.title(\"Validation loss for a model with appropriate capacity\")\n",
"plt.xlabel(\"Epochs\")\n",
"plt.ylabel(\"Loss\")\n",
"plt.legend()\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model = keras.Sequential(\n",
" [\n",
" layers.Dense(2048, activation=\"relu\"),\n",
" layers.Dense(2048, activation=\"relu\"),\n",
" layers.Dense(2048, activation=\"relu\"),\n",
" layers.Dense(10, activation=\"softmax\"),\n",
" ]\n",
")\n",
"model.compile(\n",
" optimizer=\"rmsprop\",\n",
" loss=\"sparse_categorical_crossentropy\",\n",
" metrics=[\"accuracy\"],\n",
")\n",
"history_very_large_model = model.fit(\n",
" train_images,\n",
" train_labels,\n",
" epochs=20,\n",
" batch_size=32,\n",
" validation_split=0.2,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"val_loss = history_very_large_model.history[\"val_loss\"]\n",
"epochs = range(1, 21)\n",
"plt.plot(epochs, val_loss, \"b-\", label=\"Validation loss\")\n",
"plt.title(\"Validation loss for a model with too much capacity\")\n",
"plt.xlabel(\"Epochs\")\n",
"plt.ylabel(\"Loss\")\n",
"plt.legend()\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"### Improving generalization"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Dataset curation"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Feature engineering"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Using early stopping"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Regularizing your model"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Reducing the network's size"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"from keras.datasets import imdb\n",
"\n",
"(train_data, train_labels), _ = imdb.load_data(num_words=10000)\n",
"\n",
"def vectorize_sequences(sequences, dimension=10000):\n",
" results = np.zeros((len(sequences), dimension))\n",
" for i, sequence in enumerate(sequences):\n",
" results[i, sequence] = 1.0\n",
" return results\n",
"\n",
"train_data = vectorize_sequences(train_data)\n",
"\n",
"model = keras.Sequential(\n",
" [\n",
" layers.Dense(16, activation=\"relu\"),\n",
" layers.Dense(16, activation=\"relu\"),\n",
" layers.Dense(1, activation=\"sigmoid\"),\n",
" ]\n",
")\n",
"model.compile(\n",
" optimizer=\"rmsprop\",\n",
" loss=\"binary_crossentropy\",\n",
" metrics=[\"accuracy\"],\n",
")\n",
"history_original = model.fit(\n",
" train_data,\n",
" train_labels,\n",
" epochs=20,\n",
" batch_size=512,\n",
" validation_split=0.4,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model = keras.Sequential(\n",
" [\n",
" layers.Dense(4, activation=\"relu\"),\n",
" layers.Dense(4, activation=\"relu\"),\n",
" layers.Dense(1, activation=\"sigmoid\"),\n",
" ]\n",
")\n",
"model.compile(\n",
" optimizer=\"rmsprop\",\n",
" loss=\"binary_crossentropy\",\n",
" metrics=[\"accuracy\"],\n",
")\n",
"history_smaller_model = model.fit(\n",
" train_data,\n",
" train_labels,\n",
" epochs=20,\n",
" batch_size=512,\n",
" validation_split=0.4,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"original_val_loss = history_original.history[\"val_loss\"]\n",
"smaller_model_val_loss = history_smaller_model.history[\"val_loss\"]\n",
"epochs = range(1, 21)\n",
"plt.plot(\n",
" epochs,\n",
" original_val_loss,\n",
" \"r--\",\n",
" label=\"Validation loss of original model\",\n",
")\n",
"plt.plot(\n",
" epochs,\n",
" smaller_model_val_loss,\n",
" \"b-\",\n",
" label=\"Validation loss of smaller model\",\n",
")\n",
"plt.title(\"Original model vs. smaller model (IMDB review classification)\")\n",
"plt.xlabel(\"Epochs\")\n",
"plt.ylabel(\"Loss\")\n",
"plt.xticks(epochs)\n",
"plt.legend()\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model = keras.Sequential(\n",
" [\n",
" layers.Dense(512, activation=\"relu\"),\n",
" layers.Dense(512, activation=\"relu\"),\n",
" layers.Dense(1, activation=\"sigmoid\"),\n",
" ]\n",
")\n",
"model.compile(\n",
" optimizer=\"rmsprop\",\n",
" loss=\"binary_crossentropy\",\n",
" metrics=[\"accuracy\"],\n",
")\n",
"history_larger_model = model.fit(\n",
" train_data,\n",
" train_labels,\n",
" epochs=20,\n",
" batch_size=512,\n",
" validation_split=0.4,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"original_val_loss = history_original.history[\"val_loss\"]\n",
"larger_model_val_loss = history_larger_model.history[\"val_loss\"]\n",
"epochs = range(1, 21)\n",
"plt.plot(\n",
" epochs,\n",
" original_val_loss,\n",
" \"r--\",\n",
" label=\"Validation loss of original model\",\n",
")\n",
"plt.plot(\n",
" epochs,\n",
" larger_model_val_loss,\n",
" \"b-\",\n",
" label=\"Validation loss of larger model\",\n",
")\n",
"plt.title(\"Original model vs. larger model (IMDB review classification)\")\n",
"plt.xlabel(\"Epochs\")\n",
"plt.ylabel(\"Loss\")\n",
"plt.xticks(epochs)\n",
"plt.legend()\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Adding weight regularization"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"from keras.regularizers import l2\n",
"\n",
"model = keras.Sequential(\n",
" [\n",
" layers.Dense(16, kernel_regularizer=l2(0.002), activation=\"relu\"),\n",
" layers.Dense(16, kernel_regularizer=l2(0.002), activation=\"relu\"),\n",
" layers.Dense(1, activation=\"sigmoid\"),\n",
" ]\n",
")\n",
"model.compile(\n",
" optimizer=\"rmsprop\",\n",
" loss=\"binary_crossentropy\",\n",
" metrics=[\"accuracy\"],\n",
")\n",
"history_l2_reg = model.fit(\n",
" train_data,\n",
" train_labels,\n",
" epochs=20,\n",
" batch_size=512,\n",
" validation_split=0.4,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"original_val_loss = history_original.history[\"val_loss\"]\n",
"l2_val_loss = history_l2_reg.history[\"val_loss\"]\n",
"epochs = range(1, 21)\n",
"plt.plot(\n",
" epochs,\n",
" original_val_loss,\n",
" \"r--\",\n",
" label=\"Validation loss of original model\",\n",
")\n",
"plt.plot(\n",
" epochs,\n",
" l2_val_loss,\n",
" \"b-\",\n",
" label=\"Validation loss of L2-regularized model\",\n",
")\n",
"plt.title(\n",
" \"Original model vs. L2-regularized model (IMDB review classification)\"\n",
")\n",
"plt.xlabel(\"Epochs\")\n",
"plt.ylabel(\"Loss\")\n",
"plt.xticks(epochs)\n",
"plt.legend()\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"from keras import regularizers\n",
"\n",
"regularizers.l1(0.001)\n",
"regularizers.l1_l2(l1=0.001, l2=0.001)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Adding dropout"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model = keras.Sequential(\n",
" [\n",
" layers.Dense(16, activation=\"relu\"),\n",
" layers.Dropout(0.5),\n",
" layers.Dense(16, activation=\"relu\"),\n",
" layers.Dropout(0.5),\n",
" layers.Dense(1, activation=\"sigmoid\"),\n",
" ]\n",
")\n",
"model.compile(\n",
" optimizer=\"rmsprop\",\n",
" loss=\"binary_crossentropy\",\n",
" metrics=[\"accuracy\"],\n",
")\n",
"history_dropout = model.fit(\n",
" train_data,\n",
" train_labels,\n",
" epochs=20,\n",
" batch_size=512,\n",
" validation_split=0.4,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"original_val_loss = history_original.history[\"val_loss\"]\n",
"dropout_val_loss = history_dropout.history[\"val_loss\"]\n",
"epochs = range(1, 21)\n",
"plt.plot(\n",
" epochs,\n",
" original_val_loss,\n",
" \"r--\",\n",
" label=\"Validation loss of original model\",\n",
")\n",
"plt.plot(\n",
" epochs,\n",
" dropout_val_loss,\n",
" \"b-\",\n",
" label=\"Validation loss of dropout-regularized model\",\n",
")\n",
"plt.title(\n",
" \"Original model vs. dropout-regularized model (IMDB review classification)\"\n",
")\n",
"plt.xlabel(\"Epochs\")\n",
"plt.ylabel(\"Loss\")\n",
"plt.xticks(epochs)\n",
"plt.legend()\n",
"plt.show()"
]
}
],
"metadata": {
"accelerator": "GPU",
"colab": {
"collapsed_sections": [],
"name": "chapter05_fundamentals-of-ml",
"private_outputs": false,
"provenance": [],
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.0"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
================================================
FILE: chapter07_deep-dive-keras.ipynb
================================================
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)."
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"!pip install keras keras-hub --upgrade -q"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import os\n",
"os.environ[\"KERAS_BACKEND\"] = \"jax\""
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"cellView": "form",
"colab_type": "code"
},
"outputs": [],
"source": [
"# @title\n",
"import os\n",
"from IPython.core.magic import register_cell_magic\n",
"\n",
"@register_cell_magic\n",
"def backend(line, cell):\n",
" current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n",
" if current == required:\n",
" get_ipython().run_cell(cell)\n",
" else:\n",
" print(\n",
" f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n",
" f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n",
" )"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"## A deep dive on Keras"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"### A spectrum of workflows"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"### Different ways to build Keras models"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### The Sequential model"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import keras\n",
"from keras import layers\n",
"\n",
"model = keras.Sequential(\n",
" [\n",
" layers.Dense(64, activation=\"relu\"),\n",
" layers.Dense(10, activation=\"softmax\"),\n",
" ]\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model = keras.Sequential()\n",
"model.add(layers.Dense(64, activation=\"relu\"))\n",
"model.add(layers.Dense(10, activation=\"softmax\"))"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model.weights"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model.build(input_shape=(None, 3))\n",
"model.weights"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model.summary(line_length=80)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model = keras.Sequential(name=\"my_example_model\")\n",
"model.add(layers.Dense(64, activation=\"relu\", name=\"my_first_layer\"))\n",
"model.add(layers.Dense(10, activation=\"softmax\", name=\"my_last_layer\"))\n",
"model.build((None, 3))\n",
"model.summary(line_length=80)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model = keras.Sequential()\n",
"model.add(keras.Input(shape=(3,)))\n",
"model.add(layers.Dense(64, activation=\"relu\"))"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model.summary(line_length=80)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model.add(layers.Dense(10, activation=\"softmax\"))\n",
"model.summary(line_length=80)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### The Functional API"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### A simple example"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"inputs = keras.Input(shape=(3,), name=\"my_input\")\n",
"features = layers.Dense(64, activation=\"relu\")(inputs)\n",
"outputs = layers.Dense(10, activation=\"softmax\")(features)\n",
"model = keras.Model(inputs=inputs, outputs=outputs, name=\"my_functional_model\")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"inputs = keras.Input(shape=(3,), name=\"my_input\")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"inputs.shape"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"inputs.dtype"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"features = layers.Dense(64, activation=\"relu\")(inputs)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"features.shape"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"outputs = layers.Dense(10, activation=\"softmax\")(features)\n",
"model = keras.Model(inputs=inputs, outputs=outputs, name=\"my_functional_model\")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model.summary(line_length=80)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Multi-input, multi-output models"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"vocabulary_size = 10000\n",
"num_tags = 100\n",
"num_departments = 4\n",
"\n",
"title = keras.Input(shape=(vocabulary_size,), name=\"title\")\n",
"text_body = keras.Input(shape=(vocabulary_size,), name=\"text_body\")\n",
"tags = keras.Input(shape=(num_tags,), name=\"tags\")\n",
"\n",
"features = layers.Concatenate()([title, text_body, tags])\n",
"features = layers.Dense(64, activation=\"relu\", name=\"dense_features\")(features)\n",
"\n",
"priority = layers.Dense(1, activation=\"sigmoid\", name=\"priority\")(features)\n",
"department = layers.Dense(\n",
" num_departments, activation=\"softmax\", name=\"department\"\n",
")(features)\n",
"\n",
"model = keras.Model(\n",
" inputs=[title, text_body, tags],\n",
" outputs=[priority, department],\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Training a multi-input, multi-output model"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import numpy as np\n",
"\n",
"num_samples = 1280\n",
"\n",
"title_data = np.random.randint(0, 2, size=(num_samples, vocabulary_size))\n",
"text_body_data = np.random.randint(0, 2, size=(num_samples, vocabulary_size))\n",
"tags_data = np.random.randint(0, 2, size=(num_samples, num_tags))\n",
"\n",
"priority_data = np.random.random(size=(num_samples, 1))\n",
"department_data = np.random.randint(0, num_departments, size=(num_samples, 1))\n",
"\n",
"model.compile(\n",
" optimizer=\"adam\",\n",
" loss=[\"mean_squared_error\", \"sparse_categorical_crossentropy\"],\n",
" metrics=[[\"mean_absolute_error\"], [\"accuracy\"]],\n",
")\n",
"model.fit(\n",
" [title_data, text_body_data, tags_data],\n",
" [priority_data, department_data],\n",
" epochs=1,\n",
")\n",
"model.evaluate(\n",
" [title_data, text_body_data, tags_data], [priority_data, department_data]\n",
")\n",
"priority_preds, department_preds = model.predict(\n",
" [title_data, text_body_data, tags_data]\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model.compile(\n",
" optimizer=\"adam\",\n",
" loss={\n",
" \"priority\": \"mean_squared_error\",\n",
" \"department\": \"sparse_categorical_crossentropy\",\n",
" },\n",
" metrics={\n",
" \"priority\": [\"mean_absolute_error\"],\n",
" \"department\": [\"accuracy\"],\n",
" },\n",
")\n",
"model.fit(\n",
" {\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data},\n",
" {\"priority\": priority_data, \"department\": department_data},\n",
" epochs=1,\n",
")\n",
"model.evaluate(\n",
" {\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data},\n",
" {\"priority\": priority_data, \"department\": department_data},\n",
")\n",
"priority_preds, department_preds = model.predict(\n",
" {\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data}\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### The power of the Functional API: Access to layer connectivity"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"###### Plotting layer connectivity"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"keras.utils.plot_model(model, \"ticket_classifier.png\")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"keras.utils.plot_model(\n",
" model,\n",
" \"ticket_classifier_with_shape_info.png\",\n",
" show_shapes=True,\n",
" show_layer_names=True,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"###### Feature extraction with a Functional model"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model.layers"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model.layers[3].input"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model.layers[3].output"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"features = model.layers[4].output\n",
"difficulty = layers.Dense(3, activation=\"softmax\", name=\"difficulty\")(features)\n",
"\n",
"new_model = keras.Model(\n",
" inputs=[title, text_body, tags], outputs=[priority, department, difficulty]\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"keras.utils.plot_model(\n",
" new_model,\n",
" \"updated_ticket_classifier.png\",\n",
" show_shapes=True,\n",
" show_layer_names=True,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Subclassing the Model class"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Rewriting our previous example as a subclassed model"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"class CustomerTicketModel(keras.Model):\n",
" def __init__(self, num_departments):\n",
" super().__init__()\n",
" self.concat_layer = layers.Concatenate()\n",
" self.mixing_layer = layers.Dense(64, activation=\"relu\")\n",
" self.priority_scorer = layers.Dense(1, activation=\"sigmoid\")\n",
" self.department_classifier = layers.Dense(\n",
" num_departments, activation=\"softmax\"\n",
" )\n",
"\n",
" def call(self, inputs):\n",
" title = inputs[\"title\"]\n",
" text_body = inputs[\"text_body\"]\n",
" tags = inputs[\"tags\"]\n",
"\n",
" features = self.concat_layer([title, text_body, tags])\n",
" features = self.mixing_layer(features)\n",
" priority = self.priority_scorer(features)\n",
" department = self.department_classifier(features)\n",
" return priority, department"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model = CustomerTicketModel(num_departments=4)\n",
"\n",
"priority, department = model(\n",
" {\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data}\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model.compile(\n",
" optimizer=\"adam\",\n",
" loss=[\"mean_squared_error\", \"sparse_categorical_crossentropy\"],\n",
" metrics=[[\"mean_absolute_error\"], [\"accuracy\"]],\n",
")\n",
"model.fit(\n",
" {\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data},\n",
" [priority_data, department_data],\n",
" epochs=1,\n",
")\n",
"model.evaluate(\n",
" {\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data},\n",
" [priority_data, department_data],\n",
")\n",
"priority_preds, department_preds = model.predict(\n",
" {\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data}\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Beware: What subclassed models don't support"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Mixing and matching different components"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"class Classifier(keras.Model):\n",
" def __init__(self, num_classes=2):\n",
" super().__init__()\n",
" if num_classes == 2:\n",
" num_units = 1\n",
" activation = \"sigmoid\"\n",
" else:\n",
" num_units = num_classes\n",
" activation = \"softmax\"\n",
" self.dense = layers.Dense(num_units, activation=activation)\n",
"\n",
" def call(self, inputs):\n",
" return self.dense(inputs)\n",
"\n",
"inputs = keras.Input(shape=(3,))\n",
"features = layers.Dense(64, activation=\"relu\")(inputs)\n",
"outputs = Classifier(num_classes=10)(features)\n",
"model = keras.Model(inputs=inputs, outputs=outputs)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"inputs = keras.Input(shape=(64,))\n",
"outputs = layers.Dense(1, activation=\"sigmoid\")(inputs)\n",
"binary_classifier = keras.Model(inputs=inputs, outputs=outputs)\n",
"\n",
"class MyModel(keras.Model):\n",
" def __init__(self, num_classes=2):\n",
" super().__init__()\n",
" self.dense = layers.Dense(64, activation=\"relu\")\n",
" self.classifier = binary_classifier\n",
"\n",
" def call(self, inputs):\n",
" features = self.dense(inputs)\n",
" return self.classifier(features)\n",
"\n",
"model = MyModel()"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Remember: Use the right tool for the job"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"### Using built-in training and evaluation loops"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"from keras.datasets import mnist\n",
"\n",
"def get_mnist_model():\n",
" inputs = keras.Input(shape=(28 * 28,))\n",
" features = layers.Dense(512, activation=\"relu\")(inputs)\n",
" features = layers.Dropout(0.5)(features)\n",
" outputs = layers.Dense(10, activation=\"softmax\")(features)\n",
" model = keras.Model(inputs, outputs)\n",
" return model\n",
"\n",
"(images, labels), (test_images, test_labels) = mnist.load_data()\n",
"images = images.reshape((60000, 28 * 28)).astype(\"float32\") / 255\n",
"test_images = test_images.reshape((10000, 28 * 28)).astype(\"float32\") / 255\n",
"train_images, val_images = images[10000:], images[:10000]\n",
"train_labels, val_labels = labels[10000:], labels[:10000]\n",
"\n",
"model = get_mnist_model()\n",
"model.compile(\n",
" optimizer=\"adam\",\n",
" loss=\"sparse_categorical_crossentropy\",\n",
" metrics=[\"accuracy\"],\n",
")\n",
"model.fit(\n",
" train_images,\n",
" train_labels,\n",
" epochs=3,\n",
" validation_data=(val_images, val_labels),\n",
")\n",
"test_metrics = model.evaluate(test_images, test_labels)\n",
"predictions = model.predict(test_images)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Writing your own metrics"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"from keras import ops\n",
"\n",
"class RootMeanSquaredError(keras.metrics.Metric):\n",
" def __init__(self, name=\"rmse\", **kwargs):\n",
" super().__init__(name=name, **kwargs)\n",
" self.mse_sum = self.add_weight(name=\"mse_sum\", initializer=\"zeros\")\n",
" self.total_samples = self.add_weight(\n",
" name=\"total_samples\", initializer=\"zeros\"\n",
" )\n",
"\n",
" def update_state(self, y_true, y_pred, sample_weight=None):\n",
" y_true = ops.one_hot(y_true, num_classes=ops.shape(y_pred)[1])\n",
" mse = ops.sum(ops.square(y_true - y_pred))\n",
" self.mse_sum.assign_add(mse)\n",
" num_samples = ops.shape(y_pred)[0]\n",
" self.total_samples.assign_add(num_samples)\n",
"\n",
" def result(self):\n",
" return ops.sqrt(self.mse_sum / self.total_samples)\n",
"\n",
" def reset_state(self):\n",
" self.mse_sum.assign(0.)\n",
" self.total_samples.assign(0.)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model = get_mnist_model()\n",
"model.compile(\n",
" optimizer=\"adam\",\n",
" loss=\"sparse_categorical_crossentropy\",\n",
" metrics=[\"accuracy\", RootMeanSquaredError()],\n",
")\n",
"model.fit(\n",
" train_images,\n",
" train_labels,\n",
" epochs=3,\n",
" validation_data=(val_images, val_labels),\n",
")\n",
"test_metrics = model.evaluate(test_images, test_labels)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Using callbacks"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### The EarlyStopping and ModelCheckpoint callbacks"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"callbacks_list = [\n",
" keras.callbacks.EarlyStopping(\n",
" monitor=\"accuracy\",\n",
" patience=1,\n",
" ),\n",
" keras.callbacks.ModelCheckpoint(\n",
" filepath=\"checkpoint_path.keras\",\n",
" monitor=\"val_loss\",\n",
" save_best_only=True,\n",
" ),\n",
"]\n",
"model = get_mnist_model()\n",
"model.compile(\n",
" optimizer=\"adam\",\n",
" loss=\"sparse_categorical_crossentropy\",\n",
" metrics=[\"accuracy\"],\n",
")\n",
"model.fit(\n",
" train_images,\n",
" train_labels,\n",
" epochs=10,\n",
" callbacks=callbacks_list,\n",
" validation_data=(val_images, val_labels),\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model = keras.models.load_model(\"checkpoint_path.keras\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Writing your own callbacks"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"from matplotlib import pyplot as plt\n",
"\n",
"class LossHistory(keras.callbacks.Callback):\n",
" def on_train_begin(self, logs):\n",
" self.per_batch_losses = []\n",
"\n",
" def on_batch_end(self, batch, logs):\n",
" self.per_batch_losses.append(logs.get(\"loss\"))\n",
"\n",
" def on_epoch_end(self, epoch, logs):\n",
" plt.clf()\n",
" plt.plot(\n",
" range(len(self.per_batch_losses)),\n",
" self.per_batch_losses,\n",
" label=\"Training loss for each batch\",\n",
" )\n",
" plt.xlabel(f\"Batch (epoch {epoch})\")\n",
" plt.ylabel(\"Loss\")\n",
" plt.legend()\n",
" plt.savefig(f\"plot_at_epoch_{epoch}\", dpi=300)\n",
" self.per_batch_losses = []"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model = get_mnist_model()\n",
"model.compile(\n",
" optimizer=\"adam\",\n",
" loss=\"sparse_categorical_crossentropy\",\n",
" metrics=[\"accuracy\"],\n",
")\n",
"model.fit(\n",
" train_images,\n",
" train_labels,\n",
" epochs=10,\n",
" callbacks=[LossHistory()],\n",
" validation_data=(val_images, val_labels),\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Monitoring and visualization with TensorBoard"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model = get_mnist_model()\n",
"model.compile(\n",
" optimizer=\"adam\",\n",
" loss=\"sparse_categorical_crossentropy\",\n",
" metrics=[\"accuracy\"],\n",
")\n",
"\n",
"tensorboard = keras.callbacks.TensorBoard(\n",
" log_dir=\"/full_path_to_your_log_dir\",\n",
")\n",
"model.fit(\n",
" train_images,\n",
" train_labels,\n",
" epochs=10,\n",
" validation_data=(val_images, val_labels),\n",
" callbacks=[tensorboard],\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"%load_ext tensorboard\n",
"%tensorboard --logdir /full_path_to_your_log_dir"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"### Writing your own training and evaluation loops"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Training vs. inference"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Writing custom training step functions"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### A TensorFlow training step function"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"%%backend tensorflow\n",
"import tensorflow as tf\n",
"\n",
"model = get_mnist_model()\n",
"loss_fn = keras.losses.SparseCategoricalCrossentropy()\n",
"optimizer = keras.optimizers.Adam()\n",
"\n",
"def train_step(inputs, targets):\n",
" with tf.GradientTape() as tape:\n",
" predictions = model(inputs, training=True)\n",
" loss = loss_fn(targets, predictions)\n",
" gradients = tape.gradient(loss, model.trainable_weights)\n",
" optimizer.apply(gradients, model.trainable_weights)\n",
" return loss"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"%%backend tensorflow\n",
"batch_size = 32\n",
"inputs = train_images[:batch_size]\n",
"targets = train_labels[:batch_size]\n",
"loss = train_step(inputs, targets)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### A PyTorch training step function"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"%%backend torch\n",
"import torch\n",
"\n",
"model = get_mnist_model()\n",
"loss_fn = keras.losses.SparseCategoricalCrossentropy()\n",
"optimizer = keras.optimizers.Adam()\n",
"\n",
"def train_step(inputs, targets):\n",
" predictions = model(inputs, training=True)\n",
" loss = loss_fn(targets, predictions)\n",
" loss.backward()\n",
" gradients = [weight.value.grad for weight in model.trainable_weights]\n",
" with torch.no_grad():\n",
" optimizer.apply(gradients, model.trainable_weights)\n",
" model.zero_grad()\n",
" return loss"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"%%backend torch\n",
"batch_size = 32\n",
"inputs = train_images[:batch_size]\n",
"targets = train_labels[:batch_size]\n",
"loss = train_step(inputs, targets)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### A JAX training step function"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"%%backend jax\n",
"model = get_mnist_model()\n",
"loss_fn = keras.losses.SparseCategoricalCrossentropy()\n",
"\n",
"def compute_loss_and_updates(\n",
" trainable_variables, non_trainable_variables, inputs, targets\n",
"):\n",
" outputs, non_trainable_variables = model.stateless_call(\n",
" trainable_variables, non_trainable_variables, inputs, training=True\n",
" )\n",
" loss = loss_fn(targets, outputs)\n",
" return loss, non_trainable_variables"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"%%backend jax\n",
"import jax\n",
"\n",
"grad_fn = jax.value_and_grad(compute_loss_and_updates, has_aux=True)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"%%backend jax\n",
"optimizer = keras.optimizers.Adam()\n",
"optimizer.build(model.trainable_variables)\n",
"\n",
"def train_step(state, inputs, targets):\n",
" (trainable_variables, non_trainable_variables, optimizer_variables) = state\n",
" (loss, non_trainable_variables), grads = grad_fn(\n",
" trainable_variables, non_trainable_variables, inputs, targets\n",
" )\n",
" trainable_variables, optimizer_variables = optimizer.stateless_apply(\n",
" optimizer_variables, grads, trainable_variables\n",
" )\n",
" return loss, (\n",
" trainable_variables,\n",
" non_trainable_variables,\n",
" optimizer_variables,\n",
" )"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"%%backend jax\n",
"batch_size = 32\n",
"inputs = train_images[:batch_size]\n",
"targets = train_labels[:batch_size]\n",
"\n",
"trainable_variables = [v.value for v in model.trainable_variables]\n",
"non_trainable_variables = [v.value for v in model.non_trainable_variables]\n",
"optimizer_variables = [v.value for v in optimizer.variables]\n",
"\n",
"state = (trainable_variables, non_trainable_variables, optimizer_variables)\n",
"loss, state = train_step(state, inputs, targets)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Low-level usage of metrics"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"from keras import ops\n",
"\n",
"metric = keras.metrics.SparseCategoricalAccuracy()\n",
"targets = ops.array([0, 1, 2])\n",
"predictions = ops.array([[1, 0, 0], [0, 1, 0], [0, 0, 1]])\n",
"metric.update_state(targets, predictions)\n",
"current_result = metric.result()\n",
"print(f\"result: {current_result:.2f}\")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"values = ops.array([0, 1, 2, 3, 4])\n",
"mean_tracker = keras.metrics.Mean()\n",
"for value in values:\n",
" mean_tracker.update_state(value)\n",
"print(f\"Mean of values: {mean_tracker.result():.2f}\")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"metric = keras.metrics.SparseCategoricalAccuracy()\n",
"targets = ops.array([0, 1, 2])\n",
"predictions = ops.array([[1, 0, 0], [0, 1, 0], [0, 0, 1]])\n",
"\n",
"metric_variables = metric.variables\n",
"metric_variables = metric.stateless_update_state(\n",
" metric_variables, targets, predictions\n",
")\n",
"current_result = metric.stateless_result(metric_variables)\n",
"print(f\"result: {current_result:.2f}\")\n",
"\n",
"metric_variables = metric.stateless_reset_state()"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Using fit() with a custom training loop"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Customizing fit() with TensorFlow"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"%%backend tensorflow\n",
"import keras\n",
"from keras import layers\n",
"\n",
"loss_fn = keras.losses.SparseCategoricalCrossentropy()\n",
"loss_tracker = keras.metrics.Mean(name=\"loss\")\n",
"\n",
"class CustomModel(keras.Model):\n",
" def train_step(self, data):\n",
" inputs, targets = data\n",
" with tf.GradientTape() as tape:\n",
" predictions = self(inputs, training=True)\n",
" loss = loss_fn(targets, predictions)\n",
" gradients = tape.gradient(loss, self.trainable_weights)\n",
" self.optimizer.apply(gradients, self.trainable_weights)\n",
"\n",
" loss_tracker.update_state(loss)\n",
" return {\"loss\": loss_tracker.result()}\n",
"\n",
" @property\n",
" def metrics(self):\n",
" return [loss_tracker]"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"%%backend tensorflow\n",
"def get_custom_model():\n",
" inputs = keras.Input(shape=(28 * 28,))\n",
" features = layers.Dense(512, activation=\"relu\")(inputs)\n",
" features = layers.Dropout(0.5)(features)\n",
" outputs = layers.Dense(10, activation=\"softmax\")(features)\n",
" model = CustomModel(inputs, outputs)\n",
" model.compile(optimizer=keras.optimizers.Adam())\n",
" return model"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"%%backend tensorflow\n",
"model = get_custom_model()\n",
"model.fit(train_images, train_labels, epochs=3)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Customizing fit() with PyTorch"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"%%backend torch\n",
"import keras\n",
"from keras import layers\n",
"\n",
"loss_fn = keras.losses.SparseCategoricalCrossentropy()\n",
"loss_tracker = keras.metrics.Mean(name=\"loss\")\n",
"\n",
"class CustomModel(keras.Model):\n",
" def train_step(self, data):\n",
" inputs, targets = data\n",
" predictions = self(inputs, training=True)\n",
" loss = loss_fn(targets, predictions)\n",
"\n",
" loss.backward()\n",
" trainable_weights = [v for v in self.trainable_weights]\n",
" gradients = [v.value.grad for v in trainable_weights]\n",
"\n",
" with torch.no_grad():\n",
" self.optimizer.apply(gradients, trainable_weights)\n",
"\n",
" loss_tracker.update_state(loss)\n",
" return {\"loss\": loss_tracker.result()}\n",
"\n",
" @property\n",
" def metrics(self):\n",
" return [loss_tracker]"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"%%backend torch\n",
"def get_custom_model():\n",
" inputs = keras.Input(shape=(28 * 28,))\n",
" features = layers.Dense(512, activation=\"relu\")(inputs)\n",
" features = layers.Dropout(0.5)(features)\n",
" outputs = layers.Dense(10, activation=\"softmax\")(features)\n",
" model = CustomModel(inputs, outputs)\n",
" model.compile(optimizer=keras.optimizers.Adam())\n",
" return model"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"%%backend torch\n",
"model = get_custom_model()\n",
"model.fit(train_images, train_labels, epochs=3)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Customizing fit() with JAX"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"%%backend jax\n",
"import keras\n",
"from keras import layers\n",
"\n",
"loss_fn = keras.losses.SparseCategoricalCrossentropy()\n",
"\n",
"class CustomModel(keras.Model):\n",
" def compute_loss_and_updates(\n",
" self,\n",
" trainable_variables,\n",
" non_trainable_variables,\n",
" inputs,\n",
" targets,\n",
" training=False,\n",
" ):\n",
" predictions, non_trainable_variables = self.stateless_call(\n",
" trainable_variables,\n",
" non_trainable_variables,\n",
" inputs,\n",
" training=training,\n",
" )\n",
" loss = loss_fn(targets, predictions)\n",
" return loss, non_trainable_variables\n",
"\n",
" def train_step(self, state, data):\n",
" (\n",
" trainable_variables,\n",
" non_trainable_variables,\n",
" optimizer_variables,\n",
" metrics_variables,\n",
" ) = state\n",
" inputs, targets = data\n",
"\n",
" grad_fn = jax.value_and_grad(\n",
" self.compute_loss_and_updates, has_aux=True\n",
" )\n",
"\n",
" (loss, non_trainable_variables), grads = grad_fn(\n",
" trainable_variables,\n",
" non_trainable_variables,\n",
" inputs,\n",
" targets,\n",
" training=True,\n",
" )\n",
"\n",
" (\n",
" trainable_variables,\n",
" optimizer_variables,\n",
" ) = self.optimizer.stateless_apply(\n",
" optimizer_variables, grads, trainable_variables\n",
" )\n",
"\n",
" logs = {\"loss\": loss}\n",
" state = (\n",
" trainable_variables,\n",
" non_trainable_variables,\n",
" optimizer_variables,\n",
" metrics_variables,\n",
" )\n",
" return logs, state"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"%%backend jax\n",
"def get_custom_model():\n",
" inputs = keras.Input(shape=(28 * 28,))\n",
" features = layers.Dense(512, activation=\"relu\")(inputs)\n",
" features = layers.Dropout(0.5)(features)\n",
" outputs = layers.Dense(10, activation=\"softmax\")(features)\n",
" model = CustomModel(inputs, outputs)\n",
" model.compile(optimizer=keras.optimizers.Adam())\n",
" return model"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"%%backend jax\n",
"model = get_custom_model()\n",
"model.fit(train_images, train_labels, epochs=3)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Handling metrics in a custom train_step()"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### train_step() metrics handling with TensorFlow"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"%%backend tensorflow\n",
"import keras\n",
"from keras import layers\n",
"\n",
"class CustomModel(keras.Model):\n",
" def train_step(self, data):\n",
" inputs, targets = data\n",
" with tf.GradientTape() as tape:\n",
" predictions = self(inputs, training=True)\n",
" loss = self.compute_loss(y=targets, y_pred=predictions)\n",
"\n",
" gradients = tape.gradient(loss, self.trainable_weights)\n",
" self.optimizer.apply(gradients, self.trainable_weights)\n",
"\n",
" for metric in self.metrics:\n",
" if metric.name == \"loss\":\n",
" metric.update_state(loss)\n",
" else:\n",
" metric.update_state(targets, predictions)\n",
"\n",
" return {m.name: m.result() for m in self.metrics}"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"%%backend tensorflow\n",
"def get_custom_model():\n",
" inputs = keras.Input(shape=(28 * 28,))\n",
" features = layers.Dense(512, activation=\"relu\")(inputs)\n",
" features = layers.Dropout(0.5)(features)\n",
" outputs = layers.Dense(10, activation=\"softmax\")(features)\n",
" model = CustomModel(inputs, outputs)\n",
" model.compile(\n",
" optimizer=keras.optimizers.Adam(),\n",
" loss=keras.losses.SparseCategoricalCrossentropy(),\n",
" metrics=[keras.metrics.SparseCategoricalAccuracy()],\n",
" )\n",
" return model\n",
"\n",
"model = get_custom_model()\n",
"model.fit(train_images, train_labels, epochs=3)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### train_step() metrics handling with PyTorch"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"%%backend torch\n",
"import keras\n",
"from keras import layers\n",
"\n",
"class CustomModel(keras.Model):\n",
" def train_step(self, data):\n",
" inputs, targets = data\n",
" predictions = self(inputs, training=True)\n",
" loss = self.compute_loss(y=targets, y_pred=predictions)\n",
"\n",
" loss.backward()\n",
" trainable_weights = [v for v in self.trainable_weights]\n",
" gradients = [v.value.grad for v in trainable_weights]\n",
"\n",
" with torch.no_grad():\n",
" self.optimizer.apply(gradients, trainable_weights)\n",
"\n",
" for metric in self.metrics:\n",
" if metric.name == \"loss\":\n",
" metric.update_state(loss)\n",
" else:\n",
" metric.update_state(targets, predictions)\n",
"\n",
" return {m.name: m.result() for m in self.metrics}"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"%%backend torch\n",
"def get_custom_model():\n",
" inputs = keras.Input(shape=(28 * 28,))\n",
" features = layers.Dense(512, activation=\"relu\")(inputs)\n",
" features = layers.Dropout(0.5)(features)\n",
" outputs = layers.Dense(10, activation=\"softmax\")(features)\n",
" model = CustomModel(inputs, outputs)\n",
" model.compile(\n",
" optimizer=keras.optimizers.Adam(),\n",
" loss=keras.losses.SparseCategoricalCrossentropy(),\n",
" metrics=[keras.metrics.SparseCategoricalAccuracy()],\n",
" )\n",
" return model\n",
"\n",
"model = get_custom_model()\n",
"model.fit(train_images, train_labels, epochs=3)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### train_step() metrics handling with JAX"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"%%backend jax\n",
"import keras\n",
"from keras import layers\n",
"\n",
"class CustomModel(keras.Model):\n",
" def compute_loss_and_updates(\n",
" self,\n",
" trainable_variables,\n",
" non_trainable_variables,\n",
" inputs,\n",
" targets,\n",
" training=False,\n",
" ):\n",
" predictions, non_trainable_variables = self.stateless_call(\n",
" trainable_variables,\n",
" non_trainable_variables,\n",
" inputs,\n",
" training=training,\n",
" )\n",
" loss = self.compute_loss(y=targets, y_pred=predictions)\n",
" return loss, (predictions, non_trainable_variables)\n",
"\n",
" def train_step(self, state, data):\n",
" (\n",
" trainable_variables,\n",
" non_trainable_variables,\n",
" optimizer_variables,\n",
" metrics_variables,\n",
" ) = state\n",
" inputs, targets = data\n",
"\n",
" grad_fn = jax.value_and_grad(\n",
" self.compute_loss_and_updates, has_aux=True\n",
" )\n",
"\n",
" (loss, (predictions, non_trainable_variables)), grads = grad_fn(\n",
" trainable_variables,\n",
" non_trainable_variables,\n",
" inputs,\n",
" targets,\n",
" training=True,\n",
" )\n",
" (\n",
" trainable_variables,\n",
" optimizer_variables,\n",
" ) = self.optimizer.stateless_apply(\n",
" optimizer_variables, grads, trainable_variables\n",
" )\n",
"\n",
" new_metrics_vars = []\n",
" logs = {}\n",
" for metric in self.metrics:\n",
" num_prev = len(new_metrics_vars)\n",
" num_current = len(metric.variables)\n",
" current_vars = metrics_variables[num_prev : num_prev + num_current]\n",
" if metric.name == \"loss\":\n",
" current_vars = metric.stateless_update_state(current_vars, loss)\n",
" else:\n",
" current_vars = metric.stateless_update_state(\n",
" current_vars, targets, predictions\n",
" )\n",
" logs[metric.name] = metric.stateless_result(current_vars)\n",
" new_metrics_vars += current_vars\n",
"\n",
" state = (\n",
" trainable_variables,\n",
" non_trainable_variables,\n",
" optimizer_variables,\n",
" new_metrics_vars,\n",
" )\n",
" return logs, state"
]
}
],
"metadata": {
"accelerator": "GPU",
"colab": {
"collapsed_sections": [],
"name": "chapter07_deep-dive-keras",
"private_outputs": false,
"provenance": [],
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.0"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
================================================
FILE: chapter08_image-classification.ipynb
================================================
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)."
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"!pip install keras keras-hub --upgrade -q"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import os\n",
"os.environ[\"KERAS_BACKEND\"] = \"jax\""
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"cellView": "form",
"colab_type": "code"
},
"outputs": [],
"source": [
"# @title\n",
"import os\n",
"from IPython.core.magic import register_cell_magic\n",
"\n",
"@register_cell_magic\n",
"def backend(line, cell):\n",
" current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n",
" if current == required:\n",
" get_ipython().run_cell(cell)\n",
" else:\n",
" print(\n",
" f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n",
" f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n",
" )"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"## Image classification"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"### Introduction to ConvNets"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import keras\n",
"from keras import layers\n",
"\n",
"inputs = keras.Input(shape=(28, 28, 1))\n",
"x = layers.Conv2D(filters=64, kernel_size=3, activation=\"relu\")(inputs)\n",
"x = layers.MaxPooling2D(pool_size=2)(x)\n",
"x = layers.Conv2D(filters=128, kernel_size=3, activation=\"relu\")(x)\n",
"x = layers.MaxPooling2D(pool_size=2)(x)\n",
"x = layers.Conv2D(filters=256, kernel_size=3, activation=\"relu\")(x)\n",
"x = layers.GlobalAveragePooling2D()(x)\n",
"outputs = layers.Dense(10, activation=\"softmax\")(x)\n",
"model = keras.Model(inputs=inputs, outputs=outputs)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model.summary(line_length=80)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"from keras.datasets import mnist\n",
"\n",
"(train_images, train_labels), (test_images, test_labels) = mnist.load_data()\n",
"train_images = train_images.reshape((60000, 28, 28, 1))\n",
"train_images = train_images.astype(\"float32\") / 255\n",
"test_images = test_images.reshape((10000, 28, 28, 1))\n",
"test_images = test_images.astype(\"float32\") / 255\n",
"model.compile(\n",
" optimizer=\"adam\",\n",
" loss=\"sparse_categorical_crossentropy\",\n",
" metrics=[\"accuracy\"],\n",
")\n",
"model.fit(train_images, train_labels, epochs=5, batch_size=64)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"test_loss, test_acc = model.evaluate(test_images, test_labels)\n",
"print(f\"Test accuracy: {test_acc:.3f}\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### The convolution operation"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Understanding border effects and padding"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Understanding convolution strides"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### The max-pooling operation"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"inputs = keras.Input(shape=(28, 28, 1))\n",
"x = layers.Conv2D(filters=64, kernel_size=3, activation=\"relu\")(inputs)\n",
"x = layers.Conv2D(filters=128, kernel_size=3, activation=\"relu\")(x)\n",
"x = layers.Conv2D(filters=256, kernel_size=3, activation=\"relu\")(x)\n",
"x = layers.GlobalAveragePooling2D()(x)\n",
"outputs = layers.Dense(10, activation=\"softmax\")(x)\n",
"model_no_max_pool = keras.Model(inputs=inputs, outputs=outputs)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model_no_max_pool.summary(line_length=80)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"### Training a ConvNet from scratch on a small dataset"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### The relevance of deep learning for small-data problems"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Downloading the data"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import kagglehub\n",
"\n",
"kagglehub.login()"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"download_path = kagglehub.competition_download(\"dogs-vs-cats\")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import zipfile\n",
"\n",
"with zipfile.ZipFile(download_path + \"/train.zip\", \"r\") as zip_ref:\n",
" zip_ref.extractall(\".\")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import os, shutil, pathlib\n",
"\n",
"original_dir = pathlib.Path(\"train\")\n",
"new_base_dir = pathlib.Path(\"dogs_vs_cats_small\")\n",
"\n",
"def make_subset(subset_name, start_index, end_index):\n",
" for category in (\"cat\", \"dog\"):\n",
" dir = new_base_dir / subset_name / category\n",
" os.makedirs(dir)\n",
" fnames = [f\"{category}.{i}.jpg\" for i in range(start_index, end_index)]\n",
" for fname in fnames:\n",
" shutil.copyfile(src=original_dir / fname, dst=dir / fname)\n",
"\n",
"make_subset(\"train\", start_index=0, end_index=1000)\n",
"make_subset(\"validation\", start_index=1000, end_index=1500)\n",
"make_subset(\"test\", start_index=1500, end_index=2500)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Building your model"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import keras\n",
"from keras import layers\n",
"\n",
"inputs = keras.Input(shape=(180, 180, 3))\n",
"x = layers.Rescaling(1.0 / 255)(inputs)\n",
"x = layers.Conv2D(filters=32, kernel_size=3, activation=\"relu\")(x)\n",
"x = layers.MaxPooling2D(pool_size=2)(x)\n",
"x = layers.Conv2D(filters=64, kernel_size=3, activation=\"relu\")(x)\n",
"x = layers.MaxPooling2D(pool_size=2)(x)\n",
"x = layers.Conv2D(filters=128, kernel_size=3, activation=\"relu\")(x)\n",
"x = layers.MaxPooling2D(pool_size=2)(x)\n",
"x = layers.Conv2D(filters=256, kernel_size=3, activation=\"relu\")(x)\n",
"x = layers.MaxPooling2D(pool_size=2)(x)\n",
"x = layers.Conv2D(filters=512, kernel_size=3, activation=\"relu\")(x)\n",
"x = layers.GlobalAveragePooling2D()(x)\n",
"outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n",
"model = keras.Model(inputs=inputs, outputs=outputs)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model.summary(line_length=80)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model.compile(\n",
" loss=\"binary_crossentropy\",\n",
" optimizer=\"adam\",\n",
" metrics=[\"accuracy\"],\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Data preprocessing"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"from keras.utils import image_dataset_from_directory\n",
"\n",
"batch_size = 64\n",
"image_size = (180, 180)\n",
"train_dataset = image_dataset_from_directory(\n",
" new_base_dir / \"train\", image_size=image_size, batch_size=batch_size\n",
")\n",
"validation_dataset = image_dataset_from_directory(\n",
" new_base_dir / \"validation\", image_size=image_size, batch_size=batch_size\n",
")\n",
"test_dataset = image_dataset_from_directory(\n",
" new_base_dir / \"test\", image_size=image_size, batch_size=batch_size\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Understanding TensorFlow Dataset objects"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import numpy as np\n",
"import tensorflow as tf\n",
"\n",
"random_numbers = np.random.normal(size=(1000, 16))\n",
"dataset = tf.data.Dataset.from_tensor_slices(random_numbers)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"for i, element in enumerate(dataset):\n",
" print(element.shape)\n",
" if i >= 2:\n",
" break"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"batched_dataset = dataset.batch(32)\n",
"for i, element in enumerate(batched_dataset):\n",
" print(element.shape)\n",
" if i >= 2:\n",
" break"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"reshaped_dataset = dataset.map(\n",
" lambda x: tf.reshape(x, (4, 4)),\n",
" num_parallel_calls=8)\n",
"for i, element in enumerate(reshaped_dataset):\n",
" print(element.shape)\n",
" if i >= 2:\n",
" break"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Fitting the model"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"for data_batch, labels_batch in train_dataset:\n",
" print(\"data batch shape:\", data_batch.shape)\n",
" print(\"labels batch shape:\", labels_batch.shape)\n",
" break"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"callbacks = [\n",
" keras.callbacks.ModelCheckpoint(\n",
" filepath=\"convnet_from_scratch.keras\",\n",
" save_best_only=True,\n",
" monitor=\"val_loss\",\n",
" )\n",
"]\n",
"history = model.fit(\n",
" train_dataset,\n",
" epochs=50,\n",
" validation_data=validation_dataset,\n",
" callbacks=callbacks,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
"\n",
"accuracy = history.history[\"accuracy\"]\n",
"val_accuracy = history.history[\"val_accuracy\"]\n",
"loss = history.history[\"loss\"]\n",
"val_loss = history.history[\"val_loss\"]\n",
"epochs = range(1, len(accuracy) + 1)\n",
"\n",
"plt.plot(epochs, accuracy, \"r--\", label=\"Training accuracy\")\n",
"plt.plot(epochs, val_accuracy, \"b\", label=\"Validation accuracy\")\n",
"plt.title(\"Training and validation accuracy\")\n",
"plt.legend()\n",
"plt.figure()\n",
"\n",
"plt.plot(epochs, loss, \"r--\", label=\"Training loss\")\n",
"plt.plot(epochs, val_loss, \"b\", label=\"Validation loss\")\n",
"plt.title(\"Training and validation loss\")\n",
"plt.legend()\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"test_model = keras.models.load_model(\"convnet_from_scratch.keras\")\n",
"test_loss, test_acc = test_model.evaluate(test_dataset)\n",
"print(f\"Test accuracy: {test_acc:.3f}\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Using data augmentation"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"data_augmentation_layers = [\n",
" layers.RandomFlip(\"horizontal\"),\n",
" layers.RandomRotation(0.1),\n",
" layers.RandomZoom(0.2),\n",
"]\n",
"\n",
"def data_augmentation(images, targets):\n",
" for layer in data_augmentation_layers:\n",
" images = layer(images)\n",
" return images, targets\n",
"\n",
"augmented_train_dataset = train_dataset.map(\n",
" data_augmentation, num_parallel_calls=8\n",
")\n",
"augmented_train_dataset = augmented_train_dataset.prefetch(tf.data.AUTOTUNE)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"plt.figure(figsize=(10, 10))\n",
"for image_batch, _ in train_dataset.take(1):\n",
" image = image_batch[0]\n",
" for i in range(9):\n",
" ax = plt.subplot(3, 3, i + 1)\n",
" augmented_image, _ = data_augmentation(image, None)\n",
" augmented_image = keras.ops.convert_to_numpy(augmented_image)\n",
" plt.imshow(augmented_image.astype(\"uint8\"))\n",
" plt.axis(\"off\")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"inputs = keras.Input(shape=(180, 180, 3))\n",
"x = layers.Rescaling(1.0 / 255)(inputs)\n",
"x = layers.Conv2D(filters=32, kernel_size=3, activation=\"relu\")(x)\n",
"x = layers.MaxPooling2D(pool_size=2)(x)\n",
"x = layers.Conv2D(filters=64, kernel_size=3, activation=\"relu\")(x)\n",
"x = layers.MaxPooling2D(pool_size=2)(x)\n",
"x = layers.Conv2D(filters=128, kernel_size=3, activation=\"relu\")(x)\n",
"x = layers.MaxPooling2D(pool_size=2)(x)\n",
"x = layers.Conv2D(filters=256, kernel_size=3, activation=\"relu\")(x)\n",
"x = layers.MaxPooling2D(pool_size=2)(x)\n",
"x = layers.Conv2D(filters=512, kernel_size=3, activation=\"relu\")(x)\n",
"x = layers.GlobalAveragePooling2D()(x)\n",
"x = layers.Dropout(0.25)(x)\n",
"outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n",
"model = keras.Model(inputs=inputs, outputs=outputs)\n",
"\n",
"model.compile(\n",
" loss=\"binary_crossentropy\",\n",
" optimizer=\"adam\",\n",
" metrics=[\"accuracy\"],\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"callbacks = [\n",
" keras.callbacks.ModelCheckpoint(\n",
" filepath=\"convnet_from_scratch_with_augmentation.keras\",\n",
" save_best_only=True,\n",
" monitor=\"val_loss\",\n",
" )\n",
"]\n",
"history = model.fit(\n",
" augmented_train_dataset,\n",
" epochs=100,\n",
" validation_data=validation_dataset,\n",
" callbacks=callbacks,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"test_model = keras.models.load_model(\n",
" \"convnet_from_scratch_with_augmentation.keras\"\n",
")\n",
"test_loss, test_acc = test_model.evaluate(test_dataset)\n",
"print(f\"Test accuracy: {test_acc:.3f}\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"### Using a pretrained model"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Feature extraction with a pretrained model"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import keras_hub\n",
"\n",
"conv_base = keras_hub.models.Backbone.from_preset(\"xception_41_imagenet\")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"preprocessor = keras_hub.layers.ImageConverter.from_preset(\n",
" \"xception_41_imagenet\",\n",
" image_size=(180, 180),\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Fast feature extraction without data augmentation"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"def get_features_and_labels(dataset):\n",
" all_features = []\n",
" all_labels = []\n",
" for images, labels in dataset:\n",
" preprocessed_images = preprocessor(images)\n",
" features = conv_base.predict(preprocessed_images, verbose=0)\n",
" all_features.append(features)\n",
" all_labels.append(labels)\n",
" return np.concatenate(all_features), np.concatenate(all_labels)\n",
"\n",
"train_features, train_labels = get_features_and_labels(train_dataset)\n",
"val_features, val_labels = get_features_and_labels(validation_dataset)\n",
"test_features, test_labels = get_features_and_labels(test_dataset)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"train_features.shape"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"inputs = keras.Input(shape=(6, 6, 2048))\n",
"x = layers.GlobalAveragePooling2D()(inputs)\n",
"x = layers.Dense(256, activation=\"relu\")(x)\n",
"x = layers.Dropout(0.25)(x)\n",
"outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n",
"model = keras.Model(inputs, outputs)\n",
"model.compile(\n",
" loss=\"binary_crossentropy\",\n",
" optimizer=\"adam\",\n",
" metrics=[\"accuracy\"],\n",
")\n",
"\n",
"callbacks = [\n",
" keras.callbacks.ModelCheckpoint(\n",
" filepath=\"feature_extraction.keras\",\n",
" save_best_only=True,\n",
" monitor=\"val_loss\",\n",
" )\n",
"]\n",
"history = model.fit(\n",
" train_features,\n",
" train_labels,\n",
" epochs=10,\n",
" validation_data=(val_features, val_labels),\n",
" callbacks=callbacks,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
"\n",
"acc = history.history[\"accuracy\"]\n",
"val_acc = history.history[\"val_accuracy\"]\n",
"loss = history.history[\"loss\"]\n",
"val_loss = history.history[\"val_loss\"]\n",
"epochs = range(1, len(acc) + 1)\n",
"plt.plot(epochs, acc, \"r--\", label=\"Training accuracy\")\n",
"plt.plot(epochs, val_acc, \"b\", label=\"Validation accuracy\")\n",
"plt.title(\"Training and validation accuracy\")\n",
"plt.legend()\n",
"plt.figure()\n",
"plt.plot(epochs, loss, \"r--\", label=\"Training loss\")\n",
"plt.plot(epochs, val_loss, \"b\", label=\"Validation loss\")\n",
"plt.title(\"Training and validation loss\")\n",
"plt.legend()\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"test_model = keras.models.load_model(\"feature_extraction.keras\")\n",
"test_loss, test_acc = test_model.evaluate(test_features, test_labels)\n",
"print(f\"Test accuracy: {test_acc:.3f}\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"##### Feature extraction together with data augmentation"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"import keras_hub\n",
"\n",
"conv_base = keras_hub.models.Backbone.from_preset(\n",
" \"xception_41_imagenet\",\n",
" trainable=False,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"conv_base.trainable = True\n",
"len(conv_base.trainable_weights)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"conv_base.trainable = False\n",
"len(conv_base.trainable_weights)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"inputs = keras.Input(shape=(180, 180, 3))\n",
"x = preprocessor(inputs)\n",
"x = conv_base(x)\n",
"x = layers.GlobalAveragePooling2D()(x)\n",
"x = layers.Dense(256)(x)\n",
"x = layers.Dropout(0.25)(x)\n",
"outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n",
"model = keras.Model(inputs, outputs)\n",
"model.compile(\n",
" loss=\"binary_crossentropy\",\n",
" optimizer=\"adam\",\n",
" metrics=[\"accuracy\"],\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"callbacks = [\n",
" keras.callbacks.ModelCheckpoint(\n",
" filepath=\"feature_extraction_with_data_augmentation.keras\",\n",
" save_best_only=True,\n",
" monitor=\"val_loss\",\n",
" )\n",
"]\n",
"history = model.fit(\n",
" augmented_train_dataset,\n",
" epochs=30,\n",
" validation_data=validation_dataset,\n",
" callbacks=callbacks,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"test_model = keras.models.load_model(\n",
" \"feature_extraction_with_data_augmentation.keras\"\n",
")\n",
"test_loss, test_acc = test_model.evaluate(test_dataset)\n",
"print(f\"Test accuracy: {test_acc:.3f}\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text"
},
"source": [
"#### Fine-tuning a pretrained model"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab_type": "code"
},
"outputs": [],
"source": [
"model.compile(\n",
" loss=\"binary_crossentropy\",\n",
" optimizer=keras.optimizers.Adam(learning_rate=1e-5),\n",
" metrics=[\"accuracy\"],\n",
")\n",
"\n",
"callbacks = [\n",
" keras.callbacks.ModelCheckpoint(\n",
" filepath=\"fine_tuning.keras\",\n",
" save_best_only=True,\n",
" monitor=\"val_loss\",\n",
" )\n",
"]\n",
"history = model.fit(\n",
" augmented_train_dataset,\n",
" epochs=30,\n",
" validation_data=validation_dataset,\n",
" callb
gitextract_laoqm0uq/
├── LICENSE
├── README.md
├── chapter02_mathematical-building-blocks.ipynb
├── chapter03_introduction-to-ml-frameworks.ipynb
├── chapter04_classification-and-regression.ipynb
├── chapter05_fundamentals-of-ml.ipynb
├── chapter07_deep-dive-keras.ipynb
├── chapter08_image-classification.ipynb
├── chapter09_convnet-architecture-patterns.ipynb
├── chapter10_interpreting-what-convnets-learn.ipynb
├── chapter11_image-segmentation.ipynb
├── chapter12_object-detection.ipynb
├── chapter13_timeseries-forecasting.ipynb
├── chapter14_text-classification.ipynb
├── chapter15_language-models-and-the-transformer.ipynb
├── chapter16_text-generation.ipynb
├── chapter17_image-generation.ipynb
├── chapter18_best-practices-for-the-real-world.ipynb
├── first_edition/
│ ├── 2.1-a-first-look-at-a-neural-network.ipynb
│ ├── 3.5-classifying-movie-reviews.ipynb
│ ├── 3.6-classifying-newswires.ipynb
│ ├── 3.7-predicting-house-prices.ipynb
│ ├── 4.4-overfitting-and-underfitting.ipynb
│ ├── 5.1-introduction-to-convnets.ipynb
│ ├── 5.2-using-convnets-with-small-datasets.ipynb
│ ├── 5.3-using-a-pretrained-convnet.ipynb
│ ├── 5.4-visualizing-what-convnets-learn.ipynb
│ ├── 6.1-one-hot-encoding-of-words-or-characters.ipynb
│ ├── 6.1-using-word-embeddings.ipynb
│ ├── 6.2-understanding-recurrent-neural-networks.ipynb
│ ├── 6.3-advanced-usage-of-recurrent-neural-networks.ipynb
│ ├── 6.4-sequence-processing-with-convnets.ipynb
│ ├── 8.1-text-generation-with-lstm.ipynb
│ ├── 8.2-deep-dream.ipynb
│ ├── 8.3-neural-style-transfer.ipynb
│ ├── 8.4-generating-images-with-vaes.ipynb
│ └── 8.5-introduction-to-gans.ipynb
└── second_edition/
├── README.md
├── chapter02_mathematical-building-blocks.ipynb
├── chapter03_introduction-to-keras-and-tf.ipynb
├── chapter04_getting-started-with-neural-networks.ipynb
├── chapter05_fundamentals-of-ml.ipynb
├── chapter07_working-with-keras.ipynb
├── chapter08_intro-to-dl-for-computer-vision.ipynb
├── chapter09_part01_image-segmentation.ipynb
├── chapter09_part02_modern-convnet-architecture-patterns.ipynb
├── chapter09_part03_interpreting-what-convnets-learn.ipynb
├── chapter10_dl-for-timeseries.ipynb
├── chapter11_part01_introduction.ipynb
├── chapter11_part02_sequence-models.ipynb
├── chapter11_part03_transformer.ipynb
├── chapter11_part04_sequence-to-sequence-learning.ipynb
├── chapter12_part01_text-generation.ipynb
├── chapter12_part02_deep-dream.ipynb
├── chapter12_part03_neural-style-transfer.ipynb
├── chapter12_part04_variational-autoencoders.ipynb
├── chapter12_part05_gans.ipynb
├── chapter13_best-practices-for-the-real-world.ipynb
└── chapter14_conclusions.ipynb
Copy disabled (too large)
Download .json
Condensed preview — 59 files, each showing path, character count, and a content snippet. Download the .json file for the full structured content (10,772K chars).
[
{
"path": "LICENSE",
"chars": 1081,
"preview": "MIT License\n\nCopyright (c) 2017-present François Chollet\n\nPermission is hereby granted, free of charge, to any person ob"
},
{
"path": "README.md",
"chars": 6095,
"preview": "# Companion notebooks for Deep Learning with Python\n\nThis repository contains Jupyter notebooks implementing the code sa"
},
{
"path": "chapter02_mathematical-building-blocks.ipynb",
"chars": 30007,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "chapter03_introduction-to-ml-frameworks.ipynb",
"chars": 36224,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "chapter04_classification-and-regression.ipynb",
"chars": 28353,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "chapter05_fundamentals-of-ml.ipynb",
"chars": 23753,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "chapter07_deep-dive-keras.ipynb",
"chars": 48132,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "chapter08_image-classification.ipynb",
"chars": 26220,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "chapter09_convnet-architecture-patterns.ipynb",
"chars": 10471,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "chapter10_interpreting-what-convnets-learn.ipynb",
"chars": 21336,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "chapter11_image-segmentation.ipynb",
"chars": 17878,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "chapter12_object-detection.ipynb",
"chars": 19577,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "chapter13_timeseries-forecasting.ipynb",
"chars": 18428,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "chapter14_text-classification.ipynb",
"chars": 35739,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "chapter15_language-models-and-the-transformer.ipynb",
"chars": 32307,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "chapter16_text-generation.ipynb",
"chars": 29083,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "chapter17_image-generation.ipynb",
"chars": 26342,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "chapter18_best-practices-for-the-real-world.ipynb",
"chars": 13502,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "first_edition/2.1-a-first-look-at-a-neural-network.ipynb",
"chars": 13940,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [\n {\n \"name\":"
},
{
"path": "first_edition/3.5-classifying-movie-reviews.ipynb",
"chars": 69517,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 20,\n \"metadata\": {},\n \"outputs\": [\n {\n \"data\""
},
{
"path": "first_edition/3.6-classifying-newswires.ipynb",
"chars": 63700,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [\n {\n \"name\":"
},
{
"path": "first_edition/3.7-predicting-house-prices.ipynb",
"chars": 70240,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [\n {\n \"name\":"
},
{
"path": "first_edition/4.4-overfitting-and-underfitting.ipynb",
"chars": 106100,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [\n {\n \"name\":"
},
{
"path": "first_edition/5.1-introduction-to-convnets.ipynb",
"chars": 11100,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [\n {\n \"name\":"
},
{
"path": "first_edition/5.2-using-convnets-with-small-datasets.ipynb",
"chars": 431140,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [\n {\n \"name\":"
},
{
"path": "first_edition/5.3-using-a-pretrained-convnet.ipynb",
"chars": 233147,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [\n {\n \"name\":"
},
{
"path": "first_edition/5.4-visualizing-what-convnets-learn.ipynb",
"chars": 7004823,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [\n {\n \"name\":"
},
{
"path": "first_edition/6.1-one-hot-encoding-of-words-or-characters.ipynb",
"chars": 8792,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [\n {\n \"name\":"
},
{
"path": "first_edition/6.1-using-word-embeddings.ipynb",
"chars": 94317,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [\n {\n \"name\":"
},
{
"path": "first_edition/6.2-understanding-recurrent-neural-networks.ipynb",
"chars": 84619,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [\n {\n \"name\":"
},
{
"path": "first_edition/6.3-advanced-usage-of-recurrent-neural-networks.ipynb",
"chars": 204230,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [\n {\n \"name\":"
},
{
"path": "first_edition/6.4-sequence-processing-with-convnets.ipynb",
"chars": 94418,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [\n {\n \"name\":"
},
{
"path": "first_edition/8.1-text-generation-with-lstm.ipynb",
"chars": 160140,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 19,\n \"metadata\": {},\n \"outputs\": [\n {\n \"data\""
},
{
"path": "first_edition/8.2-deep-dream.ipynb",
"chars": 201125,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [\n {\n \"name\":"
},
{
"path": "first_edition/8.3-neural-style-transfer.ipynb",
"chars": 415051,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [\n {\n \"name\":"
},
{
"path": "first_edition/8.4-generating-images-with-vaes.ipynb",
"chars": 283828,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 2,\n \"metadata\": {},\n \"outputs\": [\n {\n \"data\":"
},
{
"path": "first_edition/8.5-introduction-to-gans.ipynb",
"chars": 147649,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [\n {\n \"name\":"
},
{
"path": "second_edition/README.md",
"chars": 4551,
"preview": "# Second edition notebooks\n\nThese are the notebooks for the second edition of the book, originally published in 2021. Th"
},
{
"path": "second_edition/chapter02_mathematical-building-blocks.ipynb",
"chars": 30097,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "second_edition/chapter03_introduction-to-keras-and-tf.ipynb",
"chars": 20418,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "second_edition/chapter04_getting-started-with-neural-networks.ipynb",
"chars": 29579,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "second_edition/chapter05_fundamentals-of-ml.ipynb",
"chars": 18665,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "second_edition/chapter07_working-with-keras.ipynb",
"chars": 36694,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "second_edition/chapter08_intro-to-dl-for-computer-vision.ipynb",
"chars": 29330,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "second_edition/chapter09_part01_image-segmentation.ipynb",
"chars": 8540,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "second_edition/chapter09_part02_modern-convnet-architecture-patterns.ipynb",
"chars": 8900,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "second_edition/chapter09_part03_interpreting-what-convnets-learn.ipynb",
"chars": 19114,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "second_edition/chapter10_dl-for-timeseries.ipynb",
"chars": 20952,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "second_edition/chapter11_part01_introduction.ipynb",
"chars": 18710,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "second_edition/chapter11_part02_sequence-models.ipynb",
"chars": 12860,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "second_edition/chapter11_part03_transformer.ipynb",
"chars": 12412,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "second_edition/chapter11_part04_sequence-to-sequence-learning.ipynb",
"chars": 18936,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "second_edition/chapter12_part01_text-generation.ipynb",
"chars": 14357,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "second_edition/chapter12_part02_deep-dream.ipynb",
"chars": 7944,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "second_edition/chapter12_part03_neural-style-transfer.ipynb",
"chars": 9673,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "second_edition/chapter12_part04_variational-autoencoders.ipynb",
"chars": 9571,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "second_edition/chapter12_part05_gans.ipynb",
"chars": 11570,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "second_edition/chapter13_best-practices-for-the-real-world.ipynb",
"chars": 10593,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
},
{
"path": "second_edition/chapter14_conclusions.ipynb",
"chars": 13113,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\"\n },\n \"source\": [\n \"This i"
}
]
About this extraction
This page contains the full source code of the fchollet/deep-learning-with-python-notebooks GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 59 files (10.0 MB), approximately 2.6M tokens. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.