Full Code of fchollet/deep-learning-with-python-notebooks for AI

master fbf7f1bf2041 cached

59 files

10.0 MB

2.6M tokens

1 requests

Copy disabled (too large) Download .txt

Showing preview only (10,501K chars total). Download the full file to get everything.

Repository: fchollet/deep-learning-with-python-notebooks
Branch: master
Commit: fbf7f1bf2041
Files: 59
Total size: 10.0 MB

Directory structure:
gitextract_laoqm0uq/

├── LICENSE
├── README.md
├── chapter02_mathematical-building-blocks.ipynb
├── chapter03_introduction-to-ml-frameworks.ipynb
├── chapter04_classification-and-regression.ipynb
├── chapter05_fundamentals-of-ml.ipynb
├── chapter07_deep-dive-keras.ipynb
├── chapter08_image-classification.ipynb
├── chapter09_convnet-architecture-patterns.ipynb
├── chapter10_interpreting-what-convnets-learn.ipynb
├── chapter11_image-segmentation.ipynb
├── chapter12_object-detection.ipynb
├── chapter13_timeseries-forecasting.ipynb
├── chapter14_text-classification.ipynb
├── chapter15_language-models-and-the-transformer.ipynb
├── chapter16_text-generation.ipynb
├── chapter17_image-generation.ipynb
├── chapter18_best-practices-for-the-real-world.ipynb
├── first_edition/
│   ├── 2.1-a-first-look-at-a-neural-network.ipynb
│   ├── 3.5-classifying-movie-reviews.ipynb
│   ├── 3.6-classifying-newswires.ipynb
│   ├── 3.7-predicting-house-prices.ipynb
│   ├── 4.4-overfitting-and-underfitting.ipynb
│   ├── 5.1-introduction-to-convnets.ipynb
│   ├── 5.2-using-convnets-with-small-datasets.ipynb
│   ├── 5.3-using-a-pretrained-convnet.ipynb
│   ├── 5.4-visualizing-what-convnets-learn.ipynb
│   ├── 6.1-one-hot-encoding-of-words-or-characters.ipynb
│   ├── 6.1-using-word-embeddings.ipynb
│   ├── 6.2-understanding-recurrent-neural-networks.ipynb
│   ├── 6.3-advanced-usage-of-recurrent-neural-networks.ipynb
│   ├── 6.4-sequence-processing-with-convnets.ipynb
│   ├── 8.1-text-generation-with-lstm.ipynb
│   ├── 8.2-deep-dream.ipynb
│   ├── 8.3-neural-style-transfer.ipynb
│   ├── 8.4-generating-images-with-vaes.ipynb
│   └── 8.5-introduction-to-gans.ipynb
└── second_edition/
    ├── README.md
    ├── chapter02_mathematical-building-blocks.ipynb
    ├── chapter03_introduction-to-keras-and-tf.ipynb
    ├── chapter04_getting-started-with-neural-networks.ipynb
    ├── chapter05_fundamentals-of-ml.ipynb
    ├── chapter07_working-with-keras.ipynb
    ├── chapter08_intro-to-dl-for-computer-vision.ipynb
    ├── chapter09_part01_image-segmentation.ipynb
    ├── chapter09_part02_modern-convnet-architecture-patterns.ipynb
    ├── chapter09_part03_interpreting-what-convnets-learn.ipynb
    ├── chapter10_dl-for-timeseries.ipynb
    ├── chapter11_part01_introduction.ipynb
    ├── chapter11_part02_sequence-models.ipynb
    ├── chapter11_part03_transformer.ipynb
    ├── chapter11_part04_sequence-to-sequence-learning.ipynb
    ├── chapter12_part01_text-generation.ipynb
    ├── chapter12_part02_deep-dream.ipynb
    ├── chapter12_part03_neural-style-transfer.ipynb
    ├── chapter12_part04_variational-autoencoders.ipynb
    ├── chapter12_part05_gans.ipynb
    ├── chapter13_best-practices-for-the-real-world.ipynb
    └── chapter14_conclusions.ipynb

================================================
FILE CONTENTS
================================================

================================================
FILE: LICENSE
================================================
MIT License

Copyright (c) 2017-present François Chollet

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.


================================================
FILE: README.md
================================================
# Companion notebooks for Deep Learning with Python

This repository contains Jupyter notebooks implementing the code samples found in the book [Deep Learning with Python, third edition (2025)](https://www.manning.com/books/deep-learning-with-python-third-edition?a_aid=keras&a_bid=76564dff)
by Francois Chollet and Matthew Watson. In addition, you will also find the legacy notebooks for the [second edition (2021)](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff)
and the [first edition (2017)](https://www.manning.com/books/deep-learning-with-python?a_aid=keras&a_bid=76564dff).

For readability, these notebooks only contain runnable code blocks and section titles, and omit everything else in the book: text paragraphs, figures, and pseudocode.
**If you want to be able to follow what's going on, I recommend reading the notebooks side by side with your copy of the book.**

## Running the code

We recommend running these notebooks on [Colab](https://colab.google), which
provides a hosted runtime with all the dependencies you will need. You can also,
run these notebooks locally, either by setting up your own Jupyter environment,
or using Colab's instructions for
[running locally](https://research.google.com/colaboratory/local-runtimes.html).

By default, all notebooks will run on Colab's free tier GPU runtime, which
is sufficient to run all code in this book. Chapter 8-18 chapters will benefit
from a faster GPU if you have a Colab Pro subscription. You can change your
runtime type using **Runtime -> Change runtime type** in Colab's dropdown menus.

## Choosing a backend

The code for third edition is written using Keras 3. As such, it can be run with
JAX, TensorFlow or PyTorch as a backend. To set the backend, update the backend
in the cell at the top of the colab that looks like this:

```python
import os
os.environ["KERAS_BACKEND"] = "jax"
```

This must be done only once per session before importing Keras. If you are
in the middle running a notebook, you will need to restart the notebook session
and rerun all relevant notebook cells. This can be done in using
**Runtime -> Restart Session** in Colab's dropdown menus.

## Using Kaggle data

This book uses datasets and model weights provided by Kaggle, an online Machine
Learning community and platform. You will need to create a Kaggle login to run
Kaggle code in this book; instructions are given in Chapter 8.

For chapters that need Kaggle data, you can login to Kaggle once per session
when you hit the notebook cell with `kagglehub.login()`. Alternately,
you can set up your Kaggle login information once as Colab secrets:

 * Go to https://www.kaggle.com/ and sign in.
 * Go to https://www.kaggle.com/settings and generate a Kaggle API key.
 * Open the secrets tab in Colab by clicking the key icon on the left.
 * Add two secrets, `KAGGLE_USERNAME` and `KAGGLE_KEY` with the username and key
   you just created.

Following this approach you will only need to copy your Kaggle secret key once,
though you will need to allow each notebook to access your secrets when running
the relevant Kaggle code.

## Table of contents

* [Chapter 2: The mathematical building blocks of neural networks](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter02_mathematical-building-blocks.ipynb)
* [Chapter 3: Introduction to TensorFlow, PyTorch, JAX, and Keras](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter03_introduction-to-ml-frameworks.ipynb)
* [Chapter 4: Classification and regression](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter04_classification-and-regression.ipynb)
* [Chapter 5: Fundamentals of machine learning](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter05_fundamentals-of-ml.ipynb)
* [Chapter 7: A deep dive on Keras](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter07_deep-dive-keras.ipynb)
* [Chapter 8: Image Classification](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter08_image-classification.ipynb)
* [Chapter 9: Convnet architecture patterns](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter09_convnet-architecture-patterns.ipynb)
* [Chapter 10: Interpreting what ConvNets learn](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter10_interpreting-what-convnets-learn.ipynb)
* [Chapter 11: Image Segmentation](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter11_image-segmentation.ipynb)
* [Chapter 12: Object Detection](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter12_object-detection.ipynb)
* [Chapter 13: Timeseries Forecasting](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter13_timeseries-forecasting.ipynb)
* [Chapter 14: Text Classification](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter14_text-classification.ipynb)
* [Chapter 15: Language Models and the Transformer](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter15_language-models-and-the-transformer.ipynb)
* [Chapter 16: Text Generation](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter16_text-generation.ipynb)
* [Chapter 17: Image Generation](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter17_image-generation.ipynb)
* [Chapter 18: Best practices for the real world](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter18_best-practices-for-the-real-world.ipynb)


================================================
FILE: chapter02_mathematical-building-blocks.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "!pip install keras keras-hub --upgrade -q"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import os\n",
    "os.environ[\"KERAS_BACKEND\"] = \"tensorflow\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "cellView": "form",
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "# @title\n",
    "import os\n",
    "from IPython.core.magic import register_cell_magic\n",
    "\n",
    "@register_cell_magic\n",
    "def backend(line, cell):\n",
    "    current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n",
    "    if current == required:\n",
    "        get_ipython().run_cell(cell)\n",
    "    else:\n",
    "        print(\n",
    "            f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n",
    "            f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n",
    "        )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "## The mathematical building blocks of neural networks"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### A first look at a neural network"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "from keras.datasets import mnist\n",
    "\n",
    "(train_images, train_labels), (test_images, test_labels) = mnist.load_data()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "train_images.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "len(train_labels)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "train_labels"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "test_images.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "len(test_labels)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "test_labels"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import keras\n",
    "from keras import layers\n",
    "\n",
    "model = keras.Sequential(\n",
    "    [\n",
    "        layers.Dense(512, activation=\"relu\"),\n",
    "        layers.Dense(10, activation=\"softmax\"),\n",
    "    ]\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model.compile(\n",
    "    optimizer=\"adam\",\n",
    "    loss=\"sparse_categorical_crossentropy\",\n",
    "    metrics=[\"accuracy\"],\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "train_images = train_images.reshape((60000, 28 * 28))\n",
    "train_images = train_images.astype(\"float32\") / 255\n",
    "test_images = test_images.reshape((10000, 28 * 28))\n",
    "test_images = test_images.astype(\"float32\") / 255"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model.fit(train_images, train_labels, epochs=5, batch_size=128)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "test_digits = test_images[0:10]\n",
    "predictions = model.predict(test_digits)\n",
    "predictions[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "predictions[0].argmax()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "predictions[0][7]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "test_labels[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "test_loss, test_acc = model.evaluate(test_images, test_labels)\n",
    "print(f\"test_acc: {test_acc}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Data representations for neural networks"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Scalars (rank-0 tensors)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "x = np.array(12)\n",
    "x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "x.ndim"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Vectors (rank-1 tensors)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "x = np.array([12, 3, 6, 14, 7])\n",
    "x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "x.ndim"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Matrices (rank-2 tensors)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "x = np.array([[5, 78, 2, 34, 0],\n",
    "              [6, 79, 3, 35, 1],\n",
    "              [7, 80, 4, 36, 2]])\n",
    "x.ndim"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Rank-3 tensors and higher-rank tensors"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "x = np.array([[[5, 78, 2, 34, 0],\n",
    "               [6, 79, 3, 35, 1],\n",
    "               [7, 80, 4, 36, 2]],\n",
    "              [[5, 78, 2, 34, 0],\n",
    "               [6, 79, 3, 35, 1],\n",
    "               [7, 80, 4, 36, 2]],\n",
    "              [[5, 78, 2, 34, 0],\n",
    "               [6, 79, 3, 35, 1],\n",
    "               [7, 80, 4, 36, 2]]])\n",
    "x.ndim"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Key attributes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "from keras.datasets import mnist\n",
    "\n",
    "(train_images, train_labels), (test_images, test_labels) = mnist.load_data()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "train_images.ndim"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "train_images.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "train_images.dtype"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "\n",
    "digit = train_images[4]\n",
    "plt.imshow(digit, cmap=plt.cm.binary)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "train_labels[4]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Manipulating tensors in NumPy"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "my_slice = train_images[10:100]\n",
    "my_slice.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "my_slice = train_images[10:100, :, :]\n",
    "my_slice.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "my_slice = train_images[10:100, 0:28, 0:28]\n",
    "my_slice.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "my_slice = train_images[:, 14:, 14:]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "my_slice = train_images[:, 7:-7, 7:-7]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### The notion of data batches"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "batch = train_images[:128]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "batch = train_images[128:256]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "n = 3\n",
    "batch = train_images[128 * n : 128 * (n + 1)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Real-world examples of data tensors"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Vector data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Timeseries data or sequence data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Image data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Video data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### The gears of neural networks: Tensor operations"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Element-wise operations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "def naive_relu(x):\n",
    "    assert len(x.shape) == 2\n",
    "    x = x.copy()\n",
    "    for i in range(x.shape[0]):\n",
    "        for j in range(x.shape[1]):\n",
    "            x[i, j] = max(x[i, j], 0)\n",
    "    return x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "def naive_add(x, y):\n",
    "    assert len(x.shape) == 2\n",
    "    assert x.shape == y.shape\n",
    "    x = x.copy()\n",
    "    for i in range(x.shape[0]):\n",
    "        for j in range(x.shape[1]):\n",
    "            x[i, j] += y[i, j]\n",
    "    return x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import time\n",
    "\n",
    "x = np.random.random((20, 100))\n",
    "y = np.random.random((20, 100))\n",
    "\n",
    "t0 = time.time()\n",
    "for _ in range(1000):\n",
    "    z = x + y\n",
    "    z = np.maximum(z, 0.0)\n",
    "print(\"Took: {0:.2f} s\".format(time.time() - t0))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "t0 = time.time()\n",
    "for _ in range(1000):\n",
    "    z = naive_add(x, y)\n",
    "    z = naive_relu(z)\n",
    "print(\"Took: {0:.2f} s\".format(time.time() - t0))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Broadcasting"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "X = np.random.random((32, 10))\n",
    "y = np.random.random((10,))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "y = np.expand_dims(y, axis=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "Y = np.tile(y, (32, 1))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "def naive_add_matrix_and_vector(x, y):\n",
    "    assert len(x.shape) == 2\n",
    "    assert len(y.shape) == 1\n",
    "    assert x.shape[1] == y.shape[0]\n",
    "    x = x.copy()\n",
    "    for i in range(x.shape[0]):\n",
    "        for j in range(x.shape[1]):\n",
    "            x[i, j] += y[j]\n",
    "    return x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "x = np.random.random((64, 3, 32, 10))\n",
    "y = np.random.random((32, 10))\n",
    "z = np.maximum(x, y)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Tensor product"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "x = np.random.random((32,))\n",
    "y = np.random.random((32,))\n",
    "\n",
    "z = np.matmul(x, y)\n",
    "z = x @ y"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "def naive_vector_product(x, y):\n",
    "    assert len(x.shape) == 1\n",
    "    assert len(y.shape) == 1\n",
    "    assert x.shape[0] == y.shape[0]\n",
    "    z = 0.0\n",
    "    for i in range(x.shape[0]):\n",
    "        z += x[i] * y[i]\n",
    "    return z"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "def naive_matrix_vector_product(x, y):\n",
    "    assert len(x.shape) == 2\n",
    "    assert len(y.shape) == 1\n",
    "    assert x.shape[1] == y.shape[0]\n",
    "    z = np.zeros(x.shape[0])\n",
    "    for i in range(x.shape[0]):\n",
    "        for j in range(x.shape[1]):\n",
    "            z[i] += x[i, j] * y[j]\n",
    "    return z"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "def naive_matrix_vector_product(x, y):\n",
    "    z = np.zeros(x.shape[0])\n",
    "    for i in range(x.shape[0]):\n",
    "        z[i] = naive_vector_product(x[i, :], y)\n",
    "    return z"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "def naive_matrix_product(x, y):\n",
    "    assert len(x.shape) == 2\n",
    "    assert len(y.shape) == 2\n",
    "    assert x.shape[1] == y.shape[0]\n",
    "    z = np.zeros((x.shape[0], y.shape[1]))\n",
    "    for i in range(x.shape[0]):\n",
    "        for j in range(y.shape[1]):\n",
    "            row_x = x[i, :]\n",
    "            column_y = y[:, j]\n",
    "            z[i, j] = naive_vector_product(row_x, column_y)\n",
    "    return z"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Tensor reshaping"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "train_images = train_images.reshape((60000, 28 * 28))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "x = np.array([[0., 1.],\n",
    "              [2., 3.],\n",
    "              [4., 5.]])\n",
    "x.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "x = x.reshape((6, 1))\n",
    "x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "x = x.reshape((2, 3))\n",
    "x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "x = np.zeros((300, 20))\n",
    "x = np.transpose(x)\n",
    "x.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Geometric interpretation of tensor operations"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### A geometric interpretation of deep learning"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### The engine of neural networks: Gradient-based optimization"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### What's a derivative?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Derivative of a tensor operation: The gradient"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Stochastic gradient descent"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Chaining derivatives: The Backpropagation algorithm"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### The chain rule"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Automatic differentiation with computation graphs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Looking back at our first example"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "(train_images, train_labels), (test_images, test_labels) = mnist.load_data()\n",
    "train_images = train_images.reshape((60000, 28 * 28))\n",
    "train_images = train_images.astype(\"float32\") / 255\n",
    "test_images = test_images.reshape((10000, 28 * 28))\n",
    "test_images = test_images.astype(\"float32\") / 255"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = keras.Sequential(\n",
    "    [\n",
    "        layers.Dense(512, activation=\"relu\"),\n",
    "        layers.Dense(10, activation=\"softmax\"),\n",
    "    ]\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model.compile(\n",
    "    optimizer=\"adam\",\n",
    "    loss=\"sparse_categorical_crossentropy\",\n",
    "    metrics=[\"accuracy\"],\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model.fit(\n",
    "    train_images,\n",
    "    train_labels,\n",
    "    epochs=5,\n",
    "    batch_size=128,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Reimplementing our first example from scratch"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### A simple Dense class"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import keras\n",
    "from keras import ops\n",
    "\n",
    "class NaiveDense:\n",
    "    def __init__(self, input_size, output_size, activation=None):\n",
    "        self.activation = activation\n",
    "        self.W = keras.Variable(\n",
    "            shape=(input_size, output_size), initializer=\"uniform\"\n",
    "        )\n",
    "        self.b = keras.Variable(shape=(output_size,), initializer=\"zeros\")\n",
    "\n",
    "    def __call__(self, inputs):\n",
    "        x = ops.matmul(inputs, self.W)\n",
    "        x = x + self.b\n",
    "        if self.activation is not None:\n",
    "            x = self.activation(x)\n",
    "        return x\n",
    "\n",
    "    @property\n",
    "    def weights(self):\n",
    "        return [self.W, self.b]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### A simple Sequential class"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "class NaiveSequential:\n",
    "    def __init__(self, layers):\n",
    "        self.layers = layers\n",
    "\n",
    "    def __call__(self, inputs):\n",
    "        x = inputs\n",
    "        for layer in self.layers:\n",
    "            x = layer(x)\n",
    "        return x\n",
    "\n",
    "    @property\n",
    "    def weights(self):\n",
    "        weights = []\n",
    "        for layer in self.layers:\n",
    "            weights += layer.weights\n",
    "        return weights"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = NaiveSequential(\n",
    "    [\n",
    "        NaiveDense(input_size=28 * 28, output_size=512, activation=ops.relu),\n",
    "        NaiveDense(input_size=512, output_size=10, activation=ops.softmax),\n",
    "    ]\n",
    ")\n",
    "assert len(model.weights) == 4"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### A batch generator"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import math\n",
    "\n",
    "class BatchGenerator:\n",
    "    def __init__(self, images, labels, batch_size=128):\n",
    "        assert len(images) == len(labels)\n",
    "        self.index = 0\n",
    "        self.images = images\n",
    "        self.labels = labels\n",
    "        self.batch_size = batch_size\n",
    "        self.num_batches = math.ceil(len(images) / batch_size)\n",
    "\n",
    "    def next(self):\n",
    "        images = self.images[self.index : self.index + self.batch_size]\n",
    "        labels = self.labels[self.index : self.index + self.batch_size]\n",
    "        self.index += self.batch_size\n",
    "        return images, labels"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Running one training step"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### The weight update step"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "learning_rate = 1e-3\n",
    "\n",
    "def update_weights(gradients, weights):\n",
    "    for g, w in zip(gradients, weights):\n",
    "        w.assign(w - g * learning_rate)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "from keras import optimizers\n",
    "\n",
    "optimizer = optimizers.SGD(learning_rate=1e-3)\n",
    "\n",
    "def update_weights(gradients, weights):\n",
    "    optimizer.apply_gradients(zip(gradients, weights))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Gradient computation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "%%backend tensorflow\n",
    "import tensorflow as tf\n",
    "\n",
    "x = tf.zeros(shape=())\n",
    "with tf.GradientTape() as tape:\n",
    "    y = 2 * x + 3\n",
    "grad_of_y_wrt_x = tape.gradient(y, x)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "%%backend tensorflow\n",
    "def one_training_step(model, images_batch, labels_batch):\n",
    "    with tf.GradientTape() as tape:\n",
    "        predictions = model(images_batch)\n",
    "        loss = ops.sparse_categorical_crossentropy(labels_batch, predictions)\n",
    "        average_loss = ops.mean(loss)\n",
    "    gradients = tape.gradient(average_loss, model.weights)\n",
    "    update_weights(gradients, model.weights)\n",
    "    return average_loss"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### The full training loop"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "%%backend tensorflow\n",
    "def fit(model, images, labels, epochs, batch_size=128):\n",
    "    for epoch_counter in range(epochs):\n",
    "        print(f\"Epoch {epoch_counter}\")\n",
    "        batch_generator = BatchGenerator(images, labels)\n",
    "        for batch_counter in range(batch_generator.num_batches):\n",
    "            images_batch, labels_batch = batch_generator.next()\n",
    "            loss = one_training_step(model, images_batch, labels_batch)\n",
    "            if batch_counter % 100 == 0:\n",
    "                print(f\"loss at batch {batch_counter}: {loss:.2f}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "%%backend tensorflow\n",
    "from keras.datasets import mnist\n",
    "\n",
    "(train_images, train_labels), (test_images, test_labels) = mnist.load_data()\n",
    "\n",
    "train_images = train_images.reshape((60000, 28 * 28))\n",
    "train_images = train_images.astype(\"float32\") / 255\n",
    "test_images = test_images.reshape((10000, 28 * 28))\n",
    "test_images = test_images.astype(\"float32\") / 255\n",
    "\n",
    "fit(model, train_images, train_labels, epochs=10, batch_size=128)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Evaluating the model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "%%backend tensorflow\n",
    "predictions = model(test_images)\n",
    "predicted_labels = ops.argmax(predictions, axis=1)\n",
    "matches = predicted_labels == test_labels\n",
    "f\"accuracy: {ops.mean(matches):.2f}\""
   ]
  }
 ],
 "metadata": {
  "accelerator": "GPU",
  "colab": {
   "collapsed_sections": [],
   "name": "chapter02_mathematical-building-blocks",
   "private_outputs": false,
   "provenance": [],
   "toc_visible": true
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}

================================================
FILE: chapter03_introduction-to-ml-frameworks.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "!pip install keras keras-hub --upgrade -q"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import os\n",
    "os.environ[\"KERAS_BACKEND\"] = \"jax\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "cellView": "form",
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "# @title\n",
    "import os\n",
    "from IPython.core.magic import register_cell_magic\n",
    "\n",
    "@register_cell_magic\n",
    "def backend(line, cell):\n",
    "    current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n",
    "    if current == required:\n",
    "        get_ipython().run_cell(cell)\n",
    "    else:\n",
    "        print(\n",
    "            f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n",
    "            f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n",
    "        )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "## Introduction to TensorFlow, PyTorch, JAX, and Keras"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### A brief history of deep learning frameworks"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### How these frameworks relate to each other"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Introduction to TensorFlow"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### First steps with TensorFlow"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Tensors and variables in TensorFlow"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "###### Constant tensors"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import tensorflow as tf\n",
    "tf.ones(shape=(2, 1))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "tf.zeros(shape=(2, 1))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "tf.constant([1, 2, 3], dtype=\"float32\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "###### Random tensors"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "x = tf.random.normal(shape=(3, 1), mean=0., stddev=1.)\n",
    "print(x)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "x = tf.random.uniform(shape=(3, 1), minval=0., maxval=1.)\n",
    "print(x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "###### Tensor assignment and the Variable class"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "x = np.ones(shape=(2, 2))\n",
    "x[0, 0] = 0.0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "v = tf.Variable(initial_value=tf.random.normal(shape=(3, 1)))\n",
    "print(v)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "v.assign(tf.ones((3, 1)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "v[0, 0].assign(3.)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "v.assign_add(tf.ones((3, 1)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Tensor operations: Doing math in TensorFlow"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "a = tf.ones((2, 2))\n",
    "b = tf.square(a)\n",
    "c = tf.sqrt(a)\n",
    "d = b + c\n",
    "e = tf.matmul(a, b)\n",
    "f = tf.concat((a, b), axis=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "def dense(inputs, W, b):\n",
    "    return tf.nn.relu(tf.matmul(inputs, W) + b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Gradients in TensorFlow: A second look at the GradientTape API"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "input_var = tf.Variable(initial_value=3.0)\n",
    "with tf.GradientTape() as tape:\n",
    "    result = tf.square(input_var)\n",
    "gradient = tape.gradient(result, input_var)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "input_const = tf.constant(3.0)\n",
    "with tf.GradientTape() as tape:\n",
    "    tape.watch(input_const)\n",
    "    result = tf.square(input_const)\n",
    "gradient = tape.gradient(result, input_const)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "time = tf.Variable(0.0)\n",
    "with tf.GradientTape() as outer_tape:\n",
    "    with tf.GradientTape() as inner_tape:\n",
    "        position = 4.9 * time**2\n",
    "    speed = inner_tape.gradient(position, time)\n",
    "acceleration = outer_tape.gradient(speed, time)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Making TensorFlow functions fast using compilation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "@tf.function\n",
    "def dense(inputs, W, b):\n",
    "    return tf.nn.relu(tf.matmul(inputs, W) + b)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "@tf.function(jit_compile=True)\n",
    "def dense(inputs, W, b):\n",
    "    return tf.nn.relu(tf.matmul(inputs, W) + b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### An end-to-end example: A linear classifier in pure TensorFlow"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "num_samples_per_class = 1000\n",
    "negative_samples = np.random.multivariate_normal(\n",
    "    mean=[0, 3], cov=[[1, 0.5], [0.5, 1]], size=num_samples_per_class\n",
    ")\n",
    "positive_samples = np.random.multivariate_normal(\n",
    "    mean=[3, 0], cov=[[1, 0.5], [0.5, 1]], size=num_samples_per_class\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "inputs = np.vstack((negative_samples, positive_samples)).astype(np.float32)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "targets = np.vstack(\n",
    "    (\n",
    "        np.zeros((num_samples_per_class, 1), dtype=\"float32\"),\n",
    "        np.ones((num_samples_per_class, 1), dtype=\"float32\"),\n",
    "    )\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "\n",
    "plt.scatter(inputs[:, 0], inputs[:, 1], c=targets[:, 0])\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "input_dim = 2\n",
    "output_dim = 1\n",
    "W = tf.Variable(initial_value=tf.random.uniform(shape=(input_dim, output_dim)))\n",
    "b = tf.Variable(initial_value=tf.zeros(shape=(output_dim,)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "def model(inputs, W, b):\n",
    "    return tf.matmul(inputs, W) + b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "def mean_squared_error(targets, predictions):\n",
    "    per_sample_losses = tf.square(targets - predictions)\n",
    "    return tf.reduce_mean(per_sample_losses)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "learning_rate = 0.1\n",
    "\n",
    "@tf.function(jit_compile=True)\n",
    "def training_step(inputs, targets, W, b):\n",
    "    with tf.GradientTape() as tape:\n",
    "        predictions = model(inputs, W, b)\n",
    "        loss = mean_squared_error(predictions, targets)\n",
    "    grad_loss_wrt_W, grad_loss_wrt_b = tape.gradient(loss, [W, b])\n",
    "    W.assign_sub(grad_loss_wrt_W * learning_rate)\n",
    "    b.assign_sub(grad_loss_wrt_b * learning_rate)\n",
    "    return loss"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "for step in range(40):\n",
    "    loss = training_step(inputs, targets, W, b)\n",
    "    print(f\"Loss at step {step}: {loss:.4f}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "predictions = model(inputs, W, b)\n",
    "plt.scatter(inputs[:, 0], inputs[:, 1], c=predictions[:, 0] > 0.5)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "x = np.linspace(-1, 4, 100)\n",
    "y = -W[0] / W[1] * x + (0.5 - b) / W[1]\n",
    "plt.plot(x, y, \"-r\")\n",
    "plt.scatter(inputs[:, 0], inputs[:, 1], c=predictions[:, 0] > 0.5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### What makes the TensorFlow approach unique"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Introduction to PyTorch"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### First steps with PyTorch"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Tensors and parameters in PyTorch"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "###### Constant tensors"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import torch\n",
    "torch.ones(size=(2, 1))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "torch.zeros(size=(2, 1))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "torch.tensor([1, 2, 3], dtype=torch.float32)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "###### Random tensors"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "torch.normal(\n",
    "mean=torch.zeros(size=(3, 1)),\n",
    "std=torch.ones(size=(3, 1)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "torch.rand(3, 1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "###### Tensor assignment and the Parameter class"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "x = torch.zeros(size=(2, 1))\n",
    "x[0, 0] = 1.\n",
    "x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "x = torch.zeros(size=(2, 1))\n",
    "p = torch.nn.parameter.Parameter(data=x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Tensor operations: Doing math in PyTorch"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "a = torch.ones((2, 2))\n",
    "b = torch.square(a)\n",
    "c = torch.sqrt(a)\n",
    "d = b + c\n",
    "e = torch.matmul(a, b)\n",
    "f = torch.cat((a, b), dim=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "def dense(inputs, W, b):\n",
    "    return torch.nn.relu(torch.matmul(inputs, W) + b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Computing gradients with PyTorch"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "input_var = torch.tensor(3.0, requires_grad=True)\n",
    "result = torch.square(input_var)\n",
    "result.backward()\n",
    "gradient = input_var.grad\n",
    "gradient"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "result = torch.square(input_var)\n",
    "result.backward()\n",
    "input_var.grad"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "input_var.grad = None"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### An end-to-end example: A linear classifier in pure PyTorch"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "input_dim = 2\n",
    "output_dim = 1\n",
    "\n",
    "W = torch.rand(input_dim, output_dim, requires_grad=True)\n",
    "b = torch.zeros(output_dim, requires_grad=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "def model(inputs, W, b):\n",
    "    return torch.matmul(inputs, W) + b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "def mean_squared_error(targets, predictions):\n",
    "    per_sample_losses = torch.square(targets - predictions)\n",
    "    return torch.mean(per_sample_losses)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "learning_rate = 0.1\n",
    "\n",
    "def training_step(inputs, targets, W, b):\n",
    "    predictions = model(inputs)\n",
    "    loss = mean_squared_error(targets, predictions)\n",
    "    loss.backward()\n",
    "    grad_loss_wrt_W, grad_loss_wrt_b = W.grad, b.grad\n",
    "    with torch.no_grad():\n",
    "        W -= grad_loss_wrt_W * learning_rate\n",
    "        b -= grad_loss_wrt_b * learning_rate\n",
    "    W.grad = None\n",
    "    b.grad = None\n",
    "    return loss"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Packaging state and computation with the Module class"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "class LinearModel(torch.nn.Module):\n",
    "    def __init__(self):\n",
    "        super().__init__()\n",
    "        self.W = torch.nn.Parameter(torch.rand(input_dim, output_dim))\n",
    "        self.b = torch.nn.Parameter(torch.zeros(output_dim))\n",
    "\n",
    "    def forward(self, inputs):\n",
    "        return torch.matmul(inputs, self.W) + self.b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = LinearModel()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "torch_inputs = torch.tensor(inputs)\n",
    "output = model(torch_inputs)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "def training_step(inputs, targets):\n",
    "    predictions = model(inputs)\n",
    "    loss = mean_squared_error(targets, predictions)\n",
    "    loss.backward()\n",
    "    optimizer.step()\n",
    "    model.zero_grad()\n",
    "    return loss"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Making PyTorch modules fast using compilation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "compiled_model = torch.compile(model)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "@torch.compile\n",
    "def dense(inputs, W, b):\n",
    "    return torch.nn.relu(torch.matmul(inputs, W) + b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### What makes the PyTorch approach unique"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Introduction to JAX"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### First steps with JAX"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Tensors in JAX"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "from jax import numpy as jnp\n",
    "jnp.ones(shape=(2, 1))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "jnp.zeros(shape=(2, 1))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "jnp.array([1, 2, 3], dtype=\"float32\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Random number generation in JAX"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "np.random.normal(size=(3,))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "np.random.normal(size=(3,))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "def apply_noise(x, seed):\n",
    "    np.random.seed(seed)\n",
    "    x = x * np.random.normal((3,))\n",
    "    return x\n",
    "\n",
    "seed = 1337\n",
    "y = apply_noise(x, seed)\n",
    "seed += 1\n",
    "z = apply_noise(x, seed)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import jax\n",
    "\n",
    "seed_key = jax.random.key(1337)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "seed_key = jax.random.key(0)\n",
    "jax.random.normal(seed_key, shape=(3,))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "seed_key = jax.random.key(123)\n",
    "jax.random.normal(seed_key, shape=(3,))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "jax.random.normal(seed_key, shape=(3,))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "seed_key = jax.random.key(123)\n",
    "jax.random.normal(seed_key, shape=(3,))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "new_seed_key = jax.random.split(seed_key, num=1)[0]\n",
    "jax.random.normal(new_seed_key, shape=(3,))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Tensor assignment"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "x = jnp.array([1, 2, 3], dtype=\"float32\")\n",
    "new_x = x.at[0].set(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Tensor operations: Doing math in JAX"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "a = jnp.ones((2, 2))\n",
    "b = jnp.square(a)\n",
    "c = jnp.sqrt(a)\n",
    "d = b + c\n",
    "e = jnp.matmul(a, b)\n",
    "e *= d"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "def dense(inputs, W, b):\n",
    "    return jax.nn.relu(jnp.matmul(inputs, W) + b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Computing gradients with JAX"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "def compute_loss(input_var):\n",
    "    return jnp.square(input_var)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "grad_fn = jax.grad(compute_loss)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "input_var = jnp.array(3.0)\n",
    "grad_of_loss_wrt_input_var = grad_fn(input_var)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### JAX gradient-computation best practices"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "###### Returning the loss value"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "grad_fn = jax.value_and_grad(compute_loss)\n",
    "output, grad_of_loss_wrt_input_var = grad_fn(input_var)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "###### Getting gradients for a complex function"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "###### Returning auxiliary outputs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Making JAX functions fast with @jax.jit"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "@jax.jit\n",
    "def dense(inputs, W, b):\n",
    "    return jax.nn.relu(jnp.matmul(inputs, W) + b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### An end-to-end example: A linear classifier in pure JAX"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "def model(inputs, W, b):\n",
    "    return jnp.matmul(inputs, W) + b\n",
    "\n",
    "def mean_squared_error(targets, predictions):\n",
    "    per_sample_losses = jnp.square(targets - predictions)\n",
    "    return jnp.mean(per_sample_losses)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "def compute_loss(state, inputs, targets):\n",
    "    W, b = state\n",
    "    predictions = model(inputs, W, b)\n",
    "    loss = mean_squared_error(targets, predictions)\n",
    "    return loss"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "grad_fn = jax.value_and_grad(compute_loss)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "learning_rate = 0.1\n",
    "\n",
    "@jax.jit\n",
    "def training_step(inputs, targets, W, b):\n",
    "    loss, grads = grad_fn((W, b), inputs, targets)\n",
    "    grad_wrt_W, grad_wrt_b = grads\n",
    "    W = W - grad_wrt_W * learning_rate\n",
    "    b = b - grad_wrt_b * learning_rate\n",
    "    return loss, W, b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "input_dim = 2\n",
    "output_dim = 1\n",
    "\n",
    "W = jax.numpy.array(np.random.uniform(size=(input_dim, output_dim)))\n",
    "b = jax.numpy.array(np.zeros(shape=(output_dim,)))\n",
    "state = (W, b)\n",
    "for step in range(40):\n",
    "    loss, W, b = training_step(inputs, targets, W, b)\n",
    "    print(f\"Loss at step {step}: {loss:.4f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### What makes the JAX approach unique"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Introduction to Keras"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### First steps with Keras"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Picking a backend framework"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "os.environ[\"KERAS_BACKEND\"] = \"jax\"\n",
    "\n",
    "import keras"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Layers: The building blocks of deep learning"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### The base `Layer` class in Keras"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import keras\n",
    "\n",
    "class SimpleDense(keras.Layer):\n",
    "    def __init__(self, units, activation=None):\n",
    "        super().__init__()\n",
    "        self.units = units\n",
    "        self.activation = activation\n",
    "\n",
    "    def build(self, input_shape):\n",
    "        batch_dim, input_dim = input_shape\n",
    "        self.W = self.add_weight(\n",
    "            shape=(input_dim, self.units), initializer=\"random_normal\"\n",
    "        )\n",
    "        self.b = self.add_weight(shape=(self.units,), initializer=\"zeros\")\n",
    "\n",
    "    def call(self, inputs):\n",
    "        y = keras.ops.matmul(inputs, self.W) + self.b\n",
    "        if self.activation is not None:\n",
    "            y = self.activation(y)\n",
    "        return y"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "my_dense = SimpleDense(units=32, activation=keras.ops.relu)\n",
    "input_tensor = keras.ops.ones(shape=(2, 784))\n",
    "output_tensor = my_dense(input_tensor)\n",
    "print(output_tensor.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Automatic shape inference: Building layers on the fly"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "from keras import layers\n",
    "\n",
    "layer = layers.Dense(32, activation=\"relu\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "from keras import models\n",
    "from keras import layers\n",
    "\n",
    "model = models.Sequential(\n",
    "    [\n",
    "        layers.Dense(32, activation=\"relu\"),\n",
    "        layers.Dense(32),\n",
    "    ]\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = keras.Sequential(\n",
    "    [\n",
    "        SimpleDense(32, activation=\"relu\"),\n",
    "        SimpleDense(64, activation=\"relu\"),\n",
    "        SimpleDense(32, activation=\"relu\"),\n",
    "        SimpleDense(10, activation=\"softmax\"),\n",
    "    ]\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### From layers to models"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### The \"compile\" step: Configuring the learning process"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = keras.Sequential([keras.layers.Dense(1)])\n",
    "model.compile(\n",
    "    optimizer=\"rmsprop\",\n",
    "    loss=\"mean_squared_error\",\n",
    "    metrics=[\"accuracy\"],\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model.compile(\n",
    "    optimizer=keras.optimizers.RMSprop(),\n",
    "    loss=keras.losses.MeanSquaredError(),\n",
    "    metrics=[keras.metrics.BinaryAccuracy()],\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Picking a loss function"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Understanding the fit method"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "history = model.fit(\n",
    "    inputs,\n",
    "    targets,\n",
    "    epochs=5,\n",
    "    batch_size=128,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "history.history"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Monitoring loss and metrics on validation data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = keras.Sequential([keras.layers.Dense(1)])\n",
    "model.compile(\n",
    "    optimizer=keras.optimizers.RMSprop(learning_rate=0.1),\n",
    "    loss=keras.losses.MeanSquaredError(),\n",
    "    metrics=[keras.metrics.BinaryAccuracy()],\n",
    ")\n",
    "\n",
    "indices_permutation = np.random.permutation(len(inputs))\n",
    "shuffled_inputs = inputs[indices_permutation]\n",
    "shuffled_targets = targets[indices_permutation]\n",
    "\n",
    "num_validation_samples = int(0.3 * len(inputs))\n",
    "val_inputs = shuffled_inputs[:num_validation_samples]\n",
    "val_targets = shuffled_targets[:num_validation_samples]\n",
    "training_inputs = shuffled_inputs[num_validation_samples:]\n",
    "training_targets = shuffled_targets[num_validation_samples:]\n",
    "model.fit(\n",
    "    training_inputs,\n",
    "    training_targets,\n",
    "    epochs=5,\n",
    "    batch_size=16,\n",
    "    validation_data=(val_inputs, val_targets),\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Inference: Using a model after training"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "predictions = model.predict(val_inputs, batch_size=128)\n",
    "print(predictions[:10])"
   ]
  }
 ],
 "metadata": {
  "accelerator": "GPU",
  "colab": {
   "collapsed_sections": [],
   "name": "chapter03_introduction-to-ml-frameworks",
   "private_outputs": false,
   "provenance": [],
   "toc_visible": true
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}

================================================
FILE: chapter04_classification-and-regression.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "!pip install keras keras-hub --upgrade -q"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import os\n",
    "os.environ[\"KERAS_BACKEND\"] = \"jax\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "cellView": "form",
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "# @title\n",
    "import os\n",
    "from IPython.core.magic import register_cell_magic\n",
    "\n",
    "@register_cell_magic\n",
    "def backend(line, cell):\n",
    "    current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n",
    "    if current == required:\n",
    "        get_ipython().run_cell(cell)\n",
    "    else:\n",
    "        print(\n",
    "            f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n",
    "            f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n",
    "        )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "## Classification and regression"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Classifying movie reviews: A binary classification example"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### The IMDb dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "from keras.datasets import imdb\n",
    "\n",
    "(train_data, train_labels), (test_data, test_labels) = imdb.load_data(\n",
    "    num_words=10000\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "train_data[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "train_labels[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "max([max(sequence) for sequence in train_data])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "word_index = imdb.get_word_index()\n",
    "reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])\n",
    "decoded_review = \" \".join(\n",
    "    [reverse_word_index.get(i - 3, \"?\") for i in train_data[0]]\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "decoded_review[:100]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Preparing the data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "def multi_hot_encode(sequences, num_classes):\n",
    "    results = np.zeros((len(sequences), num_classes))\n",
    "    for i, sequence in enumerate(sequences):\n",
    "        results[i][sequence] = 1.0\n",
    "    return results\n",
    "\n",
    "x_train = multi_hot_encode(train_data, num_classes=10000)\n",
    "x_test = multi_hot_encode(test_data, num_classes=10000)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "x_train[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "y_train = train_labels.astype(\"float32\")\n",
    "y_test = test_labels.astype(\"float32\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Building your model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import keras\n",
    "from keras import layers\n",
    "\n",
    "model = keras.Sequential(\n",
    "    [\n",
    "        layers.Dense(16, activation=\"relu\"),\n",
    "        layers.Dense(16, activation=\"relu\"),\n",
    "        layers.Dense(1, activation=\"sigmoid\"),\n",
    "    ]\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model.compile(\n",
    "    optimizer=\"adam\",\n",
    "    loss=\"binary_crossentropy\",\n",
    "    metrics=[\"accuracy\"],\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Validating your approach"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "x_val = x_train[:10000]\n",
    "partial_x_train = x_train[10000:]\n",
    "y_val = y_train[:10000]\n",
    "partial_y_train = y_train[10000:]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "history = model.fit(\n",
    "    partial_x_train,\n",
    "    partial_y_train,\n",
    "    epochs=20,\n",
    "    batch_size=512,\n",
    "    validation_data=(x_val, y_val),\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "history = model.fit(\n",
    "    x_train,\n",
    "    y_train,\n",
    "    epochs=20,\n",
    "    batch_size=512,\n",
    "    validation_split=0.2,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "history_dict = history.history\n",
    "history_dict.keys()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "\n",
    "history_dict = history.history\n",
    "loss_values = history_dict[\"loss\"]\n",
    "val_loss_values = history_dict[\"val_loss\"]\n",
    "epochs = range(1, len(loss_values) + 1)\n",
    "plt.plot(epochs, loss_values, \"r--\", label=\"Training loss\")\n",
    "plt.plot(epochs, val_loss_values, \"b\", label=\"Validation loss\")\n",
    "plt.title(\"[IMDB] Training and validation loss\")\n",
    "plt.xlabel(\"Epochs\")\n",
    "plt.xticks(epochs)\n",
    "plt.ylabel(\"Loss\")\n",
    "plt.legend()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "plt.clf()\n",
    "acc = history_dict[\"accuracy\"]\n",
    "val_acc = history_dict[\"val_accuracy\"]\n",
    "plt.plot(epochs, acc, \"r--\", label=\"Training acc\")\n",
    "plt.plot(epochs, val_acc, \"b\", label=\"Validation acc\")\n",
    "plt.title(\"[IMDB] Training and validation accuracy\")\n",
    "plt.xlabel(\"Epochs\")\n",
    "plt.xticks(epochs)\n",
    "plt.ylabel(\"Accuracy\")\n",
    "plt.legend()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = keras.Sequential(\n",
    "    [\n",
    "        layers.Dense(16, activation=\"relu\"),\n",
    "        layers.Dense(16, activation=\"relu\"),\n",
    "        layers.Dense(1, activation=\"sigmoid\"),\n",
    "    ]\n",
    ")\n",
    "model.compile(\n",
    "    optimizer=\"adam\",\n",
    "    loss=\"binary_crossentropy\",\n",
    "    metrics=[\"accuracy\"],\n",
    ")\n",
    "model.fit(x_train, y_train, epochs=4, batch_size=512)\n",
    "results = model.evaluate(x_test, y_test)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "results"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Using a trained model to generate predictions on new data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model.predict(x_test)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Further experiments"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Wrapping up"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Classifying newswires: A multiclass classification example"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### The Reuters dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "from keras.datasets import reuters\n",
    "\n",
    "(train_data, train_labels), (test_data, test_labels) = reuters.load_data(\n",
    "    num_words=10000\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "len(train_data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "len(test_data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "train_data[10]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "word_index = reuters.get_word_index()\n",
    "reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])\n",
    "decoded_newswire = \" \".join(\n",
    "    [reverse_word_index.get(i - 3, \"?\") for i in train_data[10]]\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "train_labels[10]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Preparing the data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "x_train = multi_hot_encode(train_data, num_classes=10000)\n",
    "x_test = multi_hot_encode(test_data, num_classes=10000)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "def one_hot_encode(labels, num_classes=46):\n",
    "    results = np.zeros((len(labels), num_classes))\n",
    "    for i, label in enumerate(labels):\n",
    "        results[i, label] = 1.0\n",
    "    return results\n",
    "\n",
    "y_train = one_hot_encode(train_labels)\n",
    "y_test = one_hot_encode(test_labels)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "from keras.utils import to_categorical\n",
    "\n",
    "y_train = to_categorical(train_labels)\n",
    "y_test = to_categorical(test_labels)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Building your model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = keras.Sequential(\n",
    "    [\n",
    "        layers.Dense(64, activation=\"relu\"),\n",
    "        layers.Dense(64, activation=\"relu\"),\n",
    "        layers.Dense(46, activation=\"softmax\"),\n",
    "    ]\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "top_3_accuracy = keras.metrics.TopKCategoricalAccuracy(\n",
    "    k=3, name=\"top_3_accuracy\"\n",
    ")\n",
    "model.compile(\n",
    "    optimizer=\"adam\",\n",
    "    loss=\"categorical_crossentropy\",\n",
    "    metrics=[\"accuracy\", top_3_accuracy],\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Validating your approach"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "x_val = x_train[:1000]\n",
    "partial_x_train = x_train[1000:]\n",
    "y_val = y_train[:1000]\n",
    "partial_y_train = y_train[1000:]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "history = model.fit(\n",
    "    partial_x_train,\n",
    "    partial_y_train,\n",
    "    epochs=20,\n",
    "    batch_size=512,\n",
    "    validation_data=(x_val, y_val),\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "loss = history.history[\"loss\"]\n",
    "val_loss = history.history[\"val_loss\"]\n",
    "epochs = range(1, len(loss) + 1)\n",
    "plt.plot(epochs, loss, \"r--\", label=\"Training loss\")\n",
    "plt.plot(epochs, val_loss, \"b\", label=\"Validation loss\")\n",
    "plt.title(\"Training and validation loss\")\n",
    "plt.xlabel(\"Epochs\")\n",
    "plt.xticks(epochs)\n",
    "plt.ylabel(\"Loss\")\n",
    "plt.legend()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "plt.clf()\n",
    "acc = history.history[\"accuracy\"]\n",
    "val_acc = history.history[\"val_accuracy\"]\n",
    "plt.plot(epochs, acc, \"r--\", label=\"Training accuracy\")\n",
    "plt.plot(epochs, val_acc, \"b\", label=\"Validation accuracy\")\n",
    "plt.title(\"Training and validation accuracy\")\n",
    "plt.xlabel(\"Epochs\")\n",
    "plt.xticks(epochs)\n",
    "plt.ylabel(\"Accuracy\")\n",
    "plt.legend()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "plt.clf()\n",
    "acc = history.history[\"top_3_accuracy\"]\n",
    "val_acc = history.history[\"val_top_3_accuracy\"]\n",
    "plt.plot(epochs, acc, \"r--\", label=\"Training top-3 accuracy\")\n",
    "plt.plot(epochs, val_acc, \"b\", label=\"Validation top-3 accuracy\")\n",
    "plt.title(\"Training and validation top-3 accuracy\")\n",
    "plt.xlabel(\"Epochs\")\n",
    "plt.xticks(epochs)\n",
    "plt.ylabel(\"Top-3 accuracy\")\n",
    "plt.legend()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = keras.Sequential(\n",
    "    [\n",
    "        layers.Dense(64, activation=\"relu\"),\n",
    "        layers.Dense(64, activation=\"relu\"),\n",
    "        layers.Dense(46, activation=\"softmax\"),\n",
    "    ]\n",
    ")\n",
    "model.compile(\n",
    "    optimizer=\"adam\",\n",
    "    loss=\"categorical_crossentropy\",\n",
    "    metrics=[\"accuracy\"],\n",
    ")\n",
    "model.fit(\n",
    "    x_train,\n",
    "    y_train,\n",
    "    epochs=9,\n",
    "    batch_size=512,\n",
    ")\n",
    "results = model.evaluate(x_test, y_test)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "results"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import copy\n",
    "test_labels_copy = copy.copy(test_labels)\n",
    "np.random.shuffle(test_labels_copy)\n",
    "hits_array = np.array(test_labels == test_labels_copy)\n",
    "hits_array.mean()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Generating predictions on new data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "predictions = model.predict(x_test)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "predictions[0].shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "np.sum(predictions[0])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "np.argmax(predictions[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### A different way to handle the labels and the loss"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "y_train = train_labels\n",
    "y_test = test_labels"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model.compile(\n",
    "    optimizer=\"adam\",\n",
    "    loss=\"sparse_categorical_crossentropy\",\n",
    "    metrics=[\"accuracy\"],\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### The importance of having sufficiently large intermediate layers"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = keras.Sequential(\n",
    "    [\n",
    "        layers.Dense(64, activation=\"relu\"),\n",
    "        layers.Dense(4, activation=\"relu\"),\n",
    "        layers.Dense(46, activation=\"softmax\"),\n",
    "    ]\n",
    ")\n",
    "model.compile(\n",
    "    optimizer=\"adam\",\n",
    "    loss=\"categorical_crossentropy\",\n",
    "    metrics=[\"accuracy\"],\n",
    ")\n",
    "model.fit(\n",
    "    partial_x_train,\n",
    "    partial_y_train,\n",
    "    epochs=20,\n",
    "    batch_size=128,\n",
    "    validation_data=(x_val, y_val),\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Further experiments"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Wrapping up"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Predicting house prices: A regression example"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### The California Housing Price dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "from keras.datasets import california_housing\n",
    "\n",
    "(train_data, train_targets), (test_data, test_targets) = (\n",
    "    california_housing.load_data(version=\"small\")\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "train_data.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "test_data.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "train_targets"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Preparing the data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "mean = train_data.mean(axis=0)\n",
    "std = train_data.std(axis=0)\n",
    "x_train = (train_data - mean) / std\n",
    "x_test = (test_data - mean) / std"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "y_train = train_targets / 100000\n",
    "y_test = test_targets / 100000"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Building your model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "def get_model():\n",
    "    model = keras.Sequential(\n",
    "        [\n",
    "            layers.Dense(64, activation=\"relu\"),\n",
    "            layers.Dense(64, activation=\"relu\"),\n",
    "            layers.Dense(1),\n",
    "        ]\n",
    "    )\n",
    "    model.compile(\n",
    "        optimizer=\"adam\",\n",
    "        loss=\"mean_squared_error\",\n",
    "        metrics=[\"mean_absolute_error\"],\n",
    "    )\n",
    "    return model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Validating your approach using K-fold validation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "k = 4\n",
    "num_val_samples = len(x_train) // k\n",
    "num_epochs = 50\n",
    "all_scores = []\n",
    "for i in range(k):\n",
    "    print(f\"Processing fold #{i + 1}\")\n",
    "    fold_x_val = x_train[i * num_val_samples : (i + 1) * num_val_samples]\n",
    "    fold_y_val = y_train[i * num_val_samples : (i + 1) * num_val_samples]\n",
    "    fold_x_train = np.concatenate(\n",
    "        [x_train[: i * num_val_samples], x_train[(i + 1) * num_val_samples :]],\n",
    "        axis=0,\n",
    "    )\n",
    "    fold_y_train = np.concatenate(\n",
    "        [y_train[: i * num_val_samples], y_train[(i + 1) * num_val_samples :]],\n",
    "        axis=0,\n",
    "    )\n",
    "    model = get_model()\n",
    "    model.fit(\n",
    "        fold_x_train,\n",
    "        fold_y_train,\n",
    "        epochs=num_epochs,\n",
    "        batch_size=16,\n",
    "        verbose=0,\n",
    "    )\n",
    "    scores = model.evaluate(fold_x_val, fold_y_val, verbose=0)\n",
    "    val_loss, val_mae = scores\n",
    "    all_scores.append(val_mae)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "[round(value, 3) for value in all_scores]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "round(np.mean(all_scores), 3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "k = 4\n",
    "num_val_samples = len(x_train) // k\n",
    "num_epochs = 200\n",
    "all_mae_histories = []\n",
    "for i in range(k):\n",
    "    print(f\"Processing fold #{i + 1}\")\n",
    "    fold_x_val = x_train[i * num_val_samples : (i + 1) * num_val_samples]\n",
    "    fold_y_val = y_train[i * num_val_samples : (i + 1) * num_val_samples]\n",
    "    fold_x_train = np.concatenate(\n",
    "        [x_train[: i * num_val_samples], x_train[(i + 1) * num_val_samples :]],\n",
    "        axis=0,\n",
    "    )\n",
    "    fold_y_train = np.concatenate(\n",
    "        [y_train[: i * num_val_samples], y_train[(i + 1) * num_val_samples :]],\n",
    "        axis=0,\n",
    "    )\n",
    "    model = get_model()\n",
    "    history = model.fit(\n",
    "        fold_x_train,\n",
    "        fold_y_train,\n",
    "        validation_data=(fold_x_val, fold_y_val),\n",
    "        epochs=num_epochs,\n",
    "        batch_size=16,\n",
    "        verbose=0,\n",
    "    )\n",
    "    mae_history = history.history[\"val_mean_absolute_error\"]\n",
    "    all_mae_histories.append(mae_history)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "average_mae_history = [\n",
    "    np.mean([x[i] for x in all_mae_histories]) for i in range(num_epochs)\n",
    "]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "epochs = range(1, len(average_mae_history) + 1)\n",
    "plt.plot(epochs, average_mae_history)\n",
    "plt.xlabel(\"Epochs\")\n",
    "plt.ylabel(\"Validation MAE\")\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "truncated_mae_history = average_mae_history[10:]\n",
    "epochs = range(10, len(truncated_mae_history) + 10)\n",
    "plt.plot(epochs, truncated_mae_history)\n",
    "plt.xlabel(\"Epochs\")\n",
    "plt.ylabel(\"Validation MAE\")\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = get_model()\n",
    "model.fit(x_train, y_train, epochs=130, batch_size=16, verbose=0)\n",
    "test_mean_squared_error, test_mean_absolute_error = model.evaluate(\n",
    "    x_test, y_test\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "round(test_mean_absolute_error, 3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Generating predictions on new data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "predictions = model.predict(x_test)\n",
    "predictions[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Wrapping up"
   ]
  }
 ],
 "metadata": {
  "accelerator": "GPU",
  "colab": {
   "collapsed_sections": [],
   "name": "chapter04_classification-and-regression",
   "private_outputs": false,
   "provenance": [],
   "toc_visible": true
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}

================================================
FILE: chapter05_fundamentals-of-ml.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "!pip install keras keras-hub --upgrade -q"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import os\n",
    "os.environ[\"KERAS_BACKEND\"] = \"jax\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "cellView": "form",
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "# @title\n",
    "import os\n",
    "from IPython.core.magic import register_cell_magic\n",
    "\n",
    "@register_cell_magic\n",
    "def backend(line, cell):\n",
    "    current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n",
    "    if current == required:\n",
    "        get_ipython().run_cell(cell)\n",
    "    else:\n",
    "        print(\n",
    "            f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n",
    "            f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n",
    "        )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "## Fundamentals of machine learning"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Generalization: The goal of machine learning"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Underfitting and overfitting"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Noisy training data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Ambiguous features"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Rare features and spurious correlations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "from keras.datasets import mnist\n",
    "import numpy as np\n",
    "\n",
    "(train_images, train_labels), _ = mnist.load_data()\n",
    "train_images = train_images.reshape((60000, 28 * 28))\n",
    "train_images = train_images.astype(\"float32\") / 255\n",
    "\n",
    "train_images_with_noise_channels = np.concatenate(\n",
    "    [train_images, np.random.random((len(train_images), 784))], axis=1\n",
    ")\n",
    "\n",
    "train_images_with_zeros_channels = np.concatenate(\n",
    "    [train_images, np.zeros((len(train_images), 784))], axis=1\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import keras\n",
    "from keras import layers\n",
    "\n",
    "def get_model():\n",
    "    model = keras.Sequential(\n",
    "        [\n",
    "            layers.Dense(512, activation=\"relu\"),\n",
    "            layers.Dense(10, activation=\"softmax\"),\n",
    "        ]\n",
    "    )\n",
    "    model.compile(\n",
    "        optimizer=\"adam\",\n",
    "        loss=\"sparse_categorical_crossentropy\",\n",
    "        metrics=[\"accuracy\"],\n",
    "    )\n",
    "    return model\n",
    "\n",
    "model = get_model()\n",
    "history_noise = model.fit(\n",
    "    train_images_with_noise_channels,\n",
    "    train_labels,\n",
    "    epochs=10,\n",
    "    batch_size=128,\n",
    "    validation_split=0.2,\n",
    ")\n",
    "\n",
    "model = get_model()\n",
    "history_zeros = model.fit(\n",
    "    train_images_with_zeros_channels,\n",
    "    train_labels,\n",
    "    epochs=10,\n",
    "    batch_size=128,\n",
    "    validation_split=0.2,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "\n",
    "val_acc_noise = history_noise.history[\"val_accuracy\"]\n",
    "val_acc_zeros = history_zeros.history[\"val_accuracy\"]\n",
    "epochs = range(1, 11)\n",
    "plt.plot(\n",
    "    epochs,\n",
    "    val_acc_noise,\n",
    "    \"b-\",\n",
    "    label=\"Validation accuracy with noise channels\",\n",
    ")\n",
    "plt.plot(\n",
    "    epochs,\n",
    "    val_acc_zeros,\n",
    "    \"r--\",\n",
    "    label=\"Validation accuracy with zeros channels\",\n",
    ")\n",
    "plt.title(\"Effect of noise channels on validation accuracy\")\n",
    "plt.xlabel(\"Epochs\")\n",
    "plt.xticks(epochs)\n",
    "plt.ylabel(\"Accuracy\")\n",
    "plt.legend()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### The nature of generalization in deep learning"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "(train_images, train_labels), _ = mnist.load_data()\n",
    "train_images = train_images.reshape((60000, 28 * 28))\n",
    "train_images = train_images.astype(\"float32\") / 255\n",
    "\n",
    "random_train_labels = train_labels[:]\n",
    "np.random.shuffle(random_train_labels)\n",
    "\n",
    "model = keras.Sequential(\n",
    "    [\n",
    "        layers.Dense(512, activation=\"relu\"),\n",
    "        layers.Dense(10, activation=\"softmax\"),\n",
    "    ]\n",
    ")\n",
    "model.compile(\n",
    "    optimizer=\"rmsprop\",\n",
    "    loss=\"sparse_categorical_crossentropy\",\n",
    "    metrics=[\"accuracy\"],\n",
    ")\n",
    "model.fit(\n",
    "    train_images,\n",
    "    random_train_labels,\n",
    "    epochs=100,\n",
    "    batch_size=128,\n",
    "    validation_split=0.2,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### The manifold hypothesis"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Interpolation as a source of generalization"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Why deep learning works"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Training data is paramount"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Evaluating machine-learning models"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Training, validation, and test sets"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Simple hold-out validation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### K-fold validation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Iterated K-fold validation with shuffling"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Beating a common-sense baseline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Things to keep in mind about model evaluation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Improving model fit"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Tuning key gradient descent parameters"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "(train_images, train_labels), _ = mnist.load_data()\n",
    "train_images = train_images.reshape((60000, 28 * 28))\n",
    "train_images = train_images.astype(\"float32\") / 255\n",
    "\n",
    "model = keras.Sequential(\n",
    "    [\n",
    "        layers.Dense(512, activation=\"relu\"),\n",
    "        layers.Dense(10, activation=\"softmax\"),\n",
    "    ]\n",
    ")\n",
    "model.compile(\n",
    "    optimizer=keras.optimizers.RMSprop(learning_rate=1.0),\n",
    "    loss=\"sparse_categorical_crossentropy\",\n",
    "    metrics=[\"accuracy\"],\n",
    ")\n",
    "model.fit(\n",
    "    train_images, train_labels, epochs=10, batch_size=128, validation_split=0.2\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = keras.Sequential(\n",
    "    [\n",
    "        layers.Dense(512, activation=\"relu\"),\n",
    "        layers.Dense(10, activation=\"softmax\"),\n",
    "    ]\n",
    ")\n",
    "model.compile(\n",
    "    optimizer=keras.optimizers.RMSprop(learning_rate=1e-2),\n",
    "    loss=\"sparse_categorical_crossentropy\",\n",
    "    metrics=[\"accuracy\"],\n",
    ")\n",
    "model.fit(\n",
    "    train_images, train_labels, epochs=10, batch_size=128, validation_split=0.2\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Using better architecture priors"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Increasing model capacity"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = keras.Sequential([layers.Dense(10, activation=\"softmax\")])\n",
    "model.compile(\n",
    "    optimizer=\"rmsprop\",\n",
    "    loss=\"sparse_categorical_crossentropy\",\n",
    "    metrics=[\"accuracy\"],\n",
    ")\n",
    "history_small_model = model.fit(\n",
    "    train_images, train_labels, epochs=20, batch_size=128, validation_split=0.2\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "\n",
    "val_loss = history_small_model.history[\"val_loss\"]\n",
    "epochs = range(1, 21)\n",
    "plt.plot(epochs, val_loss, \"b-\", label=\"Validation loss\")\n",
    "plt.title(\"Validation loss for a model with insufficient capacity\")\n",
    "plt.xlabel(\"Epochs\")\n",
    "plt.ylabel(\"Loss\")\n",
    "plt.legend()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = keras.Sequential(\n",
    "    [\n",
    "        layers.Dense(128, activation=\"relu\"),\n",
    "        layers.Dense(128, activation=\"relu\"),\n",
    "        layers.Dense(10, activation=\"softmax\"),\n",
    "    ]\n",
    ")\n",
    "model.compile(\n",
    "    optimizer=\"rmsprop\",\n",
    "    loss=\"sparse_categorical_crossentropy\",\n",
    "    metrics=[\"accuracy\"],\n",
    ")\n",
    "history_large_model = model.fit(\n",
    "    train_images,\n",
    "    train_labels,\n",
    "    epochs=20,\n",
    "    batch_size=128,\n",
    "    validation_split=0.2,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "val_loss = history_large_model.history[\"val_loss\"]\n",
    "epochs = range(1, 21)\n",
    "plt.plot(epochs, val_loss, \"b-\", label=\"Validation loss\")\n",
    "plt.title(\"Validation loss for a model with appropriate capacity\")\n",
    "plt.xlabel(\"Epochs\")\n",
    "plt.ylabel(\"Loss\")\n",
    "plt.legend()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = keras.Sequential(\n",
    "    [\n",
    "        layers.Dense(2048, activation=\"relu\"),\n",
    "        layers.Dense(2048, activation=\"relu\"),\n",
    "        layers.Dense(2048, activation=\"relu\"),\n",
    "        layers.Dense(10, activation=\"softmax\"),\n",
    "    ]\n",
    ")\n",
    "model.compile(\n",
    "    optimizer=\"rmsprop\",\n",
    "    loss=\"sparse_categorical_crossentropy\",\n",
    "    metrics=[\"accuracy\"],\n",
    ")\n",
    "history_very_large_model = model.fit(\n",
    "    train_images,\n",
    "    train_labels,\n",
    "    epochs=20,\n",
    "    batch_size=32,\n",
    "    validation_split=0.2,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "val_loss = history_very_large_model.history[\"val_loss\"]\n",
    "epochs = range(1, 21)\n",
    "plt.plot(epochs, val_loss, \"b-\", label=\"Validation loss\")\n",
    "plt.title(\"Validation loss for a model with too much capacity\")\n",
    "plt.xlabel(\"Epochs\")\n",
    "plt.ylabel(\"Loss\")\n",
    "plt.legend()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Improving generalization"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Dataset curation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Feature engineering"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Using early stopping"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Regularizing your model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Reducing the network's size"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "from keras.datasets import imdb\n",
    "\n",
    "(train_data, train_labels), _ = imdb.load_data(num_words=10000)\n",
    "\n",
    "def vectorize_sequences(sequences, dimension=10000):\n",
    "    results = np.zeros((len(sequences), dimension))\n",
    "    for i, sequence in enumerate(sequences):\n",
    "        results[i, sequence] = 1.0\n",
    "    return results\n",
    "\n",
    "train_data = vectorize_sequences(train_data)\n",
    "\n",
    "model = keras.Sequential(\n",
    "    [\n",
    "        layers.Dense(16, activation=\"relu\"),\n",
    "        layers.Dense(16, activation=\"relu\"),\n",
    "        layers.Dense(1, activation=\"sigmoid\"),\n",
    "    ]\n",
    ")\n",
    "model.compile(\n",
    "    optimizer=\"rmsprop\",\n",
    "    loss=\"binary_crossentropy\",\n",
    "    metrics=[\"accuracy\"],\n",
    ")\n",
    "history_original = model.fit(\n",
    "    train_data,\n",
    "    train_labels,\n",
    "    epochs=20,\n",
    "    batch_size=512,\n",
    "    validation_split=0.4,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = keras.Sequential(\n",
    "    [\n",
    "        layers.Dense(4, activation=\"relu\"),\n",
    "        layers.Dense(4, activation=\"relu\"),\n",
    "        layers.Dense(1, activation=\"sigmoid\"),\n",
    "    ]\n",
    ")\n",
    "model.compile(\n",
    "    optimizer=\"rmsprop\",\n",
    "    loss=\"binary_crossentropy\",\n",
    "    metrics=[\"accuracy\"],\n",
    ")\n",
    "history_smaller_model = model.fit(\n",
    "    train_data,\n",
    "    train_labels,\n",
    "    epochs=20,\n",
    "    batch_size=512,\n",
    "    validation_split=0.4,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "original_val_loss = history_original.history[\"val_loss\"]\n",
    "smaller_model_val_loss = history_smaller_model.history[\"val_loss\"]\n",
    "epochs = range(1, 21)\n",
    "plt.plot(\n",
    "    epochs,\n",
    "    original_val_loss,\n",
    "    \"r--\",\n",
    "    label=\"Validation loss of original model\",\n",
    ")\n",
    "plt.plot(\n",
    "    epochs,\n",
    "    smaller_model_val_loss,\n",
    "    \"b-\",\n",
    "    label=\"Validation loss of smaller model\",\n",
    ")\n",
    "plt.title(\"Original model vs. smaller model (IMDB review classification)\")\n",
    "plt.xlabel(\"Epochs\")\n",
    "plt.ylabel(\"Loss\")\n",
    "plt.xticks(epochs)\n",
    "plt.legend()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = keras.Sequential(\n",
    "    [\n",
    "        layers.Dense(512, activation=\"relu\"),\n",
    "        layers.Dense(512, activation=\"relu\"),\n",
    "        layers.Dense(1, activation=\"sigmoid\"),\n",
    "    ]\n",
    ")\n",
    "model.compile(\n",
    "    optimizer=\"rmsprop\",\n",
    "    loss=\"binary_crossentropy\",\n",
    "    metrics=[\"accuracy\"],\n",
    ")\n",
    "history_larger_model = model.fit(\n",
    "    train_data,\n",
    "    train_labels,\n",
    "    epochs=20,\n",
    "    batch_size=512,\n",
    "    validation_split=0.4,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "original_val_loss = history_original.history[\"val_loss\"]\n",
    "larger_model_val_loss = history_larger_model.history[\"val_loss\"]\n",
    "epochs = range(1, 21)\n",
    "plt.plot(\n",
    "    epochs,\n",
    "    original_val_loss,\n",
    "    \"r--\",\n",
    "    label=\"Validation loss of original model\",\n",
    ")\n",
    "plt.plot(\n",
    "    epochs,\n",
    "    larger_model_val_loss,\n",
    "    \"b-\",\n",
    "    label=\"Validation loss of larger model\",\n",
    ")\n",
    "plt.title(\"Original model vs. larger model (IMDB review classification)\")\n",
    "plt.xlabel(\"Epochs\")\n",
    "plt.ylabel(\"Loss\")\n",
    "plt.xticks(epochs)\n",
    "plt.legend()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Adding weight regularization"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "from keras.regularizers import l2\n",
    "\n",
    "model = keras.Sequential(\n",
    "    [\n",
    "        layers.Dense(16, kernel_regularizer=l2(0.002), activation=\"relu\"),\n",
    "        layers.Dense(16, kernel_regularizer=l2(0.002), activation=\"relu\"),\n",
    "        layers.Dense(1, activation=\"sigmoid\"),\n",
    "    ]\n",
    ")\n",
    "model.compile(\n",
    "    optimizer=\"rmsprop\",\n",
    "    loss=\"binary_crossentropy\",\n",
    "    metrics=[\"accuracy\"],\n",
    ")\n",
    "history_l2_reg = model.fit(\n",
    "    train_data,\n",
    "    train_labels,\n",
    "    epochs=20,\n",
    "    batch_size=512,\n",
    "    validation_split=0.4,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "original_val_loss = history_original.history[\"val_loss\"]\n",
    "l2_val_loss = history_l2_reg.history[\"val_loss\"]\n",
    "epochs = range(1, 21)\n",
    "plt.plot(\n",
    "    epochs,\n",
    "    original_val_loss,\n",
    "    \"r--\",\n",
    "    label=\"Validation loss of original model\",\n",
    ")\n",
    "plt.plot(\n",
    "    epochs,\n",
    "    l2_val_loss,\n",
    "    \"b-\",\n",
    "    label=\"Validation loss of L2-regularized model\",\n",
    ")\n",
    "plt.title(\n",
    "    \"Original model vs. L2-regularized model (IMDB review classification)\"\n",
    ")\n",
    "plt.xlabel(\"Epochs\")\n",
    "plt.ylabel(\"Loss\")\n",
    "plt.xticks(epochs)\n",
    "plt.legend()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "from keras import regularizers\n",
    "\n",
    "regularizers.l1(0.001)\n",
    "regularizers.l1_l2(l1=0.001, l2=0.001)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Adding dropout"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = keras.Sequential(\n",
    "    [\n",
    "        layers.Dense(16, activation=\"relu\"),\n",
    "        layers.Dropout(0.5),\n",
    "        layers.Dense(16, activation=\"relu\"),\n",
    "        layers.Dropout(0.5),\n",
    "        layers.Dense(1, activation=\"sigmoid\"),\n",
    "    ]\n",
    ")\n",
    "model.compile(\n",
    "    optimizer=\"rmsprop\",\n",
    "    loss=\"binary_crossentropy\",\n",
    "    metrics=[\"accuracy\"],\n",
    ")\n",
    "history_dropout = model.fit(\n",
    "    train_data,\n",
    "    train_labels,\n",
    "    epochs=20,\n",
    "    batch_size=512,\n",
    "    validation_split=0.4,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "original_val_loss = history_original.history[\"val_loss\"]\n",
    "dropout_val_loss = history_dropout.history[\"val_loss\"]\n",
    "epochs = range(1, 21)\n",
    "plt.plot(\n",
    "    epochs,\n",
    "    original_val_loss,\n",
    "    \"r--\",\n",
    "    label=\"Validation loss of original model\",\n",
    ")\n",
    "plt.plot(\n",
    "    epochs,\n",
    "    dropout_val_loss,\n",
    "    \"b-\",\n",
    "    label=\"Validation loss of dropout-regularized model\",\n",
    ")\n",
    "plt.title(\n",
    "    \"Original model vs. dropout-regularized model (IMDB review classification)\"\n",
    ")\n",
    "plt.xlabel(\"Epochs\")\n",
    "plt.ylabel(\"Loss\")\n",
    "plt.xticks(epochs)\n",
    "plt.legend()\n",
    "plt.show()"
   ]
  }
 ],
 "metadata": {
  "accelerator": "GPU",
  "colab": {
   "collapsed_sections": [],
   "name": "chapter05_fundamentals-of-ml",
   "private_outputs": false,
   "provenance": [],
   "toc_visible": true
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}

================================================
FILE: chapter07_deep-dive-keras.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "!pip install keras keras-hub --upgrade -q"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import os\n",
    "os.environ[\"KERAS_BACKEND\"] = \"jax\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "cellView": "form",
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "# @title\n",
    "import os\n",
    "from IPython.core.magic import register_cell_magic\n",
    "\n",
    "@register_cell_magic\n",
    "def backend(line, cell):\n",
    "    current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n",
    "    if current == required:\n",
    "        get_ipython().run_cell(cell)\n",
    "    else:\n",
    "        print(\n",
    "            f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n",
    "            f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n",
    "        )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "## A deep dive on Keras"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### A spectrum of workflows"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Different ways to build Keras models"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### The Sequential model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import keras\n",
    "from keras import layers\n",
    "\n",
    "model = keras.Sequential(\n",
    "    [\n",
    "        layers.Dense(64, activation=\"relu\"),\n",
    "        layers.Dense(10, activation=\"softmax\"),\n",
    "    ]\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = keras.Sequential()\n",
    "model.add(layers.Dense(64, activation=\"relu\"))\n",
    "model.add(layers.Dense(10, activation=\"softmax\"))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model.weights"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model.build(input_shape=(None, 3))\n",
    "model.weights"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model.summary(line_length=80)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = keras.Sequential(name=\"my_example_model\")\n",
    "model.add(layers.Dense(64, activation=\"relu\", name=\"my_first_layer\"))\n",
    "model.add(layers.Dense(10, activation=\"softmax\", name=\"my_last_layer\"))\n",
    "model.build((None, 3))\n",
    "model.summary(line_length=80)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = keras.Sequential()\n",
    "model.add(keras.Input(shape=(3,)))\n",
    "model.add(layers.Dense(64, activation=\"relu\"))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model.summary(line_length=80)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model.add(layers.Dense(10, activation=\"softmax\"))\n",
    "model.summary(line_length=80)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### The Functional API"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### A simple example"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "inputs = keras.Input(shape=(3,), name=\"my_input\")\n",
    "features = layers.Dense(64, activation=\"relu\")(inputs)\n",
    "outputs = layers.Dense(10, activation=\"softmax\")(features)\n",
    "model = keras.Model(inputs=inputs, outputs=outputs, name=\"my_functional_model\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "inputs = keras.Input(shape=(3,), name=\"my_input\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "inputs.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "inputs.dtype"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "features = layers.Dense(64, activation=\"relu\")(inputs)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "features.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "outputs = layers.Dense(10, activation=\"softmax\")(features)\n",
    "model = keras.Model(inputs=inputs, outputs=outputs, name=\"my_functional_model\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model.summary(line_length=80)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Multi-input, multi-output models"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "vocabulary_size = 10000\n",
    "num_tags = 100\n",
    "num_departments = 4\n",
    "\n",
    "title = keras.Input(shape=(vocabulary_size,), name=\"title\")\n",
    "text_body = keras.Input(shape=(vocabulary_size,), name=\"text_body\")\n",
    "tags = keras.Input(shape=(num_tags,), name=\"tags\")\n",
    "\n",
    "features = layers.Concatenate()([title, text_body, tags])\n",
    "features = layers.Dense(64, activation=\"relu\", name=\"dense_features\")(features)\n",
    "\n",
    "priority = layers.Dense(1, activation=\"sigmoid\", name=\"priority\")(features)\n",
    "department = layers.Dense(\n",
    "    num_departments, activation=\"softmax\", name=\"department\"\n",
    ")(features)\n",
    "\n",
    "model = keras.Model(\n",
    "    inputs=[title, text_body, tags],\n",
    "    outputs=[priority, department],\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Training a multi-input, multi-output model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "num_samples = 1280\n",
    "\n",
    "title_data = np.random.randint(0, 2, size=(num_samples, vocabulary_size))\n",
    "text_body_data = np.random.randint(0, 2, size=(num_samples, vocabulary_size))\n",
    "tags_data = np.random.randint(0, 2, size=(num_samples, num_tags))\n",
    "\n",
    "priority_data = np.random.random(size=(num_samples, 1))\n",
    "department_data = np.random.randint(0, num_departments, size=(num_samples, 1))\n",
    "\n",
    "model.compile(\n",
    "    optimizer=\"adam\",\n",
    "    loss=[\"mean_squared_error\", \"sparse_categorical_crossentropy\"],\n",
    "    metrics=[[\"mean_absolute_error\"], [\"accuracy\"]],\n",
    ")\n",
    "model.fit(\n",
    "    [title_data, text_body_data, tags_data],\n",
    "    [priority_data, department_data],\n",
    "    epochs=1,\n",
    ")\n",
    "model.evaluate(\n",
    "    [title_data, text_body_data, tags_data], [priority_data, department_data]\n",
    ")\n",
    "priority_preds, department_preds = model.predict(\n",
    "    [title_data, text_body_data, tags_data]\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model.compile(\n",
    "    optimizer=\"adam\",\n",
    "    loss={\n",
    "        \"priority\": \"mean_squared_error\",\n",
    "        \"department\": \"sparse_categorical_crossentropy\",\n",
    "    },\n",
    "    metrics={\n",
    "        \"priority\": [\"mean_absolute_error\"],\n",
    "        \"department\": [\"accuracy\"],\n",
    "    },\n",
    ")\n",
    "model.fit(\n",
    "    {\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data},\n",
    "    {\"priority\": priority_data, \"department\": department_data},\n",
    "    epochs=1,\n",
    ")\n",
    "model.evaluate(\n",
    "    {\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data},\n",
    "    {\"priority\": priority_data, \"department\": department_data},\n",
    ")\n",
    "priority_preds, department_preds = model.predict(\n",
    "    {\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data}\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### The power of the Functional API: Access to layer connectivity"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "###### Plotting layer connectivity"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "keras.utils.plot_model(model, \"ticket_classifier.png\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "keras.utils.plot_model(\n",
    "    model,\n",
    "    \"ticket_classifier_with_shape_info.png\",\n",
    "    show_shapes=True,\n",
    "    show_layer_names=True,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "###### Feature extraction with a Functional model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model.layers"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model.layers[3].input"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model.layers[3].output"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "features = model.layers[4].output\n",
    "difficulty = layers.Dense(3, activation=\"softmax\", name=\"difficulty\")(features)\n",
    "\n",
    "new_model = keras.Model(\n",
    "    inputs=[title, text_body, tags], outputs=[priority, department, difficulty]\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "keras.utils.plot_model(\n",
    "    new_model,\n",
    "    \"updated_ticket_classifier.png\",\n",
    "    show_shapes=True,\n",
    "    show_layer_names=True,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Subclassing the Model class"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Rewriting our previous example as a subclassed model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "class CustomerTicketModel(keras.Model):\n",
    "    def __init__(self, num_departments):\n",
    "        super().__init__()\n",
    "        self.concat_layer = layers.Concatenate()\n",
    "        self.mixing_layer = layers.Dense(64, activation=\"relu\")\n",
    "        self.priority_scorer = layers.Dense(1, activation=\"sigmoid\")\n",
    "        self.department_classifier = layers.Dense(\n",
    "            num_departments, activation=\"softmax\"\n",
    "        )\n",
    "\n",
    "    def call(self, inputs):\n",
    "        title = inputs[\"title\"]\n",
    "        text_body = inputs[\"text_body\"]\n",
    "        tags = inputs[\"tags\"]\n",
    "\n",
    "        features = self.concat_layer([title, text_body, tags])\n",
    "        features = self.mixing_layer(features)\n",
    "        priority = self.priority_scorer(features)\n",
    "        department = self.department_classifier(features)\n",
    "        return priority, department"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = CustomerTicketModel(num_departments=4)\n",
    "\n",
    "priority, department = model(\n",
    "    {\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data}\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model.compile(\n",
    "    optimizer=\"adam\",\n",
    "    loss=[\"mean_squared_error\", \"sparse_categorical_crossentropy\"],\n",
    "    metrics=[[\"mean_absolute_error\"], [\"accuracy\"]],\n",
    ")\n",
    "model.fit(\n",
    "    {\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data},\n",
    "    [priority_data, department_data],\n",
    "    epochs=1,\n",
    ")\n",
    "model.evaluate(\n",
    "    {\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data},\n",
    "    [priority_data, department_data],\n",
    ")\n",
    "priority_preds, department_preds = model.predict(\n",
    "    {\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data}\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Beware: What subclassed models don't support"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Mixing and matching different components"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "class Classifier(keras.Model):\n",
    "    def __init__(self, num_classes=2):\n",
    "        super().__init__()\n",
    "        if num_classes == 2:\n",
    "            num_units = 1\n",
    "            activation = \"sigmoid\"\n",
    "        else:\n",
    "            num_units = num_classes\n",
    "            activation = \"softmax\"\n",
    "        self.dense = layers.Dense(num_units, activation=activation)\n",
    "\n",
    "    def call(self, inputs):\n",
    "        return self.dense(inputs)\n",
    "\n",
    "inputs = keras.Input(shape=(3,))\n",
    "features = layers.Dense(64, activation=\"relu\")(inputs)\n",
    "outputs = Classifier(num_classes=10)(features)\n",
    "model = keras.Model(inputs=inputs, outputs=outputs)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "inputs = keras.Input(shape=(64,))\n",
    "outputs = layers.Dense(1, activation=\"sigmoid\")(inputs)\n",
    "binary_classifier = keras.Model(inputs=inputs, outputs=outputs)\n",
    "\n",
    "class MyModel(keras.Model):\n",
    "    def __init__(self, num_classes=2):\n",
    "        super().__init__()\n",
    "        self.dense = layers.Dense(64, activation=\"relu\")\n",
    "        self.classifier = binary_classifier\n",
    "\n",
    "    def call(self, inputs):\n",
    "        features = self.dense(inputs)\n",
    "        return self.classifier(features)\n",
    "\n",
    "model = MyModel()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Remember: Use the right tool for the job"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Using built-in training and evaluation loops"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "from keras.datasets import mnist\n",
    "\n",
    "def get_mnist_model():\n",
    "    inputs = keras.Input(shape=(28 * 28,))\n",
    "    features = layers.Dense(512, activation=\"relu\")(inputs)\n",
    "    features = layers.Dropout(0.5)(features)\n",
    "    outputs = layers.Dense(10, activation=\"softmax\")(features)\n",
    "    model = keras.Model(inputs, outputs)\n",
    "    return model\n",
    "\n",
    "(images, labels), (test_images, test_labels) = mnist.load_data()\n",
    "images = images.reshape((60000, 28 * 28)).astype(\"float32\") / 255\n",
    "test_images = test_images.reshape((10000, 28 * 28)).astype(\"float32\") / 255\n",
    "train_images, val_images = images[10000:], images[:10000]\n",
    "train_labels, val_labels = labels[10000:], labels[:10000]\n",
    "\n",
    "model = get_mnist_model()\n",
    "model.compile(\n",
    "    optimizer=\"adam\",\n",
    "    loss=\"sparse_categorical_crossentropy\",\n",
    "    metrics=[\"accuracy\"],\n",
    ")\n",
    "model.fit(\n",
    "    train_images,\n",
    "    train_labels,\n",
    "    epochs=3,\n",
    "    validation_data=(val_images, val_labels),\n",
    ")\n",
    "test_metrics = model.evaluate(test_images, test_labels)\n",
    "predictions = model.predict(test_images)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Writing your own metrics"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "from keras import ops\n",
    "\n",
    "class RootMeanSquaredError(keras.metrics.Metric):\n",
    "    def __init__(self, name=\"rmse\", **kwargs):\n",
    "        super().__init__(name=name, **kwargs)\n",
    "        self.mse_sum = self.add_weight(name=\"mse_sum\", initializer=\"zeros\")\n",
    "        self.total_samples = self.add_weight(\n",
    "            name=\"total_samples\", initializer=\"zeros\"\n",
    "        )\n",
    "\n",
    "    def update_state(self, y_true, y_pred, sample_weight=None):\n",
    "        y_true = ops.one_hot(y_true, num_classes=ops.shape(y_pred)[1])\n",
    "        mse = ops.sum(ops.square(y_true - y_pred))\n",
    "        self.mse_sum.assign_add(mse)\n",
    "        num_samples = ops.shape(y_pred)[0]\n",
    "        self.total_samples.assign_add(num_samples)\n",
    "\n",
    "    def result(self):\n",
    "        return ops.sqrt(self.mse_sum / self.total_samples)\n",
    "\n",
    "    def reset_state(self):\n",
    "        self.mse_sum.assign(0.)\n",
    "        self.total_samples.assign(0.)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = get_mnist_model()\n",
    "model.compile(\n",
    "    optimizer=\"adam\",\n",
    "    loss=\"sparse_categorical_crossentropy\",\n",
    "    metrics=[\"accuracy\", RootMeanSquaredError()],\n",
    ")\n",
    "model.fit(\n",
    "    train_images,\n",
    "    train_labels,\n",
    "    epochs=3,\n",
    "    validation_data=(val_images, val_labels),\n",
    ")\n",
    "test_metrics = model.evaluate(test_images, test_labels)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Using callbacks"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### The EarlyStopping and ModelCheckpoint callbacks"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "callbacks_list = [\n",
    "    keras.callbacks.EarlyStopping(\n",
    "        monitor=\"accuracy\",\n",
    "        patience=1,\n",
    "    ),\n",
    "    keras.callbacks.ModelCheckpoint(\n",
    "        filepath=\"checkpoint_path.keras\",\n",
    "        monitor=\"val_loss\",\n",
    "        save_best_only=True,\n",
    "    ),\n",
    "]\n",
    "model = get_mnist_model()\n",
    "model.compile(\n",
    "    optimizer=\"adam\",\n",
    "    loss=\"sparse_categorical_crossentropy\",\n",
    "    metrics=[\"accuracy\"],\n",
    ")\n",
    "model.fit(\n",
    "    train_images,\n",
    "    train_labels,\n",
    "    epochs=10,\n",
    "    callbacks=callbacks_list,\n",
    "    validation_data=(val_images, val_labels),\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = keras.models.load_model(\"checkpoint_path.keras\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Writing your own callbacks"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "from matplotlib import pyplot as plt\n",
    "\n",
    "class LossHistory(keras.callbacks.Callback):\n",
    "    def on_train_begin(self, logs):\n",
    "        self.per_batch_losses = []\n",
    "\n",
    "    def on_batch_end(self, batch, logs):\n",
    "        self.per_batch_losses.append(logs.get(\"loss\"))\n",
    "\n",
    "    def on_epoch_end(self, epoch, logs):\n",
    "        plt.clf()\n",
    "        plt.plot(\n",
    "            range(len(self.per_batch_losses)),\n",
    "            self.per_batch_losses,\n",
    "            label=\"Training loss for each batch\",\n",
    "        )\n",
    "        plt.xlabel(f\"Batch (epoch {epoch})\")\n",
    "        plt.ylabel(\"Loss\")\n",
    "        plt.legend()\n",
    "        plt.savefig(f\"plot_at_epoch_{epoch}\", dpi=300)\n",
    "        self.per_batch_losses = []"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = get_mnist_model()\n",
    "model.compile(\n",
    "    optimizer=\"adam\",\n",
    "    loss=\"sparse_categorical_crossentropy\",\n",
    "    metrics=[\"accuracy\"],\n",
    ")\n",
    "model.fit(\n",
    "    train_images,\n",
    "    train_labels,\n",
    "    epochs=10,\n",
    "    callbacks=[LossHistory()],\n",
    "    validation_data=(val_images, val_labels),\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Monitoring and visualization with TensorBoard"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model = get_mnist_model()\n",
    "model.compile(\n",
    "    optimizer=\"adam\",\n",
    "    loss=\"sparse_categorical_crossentropy\",\n",
    "    metrics=[\"accuracy\"],\n",
    ")\n",
    "\n",
    "tensorboard = keras.callbacks.TensorBoard(\n",
    "    log_dir=\"/full_path_to_your_log_dir\",\n",
    ")\n",
    "model.fit(\n",
    "    train_images,\n",
    "    train_labels,\n",
    "    epochs=10,\n",
    "    validation_data=(val_images, val_labels),\n",
    "    callbacks=[tensorboard],\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "%load_ext tensorboard\n",
    "%tensorboard --logdir /full_path_to_your_log_dir"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Writing your own training and evaluation loops"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Training vs. inference"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Writing custom training step functions"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### A TensorFlow training step function"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "%%backend tensorflow\n",
    "import tensorflow as tf\n",
    "\n",
    "model = get_mnist_model()\n",
    "loss_fn = keras.losses.SparseCategoricalCrossentropy()\n",
    "optimizer = keras.optimizers.Adam()\n",
    "\n",
    "def train_step(inputs, targets):\n",
    "    with tf.GradientTape() as tape:\n",
    "        predictions = model(inputs, training=True)\n",
    "        loss = loss_fn(targets, predictions)\n",
    "    gradients = tape.gradient(loss, model.trainable_weights)\n",
    "    optimizer.apply(gradients, model.trainable_weights)\n",
    "    return loss"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "%%backend tensorflow\n",
    "batch_size = 32\n",
    "inputs = train_images[:batch_size]\n",
    "targets = train_labels[:batch_size]\n",
    "loss = train_step(inputs, targets)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### A PyTorch training step function"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "%%backend torch\n",
    "import torch\n",
    "\n",
    "model = get_mnist_model()\n",
    "loss_fn = keras.losses.SparseCategoricalCrossentropy()\n",
    "optimizer = keras.optimizers.Adam()\n",
    "\n",
    "def train_step(inputs, targets):\n",
    "    predictions = model(inputs, training=True)\n",
    "    loss = loss_fn(targets, predictions)\n",
    "    loss.backward()\n",
    "    gradients = [weight.value.grad for weight in model.trainable_weights]\n",
    "    with torch.no_grad():\n",
    "        optimizer.apply(gradients, model.trainable_weights)\n",
    "    model.zero_grad()\n",
    "    return loss"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "%%backend torch\n",
    "batch_size = 32\n",
    "inputs = train_images[:batch_size]\n",
    "targets = train_labels[:batch_size]\n",
    "loss = train_step(inputs, targets)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### A JAX training step function"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "%%backend jax\n",
    "model = get_mnist_model()\n",
    "loss_fn = keras.losses.SparseCategoricalCrossentropy()\n",
    "\n",
    "def compute_loss_and_updates(\n",
    "    trainable_variables, non_trainable_variables, inputs, targets\n",
    "):\n",
    "    outputs, non_trainable_variables = model.stateless_call(\n",
    "        trainable_variables, non_trainable_variables, inputs, training=True\n",
    "    )\n",
    "    loss = loss_fn(targets, outputs)\n",
    "    return loss, non_trainable_variables"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "%%backend jax\n",
    "import jax\n",
    "\n",
    "grad_fn = jax.value_and_grad(compute_loss_and_updates, has_aux=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "%%backend jax\n",
    "optimizer = keras.optimizers.Adam()\n",
    "optimizer.build(model.trainable_variables)\n",
    "\n",
    "def train_step(state, inputs, targets):\n",
    "    (trainable_variables, non_trainable_variables, optimizer_variables) = state\n",
    "    (loss, non_trainable_variables), grads = grad_fn(\n",
    "        trainable_variables, non_trainable_variables, inputs, targets\n",
    "    )\n",
    "    trainable_variables, optimizer_variables = optimizer.stateless_apply(\n",
    "        optimizer_variables, grads, trainable_variables\n",
    "    )\n",
    "    return loss, (\n",
    "        trainable_variables,\n",
    "        non_trainable_variables,\n",
    "        optimizer_variables,\n",
    "    )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "%%backend jax\n",
    "batch_size = 32\n",
    "inputs = train_images[:batch_size]\n",
    "targets = train_labels[:batch_size]\n",
    "\n",
    "trainable_variables = [v.value for v in model.trainable_variables]\n",
    "non_trainable_variables = [v.value for v in model.non_trainable_variables]\n",
    "optimizer_variables = [v.value for v in optimizer.variables]\n",
    "\n",
    "state = (trainable_variables, non_trainable_variables, optimizer_variables)\n",
    "loss, state = train_step(state, inputs, targets)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Low-level usage of metrics"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "from keras import ops\n",
    "\n",
    "metric = keras.metrics.SparseCategoricalAccuracy()\n",
    "targets = ops.array([0, 1, 2])\n",
    "predictions = ops.array([[1, 0, 0], [0, 1, 0], [0, 0, 1]])\n",
    "metric.update_state(targets, predictions)\n",
    "current_result = metric.result()\n",
    "print(f\"result: {current_result:.2f}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "values = ops.array([0, 1, 2, 3, 4])\n",
    "mean_tracker = keras.metrics.Mean()\n",
    "for value in values:\n",
    "    mean_tracker.update_state(value)\n",
    "print(f\"Mean of values: {mean_tracker.result():.2f}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "metric = keras.metrics.SparseCategoricalAccuracy()\n",
    "targets = ops.array([0, 1, 2])\n",
    "predictions = ops.array([[1, 0, 0], [0, 1, 0], [0, 0, 1]])\n",
    "\n",
    "metric_variables = metric.variables\n",
    "metric_variables = metric.stateless_update_state(\n",
    "    metric_variables, targets, predictions\n",
    ")\n",
    "current_result = metric.stateless_result(metric_variables)\n",
    "print(f\"result: {current_result:.2f}\")\n",
    "\n",
    "metric_variables = metric.stateless_reset_state()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Using fit() with a custom training loop"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Customizing fit() with TensorFlow"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "%%backend tensorflow\n",
    "import keras\n",
    "from keras import layers\n",
    "\n",
    "loss_fn = keras.losses.SparseCategoricalCrossentropy()\n",
    "loss_tracker = keras.metrics.Mean(name=\"loss\")\n",
    "\n",
    "class CustomModel(keras.Model):\n",
    "    def train_step(self, data):\n",
    "        inputs, targets = data\n",
    "        with tf.GradientTape() as tape:\n",
    "            predictions = self(inputs, training=True)\n",
    "            loss = loss_fn(targets, predictions)\n",
    "        gradients = tape.gradient(loss, self.trainable_weights)\n",
    "        self.optimizer.apply(gradients, self.trainable_weights)\n",
    "\n",
    "        loss_tracker.update_state(loss)\n",
    "        return {\"loss\": loss_tracker.result()}\n",
    "\n",
    "    @property\n",
    "    def metrics(self):\n",
    "        return [loss_tracker]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "%%backend tensorflow\n",
    "def get_custom_model():\n",
    "    inputs = keras.Input(shape=(28 * 28,))\n",
    "    features = layers.Dense(512, activation=\"relu\")(inputs)\n",
    "    features = layers.Dropout(0.5)(features)\n",
    "    outputs = layers.Dense(10, activation=\"softmax\")(features)\n",
    "    model = CustomModel(inputs, outputs)\n",
    "    model.compile(optimizer=keras.optimizers.Adam())\n",
    "    return model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "%%backend tensorflow\n",
    "model = get_custom_model()\n",
    "model.fit(train_images, train_labels, epochs=3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Customizing fit() with PyTorch"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "%%backend torch\n",
    "import keras\n",
    "from keras import layers\n",
    "\n",
    "loss_fn = keras.losses.SparseCategoricalCrossentropy()\n",
    "loss_tracker = keras.metrics.Mean(name=\"loss\")\n",
    "\n",
    "class CustomModel(keras.Model):\n",
    "    def train_step(self, data):\n",
    "        inputs, targets = data\n",
    "        predictions = self(inputs, training=True)\n",
    "        loss = loss_fn(targets, predictions)\n",
    "\n",
    "        loss.backward()\n",
    "        trainable_weights = [v for v in self.trainable_weights]\n",
    "        gradients = [v.value.grad for v in trainable_weights]\n",
    "\n",
    "        with torch.no_grad():\n",
    "            self.optimizer.apply(gradients, trainable_weights)\n",
    "\n",
    "        loss_tracker.update_state(loss)\n",
    "        return {\"loss\": loss_tracker.result()}\n",
    "\n",
    "    @property\n",
    "    def metrics(self):\n",
    "        return [loss_tracker]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "%%backend torch\n",
    "def get_custom_model():\n",
    "    inputs = keras.Input(shape=(28 * 28,))\n",
    "    features = layers.Dense(512, activation=\"relu\")(inputs)\n",
    "    features = layers.Dropout(0.5)(features)\n",
    "    outputs = layers.Dense(10, activation=\"softmax\")(features)\n",
    "    model = CustomModel(inputs, outputs)\n",
    "    model.compile(optimizer=keras.optimizers.Adam())\n",
    "    return model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "%%backend torch\n",
    "model = get_custom_model()\n",
    "model.fit(train_images, train_labels, epochs=3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Customizing fit() with JAX"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "%%backend jax\n",
    "import keras\n",
    "from keras import layers\n",
    "\n",
    "loss_fn = keras.losses.SparseCategoricalCrossentropy()\n",
    "\n",
    "class CustomModel(keras.Model):\n",
    "    def compute_loss_and_updates(\n",
    "        self,\n",
    "        trainable_variables,\n",
    "        non_trainable_variables,\n",
    "        inputs,\n",
    "        targets,\n",
    "        training=False,\n",
    "    ):\n",
    "        predictions, non_trainable_variables = self.stateless_call(\n",
    "            trainable_variables,\n",
    "            non_trainable_variables,\n",
    "            inputs,\n",
    "            training=training,\n",
    "        )\n",
    "        loss = loss_fn(targets, predictions)\n",
    "        return loss, non_trainable_variables\n",
    "\n",
    "    def train_step(self, state, data):\n",
    "        (\n",
    "            trainable_variables,\n",
    "            non_trainable_variables,\n",
    "            optimizer_variables,\n",
    "            metrics_variables,\n",
    "        ) = state\n",
    "        inputs, targets = data\n",
    "\n",
    "        grad_fn = jax.value_and_grad(\n",
    "            self.compute_loss_and_updates, has_aux=True\n",
    "        )\n",
    "\n",
    "        (loss, non_trainable_variables), grads = grad_fn(\n",
    "            trainable_variables,\n",
    "            non_trainable_variables,\n",
    "            inputs,\n",
    "            targets,\n",
    "            training=True,\n",
    "        )\n",
    "\n",
    "        (\n",
    "            trainable_variables,\n",
    "            optimizer_variables,\n",
    "        ) = self.optimizer.stateless_apply(\n",
    "            optimizer_variables, grads, trainable_variables\n",
    "        )\n",
    "\n",
    "        logs = {\"loss\": loss}\n",
    "        state = (\n",
    "            trainable_variables,\n",
    "            non_trainable_variables,\n",
    "            optimizer_variables,\n",
    "            metrics_variables,\n",
    "        )\n",
    "        return logs, state"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "%%backend jax\n",
    "def get_custom_model():\n",
    "    inputs = keras.Input(shape=(28 * 28,))\n",
    "    features = layers.Dense(512, activation=\"relu\")(inputs)\n",
    "    features = layers.Dropout(0.5)(features)\n",
    "    outputs = layers.Dense(10, activation=\"softmax\")(features)\n",
    "    model = CustomModel(inputs, outputs)\n",
    "    model.compile(optimizer=keras.optimizers.Adam())\n",
    "    return model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "%%backend jax\n",
    "model = get_custom_model()\n",
    "model.fit(train_images, train_labels, epochs=3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Handling metrics in a custom train_step()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### train_step() metrics handling with TensorFlow"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "%%backend tensorflow\n",
    "import keras\n",
    "from keras import layers\n",
    "\n",
    "class CustomModel(keras.Model):\n",
    "    def train_step(self, data):\n",
    "        inputs, targets = data\n",
    "        with tf.GradientTape() as tape:\n",
    "            predictions = self(inputs, training=True)\n",
    "            loss = self.compute_loss(y=targets, y_pred=predictions)\n",
    "\n",
    "        gradients = tape.gradient(loss, self.trainable_weights)\n",
    "        self.optimizer.apply(gradients, self.trainable_weights)\n",
    "\n",
    "        for metric in self.metrics:\n",
    "            if metric.name == \"loss\":\n",
    "                metric.update_state(loss)\n",
    "            else:\n",
    "                metric.update_state(targets, predictions)\n",
    "\n",
    "        return {m.name: m.result() for m in self.metrics}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "%%backend tensorflow\n",
    "def get_custom_model():\n",
    "    inputs = keras.Input(shape=(28 * 28,))\n",
    "    features = layers.Dense(512, activation=\"relu\")(inputs)\n",
    "    features = layers.Dropout(0.5)(features)\n",
    "    outputs = layers.Dense(10, activation=\"softmax\")(features)\n",
    "    model = CustomModel(inputs, outputs)\n",
    "    model.compile(\n",
    "        optimizer=keras.optimizers.Adam(),\n",
    "        loss=keras.losses.SparseCategoricalCrossentropy(),\n",
    "        metrics=[keras.metrics.SparseCategoricalAccuracy()],\n",
    "    )\n",
    "    return model\n",
    "\n",
    "model = get_custom_model()\n",
    "model.fit(train_images, train_labels, epochs=3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### train_step() metrics handling with PyTorch"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "%%backend torch\n",
    "import keras\n",
    "from keras import layers\n",
    "\n",
    "class CustomModel(keras.Model):\n",
    "    def train_step(self, data):\n",
    "        inputs, targets = data\n",
    "        predictions = self(inputs, training=True)\n",
    "        loss = self.compute_loss(y=targets, y_pred=predictions)\n",
    "\n",
    "        loss.backward()\n",
    "        trainable_weights = [v for v in self.trainable_weights]\n",
    "        gradients = [v.value.grad for v in trainable_weights]\n",
    "\n",
    "        with torch.no_grad():\n",
    "            self.optimizer.apply(gradients, trainable_weights)\n",
    "\n",
    "        for metric in self.metrics:\n",
    "            if metric.name == \"loss\":\n",
    "                metric.update_state(loss)\n",
    "            else:\n",
    "                metric.update_state(targets, predictions)\n",
    "\n",
    "        return {m.name: m.result() for m in self.metrics}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "%%backend torch\n",
    "def get_custom_model():\n",
    "    inputs = keras.Input(shape=(28 * 28,))\n",
    "    features = layers.Dense(512, activation=\"relu\")(inputs)\n",
    "    features = layers.Dropout(0.5)(features)\n",
    "    outputs = layers.Dense(10, activation=\"softmax\")(features)\n",
    "    model = CustomModel(inputs, outputs)\n",
    "    model.compile(\n",
    "        optimizer=keras.optimizers.Adam(),\n",
    "        loss=keras.losses.SparseCategoricalCrossentropy(),\n",
    "        metrics=[keras.metrics.SparseCategoricalAccuracy()],\n",
    "    )\n",
    "    return model\n",
    "\n",
    "model = get_custom_model()\n",
    "model.fit(train_images, train_labels, epochs=3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### train_step() metrics handling with JAX"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "%%backend jax\n",
    "import keras\n",
    "from keras import layers\n",
    "\n",
    "class CustomModel(keras.Model):\n",
    "    def compute_loss_and_updates(\n",
    "        self,\n",
    "        trainable_variables,\n",
    "        non_trainable_variables,\n",
    "        inputs,\n",
    "        targets,\n",
    "        training=False,\n",
    "    ):\n",
    "        predictions, non_trainable_variables = self.stateless_call(\n",
    "            trainable_variables,\n",
    "            non_trainable_variables,\n",
    "            inputs,\n",
    "            training=training,\n",
    "        )\n",
    "        loss = self.compute_loss(y=targets, y_pred=predictions)\n",
    "        return loss, (predictions, non_trainable_variables)\n",
    "\n",
    "    def train_step(self, state, data):\n",
    "        (\n",
    "            trainable_variables,\n",
    "            non_trainable_variables,\n",
    "            optimizer_variables,\n",
    "            metrics_variables,\n",
    "        ) = state\n",
    "        inputs, targets = data\n",
    "\n",
    "        grad_fn = jax.value_and_grad(\n",
    "            self.compute_loss_and_updates, has_aux=True\n",
    "        )\n",
    "\n",
    "        (loss, (predictions, non_trainable_variables)), grads = grad_fn(\n",
    "            trainable_variables,\n",
    "            non_trainable_variables,\n",
    "            inputs,\n",
    "            targets,\n",
    "            training=True,\n",
    "        )\n",
    "        (\n",
    "            trainable_variables,\n",
    "            optimizer_variables,\n",
    "        ) = self.optimizer.stateless_apply(\n",
    "            optimizer_variables, grads, trainable_variables\n",
    "        )\n",
    "\n",
    "        new_metrics_vars = []\n",
    "        logs = {}\n",
    "        for metric in self.metrics:\n",
    "            num_prev = len(new_metrics_vars)\n",
    "            num_current = len(metric.variables)\n",
    "            current_vars = metrics_variables[num_prev : num_prev + num_current]\n",
    "            if metric.name == \"loss\":\n",
    "                current_vars = metric.stateless_update_state(current_vars, loss)\n",
    "            else:\n",
    "                current_vars = metric.stateless_update_state(\n",
    "                    current_vars, targets, predictions\n",
    "                )\n",
    "            logs[metric.name] = metric.stateless_result(current_vars)\n",
    "            new_metrics_vars += current_vars\n",
    "\n",
    "        state = (\n",
    "            trainable_variables,\n",
    "            non_trainable_variables,\n",
    "            optimizer_variables,\n",
    "            new_metrics_vars,\n",
    "        )\n",
    "        return logs, state"
   ]
  }
 ],
 "metadata": {
  "accelerator": "GPU",
  "colab": {
   "collapsed_sections": [],
   "name": "chapter07_deep-dive-keras",
   "private_outputs": false,
   "provenance": [],
   "toc_visible": true
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}

================================================
FILE: chapter08_image-classification.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "!pip install keras keras-hub --upgrade -q"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import os\n",
    "os.environ[\"KERAS_BACKEND\"] = \"jax\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "cellView": "form",
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "# @title\n",
    "import os\n",
    "from IPython.core.magic import register_cell_magic\n",
    "\n",
    "@register_cell_magic\n",
    "def backend(line, cell):\n",
    "    current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n",
    "    if current == required:\n",
    "        get_ipython().run_cell(cell)\n",
    "    else:\n",
    "        print(\n",
    "            f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n",
    "            f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n",
    "        )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "## Image classification"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Introduction to ConvNets"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import keras\n",
    "from keras import layers\n",
    "\n",
    "inputs = keras.Input(shape=(28, 28, 1))\n",
    "x = layers.Conv2D(filters=64, kernel_size=3, activation=\"relu\")(inputs)\n",
    "x = layers.MaxPooling2D(pool_size=2)(x)\n",
    "x = layers.Conv2D(filters=128, kernel_size=3, activation=\"relu\")(x)\n",
    "x = layers.MaxPooling2D(pool_size=2)(x)\n",
    "x = layers.Conv2D(filters=256, kernel_size=3, activation=\"relu\")(x)\n",
    "x = layers.GlobalAveragePooling2D()(x)\n",
    "outputs = layers.Dense(10, activation=\"softmax\")(x)\n",
    "model = keras.Model(inputs=inputs, outputs=outputs)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model.summary(line_length=80)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "from keras.datasets import mnist\n",
    "\n",
    "(train_images, train_labels), (test_images, test_labels) = mnist.load_data()\n",
    "train_images = train_images.reshape((60000, 28, 28, 1))\n",
    "train_images = train_images.astype(\"float32\") / 255\n",
    "test_images = test_images.reshape((10000, 28, 28, 1))\n",
    "test_images = test_images.astype(\"float32\") / 255\n",
    "model.compile(\n",
    "    optimizer=\"adam\",\n",
    "    loss=\"sparse_categorical_crossentropy\",\n",
    "    metrics=[\"accuracy\"],\n",
    ")\n",
    "model.fit(train_images, train_labels, epochs=5, batch_size=64)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "test_loss, test_acc = model.evaluate(test_images, test_labels)\n",
    "print(f\"Test accuracy: {test_acc:.3f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### The convolution operation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Understanding border effects and padding"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Understanding convolution strides"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### The max-pooling operation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "inputs = keras.Input(shape=(28, 28, 1))\n",
    "x = layers.Conv2D(filters=64, kernel_size=3, activation=\"relu\")(inputs)\n",
    "x = layers.Conv2D(filters=128, kernel_size=3, activation=\"relu\")(x)\n",
    "x = layers.Conv2D(filters=256, kernel_size=3, activation=\"relu\")(x)\n",
    "x = layers.GlobalAveragePooling2D()(x)\n",
    "outputs = layers.Dense(10, activation=\"softmax\")(x)\n",
    "model_no_max_pool = keras.Model(inputs=inputs, outputs=outputs)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model_no_max_pool.summary(line_length=80)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Training a ConvNet from scratch on a small dataset"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### The relevance of deep learning for small-data problems"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Downloading the data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import kagglehub\n",
    "\n",
    "kagglehub.login()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "download_path = kagglehub.competition_download(\"dogs-vs-cats\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import zipfile\n",
    "\n",
    "with zipfile.ZipFile(download_path + \"/train.zip\", \"r\") as zip_ref:\n",
    "    zip_ref.extractall(\".\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import os, shutil, pathlib\n",
    "\n",
    "original_dir = pathlib.Path(\"train\")\n",
    "new_base_dir = pathlib.Path(\"dogs_vs_cats_small\")\n",
    "\n",
    "def make_subset(subset_name, start_index, end_index):\n",
    "    for category in (\"cat\", \"dog\"):\n",
    "        dir = new_base_dir / subset_name / category\n",
    "        os.makedirs(dir)\n",
    "        fnames = [f\"{category}.{i}.jpg\" for i in range(start_index, end_index)]\n",
    "        for fname in fnames:\n",
    "            shutil.copyfile(src=original_dir / fname, dst=dir / fname)\n",
    "\n",
    "make_subset(\"train\", start_index=0, end_index=1000)\n",
    "make_subset(\"validation\", start_index=1000, end_index=1500)\n",
    "make_subset(\"test\", start_index=1500, end_index=2500)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Building your model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import keras\n",
    "from keras import layers\n",
    "\n",
    "inputs = keras.Input(shape=(180, 180, 3))\n",
    "x = layers.Rescaling(1.0 / 255)(inputs)\n",
    "x = layers.Conv2D(filters=32, kernel_size=3, activation=\"relu\")(x)\n",
    "x = layers.MaxPooling2D(pool_size=2)(x)\n",
    "x = layers.Conv2D(filters=64, kernel_size=3, activation=\"relu\")(x)\n",
    "x = layers.MaxPooling2D(pool_size=2)(x)\n",
    "x = layers.Conv2D(filters=128, kernel_size=3, activation=\"relu\")(x)\n",
    "x = layers.MaxPooling2D(pool_size=2)(x)\n",
    "x = layers.Conv2D(filters=256, kernel_size=3, activation=\"relu\")(x)\n",
    "x = layers.MaxPooling2D(pool_size=2)(x)\n",
    "x = layers.Conv2D(filters=512, kernel_size=3, activation=\"relu\")(x)\n",
    "x = layers.GlobalAveragePooling2D()(x)\n",
    "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n",
    "model = keras.Model(inputs=inputs, outputs=outputs)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model.summary(line_length=80)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model.compile(\n",
    "    loss=\"binary_crossentropy\",\n",
    "    optimizer=\"adam\",\n",
    "    metrics=[\"accuracy\"],\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Data preprocessing"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "from keras.utils import image_dataset_from_directory\n",
    "\n",
    "batch_size = 64\n",
    "image_size = (180, 180)\n",
    "train_dataset = image_dataset_from_directory(\n",
    "    new_base_dir / \"train\", image_size=image_size, batch_size=batch_size\n",
    ")\n",
    "validation_dataset = image_dataset_from_directory(\n",
    "    new_base_dir / \"validation\", image_size=image_size, batch_size=batch_size\n",
    ")\n",
    "test_dataset = image_dataset_from_directory(\n",
    "    new_base_dir / \"test\", image_size=image_size, batch_size=batch_size\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Understanding TensorFlow Dataset objects"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import tensorflow as tf\n",
    "\n",
    "random_numbers = np.random.normal(size=(1000, 16))\n",
    "dataset = tf.data.Dataset.from_tensor_slices(random_numbers)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "for i, element in enumerate(dataset):\n",
    "    print(element.shape)\n",
    "    if i >= 2:\n",
    "        break"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "batched_dataset = dataset.batch(32)\n",
    "for i, element in enumerate(batched_dataset):\n",
    "    print(element.shape)\n",
    "    if i >= 2:\n",
    "        break"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "reshaped_dataset = dataset.map(\n",
    "    lambda x: tf.reshape(x, (4, 4)),\n",
    "    num_parallel_calls=8)\n",
    "for i, element in enumerate(reshaped_dataset):\n",
    "    print(element.shape)\n",
    "    if i >= 2:\n",
    "        break"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Fitting the model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "for data_batch, labels_batch in train_dataset:\n",
    "    print(\"data batch shape:\", data_batch.shape)\n",
    "    print(\"labels batch shape:\", labels_batch.shape)\n",
    "    break"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "callbacks = [\n",
    "    keras.callbacks.ModelCheckpoint(\n",
    "        filepath=\"convnet_from_scratch.keras\",\n",
    "        save_best_only=True,\n",
    "        monitor=\"val_loss\",\n",
    "    )\n",
    "]\n",
    "history = model.fit(\n",
    "    train_dataset,\n",
    "    epochs=50,\n",
    "    validation_data=validation_dataset,\n",
    "    callbacks=callbacks,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "\n",
    "accuracy = history.history[\"accuracy\"]\n",
    "val_accuracy = history.history[\"val_accuracy\"]\n",
    "loss = history.history[\"loss\"]\n",
    "val_loss = history.history[\"val_loss\"]\n",
    "epochs = range(1, len(accuracy) + 1)\n",
    "\n",
    "plt.plot(epochs, accuracy, \"r--\", label=\"Training accuracy\")\n",
    "plt.plot(epochs, val_accuracy, \"b\", label=\"Validation accuracy\")\n",
    "plt.title(\"Training and validation accuracy\")\n",
    "plt.legend()\n",
    "plt.figure()\n",
    "\n",
    "plt.plot(epochs, loss, \"r--\", label=\"Training loss\")\n",
    "plt.plot(epochs, val_loss, \"b\", label=\"Validation loss\")\n",
    "plt.title(\"Training and validation loss\")\n",
    "plt.legend()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "test_model = keras.models.load_model(\"convnet_from_scratch.keras\")\n",
    "test_loss, test_acc = test_model.evaluate(test_dataset)\n",
    "print(f\"Test accuracy: {test_acc:.3f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Using data augmentation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "data_augmentation_layers = [\n",
    "    layers.RandomFlip(\"horizontal\"),\n",
    "    layers.RandomRotation(0.1),\n",
    "    layers.RandomZoom(0.2),\n",
    "]\n",
    "\n",
    "def data_augmentation(images, targets):\n",
    "    for layer in data_augmentation_layers:\n",
    "        images = layer(images)\n",
    "    return images, targets\n",
    "\n",
    "augmented_train_dataset = train_dataset.map(\n",
    "    data_augmentation, num_parallel_calls=8\n",
    ")\n",
    "augmented_train_dataset = augmented_train_dataset.prefetch(tf.data.AUTOTUNE)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "plt.figure(figsize=(10, 10))\n",
    "for image_batch, _ in train_dataset.take(1):\n",
    "    image = image_batch[0]\n",
    "    for i in range(9):\n",
    "        ax = plt.subplot(3, 3, i + 1)\n",
    "        augmented_image, _ = data_augmentation(image, None)\n",
    "        augmented_image = keras.ops.convert_to_numpy(augmented_image)\n",
    "        plt.imshow(augmented_image.astype(\"uint8\"))\n",
    "        plt.axis(\"off\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "inputs = keras.Input(shape=(180, 180, 3))\n",
    "x = layers.Rescaling(1.0 / 255)(inputs)\n",
    "x = layers.Conv2D(filters=32, kernel_size=3, activation=\"relu\")(x)\n",
    "x = layers.MaxPooling2D(pool_size=2)(x)\n",
    "x = layers.Conv2D(filters=64, kernel_size=3, activation=\"relu\")(x)\n",
    "x = layers.MaxPooling2D(pool_size=2)(x)\n",
    "x = layers.Conv2D(filters=128, kernel_size=3, activation=\"relu\")(x)\n",
    "x = layers.MaxPooling2D(pool_size=2)(x)\n",
    "x = layers.Conv2D(filters=256, kernel_size=3, activation=\"relu\")(x)\n",
    "x = layers.MaxPooling2D(pool_size=2)(x)\n",
    "x = layers.Conv2D(filters=512, kernel_size=3, activation=\"relu\")(x)\n",
    "x = layers.GlobalAveragePooling2D()(x)\n",
    "x = layers.Dropout(0.25)(x)\n",
    "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n",
    "model = keras.Model(inputs=inputs, outputs=outputs)\n",
    "\n",
    "model.compile(\n",
    "    loss=\"binary_crossentropy\",\n",
    "    optimizer=\"adam\",\n",
    "    metrics=[\"accuracy\"],\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "callbacks = [\n",
    "    keras.callbacks.ModelCheckpoint(\n",
    "        filepath=\"convnet_from_scratch_with_augmentation.keras\",\n",
    "        save_best_only=True,\n",
    "        monitor=\"val_loss\",\n",
    "    )\n",
    "]\n",
    "history = model.fit(\n",
    "    augmented_train_dataset,\n",
    "    epochs=100,\n",
    "    validation_data=validation_dataset,\n",
    "    callbacks=callbacks,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "test_model = keras.models.load_model(\n",
    "    \"convnet_from_scratch_with_augmentation.keras\"\n",
    ")\n",
    "test_loss, test_acc = test_model.evaluate(test_dataset)\n",
    "print(f\"Test accuracy: {test_acc:.3f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Using a pretrained model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Feature extraction with a pretrained model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import keras_hub\n",
    "\n",
    "conv_base = keras_hub.models.Backbone.from_preset(\"xception_41_imagenet\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "preprocessor = keras_hub.layers.ImageConverter.from_preset(\n",
    "    \"xception_41_imagenet\",\n",
    "    image_size=(180, 180),\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Fast feature extraction without data augmentation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "def get_features_and_labels(dataset):\n",
    "    all_features = []\n",
    "    all_labels = []\n",
    "    for images, labels in dataset:\n",
    "        preprocessed_images = preprocessor(images)\n",
    "        features = conv_base.predict(preprocessed_images, verbose=0)\n",
    "        all_features.append(features)\n",
    "        all_labels.append(labels)\n",
    "    return np.concatenate(all_features), np.concatenate(all_labels)\n",
    "\n",
    "train_features, train_labels = get_features_and_labels(train_dataset)\n",
    "val_features, val_labels = get_features_and_labels(validation_dataset)\n",
    "test_features, test_labels = get_features_and_labels(test_dataset)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "train_features.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "inputs = keras.Input(shape=(6, 6, 2048))\n",
    "x = layers.GlobalAveragePooling2D()(inputs)\n",
    "x = layers.Dense(256, activation=\"relu\")(x)\n",
    "x = layers.Dropout(0.25)(x)\n",
    "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n",
    "model = keras.Model(inputs, outputs)\n",
    "model.compile(\n",
    "    loss=\"binary_crossentropy\",\n",
    "    optimizer=\"adam\",\n",
    "    metrics=[\"accuracy\"],\n",
    ")\n",
    "\n",
    "callbacks = [\n",
    "    keras.callbacks.ModelCheckpoint(\n",
    "        filepath=\"feature_extraction.keras\",\n",
    "        save_best_only=True,\n",
    "        monitor=\"val_loss\",\n",
    "    )\n",
    "]\n",
    "history = model.fit(\n",
    "    train_features,\n",
    "    train_labels,\n",
    "    epochs=10,\n",
    "    validation_data=(val_features, val_labels),\n",
    "    callbacks=callbacks,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "\n",
    "acc = history.history[\"accuracy\"]\n",
    "val_acc = history.history[\"val_accuracy\"]\n",
    "loss = history.history[\"loss\"]\n",
    "val_loss = history.history[\"val_loss\"]\n",
    "epochs = range(1, len(acc) + 1)\n",
    "plt.plot(epochs, acc, \"r--\", label=\"Training accuracy\")\n",
    "plt.plot(epochs, val_acc, \"b\", label=\"Validation accuracy\")\n",
    "plt.title(\"Training and validation accuracy\")\n",
    "plt.legend()\n",
    "plt.figure()\n",
    "plt.plot(epochs, loss, \"r--\", label=\"Training loss\")\n",
    "plt.plot(epochs, val_loss, \"b\", label=\"Validation loss\")\n",
    "plt.title(\"Training and validation loss\")\n",
    "plt.legend()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "test_model = keras.models.load_model(\"feature_extraction.keras\")\n",
    "test_loss, test_acc = test_model.evaluate(test_features, test_labels)\n",
    "print(f\"Test accuracy: {test_acc:.3f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "##### Feature extraction together with data augmentation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "import keras_hub\n",
    "\n",
    "conv_base = keras_hub.models.Backbone.from_preset(\n",
    "    \"xception_41_imagenet\",\n",
    "    trainable=False,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "conv_base.trainable = True\n",
    "len(conv_base.trainable_weights)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "conv_base.trainable = False\n",
    "len(conv_base.trainable_weights)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "inputs = keras.Input(shape=(180, 180, 3))\n",
    "x = preprocessor(inputs)\n",
    "x = conv_base(x)\n",
    "x = layers.GlobalAveragePooling2D()(x)\n",
    "x = layers.Dense(256)(x)\n",
    "x = layers.Dropout(0.25)(x)\n",
    "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n",
    "model = keras.Model(inputs, outputs)\n",
    "model.compile(\n",
    "    loss=\"binary_crossentropy\",\n",
    "    optimizer=\"adam\",\n",
    "    metrics=[\"accuracy\"],\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "callbacks = [\n",
    "    keras.callbacks.ModelCheckpoint(\n",
    "        filepath=\"feature_extraction_with_data_augmentation.keras\",\n",
    "        save_best_only=True,\n",
    "        monitor=\"val_loss\",\n",
    "    )\n",
    "]\n",
    "history = model.fit(\n",
    "    augmented_train_dataset,\n",
    "    epochs=30,\n",
    "    validation_data=validation_dataset,\n",
    "    callbacks=callbacks,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "test_model = keras.models.load_model(\n",
    "    \"feature_extraction_with_data_augmentation.keras\"\n",
    ")\n",
    "test_loss, test_acc = test_model.evaluate(test_dataset)\n",
    "print(f\"Test accuracy: {test_acc:.3f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### Fine-tuning a pretrained model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab_type": "code"
   },
   "outputs": [],
   "source": [
    "model.compile(\n",
    "    loss=\"binary_crossentropy\",\n",
    "    optimizer=keras.optimizers.Adam(learning_rate=1e-5),\n",
    "    metrics=[\"accuracy\"],\n",
    ")\n",
    "\n",
    "callbacks = [\n",
    "    keras.callbacks.ModelCheckpoint(\n",
    "        filepath=\"fine_tuning.keras\",\n",
    "        save_best_only=True,\n",
    "        monitor=\"val_loss\",\n",
    "    )\n",
    "]\n",
    "history = model.fit(\n",
    "    augmented_train_dataset,\n",
    "    epochs=30,\n",
    "    validation_data=validation_dataset,\n",
    "    callb

Download .txt

gitextract_laoqm0uq/

├── LICENSE
├── README.md
├── chapter02_mathematical-building-blocks.ipynb
├── chapter03_introduction-to-ml-frameworks.ipynb
├── chapter04_classification-and-regression.ipynb
├── chapter05_fundamentals-of-ml.ipynb
├── chapter07_deep-dive-keras.ipynb
├── chapter08_image-classification.ipynb
├── chapter09_convnet-architecture-patterns.ipynb
├── chapter10_interpreting-what-convnets-learn.ipynb
├── chapter11_image-segmentation.ipynb
├── chapter12_object-detection.ipynb
├── chapter13_timeseries-forecasting.ipynb
├── chapter14_text-classification.ipynb
├── chapter15_language-models-and-the-transformer.ipynb
├── chapter16_text-generation.ipynb
├── chapter17_image-generation.ipynb
├── chapter18_best-practices-for-the-real-world.ipynb
├── first_edition/
│   ├── 2.1-a-first-look-at-a-neural-network.ipynb
│   ├── 3.5-classifying-movie-reviews.ipynb
│   ├── 3.6-classifying-newswires.ipynb
│   ├── 3.7-predicting-house-prices.ipynb
│   ├── 4.4-overfitting-and-underfitting.ipynb
│   ├── 5.1-introduction-to-convnets.ipynb
│   ├── 5.2-using-convnets-with-small-datasets.ipynb
│   ├── 5.3-using-a-pretrained-convnet.ipynb
│   ├── 5.4-visualizing-what-convnets-learn.ipynb
│   ├── 6.1-one-hot-encoding-of-words-or-characters.ipynb
│   ├── 6.1-using-word-embeddings.ipynb
│   ├── 6.2-understanding-recurrent-neural-networks.ipynb
│   ├── 6.3-advanced-usage-of-recurrent-neural-networks.ipynb
│   ├── 6.4-sequence-processing-with-convnets.ipynb
│   ├── 8.1-text-generation-with-lstm.ipynb
│   ├── 8.2-deep-dream.ipynb
│   ├── 8.3-neural-style-transfer.ipynb
│   ├── 8.4-generating-images-with-vaes.ipynb
│   └── 8.5-introduction-to-gans.ipynb
└── second_edition/
    ├── README.md
    ├── chapter02_mathematical-building-blocks.ipynb
    ├── chapter03_introduction-to-keras-and-tf.ipynb
    ├── chapter04_getting-started-with-neural-networks.ipynb
    ├── chapter05_fundamentals-of-ml.ipynb
    ├── chapter07_working-with-keras.ipynb
    ├── chapter08_intro-to-dl-for-computer-vision.ipynb
    ├── chapter09_part01_image-segmentation.ipynb
    ├── chapter09_part02_modern-convnet-architecture-patterns.ipynb
    ├── chapter09_part03_interpreting-what-convnets-learn.ipynb
    ├── chapter10_dl-for-timeseries.ipynb
    ├── chapter11_part01_introduction.ipynb
    ├── chapter11_part02_sequence-models.ipynb
    ├── chapter11_part03_transformer.ipynb
    ├── chapter11_part04_sequence-to-sequence-learning.ipynb
    ├── chapter12_part01_text-generation.ipynb
    ├── chapter12_part02_deep-dream.ipynb
    ├── chapter12_part03_neural-style-transfer.ipynb
    ├── chapter12_part04_variational-autoencoders.ipynb
    ├── chapter12_part05_gans.ipynb
    ├── chapter13_best-practices-for-the-real-world.ipynb
    └── chapter14_conclusions.ipynb

Copy disabled (too large) Download .json

Condensed preview — 59 files, each showing path, character count, and a content snippet. Download the .json file for the full structured content (10,772K chars).

[
  {
    "path": "LICENSE",
    "chars": 1081,
    "preview": "MIT License\n\nCopyright (c) 2017-present François Chollet\n\nPermission is hereby granted, free of charge, to any person ob"
  },
  {
    "path": "README.md",
    "chars": 6095,
    "preview": "# Companion notebooks for Deep Learning with Python\n\nThis repository contains Jupyter notebooks implementing the code sa"
  },
  {
    "path": "chapter02_mathematical-building-blocks.ipynb",
    "chars": 30007,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "chapter03_introduction-to-ml-frameworks.ipynb",
    "chars": 36224,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "chapter04_classification-and-regression.ipynb",
    "chars": 28353,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "chapter05_fundamentals-of-ml.ipynb",
    "chars": 23753,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "chapter07_deep-dive-keras.ipynb",
    "chars": 48132,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "chapter08_image-classification.ipynb",
    "chars": 26220,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "chapter09_convnet-architecture-patterns.ipynb",
    "chars": 10471,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "chapter10_interpreting-what-convnets-learn.ipynb",
    "chars": 21336,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "chapter11_image-segmentation.ipynb",
    "chars": 17878,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "chapter12_object-detection.ipynb",
    "chars": 19577,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "chapter13_timeseries-forecasting.ipynb",
    "chars": 18428,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "chapter14_text-classification.ipynb",
    "chars": 35739,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "chapter15_language-models-and-the-transformer.ipynb",
    "chars": 32307,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "chapter16_text-generation.ipynb",
    "chars": 29083,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "chapter17_image-generation.ipynb",
    "chars": 26342,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "chapter18_best-practices-for-the-real-world.ipynb",
    "chars": 13502,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "first_edition/2.1-a-first-look-at-a-neural-network.ipynb",
    "chars": 13940,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\":"
  },
  {
    "path": "first_edition/3.5-classifying-movie-reviews.ipynb",
    "chars": 69517,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 20,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\""
  },
  {
    "path": "first_edition/3.6-classifying-newswires.ipynb",
    "chars": 63700,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\":"
  },
  {
    "path": "first_edition/3.7-predicting-house-prices.ipynb",
    "chars": 70240,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\":"
  },
  {
    "path": "first_edition/4.4-overfitting-and-underfitting.ipynb",
    "chars": 106100,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\":"
  },
  {
    "path": "first_edition/5.1-introduction-to-convnets.ipynb",
    "chars": 11100,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\":"
  },
  {
    "path": "first_edition/5.2-using-convnets-with-small-datasets.ipynb",
    "chars": 431140,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\":"
  },
  {
    "path": "first_edition/5.3-using-a-pretrained-convnet.ipynb",
    "chars": 233147,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\":"
  },
  {
    "path": "first_edition/5.4-visualizing-what-convnets-learn.ipynb",
    "chars": 7004823,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\":"
  },
  {
    "path": "first_edition/6.1-one-hot-encoding-of-words-or-characters.ipynb",
    "chars": 8792,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\":"
  },
  {
    "path": "first_edition/6.1-using-word-embeddings.ipynb",
    "chars": 94317,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\":"
  },
  {
    "path": "first_edition/6.2-understanding-recurrent-neural-networks.ipynb",
    "chars": 84619,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\":"
  },
  {
    "path": "first_edition/6.3-advanced-usage-of-recurrent-neural-networks.ipynb",
    "chars": 204230,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\":"
  },
  {
    "path": "first_edition/6.4-sequence-processing-with-convnets.ipynb",
    "chars": 94418,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\":"
  },
  {
    "path": "first_edition/8.1-text-generation-with-lstm.ipynb",
    "chars": 160140,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 19,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\""
  },
  {
    "path": "first_edition/8.2-deep-dream.ipynb",
    "chars": 201125,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\":"
  },
  {
    "path": "first_edition/8.3-neural-style-transfer.ipynb",
    "chars": 415051,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\":"
  },
  {
    "path": "first_edition/8.4-generating-images-with-vaes.ipynb",
    "chars": 283828,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 2,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\":"
  },
  {
    "path": "first_edition/8.5-introduction-to-gans.ipynb",
    "chars": 147649,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\":"
  },
  {
    "path": "second_edition/README.md",
    "chars": 4551,
    "preview": "# Second edition notebooks\n\nThese are the notebooks for the second edition of the book, originally published in 2021. Th"
  },
  {
    "path": "second_edition/chapter02_mathematical-building-blocks.ipynb",
    "chars": 30097,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "second_edition/chapter03_introduction-to-keras-and-tf.ipynb",
    "chars": 20418,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "second_edition/chapter04_getting-started-with-neural-networks.ipynb",
    "chars": 29579,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "second_edition/chapter05_fundamentals-of-ml.ipynb",
    "chars": 18665,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "second_edition/chapter07_working-with-keras.ipynb",
    "chars": 36694,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "second_edition/chapter08_intro-to-dl-for-computer-vision.ipynb",
    "chars": 29330,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "second_edition/chapter09_part01_image-segmentation.ipynb",
    "chars": 8540,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "second_edition/chapter09_part02_modern-convnet-architecture-patterns.ipynb",
    "chars": 8900,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "second_edition/chapter09_part03_interpreting-what-convnets-learn.ipynb",
    "chars": 19114,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "second_edition/chapter10_dl-for-timeseries.ipynb",
    "chars": 20952,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "second_edition/chapter11_part01_introduction.ipynb",
    "chars": 18710,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "second_edition/chapter11_part02_sequence-models.ipynb",
    "chars": 12860,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "second_edition/chapter11_part03_transformer.ipynb",
    "chars": 12412,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "second_edition/chapter11_part04_sequence-to-sequence-learning.ipynb",
    "chars": 18936,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "second_edition/chapter12_part01_text-generation.ipynb",
    "chars": 14357,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "second_edition/chapter12_part02_deep-dream.ipynb",
    "chars": 7944,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "second_edition/chapter12_part03_neural-style-transfer.ipynb",
    "chars": 9673,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "second_edition/chapter12_part04_variational-autoencoders.ipynb",
    "chars": 9571,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "second_edition/chapter12_part05_gans.ipynb",
    "chars": 11570,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "second_edition/chapter13_best-practices-for-the-real-world.ipynb",
    "chars": 10593,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  },
  {
    "path": "second_edition/chapter14_conclusions.ipynb",
    "chars": 13113,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"colab_type\": \"text\"\n   },\n   \"source\": [\n    \"This i"
  }
]

About this extraction

This page contains the full source code of the fchollet/deep-learning-with-python-notebooks GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 59 files (10.0 MB), approximately 2.6M tokens. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Extract another repo