Repository: bastings/annotated_encoder_decoder Branch: master Commit: 622c65f2d880 Files: 6 Total size: 209.3 KB Directory structure: gitextract_t4r2cdzr/ ├── .gitignore ├── LICENSE ├── README.md ├── _config.yml ├── annotated_encoder_decoder.ipynb └── index.md ================================================ FILE CONTENTS ================================================ ================================================ FILE: .gitignore ================================================ # Byte-compiled / optimized / DLL files __pycache__/ *.py[cod] *$py.class # C extensions *.so # Distribution / packaging .Python build/ develop-eggs/ dist/ downloads/ eggs/ .eggs/ lib/ lib64/ parts/ sdist/ var/ wheels/ *.egg-info/ .installed.cfg *.egg MANIFEST # PyInstaller # Usually these files are written by a python script from a template # before PyInstaller builds the exe, so as to inject date/other infos into it. *.manifest *.spec # Installer logs pip-log.txt pip-delete-this-directory.txt # Unit test / coverage reports htmlcov/ .tox/ .coverage .coverage.* .cache nosetests.xml coverage.xml *.cover .hypothesis/ .pytest_cache/ # Translations *.mo *.pot # Django stuff: *.log local_settings.py db.sqlite3 # Flask stuff: instance/ .webassets-cache # Scrapy stuff: .scrapy # Sphinx documentation docs/_build/ # PyBuilder target/ # Jupyter Notebook .ipynb_checkpoints # pyenv .python-version # celery beat schedule file celerybeat-schedule # SageMath parsed files *.sage.py # Environments .env .venv env/ venv/ ENV/ env.bak/ venv.bak/ # Spyder project settings .spyderproject .spyproject # Rope project settings .ropeproject # mkdocs documentation /site # mypy .mypy_cache/ ================================================ FILE: LICENSE ================================================ MIT License Copyright (c) 2018 Joost Bastings Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ================================================ FILE: README.md ================================================ # The Annotated Encoder Decoder with Attention Read the [blog post](https://bastings.github.io/annotated_encoder_decoder/) or simply run the jupyter notebook from this repository. ================================================ FILE: _config.yml ================================================ title: The Annotated Encoder Decoder description: A PyTorch tutorial implementing Bahdanau et al. (2015) google_analytics: UA-126252625-1 show_downloads: true theme: jekyll-theme-cayman kramdown: math_engine: mathjax syntax_highlighter: rouge gems: - jekyll-mentions ================================================ FILE: annotated_encoder_decoder.ipynb ================================================ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# The Annotated Encoder-Decoder with Attention\n", "\n", "Recently, Alexander Rush wrote a blog post called [The Annotated Transformer](http://nlp.seas.harvard.edu/2018/04/03/attention.html), describing the Transformer model from the paper [Attention is All You Need](https://arxiv.org/abs/1706.03762). This post can be seen as a **prequel** to that: *we will implement an Encoder-Decoder with Attention* using (Gated) Recurrent Neural Networks, very closely following the original attention-based neural machine translation paper [\"Neural Machine Translation by Jointly Learning to Align and Translate\"](https://arxiv.org/abs/1409.0473) of Bahdanau et al. (2015). \n", "\n", "The idea is that going through both blog posts will make you familiar with two very influential sequence-to-sequence architectures. If you have any comments or suggestions, please let me know: [@BastingsJasmijn](https://twitter.com/BastingsJasmijn)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Model Architecture\n", "\n", "We will model the probability $p(Y\\mid X)$ of a target sequence $Y=(y_1, \\dots, y_{N})$ given a source sequence $X=(x_1, \\dots, x_M)$ directly with a neural network: an Encoder-Decoder.\n", "\n", "\n", "\n", "#### Encoder \n", "\n", "The encoder reads in the source sentence (*at the bottom of the figure*) and produces a sequence of hidden states $\\mathbf{h}_1, \\dots, \\mathbf{h}_M$, one for each source word. These states should capture the meaning of a word in its context of the given sentence.\n", "\n", "We will use a bi-directional recurrent neural network (Bi-RNN) as the encoder; a Bi-GRU in particular.\n", "\n", "First of all we **embed** the source words. \n", "We simply look up the **word embedding** for each word in a (randomly initialized) lookup table.\n", "We will denote the word embedding for word $i$ in a given sentence with $\\mathbf{x}_i$.\n", "By embedding words, our model may exploit the fact that certain words (e.g. *cat* and *dog*) are semantically similar, and can be processed in a similar way.\n", "\n", "Now, how do we get hidden states $\\mathbf{h}_1, \\dots, \\mathbf{h}_M$? A forward GRU reads the source sentence left-to-right, while a backward GRU reads it right-to-left.\n", "Each of them follows a simple recursive formula: \n", "$$\\mathbf{h}_j = \\text{GRU}( \\mathbf{x}_j , \\mathbf{h}_{j - 1} )$$\n", "i.e. we obtain the next state from the previous state and the current input word embedding.\n", "\n", "The hidden state of the forward GRU at time step $j$ will know what words **precede** the word at that time step, but it doesn't know what words will follow. In contrast, the backward GRU will only know what words **follow** the word at time step $j$. By **concatenating** those two hidden states (*shown in blue in the figure*), we get $\\mathbf{h}_j$, which captures word $j$ in its full sentence context.\n", "\n", "\n", "#### Decoder \n", "\n", "The decoder (*at the top of the figure*) is a GRU with hidden state $\\mathbf{s_i}$. It follows a similar formula to the encoder, but takes one extra input $\\mathbf{c}_{i}$ (*shown in yellow*).\n", "\n", "$$\\mathbf{s}_{i} = f( \\mathbf{s}_{i - 1}, \\mathbf{y}_{i - 1}, \\mathbf{c}_i )$$\n", "\n", "Here, $\\mathbf{y}_{i - 1}$ is the previously generated target word (*not shown*).\n", "\n", "At each time step, an **attention mechanism** dynamically selects that part of the source sentence that is most relevant for predicting the current target word. It does so by comparing the last decoder state with each source hidden state. The result is a context vector $\\mathbf{c_i}$ (*shown in yellow*).\n", "Later the attention mechanism is explained in more detail.\n", "\n", "After computing the decoder state $\\mathbf{s}_i$, a non-linear function $g$ (which applies a [softmax](https://en.wikipedia.org/wiki/Softmax_function)) gives us the probability of the target word $y_i$ for this time step:\n", "\n", "$$ p(y_i \\mid y_{= 0.4.1** and was tested with **Python 3.6**. \n", "\n", "Make sure you have those versions, and install the packages below if you don't have them yet." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "#!pip install torch numpy matplotlib sacrebleu" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CUDA: True\n", "cuda:0\n" ] } ], "source": [ "%matplotlib inline\n", "import numpy as np\n", "import torch\n", "import torch.nn as nn\n", "import torch.nn.functional as F\n", "import math, copy, time\n", "import matplotlib.pyplot as plt\n", "from torch.nn.utils.rnn import pack_padded_sequence, pad_packed_sequence\n", "from IPython.core.debugger import set_trace\n", "\n", "# we will use CUDA if it is available\n", "USE_CUDA = torch.cuda.is_available()\n", "DEVICE=torch.device('cuda:0') # or set to 'cpu'\n", "print(\"CUDA:\", USE_CUDA)\n", "print(DEVICE)\n", "\n", "seed = 42\n", "np.random.seed(seed)\n", "torch.manual_seed(seed)\n", "torch.cuda.manual_seed(seed)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Let's start coding!\n", "\n", "## Model class\n", "\n", "Our base model class `EncoderDecoder` is very similar to the one in *The Annotated Transformer*.\n", "\n", "One difference is that our encoder also returns its final states (`encoder_final` below), which is used to initialize the decoder RNN. We also provide the sequence lengths as the RNNs require those." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "class EncoderDecoder(nn.Module):\n", " \"\"\"\n", " A standard Encoder-Decoder architecture. Base for this and many \n", " other models.\n", " \"\"\"\n", " def __init__(self, encoder, decoder, src_embed, trg_embed, generator):\n", " super(EncoderDecoder, self).__init__()\n", " self.encoder = encoder\n", " self.decoder = decoder\n", " self.src_embed = src_embed\n", " self.trg_embed = trg_embed\n", " self.generator = generator\n", " \n", " def forward(self, src, trg, src_mask, trg_mask, src_lengths, trg_lengths):\n", " \"\"\"Take in and process masked src and target sequences.\"\"\"\n", " encoder_hidden, encoder_final = self.encode(src, src_mask, src_lengths)\n", " return self.decode(encoder_hidden, encoder_final, src_mask, trg, trg_mask)\n", " \n", " def encode(self, src, src_mask, src_lengths):\n", " return self.encoder(self.src_embed(src), src_mask, src_lengths)\n", " \n", " def decode(self, encoder_hidden, encoder_final, src_mask, trg, trg_mask,\n", " decoder_hidden=None):\n", " return self.decoder(self.trg_embed(trg), encoder_hidden, encoder_final,\n", " src_mask, trg_mask, hidden=decoder_hidden)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To keep things easy we also keep the `Generator` class the same. \n", "It simply projects the pre-output layer ($x$ in the `forward` function below) to obtain the output layer, so that the final dimension is the target vocabulary size." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "class Generator(nn.Module):\n", " \"\"\"Define standard linear + softmax generation step.\"\"\"\n", " def __init__(self, hidden_size, vocab_size):\n", " super(Generator, self).__init__()\n", " self.proj = nn.Linear(hidden_size, vocab_size, bias=False)\n", "\n", " def forward(self, x):\n", " return F.log_softmax(self.proj(x), dim=-1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Encoder\n", "\n", "Our encoder is a bi-directional GRU. \n", "\n", "Because we want to process multiple sentences at the same time for speed reasons (it is more effcient on GPU), we need to support **mini-batches**. Sentences in a mini-batch may have different lengths, which means that the RNN needs to unroll further for certain sentences while it might already have finished for others:\n", "\n", "```\n", "Example: mini-batch with 3 source sentences of different lengths (7, 5, and 3).\n", "End-of-sequence is marked with a \"3\" here, and padding positions with \"1\".\n", "\n", "+---------------+\n", "| 4 5 9 8 7 8 3 |\n", "+---------------+\n", "| 5 4 8 7 3 1 1 |\n", "+---------------+\n", "| 5 8 3 1 1 1 1 |\n", "+---------------+\n", "```\n", "You can see that, when computing hidden states for this mini-batch, for sentence #2 and #3 we will need to stop updating the hidden state after we have encountered \"3\". We don't want to incorporate the padding values (1s).\n", "\n", "Luckily, PyTorch has convenient helper functions called `pack_padded_sequence` and `pad_packed_sequence`.\n", "These functions take care of masking and padding, so that the resulting word representations are simply zeros after a sentence stops.\n", "\n", "The code below reads in a source sentence (a sequence of word embeddings) and produces the hidden states.\n", "It also returns a final vector, a summary of the complete sentence, by concatenating the first and the last hidden states (they have both seen the whole sentence, each in a different direction). We will use the final vector to initialize the decoder." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "class Encoder(nn.Module):\n", " \"\"\"Encodes a sequence of word embeddings\"\"\"\n", " def __init__(self, input_size, hidden_size, num_layers=1, dropout=0.):\n", " super(Encoder, self).__init__()\n", " self.num_layers = num_layers\n", " self.rnn = nn.GRU(input_size, hidden_size, num_layers, \n", " batch_first=True, bidirectional=True, dropout=dropout)\n", " \n", " def forward(self, x, mask, lengths):\n", " \"\"\"\n", " Applies a bidirectional GRU to sequence of embeddings x.\n", " The input mini-batch x needs to be sorted by length.\n", " x should have dimensions [batch, time, dim].\n", " \"\"\"\n", " packed = pack_padded_sequence(x, lengths, batch_first=True)\n", " output, final = self.rnn(packed)\n", " output, _ = pad_packed_sequence(output, batch_first=True)\n", "\n", " # we need to manually concatenate the final states for both directions\n", " fwd_final = final[0:final.size(0):2]\n", " bwd_final = final[1:final.size(0):2]\n", " final = torch.cat([fwd_final, bwd_final], dim=2) # [num_layers, batch, 2*dim]\n", "\n", " return output, final" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Decoder\n", "\n", "The decoder is a conditional GRU. Rather than starting with an empty state like the encoder, its initial hidden state results from a projection of the encoder final vector. \n", "\n", "#### Training\n", "In `forward` you can find a for-loop that computes the decoder hidden states one time step at a time. \n", "Note that, during training, we know exactly what the target words should be! (They are in `trg_embed`.) This means that we are not even checking here what the prediction is! We simply feed the correct previous target word embedding to the GRU at each time step. This is called teacher forcing.\n", "\n", "The `forward` function returns all decoder hidden states and pre-output vectors. Elsewhere these are used to compute the loss, after which the parameters are updated.\n", "\n", "#### Prediction\n", "For prediction time, for forward function is only used for a single time step. After predicting a word from the returned pre-output vector, we can call it again, supplying it the word embedding of the previously predicted word and the last state." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "class Decoder(nn.Module):\n", " \"\"\"A conditional RNN decoder with attention.\"\"\"\n", " \n", " def __init__(self, emb_size, hidden_size, attention, num_layers=1, dropout=0.5,\n", " bridge=True):\n", " super(Decoder, self).__init__()\n", " \n", " self.hidden_size = hidden_size\n", " self.num_layers = num_layers\n", " self.attention = attention\n", " self.dropout = dropout\n", " \n", " self.rnn = nn.GRU(emb_size + 2*hidden_size, hidden_size, num_layers,\n", " batch_first=True, dropout=dropout)\n", " \n", " # to initialize from the final encoder state\n", " self.bridge = nn.Linear(2*hidden_size, hidden_size, bias=True) if bridge else None\n", "\n", " self.dropout_layer = nn.Dropout(p=dropout)\n", " self.pre_output_layer = nn.Linear(hidden_size + 2*hidden_size + emb_size,\n", " hidden_size, bias=False)\n", " \n", " def forward_step(self, prev_embed, encoder_hidden, src_mask, proj_key, hidden):\n", " \"\"\"Perform a single decoder step (1 word)\"\"\"\n", "\n", " # compute context vector using attention mechanism\n", " query = hidden[-1].unsqueeze(1) # [#layers, B, D] -> [B, 1, D]\n", " context, attn_probs = self.attention(\n", " query=query, proj_key=proj_key,\n", " value=encoder_hidden, mask=src_mask)\n", "\n", " # update rnn hidden state\n", " rnn_input = torch.cat([prev_embed, context], dim=2)\n", " output, hidden = self.rnn(rnn_input, hidden)\n", " \n", " pre_output = torch.cat([prev_embed, output, context], dim=2)\n", " pre_output = self.dropout_layer(pre_output)\n", " pre_output = self.pre_output_layer(pre_output)\n", "\n", " return output, hidden, pre_output\n", " \n", " def forward(self, trg_embed, encoder_hidden, encoder_final, \n", " src_mask, trg_mask, hidden=None, max_len=None):\n", " \"\"\"Unroll the decoder one step at a time.\"\"\"\n", " \n", " # the maximum number of steps to unroll the RNN\n", " if max_len is None:\n", " max_len = trg_mask.size(-1)\n", "\n", " # initialize decoder hidden state\n", " if hidden is None:\n", " hidden = self.init_hidden(encoder_final)\n", " \n", " # pre-compute projected encoder hidden states\n", " # (the \"keys\" for the attention mechanism)\n", " # this is only done for efficiency\n", " proj_key = self.attention.key_layer(encoder_hidden)\n", " \n", " # here we store all intermediate hidden states and pre-output vectors\n", " decoder_states = []\n", " pre_output_vectors = []\n", " \n", " # unroll the decoder RNN for max_len steps\n", " for i in range(max_len):\n", " prev_embed = trg_embed[:, i].unsqueeze(1)\n", " output, hidden, pre_output = self.forward_step(\n", " prev_embed, encoder_hidden, src_mask, proj_key, hidden)\n", " decoder_states.append(output)\n", " pre_output_vectors.append(pre_output)\n", "\n", " decoder_states = torch.cat(decoder_states, dim=1)\n", " pre_output_vectors = torch.cat(pre_output_vectors, dim=1)\n", " return decoder_states, hidden, pre_output_vectors # [B, N, D]\n", "\n", " def init_hidden(self, encoder_final):\n", " \"\"\"Returns the initial decoder state,\n", " conditioned on the final encoder state.\"\"\"\n", "\n", " if encoder_final is None:\n", " return None # start with zeros\n", "\n", " return torch.tanh(self.bridge(encoder_final)) \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Attention \n", "\n", "At every time step, the decoder has access to *all* source word representations $\\mathbf{h}_1, \\dots, \\mathbf{h}_M$. \n", "An attention mechanism allows the model to focus on the currently most relevant part of the source sentence.\n", "The state of the decoder is represented by GRU hidden state $\\mathbf{s}_i$.\n", "So if we want to know which source word representation(s) $\\mathbf{h}_j$ are most relevant, we will need to define a function that takes those two things as input.\n", "\n", "Here we use the MLP-based, additive attention that was used in Bahdanau et al.:\n", "\n", "\n", "\n", "\n", "We apply an MLP with tanh-activation to both the current decoder state $\\bf s_i$ (the *query*) and each encoder state $\\bf h_j$ (the *key*), and then project this to a single value (i.e. a scalar) to get the *attention energy* $e_{ij}$. \n", "\n", "Once all energies are computed, they are normalized by a softmax so that they sum to one: \n", "\n", "$$ \\alpha_{ij} = \\text{softmax}(\\mathbf{e}_i)[j] $$\n", "\n", "$$\\sum_j \\alpha_{ij} = 1.0$$ \n", "\n", "The context vector for time step $i$ is then a weighted sum of the encoder hidden states (the *values*):\n", "$$\\mathbf{c}_i = \\sum_j \\alpha_{ij} \\mathbf{h}_j$$" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "class BahdanauAttention(nn.Module):\n", " \"\"\"Implements Bahdanau (MLP) attention\"\"\"\n", " \n", " def __init__(self, hidden_size, key_size=None, query_size=None):\n", " super(BahdanauAttention, self).__init__()\n", " \n", " # We assume a bi-directional encoder so key_size is 2*hidden_size\n", " key_size = 2 * hidden_size if key_size is None else key_size\n", " query_size = hidden_size if query_size is None else query_size\n", "\n", " self.key_layer = nn.Linear(key_size, hidden_size, bias=False)\n", " self.query_layer = nn.Linear(query_size, hidden_size, bias=False)\n", " self.energy_layer = nn.Linear(hidden_size, 1, bias=False)\n", " \n", " # to store attention scores\n", " self.alphas = None\n", " \n", " def forward(self, query=None, proj_key=None, value=None, mask=None):\n", " assert mask is not None, \"mask is required\"\n", "\n", " # We first project the query (the decoder state).\n", " # The projected keys (the encoder states) were already pre-computated.\n", " query = self.query_layer(query)\n", " \n", " # Calculate scores.\n", " scores = self.energy_layer(torch.tanh(query + proj_key))\n", " scores = scores.squeeze(2).unsqueeze(1)\n", " \n", " # Mask out invalid positions.\n", " # The mask marks valid positions so we invert it using `mask & 0`.\n", " scores.data.masked_fill_(mask == 0, -float('inf'))\n", " \n", " # Turn scores to probabilities.\n", " alphas = F.softmax(scores, dim=-1)\n", " self.alphas = alphas \n", " \n", " # The context vector is the weighted sum of the values.\n", " context = torch.bmm(alphas, value)\n", " \n", " # context shape: [B, 1, 2D], alphas shape: [B, 1, M]\n", " return context, alphas" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Embeddings and Softmax \n", "We use learned embeddings to convert the input tokens and output tokens to vectors of dimension `emb_size`.\n", "\n", "We will simply use PyTorch's [nn.Embedding](https://pytorch.org/docs/stable/nn.html?highlight=embedding#torch.nn.Embedding) class." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Full Model\n", "\n", "Here we define a function from hyperparameters to a full model. " ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "def make_model(src_vocab, tgt_vocab, emb_size=256, hidden_size=512, num_layers=1, dropout=0.1):\n", " \"Helper: Construct a model from hyperparameters.\"\n", "\n", " attention = BahdanauAttention(hidden_size)\n", "\n", " model = EncoderDecoder(\n", " Encoder(emb_size, hidden_size, num_layers=num_layers, dropout=dropout),\n", " Decoder(emb_size, hidden_size, attention, num_layers=num_layers, dropout=dropout),\n", " nn.Embedding(src_vocab, emb_size),\n", " nn.Embedding(tgt_vocab, emb_size),\n", " Generator(hidden_size, tgt_vocab))\n", "\n", " return model.cuda() if USE_CUDA else model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Training\n", "\n", "This section describes the training regime for our models." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We stop for a quick interlude to introduce some of the tools \n", "needed to train a standard encoder decoder model. First we define a batch object that holds the src and target sentences for training, as well as their lengths and masks. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Batches and Masking" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "class Batch:\n", " \"\"\"Object for holding a batch of data with mask during training.\n", " Input is a batch from a torch text iterator.\n", " \"\"\"\n", " def __init__(self, src, trg, pad_index=0):\n", " \n", " src, src_lengths = src\n", " \n", " self.src = src\n", " self.src_lengths = src_lengths\n", " self.src_mask = (src != pad_index).unsqueeze(-2)\n", " self.nseqs = src.size(0)\n", " \n", " self.trg = None\n", " self.trg_y = None\n", " self.trg_mask = None\n", " self.trg_lengths = None\n", " self.ntokens = None\n", "\n", " if trg is not None:\n", " trg, trg_lengths = trg\n", " self.trg = trg[:, :-1]\n", " self.trg_lengths = trg_lengths\n", " self.trg_y = trg[:, 1:]\n", " self.trg_mask = (self.trg_y != pad_index)\n", " self.ntokens = (self.trg_y != pad_index).data.sum().item()\n", " \n", " if USE_CUDA:\n", " self.src = self.src.cuda()\n", " self.src_mask = self.src_mask.cuda()\n", "\n", " if trg is not None:\n", " self.trg = self.trg.cuda()\n", " self.trg_y = self.trg_y.cuda()\n", " self.trg_mask = self.trg_mask.cuda()\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Training Loop\n", "The code below trains the model for 1 epoch (=1 pass through the training data)." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "def run_epoch(data_iter, model, loss_compute, print_every=50):\n", " \"\"\"Standard Training and Logging Function\"\"\"\n", "\n", " start = time.time()\n", " total_tokens = 0\n", " total_loss = 0\n", " print_tokens = 0\n", "\n", " for i, batch in enumerate(data_iter, 1):\n", " \n", " out, _, pre_output = model.forward(batch.src, batch.trg,\n", " batch.src_mask, batch.trg_mask,\n", " batch.src_lengths, batch.trg_lengths)\n", " loss = loss_compute(pre_output, batch.trg_y, batch.nseqs)\n", " total_loss += loss\n", " total_tokens += batch.ntokens\n", " print_tokens += batch.ntokens\n", " \n", " if model.training and i % print_every == 0:\n", " elapsed = time.time() - start\n", " print(\"Epoch Step: %d Loss: %f Tokens per Sec: %f\" %\n", " (i, loss / batch.nseqs, print_tokens / elapsed))\n", " start = time.time()\n", " print_tokens = 0\n", "\n", " return math.exp(total_loss / float(total_tokens))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Training Data and Batching\n", "\n", "We will use torch text for batching. This is discussed in more detail below. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Optimizer\n", "\n", "We will use the [Adam optimizer](https://arxiv.org/abs/1412.6980) with default settings ($\\beta_1=0.9$, $\\beta_2=0.999$ and $\\epsilon=10^{-8}$).\n", "\n", "We will use $0.0003$ as the learning rate here, but for different problems another learning rate may be more appropriate. You will have to tune that." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# A First Example\n", "\n", "We can begin by trying out a simple copy-task. Given a random set of input symbols from a small vocabulary, the goal is to generate back those same symbols. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Synthetic Data" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "def data_gen(num_words=11, batch_size=16, num_batches=100, length=10, pad_index=0, sos_index=1):\n", " \"\"\"Generate random data for a src-tgt copy task.\"\"\"\n", " for i in range(num_batches):\n", " data = torch.from_numpy(\n", " np.random.randint(1, num_words, size=(batch_size, length)))\n", " data[:, 0] = sos_index\n", " data = data.cuda() if USE_CUDA else data\n", " src = data[:, 1:]\n", " trg = data\n", " src_lengths = [length-1] * batch_size\n", " trg_lengths = [length] * batch_size\n", " yield Batch((src, src_lengths), (trg, trg_lengths), pad_index=pad_index)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loss Computation" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "class SimpleLossCompute:\n", " \"\"\"A simple loss compute and train function.\"\"\"\n", "\n", " def __init__(self, generator, criterion, opt=None):\n", " self.generator = generator\n", " self.criterion = criterion\n", " self.opt = opt\n", "\n", " def __call__(self, x, y, norm):\n", " x = self.generator(x)\n", " loss = self.criterion(x.contiguous().view(-1, x.size(-1)),\n", " y.contiguous().view(-1))\n", " loss = loss / norm\n", "\n", " if self.opt is not None:\n", " loss.backward() \n", " self.opt.step()\n", " self.opt.zero_grad()\n", "\n", " return loss.data.item() * norm" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Printing examples\n", "\n", "To monitor progress during training, we will translate a few examples.\n", "\n", "We use greedy decoding for simplicity; that is, at each time step, starting at the first token, we choose the one with that maximum probability, and we never revisit that choice. " ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "def greedy_decode(model, src, src_mask, src_lengths, max_len=100, sos_index=1, eos_index=None):\n", " \"\"\"Greedily decode a sentence.\"\"\"\n", "\n", " with torch.no_grad():\n", " encoder_hidden, encoder_final = model.encode(src, src_mask, src_lengths)\n", " prev_y = torch.ones(1, 1).fill_(sos_index).type_as(src)\n", " trg_mask = torch.ones_like(prev_y)\n", "\n", " output = []\n", " attention_scores = []\n", " hidden = None\n", "\n", " for i in range(max_len):\n", " with torch.no_grad():\n", " out, hidden, pre_output = model.decode(\n", " encoder_hidden, encoder_final, src_mask,\n", " prev_y, trg_mask, hidden)\n", "\n", " # we predict from the pre-output layer, which is\n", " # a combination of Decoder state, prev emb, and context\n", " prob = model.generator(pre_output[:, -1])\n", "\n", " _, next_word = torch.max(prob, dim=1)\n", " next_word = next_word.data.item()\n", " output.append(next_word)\n", " prev_y = torch.ones(1, 1).type_as(src).fill_(next_word)\n", " attention_scores.append(model.decoder.attention.alphas.cpu().numpy())\n", " \n", " output = np.array(output)\n", " \n", " # cut off everything starting from \n", " # (only when eos_index provided)\n", " if eos_index is not None:\n", " first_eos = np.where(output==eos_index)[0]\n", " if len(first_eos) > 0:\n", " output = output[:first_eos[0]] \n", " \n", " return output, np.concatenate(attention_scores, axis=1)\n", " \n", "\n", "def lookup_words(x, vocab=None):\n", " if vocab is not None:\n", " x = [vocab.itos[i] for i in x]\n", "\n", " return [str(t) for t in x]" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "def print_examples(example_iter, model, n=2, max_len=100, \n", " sos_index=1, \n", " src_eos_index=None, \n", " trg_eos_index=None, \n", " src_vocab=None, trg_vocab=None):\n", " \"\"\"Prints N examples. Assumes batch size of 1.\"\"\"\n", "\n", " model.eval()\n", " count = 0\n", " print()\n", " \n", " if src_vocab is not None and trg_vocab is not None:\n", " src_eos_index = src_vocab.stoi[EOS_TOKEN]\n", " trg_sos_index = trg_vocab.stoi[SOS_TOKEN]\n", " trg_eos_index = trg_vocab.stoi[EOS_TOKEN]\n", " else:\n", " src_eos_index = None\n", " trg_sos_index = 1\n", " trg_eos_index = None\n", " \n", " for i, batch in enumerate(example_iter):\n", " \n", " src = batch.src.cpu().numpy()[0, :]\n", " trg = batch.trg_y.cpu().numpy()[0, :]\n", "\n", " # remove (if it is there)\n", " src = src[:-1] if src[-1] == src_eos_index else src\n", " trg = trg[:-1] if trg[-1] == trg_eos_index else trg \n", " \n", " result, _ = greedy_decode(\n", " model, batch.src, batch.src_mask, batch.src_lengths,\n", " max_len=max_len, sos_index=trg_sos_index, eos_index=trg_eos_index)\n", " print(\"Example #%d\" % (i+1))\n", " print(\"Src : \", \" \".join(lookup_words(src, vocab=src_vocab)))\n", " print(\"Trg : \", \" \".join(lookup_words(trg, vocab=trg_vocab)))\n", " print(\"Pred: \", \" \".join(lookup_words(result, vocab=trg_vocab)))\n", " print()\n", " \n", " count += 1\n", " if count == n:\n", " break" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Training the copy task" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "scrolled": false }, "outputs": [], "source": [ "def train_copy_task():\n", " \"\"\"Train the simple copy task.\"\"\"\n", " num_words = 11\n", " criterion = nn.NLLLoss(reduction=\"sum\", ignore_index=0)\n", " model = make_model(num_words, num_words, emb_size=32, hidden_size=64)\n", " optim = torch.optim.Adam(model.parameters(), lr=0.0003)\n", " eval_data = list(data_gen(num_words=num_words, batch_size=1, num_batches=100))\n", " \n", " dev_perplexities = []\n", " \n", " if USE_CUDA:\n", " model.cuda()\n", "\n", " for epoch in range(10):\n", " \n", " print(\"Epoch %d\" % epoch)\n", "\n", " # train\n", " model.train()\n", " data = data_gen(num_words=num_words, batch_size=32, num_batches=100)\n", " run_epoch(data, model,\n", " SimpleLossCompute(model.generator, criterion, optim))\n", "\n", " # evaluate\n", " model.eval()\n", " with torch.no_grad(): \n", " perplexity = run_epoch(eval_data, model,\n", " SimpleLossCompute(model.generator, criterion, None))\n", " print(\"Evaluation perplexity: %f\" % perplexity)\n", " dev_perplexities.append(perplexity)\n", " print_examples(eval_data, model, n=2, max_len=9)\n", " \n", " return dev_perplexities" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "scrolled": false }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/home/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/rnn.py:38: UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.1 and num_layers=1\n", " \"num_layers={}\".format(dropout, num_layers))\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch 0\n", "Epoch Step: 50 Loss: 19.887581 Tokens per Sec: 7748.957397\n", "Epoch Step: 100 Loss: 17.856726 Tokens per Sec: 7925.338918\n", "Evaluation perplexity: 7.172198\n", "\n", "Example #1\n", "Src : 4 8 5 7 10 3 7 8 5\n", "Trg : 4 8 5 7 10 3 7 8 5\n", "Pred: 8 3 7 5 8 3 7 5 8\n", "\n", "Example #2\n", "Src : 8 8 3 6 5 2 8 6 2\n", "Trg : 8 8 3 6 5 2 8 6 2\n", "Pred: 8 8 8 8 8 8 8 8 8\n", "\n", "Epoch 1\n", "Epoch Step: 50 Loss: 15.715487 Tokens per Sec: 8662.903188\n", "Epoch Step: 100 Loss: 12.368280 Tokens per Sec: 7860.172940\n", "Evaluation perplexity: 3.709498\n", "\n", "Example #1\n", "Src : 4 8 5 7 10 3 7 8 5\n", "Trg : 4 8 5 7 10 3 7 8 5\n", "Pred: 4 8 7 5 10 8 7 5 7\n", "\n", "Example #2\n", "Src : 8 8 3 6 5 2 8 6 2\n", "Trg : 8 8 3 6 5 2 8 6 2\n", "Pred: 8 8 5 6 2 6 8 2 5\n", "\n", "Epoch 2\n", "Epoch Step: 50 Loss: 9.246480 Tokens per Sec: 7971.095313\n", "Epoch Step: 100 Loss: 7.701921 Tokens per Sec: 7876.198908\n", "Evaluation perplexity: 2.303158\n", "\n", "Example #1\n", "Src : 4 8 5 7 10 3 7 8 5\n", "Trg : 4 8 5 7 10 3 7 8 5\n", "Pred: 4 8 7 3 10 5 8 7 5\n", "\n", "Example #2\n", "Src : 8 8 3 6 5 2 8 6 2\n", "Trg : 8 8 3 6 5 2 8 6 2\n", "Pred: 8 8 5 6 2 6 8 5 2\n", "\n", "Epoch 3\n", "Epoch Step: 50 Loss: 6.166847 Tokens per Sec: 8069.631171\n", "Epoch Step: 100 Loss: 5.673258 Tokens per Sec: 7855.858586\n", "Evaluation perplexity: 1.775795\n", "\n", "Example #1\n", "Src : 4 8 5 7 10 3 7 8 5\n", "Trg : 4 8 5 7 10 3 7 8 5\n", "Pred: 4 8 7 5 10 3 7 8 5\n", "\n", "Example #2\n", "Src : 8 8 3 6 5 2 8 6 2\n", "Trg : 8 8 3 6 5 2 8 6 2\n", "Pred: 8 8 3 6 5 2 8 6 8\n", "\n", "Epoch 4\n", "Epoch Step: 50 Loss: 4.830031 Tokens per Sec: 8094.515152\n", "Epoch Step: 100 Loss: 4.152125 Tokens per Sec: 7999.315744\n", "Evaluation perplexity: 1.572305\n", "\n", "Example #1\n", "Src : 4 8 5 7 10 3 7 8 5\n", "Trg : 4 8 5 7 10 3 7 8 5\n", "Pred: 4 8 5 7 10 3 7 8 5\n", "\n", "Example #2\n", "Src : 8 8 3 6 5 2 8 6 2\n", "Trg : 8 8 3 6 5 2 8 6 2\n", "Pred: 8 8 3 6 5 2 8 6 2\n", "\n", "Epoch 5\n", "Epoch Step: 50 Loss: 3.638369 Tokens per Sec: 8112.868501\n", "Epoch Step: 100 Loss: 3.784709 Tokens per Sec: 7843.288141\n", "Evaluation perplexity: 1.433951\n", "\n", "Example #1\n", "Src : 4 8 5 7 10 3 7 8 5\n", "Trg : 4 8 5 7 10 3 7 8 5\n", "Pred: 4 8 7 5 3 10 7 8 7\n", "\n", "Example #2\n", "Src : 8 8 3 6 5 2 8 6 2\n", "Trg : 8 8 3 6 5 2 8 6 2\n", "Pred: 8 8 3 6 5 2 8 6 2\n", "\n", "Epoch 6\n", "Epoch Step: 50 Loss: 2.802792 Tokens per Sec: 8128.952327\n", "Epoch Step: 100 Loss: 2.403310 Tokens per Sec: 7893.746819\n", "Evaluation perplexity: 1.284198\n", "\n", "Example #1\n", "Src : 4 8 5 7 10 3 7 8 5\n", "Trg : 4 8 5 7 10 3 7 8 5\n", "Pred: 4 8 5 7 10 3 7 8 5\n", "\n", "Example #2\n", "Src : 8 8 3 6 5 2 8 6 2\n", "Trg : 8 8 3 6 5 2 8 6 2\n", "Pred: 8 8 3 6 5 2 8 6 2\n", "\n", "Epoch 7\n", "Epoch Step: 50 Loss: 2.174423 Tokens per Sec: 8181.341663\n", "Epoch Step: 100 Loss: 1.838792 Tokens per Sec: 7833.160747\n", "Evaluation perplexity: 1.173110\n", "\n", "Example #1\n", "Src : 4 8 5 7 10 3 7 8 5\n", "Trg : 4 8 5 7 10 3 7 8 5\n", "Pred: 4 8 5 7 10 3 7 8 5\n", "\n", "Example #2\n", "Src : 8 8 3 6 5 2 8 6 2\n", "Trg : 8 8 3 6 5 2 8 6 2\n", "Pred: 8 8 3 6 5 2 8 6 2\n", "\n", "Epoch 8\n", "Epoch Step: 50 Loss: 1.226522 Tokens per Sec: 8267.548130\n", "Epoch Step: 100 Loss: 1.090876 Tokens per Sec: 7842.856308\n", "Evaluation perplexity: 1.123090\n", "\n", "Example #1\n", "Src : 4 8 5 7 10 3 7 8 5\n", "Trg : 4 8 5 7 10 3 7 8 5\n", "Pred: 4 8 5 7 10 3 7 8 5\n", "\n", "Example #2\n", "Src : 8 8 3 6 5 2 8 6 2\n", "Trg : 8 8 3 6 5 2 8 6 2\n", "Pred: 8 8 3 6 5 2 8 6 2\n", "\n", "Epoch 9\n", "Epoch Step: 50 Loss: 1.216270 Tokens per Sec: 8181.132215\n", "Epoch Step: 100 Loss: 0.636999 Tokens per Sec: 7866.309111\n", "Evaluation perplexity: 1.088564\n", "\n", "Example #1\n", "Src : 4 8 5 7 10 3 7 8 5\n", "Trg : 4 8 5 7 10 3 7 8 5\n", "Pred: 4 8 5 7 10 3 7 8 5\n", "\n", "Example #2\n", "Src : 8 8 3 6 5 2 8 6 2\n", "Trg : 8 8 3 6 5 2 8 6 2\n", "Pred: 8 8 3 6 5 2 8 6 2\n", "\n" ] }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYcAAAElCAYAAAAPyi6bAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAIABJREFUeJzt3Xl8XHW9//HXJ5ksTdqkbdp0L91bFilgla0spYiKG4qIiF5xwQVZVFx+3quIuFwVxAXtlUVEUUQU1KtXBSmUrWwtlNLaBbpDl3RNmqTZP78/zpl0moXMpMmczMz7+XjMI5kzc875zKSd95zv93y/x9wdERGRRHlRFyAiIgOPwkFERDpROIiISCcKBxER6UThICIinSgcRESkE4WDdGJm15qZJ9x2m9njZva2iOpxM/tqP237WjNrSbg/NFx2bH/sL926+Fsm3n4TYV2LzOzBqPYvPYtFXYAMWK3A3PD3UcDVwN/M7Bx3/1d0ZfW524B/JNwfCnwdeBlYHklFfS/xb5loZ7oLkcyhcJBuuftT8d/N7GFgM3AlcFjhYGZF7t54mOX1CXd/BXgl6joORzLvZ+LfUiQZalaSpLh7DbAWmBJfZmZDzOyHZrbFzBrNbI2ZXZq4XrzZxsxmm9kjZlYPfD98zM3sGjP7jpntMLM6M7vXzEb1VI+Zvd7M/mFm1eF695vZ0QmPHxfWdE2H9e4zs1fNrCKxvvD3ScCG8Kl3JjS/nGlmfzazZ7qo46zwOSe/Rq2LzOxBM/uAma01swYze8bM3tjFcy80syVmdsDMdpnZL8xsWMLjk8L9XWpmN5lZFbCjp/erJ2a20cxuM7PPmtnmcP8LzWx6h+cVmdl3w795k5m9ZGZXm5l1eN6YcHvbwr/DOjP7Thf7fZuZLTezejNbamanHe5rkT7i7rrpdsgNuBZo6bAsBmwDHgjvFwCLCT6YLgPOJvjQbwU+1WFbrQTNNJ8H5gEnho85wbf2fwFvBz4G7AIWd9i3A19NuD8HOBCu927gncDjwG5gbMLzvgg0J+zv40Ab8KauXitQFG7PCZqWTgpvZcBbw+XHdqjtd8CLPbyfi8L37mXgonAfS4FqYGTC8z4T1rcAeDPwYeBV4AkgL3zOpLCOrcBvw7rO6+lvGf79Ot4s4Xkbw7/Fc2F9FxEE5QagKOF5vweagP8EzgF+ENbz7YTnVITb2xb+2zgLuAS4rcN7shV4AXg/cC7wLLAPGBr1/wHdXOGgW+dbFx8oY4Gfhx8Cl4bP+Y/wg+wNHda9NfxQyEvYlgMf7WI/8Q+5xA+fd4bLz+nwvMRweAhYBsQSlpURBMv3EpblAQ8DLwHHA7XAjV291oT78Q/fD3Z4Xl74QXlTwrIKoAG4sof3cxEdggWoJAi474T3BxOExU0d1j01XPetHep7JIW/pXdzSwzxjWE9lQnLjg2f94nw/uvC+1/ssI+bw3WHhve/RRDKs3p4TxqAiQnLTgi3/76o/w/o5mpWkm7lE/wHbyb49vpB4Fp3vzV8/M0EzUzPm1ksfgPuB0YD0zps73+72c9f/dD28r8CjQTf2Dsxs0HA6QTfYEnYbz3wJNDeLOHubQQhNiJ8bD3wlZ5femfhtm4FPhjWQLhtB+5MYhNr3b29g9vdq4DHOPg6TyYIuN91eD+fBvYnvq7QX1MovxV4Qxe3ezs879GwrniNywmCNV7j6eHPuzqs9zugGIg3k50NPO7uq3uoa6W7b068H/6c2MN6kgbqkJbutBJ8KDiwF9js7i0Jj1cCMwnCoysVCb+3ufuubp5XlXjH3d3MdgJjunn+cILg+k5462hth+1tMbNHgHcBN/vhdYT/guCb+HsJAuHjwL3uvjeJdau6WLaDoIkMgvcTgiakrlR0uJ9SP4O7L0niad3VGP9bDEtYlmh7h8crCEKtJ4e8b+7eGHZdFCexrvQzhYN0q4cPlD3AGoIjiq4kfmt8rXnhKxPvhB2bIwmaprqyj6A560bCo4cOGjps7wKCYHgO+IaZ3efu3W37Nbn7DjP7M3Cpma0HjiJoU09GZRfLRnHwde4Jf15E0DfRUcfTTvtjrv3uanwp/H1vwrJXE54zOvwZfw27gHF9Xp2klZqVpLfuB44Adrv7ki5utUlu5x1mVpR4n6BjuMtTL929juDb9eu62e+K+HPNbBxBe/gvCTrCa4FfdjyzpoP4kUV3315vJmji+S5BU9EjPb9EAGYkDqwzs8pwO/HX+URY3+RuXtemJPdzOE4P64rXeCwwPaHGR8Of7++w3oUEofxseP9BYK6ZzejHWqWf6chBeutO4KPAw2Z2A0F7cSkwCzjJ3d+b5HbaCAbX/ZjgG+n3gCfd/YHXWOfzwCNm9jfgVwTNIaMIOm9fcvefhgHwK4Jvu1e6e62ZfYigI/QK4CfdbHsHwTfgi8xsDUFH6xp33x8+/hDBN+m5wJeSfI0QNL3cZ2ZfC7f5NYKzfn4IwanCZvZl4EdmNgZ4gKAfZSLBWUE/dffFKezvEGbWVR/OfndfmXB/N/BPM/smQUB/h6Cj+o6wxhfN7B7gO2ZWCCwJa/skQcf6vnA7PwQ+BCwys+sIjjDHA6e5+yd6+xokvRQO0ivu3mxm5wD/BXyW4ENsH8EHwd0pbOoWgm/ptxOcsfNPemiqcfcl4YfdtQRnUZUSfPg+RXB6J8DngDMJPpBqw/UeM7PvAd8zswfd/d9dbLvNzC4h+GB8IKxtHkGoxPtE/hRu/1cpvM6VBH0W1xG8Vy8QnJHV3lzk7gvM7BWCU3A/Gi7eQvBNfAO9l0/QId/R0xza8f9PYAVBcI4kOJr5tLs3JTznP8LXcBlBIG8K670x4XXsMbNTCN7D6wg62l8htX8XEjFz12VCJRpm5sDX3P1bUdeSCjN7EVjt7hck+fxFBKfLnt2vhR0GM9sIPOjuH4+6FhkYdOQgkoSwX+QEgsF6xwBqHpGspnAQSc4YghHhe4AvuXtXzTQiWUPNSiIi0olOZRURkU4UDiIi0knG9jmMGDHCJ02aFHUZIiIZZenSpbvcfWRPz8vYcJg0aRJLliQzXYyIiMSZWVKj7dWsJCIinSgcRESkE4WDiIh0onAQEZFOFA4iItKJwkFERDrJqXBwdx5Zu5Nr/rKCF7bs63kFEZEclbHjHHrDzLjurytZt7OOQQX5zJ4wNOqSREQGpJw6cgCYf+QoAB5cldL12UVEckruhcOs4BK563bWsXFXXcTViIgMTGkLBzNbaWa1CbcDZuZmdkK6agB4/RHDKB9UAOjoQUSkO2kLB3c/2t0Hx28E15z9t7s/l64aAGL5ecybGcw5tXBVVTp3LSKSMSJpVjKzGMEF1G9Ocb0KM5thZjNaWlp6vf94v8OzG/dQfaC519sREclWUfU5nAeUA79Ocb0rgDXAmqqq3n/rP2PmSGJ5RktbcGqriIgcKqpw+CTwe3dPdbDBTcBMYGZlZWWvd15WXMAbJw8HYKH6HUREOkl7OJjZVGA+8PNU13X33e6+1t3XxmKHN0Qj3rS0aM1OWlrbDmtbIiLZJoojh08CL7j70xHsu93ZRwZHHtUHmlm6aW+UpYiIDDhpDQczKwQuoRdHDX3tiIpSplUOBmDhap21JCKSKN1HDu8BioHfpnm/XZofHj1ovIOIyKHSGg7ufre7l7l7bTr3252zw36H9Tvr2KDR0iIi7XJu+oxEJ0wcxrCSYLS0zloSETkop8MhP8+YN1NNSyIiHeV0OEDiaOm9VNdrtLSICCgcOH3GCAryjdY2Z9FanbUkIgIKB4YUF3Di5ApAE/GJiMTlfDjAwVNaF62polmjpUVEFA4A82cF/Q41DS0s2ajR0iIiCgdgYkUJ0+OjpXXWkoiIwiEuftaSptIQEVE4tItPxLdhVx3rdg6IAdwiIpFROISOnziM4aWFgJqWREQUDqH8POPM8NrSD+qUVhHJcQqHBPGJ+JZu2su++qaIqxERiY7CIcFp0w+Olta1pUUklykcEgwpLuCkKcFoaTUtiUguUzh0MH+WRkuLiCgcOoiPd9jf0MKzG/dEXI2ISDQUDh1MGF7CzFFDAE3EJyK5S+HQhfhEfAtX7cDdI65GRCT9FA5diDctbdxdz7qdura0iOQehUMXjpswlAqNlhaRHKZw6EJ+njFvVrxpSf0OIpJ70h4OZna2mT1lZrVmtsvMFqS7hmTEJ+JbsmkPe+s0WlpEcktaw8HMzgT+CNwAVADjgdvSWUOy5k4fSWF+Hm2Ori0tIjkn3UcO/w383N3/6O6N7t7g7s+luYakDC6KceKU4YBGS4tI7klbOJhZKfBGIGZmz4VNSovMbE4K26gwsxlmNqOlpaX/ig3FJ+J7dM1Omlo0WlpEckc6jxyGhfu7CLgEGAs8APzdzIYmuY0rgDXAmqqq/v82Hx/vsL9Ro6VFJLekMxz2hz9/6e7L3b2JoJmpADglyW3cBMwEZlZWVvZDiYcaP6yEWaOD0dIP6pRWEckhaQsHd68GNgIdhxx7F8u628Zud1/r7mtjsVgfV9i1g6OlqzRaWkRyRro7pBcAHzGzo8wsBnwRaAQWp7mOpMVHS2/eU8/LVbq2tIjkhvR8/T7oBmAI8BBQDDwPvDU8qhiQjhs/lBGDC9lV28TC1VVMDyflExHJZmk9cvDANe4+2t2Huvs8d1+WzhpSlZdnzJt5cCI+EZFcoOkzkjA/4drSGi0tIrlA4ZCE06aPaB8t/fAaDYgTkeyncEhCaVGMk6cG15bWRHwikgsUDkmKT8T3yFqNlhaR7KdwSNJZYb9DbWMLz2zQaGkRyW4KhySNGzqII8eUARotLSLZT+GQgnjT0sLVura0iGQ3hUMK4qe0btlzgJc0WlpEspjCIQXHjitn5JAiQE1LIpLdFA4pyMszzpqpa0uLSPZTOKTorLDf4bnNe9ld2xhxNSIi/UPhkKLTpo+gMJaHOzy8ZmfU5YiI9AuFQ4pKCmOc0j5aWv0OIpKdFA69ED9r6dG1O2lsaY24GhGRvqdw6IX5s4J+h7qmVp5er9HSIpJ9FA69MHboII4KR0s/tFpnLYlI9lE49FJ8tPSDqzRaWkSyj8Khl+L9Dq/sPcDaHRotLSLZReHQS6/TaGkRyWIKh17Ky7P2jmmd0ioi2UbhcBjiTUvPb9nHLo2WFpEsonA4DHOnjaAoPlpaZy2JSBZROByGQYX5nDptBKCJ+EQku6QUDmb2rJldamaDU92Rmd1hZs1mVptwuyzV7Qw088NTWh97SaOlRSR7pHrk8DBwHbDNzH5hZiemuP6v3H1wwm1BiusPOPNnBf0OdU2tPKXR0iKSJVIKB3f/EjAB+DAwGnjCzF40syvNbFh/FJjIzCrMbIaZzWhpaenv3SVldHkxx4wLRkvrrCURyRYp9zm4e4u73+fubwOOAO4Dvge8ama/NbM3vMbq55vZHjNba2bX96J56gpgDbCmqmrgtPHHjx4WrqrSaGkRyQq97pA2s6nA5cAngAPAbUAxwdHENV2schMwCxgBvBs4A7g1xd3eBMwEZlZWVvay8r53dnhK66v7DrB6+/6IqxEROXypdkgXmdnFZvYwwTf404AvA2Pd/Up3Px84D7i647ruvtTdd7h7m7uvBD4HvNfMipLdv7vvdve17r42FoulUnq/OnpsGZXhaGk1LYlINkj1yGE7wbf3F4HZ7j7X3X/t7g0Jz1kMJNMz2xb+tBRrGHDy8qz9rKUHdUqriGSBVMPhc8C48ChhZVdPcPd97j6543Ize7+ZDQ1/nw78APjfDsGSseL9Di+8so+d+zVaWkQyW6rhcDrQqT3HzErN7PYe1v0UsN7M6oAHgKeAj6S4/wHrVI2WFpEskmo4fBgY1MXyQcB/vNaK7n6muw9391J3n+zun3f3mhT3P2ANKsxnbny09Gr1O4hIZks1HAw45FxNMzNgLrCzr4rKVPGJ+B57aRcNzRotLSKZK6lwMLM2M2slCIbtZtYavwEtwL3Ab/qxzowQ75Sub2rlqfW7I65GRKT3kj0f9CKCo4a7CPoOqhMeawI2uPuyPq4t44wqK+Z148p58dVqFq6q4syZA2cshohIKpIKB3f/PYCZbQOecPeBMXfFADT/yMowHHZw3buOJmh1ExHJLD02K5lZ4tffVcBwM6vs6tZ/ZWaO+GjprdUNrNqm0dIikpmSOXLYZmZj3L2KYBBcV5MHxTuq8/uyuEx09NgyRpcVs72mgYWrdnDU2LKoSxIRSVky4XAWB0c8n0XX4SAhM+OsIyu56+nNPLi6iivmT4+6JBGRlPUYDu7+SMLvi/q1mixxdhgOL2zZR9X+BiqHFEddkohISlKdeO8L3SwvNrOb+6akzHfK1BEUFwRvrUZLi0gmSnUQ3FfM7B9mNjK+wMyOBZ4DzuzLwjJZcUE+c6cFb5Em4hORTJRqOBwPDAaWm9mbzexK4GngGeCEvi4uk50dDoh7XKOlRSQDpXqZ0M0EF+n5A/B34AbgY+5+ibvX9UN9GeusWUE4HGhu5cl1Gi0tIpmlN1eCmwdcQNCUVAdcbGYj+rSqLFBZVszs8eUAPKgLAIlIhkm1Q/q7wD8ILu95EnAcMBR40czO6fvyMttZ4TUeHlqta0uLSGZJ9cjhYuBN7n6Nu7e6+yaCS4XeCvy1z6vLcPGJ+LZVN7Bya9bMTi4iOSDVcJidOO4BILwm9DXA2X1XVnY4emwZY8qDMQ4LddaSiGSQVDuk9wCYWYWZnWhmRQmPPdbXxWU6M2vvmH5IFwASkQySap/DYDP7HcGFfRYD48LlN5vZ1/uhvowXn4jvhVeqqarJistli0gOSLVZ6b+BKcCJwIGE5X8D3t1XRWWTk6dWMKggmI/wIY2WFpEMkWo4vBO4yt2f5dAJ+FYRhIZ0UFyQz9zpwZm+Gi0tIpki1XAYCXTVeD6IYNpu6UL7aOmXd2q0tIhkhFTDYTldn5V0MfDs4ZeTneaFndINzW0sXrcr4mpERHqWajhcC/zQzL5BcGGfi8zsN8DnwseSYmZ5ZrbYzNzMxqdYQ8apHFLM7AlDATUtiUhmSPVU1n8C5xHMr9QG/BdwBPBWd380hU19DqhPZd+Z7uz4Ka2rNFpaRAa+lOdWcvcH3f1Mdx/s7iXufpq7P5Ts+mY2A7gM6PLaENlqfnhK6/YajZYWkYGvNxPv9ZqZ5QG3EwTDvl6sX2FmM8xsRktLS5/X15+OHDOEseFoaU3EJyIDXY/hYGYHzKw+mVsS+7sK2O7uf+plvVcAa4A1VVWZ1XZvZu1HD5pKQ0QGuh6vIQ18mkPHNPSKmU0DrgbmHMZmbgLuAqisrFxzuDWl2/wjK7nzqU28+Go1O2oaGFWma0uLyMDUYzi4+x19tK+5BOMkVpgZHDxqWW5mX3X3BUnUshvYDTBnzuFkTDROmlJBSWE+9U2tLFxVxQdOnBh1SSIiXepVn4OZnWJmHw9vpyS52j3AVIJrQBwHnBsuPwf4dW/qyDTFBfmcFo6WXqh+BxEZwJJpVmpnZhMIPuRP5GCH8lAzewa4wN23dLeuu9eTcPqqmcX3vd3da1OqOoPNP3IU96/cweMv7+JAUyuDCvOjLklEpJNUjxxuJQiUo919uLsPB44mmDrj1lQ25O4b3d3c/ZUUa8ho82ZWYgaNLW088bJGS4vIwJRqOJwBfNrdV8UXhL9fDpzel4Vlq5FDipg9PhgtvVDXeBCRASrVcNgGdDXAoBXQ+ZlJik/Et3BVFW1tGi0tIgNPquFwDfAjM2s/zSb8/Qbga31ZWDaLj3eo2t/Ic5v3RlyNiEhnqYbDfxGMU1hvZq+Y2SvAeuCNwFfM7N/xW18Xmk1mjR7ClJGlAHz53uXUN2XWaG8RyX4pna0E3N0vVeQYM+P75x/Lhbc8xbqddVzzl5XccMHsqMsSEWmXdDiYWT7wMLDc3VOeF0kONWfScD7/phlcf/8a/rj0FU6ZWsF7Tsj62ctFJEMk3azk7q3Av4Bh/VdObvn0GVPbB8V99c8rWLczZ4Z7iMgAl2qfwypAX2/7SF6eceP7jmPkkCLqm1q5/K7ndRlRERkQUg2HLwDXm9kbw2YmOUwjhxTxowuPwwxWbavh2/+3queVRET6Warh8FeCs5WeBBp6MWW3dOHUaSO4fN40AO58ahP/eHFbxBWJSK5L9WylT/VLFcJV86fz9Po9PLNxD1+6dznHjCtnwvCSqMsSkRxlmXo94zlz5viSJUuiLqNPbas+wLk/foy99c3MnjCUP3zyZApjab1Yn4hkOTNb6u49XvMg5U8eM6s0s6vN7H/MbES47FQzm9ybQuWgMeWD2sc7vLBlHzc8kHHXMxKRLJFSOJjZ8cBq4CPAx4Cy8KE3Ad/q29Jy0/wjR/HxuUHO3vLoeh5erSmrRCT9Uj1y+AFwi7sfAzQmLL8fOLXPqspxX3rLLGaPLwfg8/csY3t1Q8QViUiuSTUcTgBu62L5VmDU4ZcjAIWxPG666ASGFMXYW9/MVXc/T6tmbxWRNEo1HFqA0i6WTwX2HH45EjexooTvnn8sAE9v2MNPFr4UcUUikktSDYd/Al80Mwvvu5kNA64jGAMhfehtx47h4hOD2dF/8tBLLF6nK8eJSHr0ZoT064F1QDFwL7ABGAr8Z9+WJgBfe/tRzBo9BHf47N3L2FXb2PNKIiKHKdVw2Au8geBI4WbgKeBqYI67q1mpHxQX5PPTD5zAoIJ8qvY38vl7XtDV40Sk3yUVDmY23Mz+CtQC1cClwA3ufpm7/8LddTpNP5pWOZhvnncMAI+u3cktj62PuCIRyXbJHjl8GzgR+DrwRYIzk37eX0VJZ+99/Xjec/w4AK6/fw1LN+nyoiLSf5INh7cCH3P377j7jcA7gbPNLNW5meQwfPO8Y5gyopTWNufK3z1PdX1z1CWJSJZKNhzGAUvjd9z930ATMDaVnZnZt81sg5nVmFmVmf3RzCamso1cVloU46cfOIHCWB6v7jvAF//4Apk6N5aIDGzJhkM+0PFramu4PBV3Ase5exkwCdiMrkudkqPGlvG1tx8FwAP/3sGvn9wUcUUiko1SaRb6nZk1JdwvBn6ZeB0Hdz/3tTbg7qsT7hrQBsxMtgAzqwAqAGbPnp3salnngydOZPHLu/jHiu18+/9W8fojhnHMuPKoyxKRLJLskcOvgC3AjoTbbwjGOCQu65GZfcDMqgnOfLoKuDaFeq8A1gBrqqpyd0I6M+O75x/L+GGDaGpt4/K7nqO2sSXqskQki0R2PQczG00ws+sT7r4oyXUSjxzWLFu2rP8KzADLtuzjvf+zmJY2513HjQ0vN2o9rygiOavfrufQV9x9O3Ar8DczG57kOrvdfa27r43FdKLUcROG8uW3zALgL8u28oclr0RckYhki6gvMxYjmMgvpbOe5KCPzZ3MvJkjAbjmf1ewdsf+iCsSkWyQtnAwszwzu9zMKsP744GfARsJLiAkvZCXZ/zgfccxqqyIhuag/+FAU2vUZYlIhkv3kcO5wAozqwOeBuqBs91dvamHYXhpIT9+//HkGazdUcs3/roy6pJEJMOlLRzcvc3dz3X3Sncvdfdx7n6xu69LVw3Z7KQpFVw1fwYAdz+7hb8sezXiikQkk0Xd5yB96PKzpnHylAoA/vO+F9m4qy7iikQkUykcskh+nvHj9x9HRWkhdU2tXP6752hsUf+DiKRO4ZBlKsuKufHC4wBY8WoN3/2H+vpFJHUKhyx0xoyRfOqMqQD88omNPLBye8QViUimUThkqavPmcEJE4cC8MU/LufVfQcirkhEMonCIUsV5Ofxk4uOp6w4RvWBZq783fM0t7ZFXZaIZAiFQxYbP6yE6y8IZq9dumkvP/zX2ogrEpFMoXDIcm8+ejSXnDIJgAWL1vHo2p3RFiQiGUHhkAO+cu4sjh5bBsDn71lGVU1DxBWJyECncMgBRbF8fvqBEygtzGdXbROf/f0yWtt0eVER6Z7CIUdMHlHKd97zOgAWr9vNgodfjrgiERnIFA455F3HjePCORMA+OGDa3l6/e6IKxKRgUrhkGOufefRTK8cTJvDVXcvY09dU88riUjOUTjkmEGF+fzs4hMoLshje00DX/jDC0R1qVgRGbgUDjloxqghXPuOowF4aHUVtz22IeKKRGSgUTjkqAvfMIF3zg6uzvrtv6/ik3cuYdW2moirEpGBQuGQo8yMb7/7GI4ZF4x/uH/lDt7648e47LdLWbNd16EWyXUKhxw2pLiAP192Kje+bzZHVJQA8PcXt/OWHz/K5Xc9x8tVCgmRXGWZ2hk5Z84cX7JkSdRlZI2W1jbue/5VfrLwJV7ZG8zgagbvmj2WK+dPZ8rIwRFXKCJ9wcyWuvucHp+ncJBEza1t3Lv0FW566OX2ab7zDM47fhxXnjWdSSNKI65QRA6HwkEOS1NLG/cs2cLPHn6ZbdXBXEz5ecb5J4zjirOmM2F4ScQVikhvKBykTzS2tPL7Z4OQ2FHTCEAsz7hgzng+M28a44cpJEQyyYALBzP7HvB2YAJQC/wf8GV339Ob7Skc0quhuZW7nt7MgkXr2FUbhERBvnHhGybwmXnTGFM+KOIKRSQZyYZDOs9WagU+CFQAs4HxwB1p3L8chuKCfD46dzKPfWkeX33bkVSUFtLc6vzmqc2c8f1FfP0vK9ihqcBFskZkzUpm9hbgHncv6836OnKIVn1TC3c+uYmfP7KOvfXNABTG8rj4xIl8+sypVA4pjrhCEenKgGtW6rRjs+uBk9z9tBTWqSA48mD27Nlrli1b1l/lSZJqG1v41eKN3PLoeqoPBCFRXJDHh046gk+eMZURg4sirlBEEg3ocDCz8wmalM5w9+dSWO9a4OsAY8aMYevWrf1Sn6Ruf0MzdzyxkVsfW09NQwsAgwry+fApk/jE6VMYXloYcYUiAgM4HMzsAuBm4Hx3fzjFdXXkMMBVH2jm9sc3cPvjG9jfGIREaWE+l5w6iUtPm8LQEoWESJQGZDiY2UeAHwDvcPcnDmdb6nMY2Krrm7nt8fXc/vgG6ppaARhcFOOjcyfzsbmTKR9UEHGFIrlpwIWDmV1J0CT0Fnd/9nC3p3DIDHvrmrj1sfXcsXgj9WFIDCmO8fG5U/jI3EmUFSskRNJpIIaDAy1AY+Jyd+++V0jCAAANN0lEQVTVpD0Kh8yyu7aRWx5dz6+f3MSB5iAkygcVcOlpk7nk1MkMLopFXKFIbhhw4dDXFA6Zaef+Rm5+ZB13PrWJxpY2AIaVFPCJ06dy8UkTdSQh0s8UDjKgVdU0sGDROu56ZjNNYUgATBlRytHjyjlmbBnHjCvn6LFl6sQW6UMKB8kI26sbWLDoZe5+ZgtNrW1dPmf8sEEcM7acY8aVhcFRzsghGj8h0hsKB8ko1fXNLH91HyterWHF1mpWvFrNpt313T5/VFkRx4wtP+QoY0x5MWaWxqpFMo/CQTJe9YFm/r21hpVhWKzYWsO6nbV09092eGkhR4dBET/SmDi8RIEhkkDhIFmprrGF1dtrgiOMMDBe2rGflrau/x0PKY61B0XQh1HO5BGl5OcpMCQ3KRwkZzQ0t7J2x/5DmqRWb9vfbR9GSWE+R4052OF9zLhyplUOpiBfl1SX7JdsOOjkcsl4xQX5HDt+KMeOH9q+rLm1jZd21LJiazUrwyOMf2+t4UBzK/VNrSzZtJclm/a2P78wlseRo4dw9Lhypo4czMThJUwcXsKE4YMoKdR/E8k9OnKQnNHa5mzYVZvQJFXNyldr2ueA6s6IwUVMHD4oITCCnxMrShg1pJg8NVFJBlGzkkgS2tqcLXvr25ukVm6tYfPuOl7Ze6DbfoxEhfl5jE8IjsTwmDC8RCO/ZcBRs5JIEvLyjCMqSjmiopS3HTumfXlLaxvbaxrYvKeeLXvq2bynns17DrTf31PXBEBTaxvrd9axfmddl9uvKC08eKSRGB4VJYwuK1bHuAxYCgeRLsTy8xg/rITxw0pgaufH9zc0syUhLDaHty176tmyt57m1uCoY3ddE7vrmli2ZV+nbRTm5zF+2KBDwmNCQl/HEE0lIhFSOIj0wpDiAo4aW8BRYztf5ba1zdkRHnV0FR67ahOOOnbVsX5X10cdIwYXMik8qpk8ooRJI0qZVFHKpBGlaq6Sfqd/YSJ9LD/PGDt0EGOHDuKkKRWdHq9rbGHL3no27+4iPPYeaJ9raldtE7tqmw45qypu5JAiJlWUtIfF5PbgKNHZVdIn9K9IJM1Ki2LMGl3GrNGdjzra2pyq/Y1s3F3Hpt11bNhVz8ZddWzcHdwamoPg2Lm/kZ37G3l2Y+fgqBxSFARGe3AERx1HDC9lUGF+v78+yQ4KB5EBJC/PGF1ezOjy4k5HHfHg2BAPi1117b9v2l3fPgV61f5GqvY38syGPZ22P7qsmEkjShKONErDpqsSigsUHHKQwkEkQyQGx8lTOwfH9pqGIDDag6Oejbvr2Ly7vn20+PaaBrbXNPDU+kODwwzGlBUHYTGi9JAmq9HlxQwpimmOqhyjcQ4iWa61zdlWfYCNu+rbgyMeIlv2HDyz6rWUFOYHwVQW3sKQGlVWzJhwecXgIp2amwE0zkFEgKCDPH5a7tzpIw55rKW1jW3VDe3NUxvC4Ni4O+gojw8ErG9qfc3xHPH9VA4pag+R9uBICJFRZcVqvsoQCgeRHBbLz2NCOL7idEYe8lg8OHbUNLT/3F7dwLaaBnZUB81TO2oa2o88giOU4LmvZWhJwcGjj4QQGRXeH1NeTPmgAjVjRUzhICJdSgyO7rS1OXvqm9heHQRHPDASw2R7TQP7Gw7OX7Wvvpl99c2s3r6/2+0WxfIONmOVF1NRWsTgonxKi2KUFMUoLQx+Ly2MUVKUz+CiGCWF+e33i2I6OjlcCgcR6bW8PGPE4CJGDC7imHHl3T6vrrElCI7wyKKrENlZ29h+IafGljY27a5/zasBvpaCfKOk8GCIHBoo+ZQUxQ4JlNKiGKVF+cE6RfFl8fvBOrEcm9Jd4SAi/a60KMbUkYOZOnJwt89pbm1j5/7GQ0JkR3h21Z66JuoaW6hvaqWuqYW6xlbqGlvaT9/tvC2n+kAz1Qea++w1FMbyKCnMpyiWR3FB1z+Lulme6s/4dopieZE1rykcRGRAKMjPax9ZnqyW1jbqmlqpTwiMuqYW6hsPhkh9Uwu18WAJfwb3D65zMHRauj17q6mlrX30ejp1FR6jyor5zcdP7Nf9pjUczOz9wGeA2UCJuyucRKTXYvl5lA/Ko3xQ301S2NTS1ilQ6hqDQGlsaaWxuY2G+M/mVhpbkvvZ1MXyZKaFb2xp63SEVNvDNUj6Qro/nPcCC4BBwC1p3reISI8KY3kUxgoZWlLY7/tqaW1LOlwaW9poDH8Wxfq//yOt4eDu9wOY2Zm9Wd/MKoAKgNmzZ/ddYSIiEYjl5xHLz6N0AM6ym2nd71cAa4A1VVVVUdciIpK1Mi0cbgJmAjMrKyujrkVEJGtlVDi4+253X+vua2OxgXcYJiKSLTIqHEREJD3SfSprPlAAFIb3i8OHGj1Tp4cVEclC6T5y+BBwALgfyA9/PwAckeY6RETkNaQ1HNz9Dne3Lm4b01mHiIi8toy92I+Z7QQ29WLVfGAUsANo7dOiMpPej0Pp/ThI78WhsuX9OMLdR/b0pIwNh94ysxkEYyVmuvvaqOuJmt6PQ+n9OEjvxaFy7f3Q2UoiItKJwkFERDrJxXDYDXwj/Cl6PzrS+3GQ3otD5dT7kXN9DiIi0rNcPHIQEZEeKBxERKQThYOIiHSicBARkU4UDiIi0onCQUREOlE4iIhIJwoHERHpJKfCwczyzex6M9tpZvvN7F4zGxF1XVEws++Z2UozqzGzrWZ2q5kNj7quqJlZnpktNjM3s/FR1xMlMzvbzJ4ys1oz22VmC6KuKSpmNtrMfh9+duw1s4fMbHbUdfWnnAoH4P8B7wJOBOL/8e+MrpxItQIfBCqA2QTvxx1RFjRAfA6oj7qIqJnZmcAfgRsI/o2MB26LsqaILQCGAzMIpu1eAvzNzCzSqvpRTk2fYWabgOvc/Rfh/anAy8Akd+/NtSGyhpm9BbjH3cuiriUq4ZTM/wDOB54HJrj7K9FWFQ0zexJ4xN3/X9S1DARmthz4qbvfEt6fCawGRrr7rkiL6yc5c+RgZkOBicDS+DJ3XwfUEHxzznXzgReiLiIqZpYH3A58AdgXcTmRMrNS4I1AzMyeC5uUFpnZnKhri9D1wPlmNtLMioFPAI9nazBADoUDMCT8Wd1h+T4gZ78tA5jZ+cCngKuiriVCVwHb3f1PURcyAAwj+Gy4CLgEGAs8APw9/JKVi54guBJcFVALvAe4NNKK+lkuhcP+8Gd5h+VDCY4ecpKZXQDcCrzT3Z+Lup4omNk04Grg8qhrGSDi/1d+6e7L3b0J+G+gADglurKiER5VPgisJfj8KAG+DTxmZqOirK0/5Uw4uPs+YDNwQnyZmU0hOGpYHlVdUTKzjwA3A+9w94ejridCc4GRwAoz2wXEQ3K5mV0WXVnRcPdqYCPQsUPSu1iWC4YDk4Gb3L3G3Zvc/TaCz8+Toy2t/+RMOIRuAb5sZpPNrAz4HnC/u2+Mtqz0M7MrCc5EebO7PxF1PRG7B5gKHBfezg2XnwP8OqqiIrYA+IiZHWVmMeCLQCOwONqy0i/sV1gLXGZmpWYWM7OPEjRVZ+0Xy1jUBaTZdwnaU58FioB/EZzOmYt+DLQADyeejefugyOrKCLuXk/C6avhhyEEfRC10VQVuRsIPvweAooJzt56a3hUkYvOI+iU3kTQvPYycIG7r4+0qn6UU6eyiohIcnKtWUlERJKgcBARkU4UDiIi0onCQUREOlE4iIhIJwoHERHpROEgMgCY2SVm1hB1HSJxCgfJeWZ2R3hxn463nJyuWwRyb4S0SHceBj7QYVlrFIWIDAQ6chAJNLn79g63nQBmttHMrjOz28PLqu40s28mXgXMzMrN7BfhtQ8azOwJMztkUjYzmx5emnavmdWb2fNmNq/Dc04zs2Xh48+Y2fHpefkih1I4iCTnswSz+s4huCDQ1cCnEx7/JXAGcCHwemAdcH98SmczG0NwTYASgon9jgW+1WEfBeGyz4Tb2AfcHU4ZLZJWmltJcp6Z3UEwAWPHDuE/ufuHzGwjsMHd5yWs833gPe4+zcymE8za+SZ3fzB8PD4526/d/Wtm9i3gI8A0dz/QRQ2XEATMbHdfHi47FXgcXcZWIqA+B5HAYuCjHZYlzsj6ZIfHngC+EF4y8kiC6xw8Hn/Q3ZvD6zAfFS46geCykp2CIUELsCLh/tbw5yiC2UBF0kbhIBKod/eXI66h1d3bEu7HD+vVrCRpp390Isk5qcP9UwiamhqAfwNGcEU5oL1Z6WRgZbjoOeBUMxuUhlpFDpvCQSRQaGajO94SHp9jZl8zsxlm9iGC603/ECA84rgP+LmZnWVmRwG/ILiw1M/C9RcQXDTnPjM72cymmNm7Op6tJDJQqFlJJDAP2NZxYXgEAPAjYBqwFGgiuJLegoSnfhS4EfgDUBo+783uvgPA3bea2Vzg+8D9QD6wmuCsJ5EBR2crifQgPFvp5+7+3ahrEUkXNSuJiEgnCgcREelEzUoiItKJjhxERKQThYOIiHSicBARkU4UDiIi0onCQUREOlE4iIhIJ/8fDb9RGSoVbuYAAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# train the copy task\n", "dev_perplexities = train_copy_task()\n", "\n", "def plot_perplexity(perplexities):\n", " \"\"\"plot perplexities\"\"\"\n", " plt.title(\"Perplexity per Epoch\")\n", " plt.xlabel(\"Epoch\")\n", " plt.ylabel(\"Perplexity\")\n", " plt.plot(perplexities)\n", " \n", "plot_perplexity(dev_perplexities)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can see that the model managed to correctly 'translate' the two examples in the end.\n", "\n", "Moreover, the perplexity of the development data nicely went down towards 1." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# A Real World Example\n", "\n", "Now we consider a real-world example using the IWSLT German-English Translation task. \n", "This task is much smaller than usual, but it illustrates the whole system. \n", "\n", "The cell below installs torch text and spacy. This might take a while." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [], "source": [ "#!pip install git+git://github.com/pytorch/text spacy \n", "#!python -m spacy download en\n", "#!python -m spacy download de" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data Loading\n", "\n", "We will load the dataset using torchtext and spacy for tokenization.\n", "\n", "This cell might take a while to run the first time, as it will download and tokenize the IWSLT data.\n", "\n", "For speed we only include short sentences, and we include a word in the vocabulary only if it occurs at least 5 times. In this case we also lowercase the data.\n", "\n", "If you have **issues** with torch text in the cell below (e.g. an `ascii` error), try running `export LC_ALL=\"en_US.UTF-8\"` before you start `jupyter notebook`." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "# For data loading.\n", "from torchtext import data, datasets\n", "\n", "if True:\n", " import spacy\n", " spacy_de = spacy.load('de')\n", " spacy_en = spacy.load('en')\n", "\n", " def tokenize_de(text):\n", " return [tok.text for tok in spacy_de.tokenizer(text)]\n", "\n", " def tokenize_en(text):\n", " return [tok.text for tok in spacy_en.tokenizer(text)]\n", "\n", " UNK_TOKEN = \"\"\n", " PAD_TOKEN = \"\" \n", " SOS_TOKEN = \"\"\n", " EOS_TOKEN = \"\"\n", " LOWER = True\n", " \n", " # we include lengths to provide to the RNNs\n", " SRC = data.Field(tokenize=tokenize_de, \n", " batch_first=True, lower=LOWER, include_lengths=True,\n", " unk_token=UNK_TOKEN, pad_token=PAD_TOKEN, init_token=None, eos_token=EOS_TOKEN)\n", " TRG = data.Field(tokenize=tokenize_en, \n", " batch_first=True, lower=LOWER, include_lengths=True,\n", " unk_token=UNK_TOKEN, pad_token=PAD_TOKEN, init_token=SOS_TOKEN, eos_token=EOS_TOKEN)\n", "\n", " MAX_LEN = 25 # NOTE: we filter out a lot of sentences for speed\n", " train_data, valid_data, test_data = datasets.IWSLT.splits(\n", " exts=('.de', '.en'), fields=(SRC, TRG), \n", " filter_pred=lambda x: len(vars(x)['src']) <= MAX_LEN and \n", " len(vars(x)['trg']) <= MAX_LEN)\n", " MIN_FREQ = 5 # NOTE: we limit the vocabulary to frequent words for speed\n", " SRC.build_vocab(train_data.src, min_freq=MIN_FREQ)\n", " TRG.build_vocab(train_data.trg, min_freq=MIN_FREQ)\n", " \n", " PAD_INDEX = TRG.vocab.stoi[PAD_TOKEN]\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Let's look at the data\n", "\n", "It never hurts to look at your data and some statistics." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Data set sizes (number of sentence pairs):\n", "train 143116\n", "valid 690\n", "test 963 \n", "\n", "First training example:\n", "src: david gallo : das ist bill lange . ich bin dave gallo .\n", "trg: david gallo : this is bill lange . i 'm dave gallo . \n", "\n", "Most common words (src):\n", " . 138325\n", " , 105944\n", " und 41839\n", " die 40809\n", " das 33324\n", " sie 33035\n", " ich 31153\n", " ist 31035\n", " es 27449\n", " wir 25817 \n", "\n", "Most common words (trg):\n", " . 137259\n", " , 91619\n", " the 73344\n", " and 50273\n", " to 42798\n", " a 39573\n", " of 39496\n", " i 33524\n", " it 32921\n", " that 32643 \n", "\n", "First 10 words (src):\n", "00 \n", "01 \n", "02 \n", "03 .\n", "04 ,\n", "05 und\n", "06 die\n", "07 das\n", "08 sie\n", "09 ich \n", "\n", "First 10 words (trg):\n", "00 \n", "01 \n", "02 \n", "03 \n", "04 .\n", "05 ,\n", "06 the\n", "07 and\n", "08 to\n", "09 a \n", "\n", "Number of German words (types): 15761\n", "Number of English words (types): 13003 \n", "\n" ] } ], "source": [ "def print_data_info(train_data, valid_data, test_data, src_field, trg_field):\n", " \"\"\" This prints some useful stuff about our data sets. \"\"\"\n", "\n", " print(\"Data set sizes (number of sentence pairs):\")\n", " print('train', len(train_data))\n", " print('valid', len(valid_data))\n", " print('test', len(test_data), \"\\n\")\n", "\n", " print(\"First training example:\")\n", " print(\"src:\", \" \".join(vars(train_data[0])['src']))\n", " print(\"trg:\", \" \".join(vars(train_data[0])['trg']), \"\\n\")\n", "\n", " print(\"Most common words (src):\")\n", " print(\"\\n\".join([\"%10s %10d\" % x for x in src_field.vocab.freqs.most_common(10)]), \"\\n\")\n", " print(\"Most common words (trg):\")\n", " print(\"\\n\".join([\"%10s %10d\" % x for x in trg_field.vocab.freqs.most_common(10)]), \"\\n\")\n", "\n", " print(\"First 10 words (src):\")\n", " print(\"\\n\".join(\n", " '%02d %s' % (i, t) for i, t in enumerate(src_field.vocab.itos[:10])), \"\\n\")\n", " print(\"First 10 words (trg):\")\n", " print(\"\\n\".join(\n", " '%02d %s' % (i, t) for i, t in enumerate(trg_field.vocab.itos[:10])), \"\\n\")\n", "\n", " print(\"Number of German words (types):\", len(src_field.vocab))\n", " print(\"Number of English words (types):\", len(trg_field.vocab), \"\\n\")\n", " \n", " \n", "print_data_info(train_data, valid_data, test_data, SRC, TRG)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Iterators\n", "Batching matters a ton for speed. We will use torch text's BucketIterator here to get batches containing sentences of (almost) the same length.\n", "\n", "#### Note on sorting batches for RNNs in PyTorch\n", "\n", "For effiency reasons, PyTorch RNNs require that batches have been sorted by length, with the longest sentence in the batch first. For training, we simply sort each batch. \n", "For validation, we would run into trouble if we want to compare our translations with some external file that was not sorted. Therefore we simply set the validation batch size to 1, so that we can keep it in the original order." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": [ "train_iter = data.BucketIterator(train_data, batch_size=64, train=True, \n", " sort_within_batch=True, \n", " sort_key=lambda x: (len(x.src), len(x.trg)), repeat=False,\n", " device=DEVICE)\n", "valid_iter = data.Iterator(valid_data, batch_size=1, train=False, sort=False, repeat=False, \n", " device=DEVICE)\n", "\n", "\n", "def rebatch(pad_idx, batch):\n", " \"\"\"Wrap torchtext batch into our own Batch class for pre-processing\"\"\"\n", " return Batch(batch.src, batch.trg, pad_idx)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Training the System\n", "\n", "Now we train the model. \n", "\n", "On a Titan X GPU, this runs at ~18,000 tokens per second with a batch size of 64." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "def train(model, num_epochs=10, lr=0.0003, print_every=100):\n", " \"\"\"Train a model on IWSLT\"\"\"\n", " \n", " if USE_CUDA:\n", " model.cuda()\n", "\n", " # optionally add label smoothing; see the Annotated Transformer\n", " criterion = nn.NLLLoss(reduction=\"sum\", ignore_index=PAD_INDEX)\n", " optim = torch.optim.Adam(model.parameters(), lr=lr)\n", " \n", " dev_perplexities = []\n", "\n", " for epoch in range(num_epochs):\n", " \n", " print(\"Epoch\", epoch)\n", " model.train()\n", " train_perplexity = run_epoch((rebatch(PAD_INDEX, b) for b in train_iter), \n", " model,\n", " SimpleLossCompute(model.generator, criterion, optim),\n", " print_every=print_every)\n", " \n", " model.eval()\n", " with torch.no_grad():\n", " print_examples((rebatch(PAD_INDEX, x) for x in valid_iter), \n", " model, n=3, src_vocab=SRC.vocab, trg_vocab=TRG.vocab) \n", "\n", " dev_perplexity = run_epoch((rebatch(PAD_INDEX, b) for b in valid_iter), \n", " model, \n", " SimpleLossCompute(model.generator, criterion, None))\n", " print(\"Validation perplexity: %f\" % dev_perplexity)\n", " dev_perplexities.append(dev_perplexity)\n", " \n", " return dev_perplexities\n", " " ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 0\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/home/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/rnn.py:38: UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.2 and num_layers=1\n", " \"num_layers={}\".format(dropout, num_layers))\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch Step: 100 Loss: 22.353386 Tokens per Sec: 16007.731248\n", "Epoch Step: 200 Loss: 34.410126 Tokens per Sec: 16368.906298\n", "Epoch Step: 300 Loss: 44.763870 Tokens per Sec: 16586.324787\n", "Epoch Step: 400 Loss: 57.584606 Tokens per Sec: 16717.486756\n", "Epoch Step: 500 Loss: 40.508701 Tokens per Sec: 16486.886104\n", "Epoch Step: 600 Loss: 51.919121 Tokens per Sec: 16529.862635\n", "Epoch Step: 700 Loss: 82.279633 Tokens per Sec: 16973.462052\n", "Epoch Step: 800 Loss: 35.026432 Tokens per Sec: 16724.939524\n", "Epoch Step: 900 Loss: 63.407204 Tokens per Sec: 16606.524355\n", "Epoch Step: 1000 Loss: 37.909828 Tokens per Sec: 19105.497130\n", "Epoch Step: 1100 Loss: 90.584244 Tokens per Sec: 19643.264684\n", "Epoch Step: 1200 Loss: 84.000832 Tokens per Sec: 19468.084935\n", "Epoch Step: 1300 Loss: 54.331242 Tokens per Sec: 19679.282614\n", "Epoch Step: 1400 Loss: 49.921040 Tokens per Sec: 19629.820942\n", "Epoch Step: 1500 Loss: 21.851797 Tokens per Sec: 19565.639729\n", "Epoch Step: 1600 Loss: 55.154270 Tokens per Sec: 19515.738007\n", "Epoch Step: 1700 Loss: 40.758137 Tokens per Sec: 19486.791554\n", "Epoch Step: 1800 Loss: 50.094219 Tokens per Sec: 19761.236905\n", "Epoch Step: 1900 Loss: 90.545143 Tokens per Sec: 19447.650965\n", "Epoch Step: 2000 Loss: 22.882494 Tokens per Sec: 19539.331538\n", "Epoch Step: 2100 Loss: 99.448174 Tokens per Sec: 19278.704892\n", "Epoch Step: 2200 Loss: 16.793839 Tokens per Sec: 19183.702688\n", "\n", "Example #1\n", "Src : als ich 11 jahre alt war , wurde ich eines morgens von den heller freude geweckt .\n", "Trg : when i was 11 , i remember waking up one morning to the sound of joy in my house .\n", "Pred: when i was born years old , i was a of the of the .\n", "\n", "Example #2\n", "Src : mein vater hörte sich auf seinem kleinen , grauen radio die der bbc an .\n", "Trg : my father was listening to bbc news on his small , gray radio .\n", "Pred: my father was on his , the of the .\n", "\n", "Example #3\n", "Src : er sah sehr glücklich aus , was damals ziemlich ungewöhnlich war , da ihn die nachrichten meistens .\n", "Trg : there was a big smile on his face which was unusual then , because the news mostly depressed him .\n", "Pred: he was very interested in the way , what was pretty much more , and then it was the .\n", "\n", "Validation perplexity: 31.839708\n", "Epoch 1\n", "Epoch Step: 100 Loss: 4.451122 Tokens per Sec: 19110.156367\n", "Epoch Step: 200 Loss: 11.262838 Tokens per Sec: 19538.253630\n", "Epoch Step: 300 Loss: 55.240711 Tokens per Sec: 19584.509548\n", "Epoch Step: 400 Loss: 54.733456 Tokens per Sec: 19787.183104\n", "Epoch Step: 500 Loss: 38.923244 Tokens per Sec: 19385.772613\n", "Epoch Step: 600 Loss: 63.162933 Tokens per Sec: 19013.165752\n", "Epoch Step: 700 Loss: 47.323864 Tokens per Sec: 18863.104141\n", "Epoch Step: 800 Loss: 43.414978 Tokens per Sec: 19258.337491\n", "Epoch Step: 900 Loss: 87.750214 Tokens per Sec: 19179.949782\n", "Epoch Step: 1000 Loss: 39.787056 Tokens per Sec: 19110.748464\n", "Epoch Step: 1100 Loss: 78.177170 Tokens per Sec: 19272.044197\n", "Epoch Step: 1200 Loss: 37.122997 Tokens per Sec: 19194.535740\n", "Epoch Step: 1300 Loss: 26.103378 Tokens per Sec: 19337.967366\n", "Epoch Step: 1400 Loss: 78.804855 Tokens per Sec: 19018.413406\n", "Epoch Step: 1500 Loss: 61.593956 Tokens per Sec: 19259.272095\n", "Epoch Step: 1600 Loss: 81.611786 Tokens per Sec: 19259.527179\n", "Epoch Step: 1700 Loss: 28.692696 Tokens per Sec: 19230.891840\n", "Epoch Step: 1800 Loss: 84.163223 Tokens per Sec: 19071.272023\n", "Epoch Step: 1900 Loss: 36.782116 Tokens per Sec: 19209.383788\n", "Epoch Step: 2000 Loss: 56.666332 Tokens per Sec: 19127.522297\n", "Epoch Step: 2100 Loss: 5.576357 Tokens per Sec: 18957.458966\n", "Epoch Step: 2200 Loss: 38.791512 Tokens per Sec: 19166.811446\n", "\n", "Example #1\n", "Src : als ich 11 jahre alt war , wurde ich eines morgens von den heller freude geweckt .\n", "Trg : when i was 11 , i remember waking up one morning to the sound of joy in my house .\n", "Pred: when i was 11 years old , i was a of the .\n", "\n", "Example #2\n", "Src : mein vater hörte sich auf seinem kleinen , grauen radio die der bbc an .\n", "Trg : my father was listening to bbc news on his small , gray radio .\n", "Pred: my father was on his , in the little , the of the .\n", "\n", "Example #3\n", "Src : er sah sehr glücklich aus , was damals ziemlich ungewöhnlich war , da ihn die nachrichten meistens .\n", "Trg : there was a big smile on his face which was unusual then , because the news mostly depressed him .\n", "Pred: he saw very happy , what was pretty much , and it was the of the .\n", "\n", "Validation perplexity: 19.906190\n", "Epoch 2\n", "Epoch Step: 100 Loss: 58.981544 Tokens per Sec: 19121.747106\n", "Epoch Step: 200 Loss: 34.874680 Tokens per Sec: 19689.768904\n", "Epoch Step: 300 Loss: 27.895102 Tokens per Sec: 19751.401628\n", "Epoch Step: 400 Loss: 52.931011 Tokens per Sec: 16369.447354\n", "Epoch Step: 500 Loss: 77.191933 Tokens per Sec: 16337.808093\n", "Epoch Step: 600 Loss: 65.645668 Tokens per Sec: 16307.871308\n", "Epoch Step: 700 Loss: 7.141161 Tokens per Sec: 16420.432824\n", "Epoch Step: 800 Loss: 76.990250 Tokens per Sec: 17512.558218\n", "Epoch Step: 900 Loss: 43.835995 Tokens per Sec: 16399.672659\n", "Epoch Step: 1000 Loss: 68.026192 Tokens per Sec: 16598.504664\n", "Epoch Step: 1100 Loss: 23.746111 Tokens per Sec: 16368.137311\n", "Epoch Step: 1200 Loss: 42.117832 Tokens per Sec: 16324.872475\n", "Epoch Step: 1300 Loss: 47.894409 Tokens per Sec: 16532.223380\n", "Epoch Step: 1400 Loss: 43.772861 Tokens per Sec: 16472.315811\n", "Epoch Step: 1500 Loss: 60.978756 Tokens per Sec: 16368.088307\n", "Epoch Step: 1600 Loss: 59.143227 Tokens per Sec: 16553.220745\n", "Epoch Step: 1700 Loss: 34.091373 Tokens per Sec: 16557.579342\n", "Epoch Step: 1800 Loss: 11.551711 Tokens per Sec: 16639.281663\n", "Epoch Step: 1900 Loss: 40.060520 Tokens per Sec: 16666.679672\n", "Epoch Step: 2000 Loss: 21.947863 Tokens per Sec: 16403.240568\n", "Epoch Step: 2100 Loss: 12.891315 Tokens per Sec: 16656.630033\n", "Epoch Step: 2200 Loss: 12.300262 Tokens per Sec: 16592.045153\n", "\n", "Example #1\n", "Src : als ich 11 jahre alt war , wurde ich eines morgens von den heller freude geweckt .\n", "Trg : when i was 11 , i remember waking up one morning to the sound of joy in my house .\n", "Pred: when i was 11 years old , i was a of the of the .\n", "\n", "Example #2\n", "Src : mein vater hörte sich auf seinem kleinen , grauen radio die der bbc an .\n", "Trg : my father was listening to bbc news on his small , gray radio .\n", "Pred: my father was on his little , , the of the bbc .\n", "\n", "Example #3\n", "Src : er sah sehr glücklich aus , was damals ziemlich ungewöhnlich war , da ihn die nachrichten meistens .\n", "Trg : there was a big smile on his face which was unusual then , because the news mostly depressed him .\n", "Pred: he looked very happy to what was pretty much more , because it was the of the .\n", "\n", "Validation perplexity: 15.555337\n", "Epoch 3\n", "Epoch Step: 100 Loss: 36.178066 Tokens per Sec: 16064.364293\n", "Epoch Step: 200 Loss: 20.046204 Tokens per Sec: 16557.065342\n", "Epoch Step: 300 Loss: 53.514584 Tokens per Sec: 16375.767859\n", "Epoch Step: 400 Loss: 29.280447 Tokens per Sec: 16687.195842\n", "Epoch Step: 500 Loss: 64.491814 Tokens per Sec: 16491.438857\n", "Epoch Step: 600 Loss: 62.286755 Tokens per Sec: 16443.863308\n", "Epoch Step: 700 Loss: 60.861393 Tokens per Sec: 16303.304238\n", "Epoch Step: 800 Loss: 25.101744 Tokens per Sec: 16437.206262\n", "Epoch Step: 900 Loss: 41.884624 Tokens per Sec: 16712.862598\n", "Epoch Step: 1000 Loss: 65.880905 Tokens per Sec: 16406.042864\n", "Epoch Step: 1100 Loss: 34.799385 Tokens per Sec: 16257.804744\n", "Epoch Step: 1200 Loss: 57.244125 Tokens per Sec: 16403.685499\n", "Epoch Step: 1300 Loss: 6.766514 Tokens per Sec: 16262.412676\n", "Epoch Step: 1400 Loss: 31.528254 Tokens per Sec: 16723.894609\n", "Epoch Step: 1500 Loss: 4.534189 Tokens per Sec: 16512.533272\n", "Epoch Step: 1600 Loss: 50.852787 Tokens per Sec: 16820.837828\n", "Epoch Step: 1700 Loss: 30.657820 Tokens per Sec: 16574.791159\n", "Epoch Step: 1800 Loss: 75.787910 Tokens per Sec: 16441.350335\n", "Epoch Step: 1900 Loss: 23.563347 Tokens per Sec: 16836.284727\n", "Epoch Step: 2000 Loss: 10.594786 Tokens per Sec: 16522.362683\n", "Epoch Step: 2100 Loss: 40.561062 Tokens per Sec: 16508.617285\n", "Epoch Step: 2200 Loss: 15.348518 Tokens per Sec: 16624.360367\n", "\n", "Example #1\n", "Src : als ich 11 jahre alt war , wurde ich eines morgens von den heller freude geweckt .\n", "Trg : when i was 11 , i remember waking up one morning to the sound of joy in my house .\n", "Pred: when i was 11 11 years old , i was a of the joy .\n", "\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Example #2\n", "Src : mein vater hörte sich auf seinem kleinen , grauen radio die der bbc an .\n", "Trg : my father was listening to bbc news on his small , gray radio .\n", "Pred: my father was on his little , , , the of the bbc .\n", "\n", "Example #3\n", "Src : er sah sehr glücklich aus , was damals ziemlich ungewöhnlich war , da ihn die nachrichten meistens .\n", "Trg : there was a big smile on his face which was unusual then , because the news mostly depressed him .\n", "Pred: he saw very happy , what was pretty much , because it was the .\n", "\n", "Validation perplexity: 13.563748\n", "Epoch 4\n", "Epoch Step: 100 Loss: 9.601490 Tokens per Sec: 16309.901017\n", "Epoch Step: 200 Loss: 13.329712 Tokens per Sec: 16693.352689\n", "Epoch Step: 300 Loss: 61.213333 Tokens per Sec: 16774.275779\n", "Epoch Step: 400 Loss: 37.759483 Tokens per Sec: 16628.037095\n", "Epoch Step: 500 Loss: 35.616104 Tokens per Sec: 16677.874896\n", "Epoch Step: 600 Loss: 58.753849 Tokens per Sec: 16452.736708\n", "Epoch Step: 700 Loss: 11.741160 Tokens per Sec: 16615.759446\n", "Epoch Step: 800 Loss: 24.230316 Tokens per Sec: 16804.673563\n", "Epoch Step: 900 Loss: 27.786499 Tokens per Sec: 16373.396939\n", "Epoch Step: 1000 Loss: 65.063515 Tokens per Sec: 16520.381173\n", "Epoch Step: 1100 Loss: 34.756481 Tokens per Sec: 16492.656502\n", "Epoch Step: 1200 Loss: 43.993877 Tokens per Sec: 17075.912389\n", "Epoch Step: 1300 Loss: 36.514729 Tokens per Sec: 16812.641454\n", "Epoch Step: 1400 Loss: 58.995735 Tokens per Sec: 16535.979640\n", "Epoch Step: 1500 Loss: 29.516464 Tokens per Sec: 16500.141569\n", "Epoch Step: 1600 Loss: 10.143467 Tokens per Sec: 16613.933279\n", "Epoch Step: 1700 Loss: 53.287037 Tokens per Sec: 16756.922926\n", "Epoch Step: 1800 Loss: 24.687494 Tokens per Sec: 16477.783348\n", "Epoch Step: 1900 Loss: 21.578268 Tokens per Sec: 16808.344988\n", "Epoch Step: 2000 Loss: 60.965946 Tokens per Sec: 16651.623717\n", "Epoch Step: 2100 Loss: 18.895075 Tokens per Sec: 16636.292649\n", "Epoch Step: 2200 Loss: 53.253704 Tokens per Sec: 16642.799323\n", "\n", "Example #1\n", "Src : als ich 11 jahre alt war , wurde ich eines morgens von den heller freude geweckt .\n", "Trg : when i was 11 , i remember waking up one morning to the sound of joy in my house .\n", "Pred: when i was 11 years old , i was a of the joy .\n", "\n", "Example #2\n", "Src : mein vater hörte sich auf seinem kleinen , grauen radio die der bbc an .\n", "Trg : my father was listening to bbc news on his small , gray radio .\n", "Pred: my dad listened on his little , radio the bbc of the bbc .\n", "\n", "Example #3\n", "Src : er sah sehr glücklich aus , was damals ziemlich ungewöhnlich war , da ihn die nachrichten meistens .\n", "Trg : there was a big smile on his face which was unusual then , because the news mostly depressed him .\n", "Pred: he saw a happy very happy , which was pretty much , because he was the most famous .\n", "\n", "Validation perplexity: 12.664111\n", "Epoch 5\n", "Epoch Step: 100 Loss: 21.919912 Tokens per Sec: 16266.471497\n", "Epoch Step: 200 Loss: 31.320656 Tokens per Sec: 16527.955427\n", "Epoch Step: 300 Loss: 40.778984 Tokens per Sec: 16517.710752\n", "Epoch Step: 400 Loss: 63.466324 Tokens per Sec: 16770.294841\n", "Epoch Step: 500 Loss: 49.329956 Tokens per Sec: 16694.936223\n", "Epoch Step: 600 Loss: 52.290169 Tokens per Sec: 16755.442966\n", "Epoch Step: 700 Loss: 51.911785 Tokens per Sec: 16768.565847\n", "Epoch Step: 800 Loss: 25.005857 Tokens per Sec: 16813.186507\n", "Epoch Step: 900 Loss: 50.679825 Tokens per Sec: 17109.031968\n", "Epoch Step: 1000 Loss: 13.069316 Tokens per Sec: 16692.984251\n", "Epoch Step: 1100 Loss: 12.595688 Tokens per Sec: 16546.293379\n", "Epoch Step: 1200 Loss: 46.846031 Tokens per Sec: 16491.379305\n", "Epoch Step: 1300 Loss: 30.238283 Tokens per Sec: 16558.196936\n", "Epoch Step: 1400 Loss: 23.865877 Tokens per Sec: 16556.353749\n", "Epoch Step: 1500 Loss: 42.451859 Tokens per Sec: 16784.645679\n", "Epoch Step: 1600 Loss: 37.048477 Tokens per Sec: 16651.129133\n", "Epoch Step: 1700 Loss: 17.043219 Tokens per Sec: 16655.630464\n", "Epoch Step: 1800 Loss: 17.227308 Tokens per Sec: 16688.568658\n", "Epoch Step: 1900 Loss: 23.672441 Tokens per Sec: 16609.439477\n", "Epoch Step: 2000 Loss: 19.385946 Tokens per Sec: 16586.442474\n", "Epoch Step: 2100 Loss: 25.717686 Tokens per Sec: 16879.694187\n", "Epoch Step: 2200 Loss: 22.427767 Tokens per Sec: 16844.504307\n", "\n", "Example #1\n", "Src : als ich 11 jahre alt war , wurde ich eines morgens von den heller freude geweckt .\n", "Trg : when i was 11 , i remember waking up one morning to the sound of joy in my house .\n", "Pred: when i was 11 years old , i was by the morning of joy .\n", "\n", "Example #2\n", "Src : mein vater hörte sich auf seinem kleinen , grauen radio die der bbc an .\n", "Trg : my father was listening to bbc news on his small , gray radio .\n", "Pred: my father listened on his little , gray radio waves the bbc of the bbc .\n", "\n", "Example #3\n", "Src : er sah sehr glücklich aus , was damals ziemlich ungewöhnlich war , da ihn die nachrichten meistens .\n", "Trg : there was a big smile on his face which was unusual then , because the news mostly depressed him .\n", "Pred: he saw a very happy ending , which was pretty unusual , since then they were .\n", "\n", "Validation perplexity: 12.246438\n", "Epoch 6\n", "Epoch Step: 100 Loss: 19.048712 Tokens per Sec: 19024.102757\n", "Epoch Step: 200 Loss: 31.636736 Tokens per Sec: 19387.779254\n", "Epoch Step: 300 Loss: 15.952754 Tokens per Sec: 19559.196457\n", "Epoch Step: 400 Loss: 24.849632 Tokens per Sec: 18968.450791\n", "Epoch Step: 500 Loss: 47.227837 Tokens per Sec: 19009.957585\n", "Epoch Step: 600 Loss: 8.887992 Tokens per Sec: 19024.581918\n", "Epoch Step: 700 Loss: 58.158920 Tokens per Sec: 16834.343585\n", "Epoch Step: 800 Loss: 32.257362 Tokens per Sec: 16725.454783\n", "Epoch Step: 900 Loss: 5.977044 Tokens per Sec: 16398.470679\n", "Epoch Step: 1000 Loss: 51.871101 Tokens per Sec: 16302.492231\n", "Epoch Step: 1100 Loss: 44.715164 Tokens per Sec: 16505.477988\n", "Epoch Step: 1200 Loss: 4.128096 Tokens per Sec: 19255.909773\n", "Epoch Step: 1300 Loss: 53.065189 Tokens per Sec: 19016.853318\n", "Epoch Step: 1400 Loss: 23.775473 Tokens per Sec: 18877.681861\n", "Epoch Step: 1500 Loss: 15.587101 Tokens per Sec: 18916.694718\n", "Epoch Step: 1600 Loss: 59.449795 Tokens per Sec: 19166.565245\n", "Epoch Step: 1700 Loss: 48.393402 Tokens per Sec: 18836.264938\n", "Epoch Step: 1800 Loss: 45.651253 Tokens per Sec: 18823.983316\n", "Epoch Step: 1900 Loss: 51.898994 Tokens per Sec: 19015.027947\n", "Epoch Step: 2000 Loss: 16.392334 Tokens per Sec: 19180.065119\n", "Epoch Step: 2100 Loss: 20.312500 Tokens per Sec: 19059.061076\n", "Epoch Step: 2200 Loss: 41.126842 Tokens per Sec: 19110.648056\n", "\n", "Example #1\n", "Src : als ich 11 jahre alt war , wurde ich eines morgens von den heller freude geweckt .\n", "Trg : when i was 11 , i remember waking up one morning to the sound of joy in my house .\n", "Pred: when i was 11 , i was a of the joy .\n", "\n", "Example #2\n", "Src : mein vater hörte sich auf seinem kleinen , grauen radio die der bbc an .\n", "Trg : my father was listening to bbc news on his small , gray radio .\n", "Pred: my father listened to his little , radio shack the of the bbc .\n", "\n", "Example #3\n", "Src : er sah sehr glücklich aus , was damals ziemlich ungewöhnlich war , da ihn die nachrichten meistens .\n", "Trg : there was a big smile on his face which was unusual then , because the news mostly depressed him .\n", "Pred: he looked very happy , which was pretty unusual , and then they had the news .\n", "\n", "Validation perplexity: 12.045694\n", "Epoch 7\n", "Epoch Step: 100 Loss: 22.484320 Tokens per Sec: 19136.387726\n", "Epoch Step: 200 Loss: 54.793003 Tokens per Sec: 19562.003455\n", "Epoch Step: 300 Loss: 52.516510 Tokens per Sec: 19494.585192\n", "Epoch Step: 400 Loss: 25.631699 Tokens per Sec: 19127.415568\n", "Epoch Step: 500 Loss: 15.818419 Tokens per Sec: 18909.082434\n", "Epoch Step: 600 Loss: 40.660767 Tokens per Sec: 19063.824782\n", "Epoch Step: 700 Loss: 21.253407 Tokens per Sec: 19011.780769\n", "Epoch Step: 800 Loss: 9.494976 Tokens per Sec: 19032.447976\n", "Epoch Step: 900 Loss: 21.503059 Tokens per Sec: 19120.646494\n", "Epoch Step: 1000 Loss: 34.198826 Tokens per Sec: 18751.274337\n", "Epoch Step: 1100 Loss: 21.471136 Tokens per Sec: 19119.629059\n", "Epoch Step: 1200 Loss: 45.433662 Tokens per Sec: 19158.978952\n", "Epoch Step: 1300 Loss: 48.697639 Tokens per Sec: 18852.568454\n", "Epoch Step: 1400 Loss: 48.406239 Tokens per Sec: 19090.121092\n", "Epoch Step: 1500 Loss: 10.506186 Tokens per Sec: 18996.606224\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch Step: 1600 Loss: 22.061657 Tokens per Sec: 18889.519602\n", "Epoch Step: 1700 Loss: 11.148299 Tokens per Sec: 19179.133196\n", "Epoch Step: 1800 Loss: 16.580446 Tokens per Sec: 19184.709044\n", "Epoch Step: 1900 Loss: 20.219671 Tokens per Sec: 18889.205997\n", "Epoch Step: 2000 Loss: 21.245464 Tokens per Sec: 18869.151894\n", "Epoch Step: 2100 Loss: 29.567142 Tokens per Sec: 18825.496347\n", "Epoch Step: 2200 Loss: 22.790722 Tokens per Sec: 18923.950021\n", "\n", "Example #1\n", "Src : als ich 11 jahre alt war , wurde ich eines morgens von den heller freude geweckt .\n", "Trg : when i was 11 , i remember waking up one morning to the sound of joy in my house .\n", "Pred: when i was 11 years old , i was a of the joy .\n", "\n", "Example #2\n", "Src : mein vater hörte sich auf seinem kleinen , grauen radio die der bbc an .\n", "Trg : my father was listening to bbc news on his small , gray radio .\n", "Pred: my father listened to his little , radio the of the bbc .\n", "\n", "Example #3\n", "Src : er sah sehr glücklich aus , was damals ziemlich ungewöhnlich war , da ihn die nachrichten meistens .\n", "Trg : there was a big smile on his face which was unusual then , because the news mostly depressed him .\n", "Pred: he looked very happy , which was pretty unusual , because he was going to put him in the .\n", "\n", "Validation perplexity: 11.837098\n", "Epoch 8\n", "Epoch Step: 100 Loss: 49.162842 Tokens per Sec: 19241.082862\n", "Epoch Step: 200 Loss: 35.163906 Tokens per Sec: 19633.028114\n", "Epoch Step: 300 Loss: 10.108455 Tokens per Sec: 17179.927672\n", "Epoch Step: 400 Loss: 12.883712 Tokens per Sec: 16510.876579\n", "Epoch Step: 500 Loss: 32.006828 Tokens per Sec: 16459.413702\n", "Epoch Step: 600 Loss: 21.056961 Tokens per Sec: 16640.683528\n", "Epoch Step: 700 Loss: 5.884560 Tokens per Sec: 16567.539919\n", "Epoch Step: 800 Loss: 17.562445 Tokens per Sec: 16529.548052\n", "Epoch Step: 900 Loss: 25.654568 Tokens per Sec: 16629.045928\n", "Epoch Step: 1000 Loss: 30.116678 Tokens per Sec: 16519.515326\n", "Epoch Step: 1100 Loss: 49.594883 Tokens per Sec: 16766.220937\n", "Epoch Step: 1200 Loss: 35.545147 Tokens per Sec: 16729.972737\n", "Epoch Step: 1300 Loss: 12.314122 Tokens per Sec: 16479.824355\n", "Epoch Step: 1400 Loss: 5.982590 Tokens per Sec: 16592.352361\n", "Epoch Step: 1500 Loss: 23.507740 Tokens per Sec: 16396.264595\n", "Epoch Step: 1600 Loss: 36.874157 Tokens per Sec: 16554.722618\n", "Epoch Step: 1700 Loss: 13.514697 Tokens per Sec: 16605.822594\n", "Epoch Step: 1800 Loss: 6.016938 Tokens per Sec: 16390.681327\n", "Epoch Step: 1900 Loss: 44.648132 Tokens per Sec: 16575.965569\n", "Epoch Step: 2000 Loss: 21.025373 Tokens per Sec: 16363.246501\n", "Epoch Step: 2100 Loss: 32.213993 Tokens per Sec: 16395.313089\n", "Epoch Step: 2200 Loss: 29.033810 Tokens per Sec: 16528.855537\n", "\n", "Example #1\n", "Src : als ich 11 jahre alt war , wurde ich eines morgens von den heller freude geweckt .\n", "Trg : when i was 11 , i remember waking up one morning to the sound of joy in my house .\n", "Pred: when i was 11 years old , i was a of the joy .\n", "\n", "Example #2\n", "Src : mein vater hörte sich auf seinem kleinen , grauen radio die der bbc an .\n", "Trg : my father was listening to bbc news on his small , gray radio .\n", "Pred: my father listened to his little , gray radio shack , the radio of the bbc .\n", "\n", "Example #3\n", "Src : er sah sehr glücklich aus , was damals ziemlich ungewöhnlich war , da ihn die nachrichten meistens .\n", "Trg : there was a big smile on his face which was unusual then , because the news mostly depressed him .\n", "Pred: he looked very happy , which was pretty unusual , because he was the news of the most famous .\n", "\n", "Validation perplexity: 11.868392\n", "Epoch 9\n", "Epoch Step: 100 Loss: 33.819195 Tokens per Sec: 16155.433696\n", "Epoch Step: 200 Loss: 26.771244 Tokens per Sec: 16447.243194\n", "Epoch Step: 300 Loss: 22.235714 Tokens per Sec: 16557.847083\n", "Epoch Step: 400 Loss: 16.233931 Tokens per Sec: 16802.777289\n", "Epoch Step: 500 Loss: 34.811615 Tokens per Sec: 16637.208199\n", "Epoch Step: 600 Loss: 11.960271 Tokens per Sec: 16478.541533\n", "Epoch Step: 700 Loss: 32.807648 Tokens per Sec: 16526.645827\n", "Epoch Step: 800 Loss: 25.779436 Tokens per Sec: 16572.304586\n", "Epoch Step: 900 Loss: 18.101871 Tokens per Sec: 16472.573763\n", "Epoch Step: 1000 Loss: 34.465992 Tokens per Sec: 16489.131609\n", "Epoch Step: 1100 Loss: 47.311241 Tokens per Sec: 16501.563937\n", "Epoch Step: 1200 Loss: 22.709623 Tokens per Sec: 16416.828638\n", "Epoch Step: 1300 Loss: 45.883862 Tokens per Sec: 16338.132985\n", "Epoch Step: 1400 Loss: 21.321081 Tokens per Sec: 16680.505744\n", "Epoch Step: 1500 Loss: 11.126824 Tokens per Sec: 16636.646687\n", "Epoch Step: 1600 Loss: 32.759712 Tokens per Sec: 16440.968759\n", "Epoch Step: 1700 Loss: 19.354910 Tokens per Sec: 16476.318234\n", "Epoch Step: 1800 Loss: 14.631118 Tokens per Sec: 16490.663260\n", "Epoch Step: 1900 Loss: 2.233373 Tokens per Sec: 16390.177497\n", "Epoch Step: 2000 Loss: 42.503407 Tokens per Sec: 16498.365808\n", "Epoch Step: 2100 Loss: 35.935966 Tokens per Sec: 16257.764127\n", "Epoch Step: 2200 Loss: 37.685387 Tokens per Sec: 16498.916279\n", "\n", "Example #1\n", "Src : als ich 11 jahre alt war , wurde ich eines morgens von den heller freude geweckt .\n", "Trg : when i was 11 , i remember waking up one morning to the sound of joy in my house .\n", "Pred: when i was 11 , i was a of joy .\n", "\n", "Example #2\n", "Src : mein vater hörte sich auf seinem kleinen , grauen radio die der bbc an .\n", "Trg : my father was listening to bbc news on his small , gray radio .\n", "Pred: my father listened to his little , gray radio shack the bbc of the bbc .\n", "\n", "Example #3\n", "Src : er sah sehr glücklich aus , was damals ziemlich ungewöhnlich war , da ihn die nachrichten meistens .\n", "Trg : there was a big smile on his face which was unusual then , because the news mostly depressed him .\n", "Pred: he looked very happy , which was pretty unusual since then , they were the .\n", "\n", "Validation perplexity: 11.886973\n" ] } ], "source": [ "model = make_model(len(SRC.vocab), len(TRG.vocab),\n", " emb_size=256, hidden_size=256,\n", " num_layers=1, dropout=0.2)\n", "dev_perplexities = train(model, print_every=100)" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAY8AAAElCAYAAAAcHW5vAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAIABJREFUeJzt3Xl8XHW9//HXJ1uTpkvapCmlC23pgoAtS1lboEUERQU38IKgoqKALCqoV+8VERVFEEEUQUQRVJYrcO8PZce20IWlLdCytbSldG+atkmbptk/vz/OmXY6TUlOOslJZt7Px+M8MnPmzJlPJu2853y/3/M95u6IiIhEkRN3ASIi0vMoPEREJDKFh4iIRKbwEBGRyBQeIiISmcJDREQiU3hIZGZ2jZl50rLJzGaZ2cdiqsfN7L87ad/XmFlT0v2ScN2Ezni9rtbK3zJ5+WuMdc0ws2fien1pW17cBUiP1QxMCW8PBq4E/mlmp7r70/GVlXZ/BB5Pul8C/AhYCiyMpaL0S/5bJtvY1YVIz6HwkA5z9xcSt81sOrASuBzYp/Aws17uXr+P5aWFu68GVsddx75oz/uZ/LcUaQ81W0lauPtWYAkwOrHOzPqa2a/NbJWZ1ZvZYjO7MPl5iWYhM5toZjPNrBb4ZfiYm9nVZnadmW0ws+1m9pCZDW6rHjM70sweN7Pq8HlPmtkhSY8fFtZ0dcrzHjazNWZWmlxfeHsk8G646b1JzTtTzex/zeylVuo4OdzmuPepdYaZPWNm55rZEjOrM7OXzOzoVrb9nJnNM7MdZlZpZneZ2YCkx0eGr3ehmd1qZhXAhrber7aY2Qoz+6OZfdPMVoav/6yZjU3ZrpeZ/SL8mzeY2TtmdqWZWcp2Q8L9rQv/DsvM7LpWXvdjZrbQzGrNbL6ZnbCvv4ukibtr0RJpAa4BmlLW5QHrgKfC+/nAHIIPrkuAUwhCoRm4KGVfzQTNQN8GpgHHhI85wbf+p4GPA18BKoE5Ka/twH8n3Z8E7Aif9yngDGAWsAnYP2m77wCNSa/3VaAF+HBrvyvQK9yfEzRdHRsu/YCPhusnpNR2H7CojfdzRvjeLQXOCV9jPlANDEra7hthfbcBpwFfBNYAs4GccJuRYR1rgb+FdX2yrb9l+PdLXSxpuxXh32JBWN85BEH6LtArabsHgAbgB8CpwK/Cen6WtE1puL914b+Nk4EvAX9MeU/WAq8B/wGcDrwMVAElcf8f0OIKDy3Rl1Y+cPYHbg8/JC4Mt/lC+EF3VMpz7ww/NHKS9uXAl1t5ncSHYPKH0xnh+lNTtksOj38DrwJ5Sev6EQTP9UnrcoDpwDvA4UANcFNrv2vS/cSH83kp2+WEH6S3Jq0rBeqAy9t4P2eQEjxAOUEAXhfe70MQJremPHdy+NyPptQ3M8Lf0veyJIf8irCe8qR1E8Ltvhbe/2B4/zspr3FH+NyS8P5PCUL7oDbekzpgRNK6I8L9nx33/wEtrmYr6bBcgg+ARoJvv+cB17j7neHjpxE0Y71iZnmJBXgS2A8Yk7K//7eX13nUd2+vfxSoJ/jGvwczKwJOJPgGTNLr1gJzgZ3NHu7eQhByZeFjy4Hvt/2r7ync153AeWENhPt24N527GKJu+/sgHf3CuB5dv2exxEE4H0p7+eLwLbk3yv0aITym4GjWlkeStnuubCuRI0LCYI3UeOJ4c+/pzzvPqAQSDTDnQLMcve326jrDXdfmXw//DmijedJF1CHuXRUM8GHhgNbgJXu3pT0eDkwniBcWlOadLvF3Sv3sl1F8h13dzPbCAzZy/YDCYLtunBJtSRlf6vMbCZwJnCH71tH/V0E3+Q/SxAYXwUecvct7XhuRSvrNhA0wUHwfkLQRNWa0pT7kfo53H1eOzbbW42Jv8WApHXJ1qc8XkoQem3Z7X1z9/qw66SwHc+VTqbwkA5r4wNnM7CY4IikNcnfOt/vugDlyXfCjtdBBE1frakiaC67ifDoI0Vdyv7OIgiOBcCPzexhd9/bvt+Xu28ws/8FLjSz5cDBBG367VHeyrrB7Po9N4c/zyHoG0mVOqy2M661sLca3wlvb0latyZpm/3Cn4nfoRIYmvbqpEup2Uo6y5PAAcAmd5/XylLTzv18wsx6Jd8n6LhudWipu28n+Hb+wb287uuJbc1sKEF7/J8JOuprgD+njgxKkTgy2du33zsImpB+QdAUNbPtXxGAccknHppZebifxO85O6xv1F5+r/fa+Tr74sSwrkSNE4CxSTU+F/78j5TnfY4gtF8O7z8DTDGzcZ1Yq3QyHXlIZ7kX+DIw3cxuJGivLgYOAo5198+2cz8tBCcf3kLwjfZ6YK67P/U+z/k2MNPM/gn8haC5ZTBB5/I77v7bMCD+QvBt+XJ3rzGz8wk6ai8DfrOXfW8g+AZ9jpktJugIXuzu28LH/03wTXwK8N12/o4QNO08bGY/DPf5Q4JRS7+GYCi0mX0PuNnMhgBPEfTjjCAY1fRbd58T4fV2Y2at9SFtc/c3ku5vAp4ws58QBPh1BB3pd4c1LjKzB4HrzKwAmBfW9nWCjv+qcD+/Bs4HZpjZtQRHqMOAE9z9ax39HaRrKTykU7h7o5mdCvwX8E2CD7kqgg+K+yPs6g8E3/L/RDDi6AnaaApy93nhh+E1BKPAigk+nF8gGL4K8C1gKsEHVk34vOfN7HrgejN7xt3fbGXfLWb2JYIPzqfC2qYRhE6iT+aRcP9/ifB7vkHQZ3ItwXv1GsGIsp3NUe5+m5mtJhhi/OVw9SqCb/Lv0nG5BAMGUr3I7gMTngBeJwjWQQRHQxe7e0PSNl8If4dLCAL7vbDem5J+j81mdjzBe3gtwUCA1UT7dyExM3ddhla6JzNz4Ifu/tO4a4nCzBYBb7v7We3cfgbBcOBTOrWwfWBmK4Bn3P2rcdci3YOOPETSIOyXOYLgZMZDATW/SEZTeIikxxCCM+o3A99199aagUQyhpqtREQkMg3VFRGRyBQeIiISWcb2eZSVlfnIkSPjLkNEpEeZP39+pbsPamu7jA2PkSNHMm9ee6brERGRBDNr12wFarYSEZHIFB4iIhKZwkNERCJTeIiISGQKDxERiUzhISIikWXsUN2OemNtNU+9sYGhA4o4e9LwuMsREemWdOSR4r6XVnLLs+/w9xdXxl2KiEi3pfBIMXVccJXN11ZXsammvo2tRUSyk8IjxfFjSinIzcEdnn+nMu5yRES6JYVHit4FeRwzeiAA0xdXxFyNiEj3pPBoxdTxQdPVc0s20tyi652IiKRSeLRi6vhgQskttY0sXF0VczUiIt2PwqMVo8uKGT6wCIDpizfGXI2ISPej8GiFmTEtbLqaqX4PEZE9KDz2ItF09drqaio1ZFdEZDcKj704bnQZBXnB2/PcEjVdiYgkU3jsRVFBLseOLgXU7yEikkrh8T6mhU1Xz7+jIbsiIskUHu8jcb5HVW0jr67SkF0RkQSFx/sYVVbMAaW9AZihUVciIjspPNqQGLI7Q/0eIiI7dWl4mNnPzOxdM9tqZhVm9g8zG5H0+BfMbJmZ1ZrZi2Z2ZFfW15qTwn6PRWuqqdhWF3M1IiLdQ1cfedwLHObu/YCRwErgfgAzmwL8HrgYGAA8BDxmZv26uMbdHDe6lF47h+xqll0REeji8HD3t929OrxrQAswPrx/IfCwuz/l7vXADUA98KmurDFVYX4uxx0YDNlVv4eISKDL+zzM7FwzqwZqgCuAa8KHJgLzE9u5uwOvhOvbu+9SMxtnZuOamprSVvPUcUHT1XNLNtLU3JK2/YqI9FRdHh7u/nd37w8MIQiOReFDfYHqlM2rgCjNVpcBi4HFFRXpO0pIDNndWtekIbsiIsQ42srd1wN3Av80s4HANqB/ymYlwNYIu72VoBlsfHl5eVrqBBhZVsyosmJAF4gSEYH4h+rmAcXA/sBrwBGJB8zMgMPC9e3i7pvcfYm7L8nLy0troYmJEjVkV0SkC8PDzHLM7FIzKw/vDwN+B6wA3iY4Cvm0mX3IzAqAK4FC4JGuqvH9JJqu3li7lYqtGrIrItmtq488TgdeN7PtwItALXCKuze5+yzgEoIQqQbOBk539yjNVp3mmFEDKcwP3q4ZmmVXRLJcl4WHu7e4++nuXu7uxe4+1N0/7+7Lkra5x91Hu3uRux/t7vPfb59dqTA/l+MPLANgppquRCTLxd3n0aMk+j2ee0dDdkUkuyk8Ipg6Luj32FbXxIKVGrIrItlL4RHBiNLejB6kIbsiIgqPiDTLroiIwiOyRL/HW+u2sr5aQ3ZFJDspPCI6etRAivJzAZi5RE1XIpKdFB4R9crLZfKYxCy7aroSkeyk8OiAk8J+j1nvVNKoIbsikoUUHh2QmKJ9W30T89/bEnM1IiJdT+HRAcMH9mZMeR9ATVcikp0UHh2UOPrQ1QVFJBspPDpo2kFBv8fb67exrnpHzNWIiHQthUcHTRo5gN4FwZBdNV2JSLZReHRQMGQ3mGVXTVcikm0UHvsgcbb57KWbaGjSkF0RyR4Kj32QuLpgTX0T897bHHM1IiJdR+GxD4aWFDFucDBkVxeIEpFsovDYR4mjD03RLiLZROGxjxL9Hks21LCmSkN2RSQ7KDz20aQDBlK8c8iujj5EJDsoPPZRQV4OU8Ymhuyq30NEsoPCIw0S/R5zllZS39QcczUiIp1P4ZEGiX6P7Q3NzFuhWXZFJPMpPNJgSP8iDtqvL6B+DxHJDgqPNDkpPPqYrn4PEckCCo80mRb2eyytqGH1ltqYqxER6VwKjzQ58oAB9O2VB2jUlYhkPoVHmuTn5miWXRHJGgqPNJp20K5ZdjVkV0QymcIjjU4aF/R77Ghs5qV3NcuuiGQuhUca7de/kA8M6Qeo30NEMpvCI82m7hyyq34PEclcCo80SwzZXb5xO6s2a8iuiGQmhUeaHTGihL6FiSG7OvoQkcyk8EizvNwcTghn2dXZ5iKSqRQenWDnLLvLKqlr1JBdEck8Co9OMHVc0Gle19jCixqyKyIZSOHRCcr7FXLI/okhu+r3EJHMo/DoJIkhuzPV7yEiGUjh0UkS/R7LK7fz3qbtMVcjIpJeXRYeZna9mb1hZlvNbK2Z3WlmA5Me/5KZtZhZTdJyX1fVl26HDy+hX6Fm2RWRzNSVRx7NwHlAKTARGAbcnbLNcnfvk7Sc04X1pVVebg4njNPZ5iKSmbosPNz9B+7+irs3uvtG4BZgajpfw8xKzWycmY1rampK5647JHG2+dxlmzRkV0QySqTwMLOXzexCM+uThtf+EPBayrrhZrbezFaZ2f1mNiriPi8DFgOLKyri/7Z/UnjkUd/UwgvLN8VcjYhI+kQ98pgOXAusM7O7zOyYjryomX0GuAi4Imn1c8AHgf2Bo4A64GkzK46w61uB8cD48vLyjpSWVoP69uKDQ/sD6vcQkcwSKTzc/bvAcOCLwH7AbDNbZGaXm9mA9uzDzM4C7gTOcPcFSfte7u5L3L3F3dcDFxIEybER6tsU7mNJXl5ehN+s8ySG7Op8DxHJJJH7PNy9yd0fdvePAQcADwPXA2vM7G9mdtTenmtmFwB3AJ9w9+ltvVS4WNQau5NEeKzYVMu7lRqyKyKZocMd5mZ2IHAp8DVgB/BHoJDgaOTqVra/HLgROM3dZ7fy+MfMbJgFBgK/AyqBFzpaY3dw2PABlPTOB3T0ISKZI2qHeS8z+7yZTSfomD4B+B6wv7tf7u6fAT4JXNnK028B+gHTk8/lSHp8KvASUAO8QTCk98PuXrPHnnqQ3BzjhLGJIbvq9xCRzBC1Y2A9QVPSX4FL3f2NVraZA+wxG6C7v2/zk7t/B/hOxHp6hGnjB/Hoa2t5YfkmdjQ0U1SQG3dJIiL7JGqz1beAoeFRRmvBgbtXuXvUIbYZ7cRwyG6DhuyKSIaIGh4n0srRipkVm9mf0lNS5inr04sJw4IhuzrbXEQyQdTw+CJQ1Mr6IuAL+15O5kpMlDhj8UbcPeZqRET2TdTwMII+j10rzAyYAqg3+H0khuyu3FzLcg3ZFZEerl3hEc5220wQHOvNrDmxAE3AQwSd6LIXE4eVMGDnkF3lrIj0bO0dbXUOwVHH3wmmFalOeqwBeNfdX01zbRklN8c4cdwg/u/VtcxYXMFXpmhMgYj0XO0KD3d/AMDM1gGz3T3+KWt7oKnjg/B4cflmahua6F3QPaZQERGJqs1mKzNLnmHwLWCgmZW3tnRemZnhxLGDMIOG5hbmLtOQXRHpudrT57EuKRjWA+taWRLr5X2U9unFhGElgIbsikjP1p52k5PZdcb4yaSMtpJopo0fxGurqnYO2Q0Gq4mI9Cxthoe7z0y6PaNTq8kCU8eXc/Mz77B6yw6WbdzOmPJ0XFdLRKRrRZ0Y8aq9rC80szvSU1JmmzC0P6XFBYBm2RWRnivqSYLfN7PHzWxQYoWZTQAWkObrkWeqnHDILuh8DxHpuaKGx+FAH2ChmZ0WXqPjRYKp1I9Id3GZKnG2+UvvbmZ7vUY9i0jPE/UytCuBk4D/AR4juLjTV9z9S+6uOTfaKXnI7hwN2RWRHqgjVxKcBpxF0FS1Hfi8mZWltaoMN6C4gMOGa8iuiPRcUTvMfwE8DtwJHAscBpQAi8zs1PSXl7mmhbPsztQsuyLSA0U98vg8waVhr3b3Znd/j+BStHcCj6a9ugyW6PdYU7WDpRU9+kq7IpKFoobHxOTzPgDcvcXdrwZOSV9Zme/Q/ftT1icYsqumKxHpaaJ2mG8GMLNSMzvGzHolPfZ8uovLZBqyKyI9WdQ+jz5mdh/BhZ/mAEPD9XeY2Y86ob6Mlri64MsrNlOjIbsi0oNEbbb6OTAaOAbYkbT+n8Cn0lVUtjhxbBk5Bo3NzuyllXGXIyLSblHD4wzgCnd/md0nSHyLIFQkgpLeBRw+YgCgpisR6VmihscgYEMr64sIrjQoEU0bn+j3qNCQXRHpMaKGx0JaH1X1eeDlfS8n+yT6PdZV17Fkg4bsikjPEPU6qNcA/zCzYUAucI6ZfYDgjPMPp7m2rHDwkH6U9elFZU090xdXMH6/vnGXJCLSpqhDdZ8APkkwv1UL8F/AAcBH3f259JeX+XJybOcJg5qiXUR6ishzW7n7M+4+1d37uHtvdz/B3f/dGcVli0R4zFuxhW11jTFXIyLSto5MjChpdsKYQeTmGE0tGrIrIj1Dm+FhZjvMrLY9S1cUnIn6987niBHBLLsasisiPUF7OswvZvdzOqQTTB1fzssrtjAjnGXXTCOfRaT7ajM83P3uLqgj600dP4gbnlzM+q11vL1+Gx8Y0i/ukkRE9qpDfR5mdryZfTVcjk93Udno4CH9KO8bzDOpWXZFpLuLOjHicDObC8wCfhkus8zsBTMb3hkFZguz5CG76vcQke4t6pHHnQRNXYe4+0B3HwgcQjA1yZ3pLi7bJM42n//eFrZqyK6IdGNRw+Mk4GJ3fyuxIrx9KXBiOgvLRlPGlpGbYzS3OLPe0ZBdEem+oobHOqC1C080A2qo30f9CvM58oDELLt6O0Wk+4oaHlcDN5vZiMSK8PaNwA/TWVi2Su730Cy7ItJdRQ2P/wImAcvNbLWZrQaWA0cD3zezNxNLugvNFtPCfo+KbfW8uW5rzNWIiLQu6qy693f0hczseuDjwHCgBvgX8L3EddHDbb4A/AgYAiwCLnH3+R19zZ7ooP36sl+/QtZvrWPG4o0csn//uEsSEdlDu8PDzHKB6cBCd6/qwGs1A+cBrwMlwD3A3QRXJ8TMpgC/J7ic7UzgCuAxMxvr7lnzFTwxZPf+l1cxY3EF35g2Ju6SRET20O5mK3dvBp4GBnTkhdz9B+7+irs3uvtG4BZgatImFwIPu/tT7l4P3ADUk4XXRk/0eyxYWUV1rYbsikj3E7XP4y1gWJpe+0PAa0n3JwI7m6g86C1+JVzfLmZWambjzGxcU1Nrg8J6hsljysgLh+w+v1QnDIpI9xM1PK4CbjCzo8NmrA4xs88AFxE0TSX0BapTNq0CokzydBmwGFhcUdFzh7r2Lcxn0sjgAO9305exo6E55opERHYXNTweJRhtNReo68iU7GZ2FsHZ6Ge4+4Kkh7YBqb3DJUCU/o5bgfHA+PLy8ghP634uO3ksZvDWuq381yOLNGxXRLqVqKOtLtqXFzOzC4BfAZ9w99kpD78GHJG0rQGHAQ+3d//uvgnYBDBp0qR9KTV2k8eUcdWp47nhycU8/MoaJg4v4YvHj4y7LBERIGJ4uPtfOvpCZnY5wTDc09z95VY2uRN4wsz+AjwPXA4UAo909DV7uotPOpDXVlXx1Jsb+Mk/3+SQ/fsxaeTAuMsSEYk+JbuZlZvZlWb2ezMrC9dNNrNRbTz1FoL+i+lmVpNYEg+6+yzgEoIQqQbOBk7PpmG6qXJyjF+dPZHRZcU0tTiX/G0BFVvr4i5LRCTylOyHA28DFwBfYVdn9oeBn77fc93d3D3f3fskLynb3OPuo929yN2PzrYTBFvTtzCfO84/kt4FuVRsq+cbf19AY3NL3GWJSJaLeuTxK+AP7n4owTkYCU8Ck9NWlexm7OC+3PDZYMTyyyu28LN/vdXGM0REOlfU8DgC+GMr69cCg/e9HNmbj00YwtdOHA3A3XNW8Mgrq2OuSESyWdTwaAKKW1l/ILC5lfWSRt89bTzHjS4F4PsPL+LNtVnbHSQiMYsaHk8A3wmH0QK4mQ0AriU4B0Q6UV5uDreeezhD+hdS19jCRX+dr+lLRCQWHTnD/EhgGcEw2oeAdwlO5vtBekuT1pT16cXvzzuSgtwcVm6u5ZsPvEJLi04gFJGuFTU8tgBHERxp3AG8AFwJTEqeWl0612HDS/jxmYcAMH3xRm559p2YKxKRbNOukwTNbCDwF+AjBIHzAvB5d1/ReaXJ+znn6BG8urKKB+at4pZn32HCsP586AMasyAiXaO9Rx4/A44hOEP8OwQjq27vrKKkfX585iFMGBZMB/bNB15lReX2mCsSkWzR3vD4KPAVd7/O3W8iuIDTKWYWdW4sSaPC/Fx+f96RDCwuYFtdExf9dT61DT13KnoR6TnaGx5D2f1aG28CDcD+nVGUtN/QkiJuPedwcgzeXr+N7z+sGXhFpPO1NzxygdQxoc3heonZ5DFlfPcjBwHwf6+u5c+zV8RbkIhkvCjNTveZWUPS/ULgz8nX8XD309NWmUTy9RNH89qqKh5/fT3XPfYWhw7tz9GjNAOviHSO9h55/AVYBWxIWv5KcI5H8jqJiZlxw1kTOXDQrhl4N2gGXhHpJJap7eOTJk3yefPmxV1Gl1taUcMnfzebmvomjhhRwv1fO46CvMgz74tIljKz+e7e5tX09KmSYcaU9+HGsyYAsGBlFT/915sxVyQimUjhkYE+cugQLjrpQADumfseD83XDLwikl4Kjwx11anjmDKmDIAfPLKI19dUx1yRiGQShUeGysvN4TfnHM7QkiLqm1q4+G/zqaptaPuJIiLtoPDIYAOLC/j9eUdQkJfDqs07uPz+V2nWDLwikgYKjww3YVgJPz3zUACeW7KRm59ZEnNFIpIJFB5Z4OyjhnPO0SMAuPXfS3n6TZ2SIyL7RuGRJa4542AmDi8B4NsPvMq7moFXRPaBwiNL9MrL5fbzjqC0uIBt9U18/d55bK/XDLwi0jEKjywypH8Rt54bzMC7ZEMN33tooWbgFZEOUXhkmeMPLOP7H/0AAP9cuI67Zr0bc0Ui0hMpPLLQV08YxccmDAHg54+/zdxlm2KuSER6GoVHFjIzfvmZCYwt70Nzi3PZfQtYV70j7rJEpAdReGSp4l553H7+kfTtlUdlTQMX/3UB9U3NcZclIj2EwiOLHTioD786eyIAr66q4tpHNQOviLSPwiPLnXrIflw6bQwAf3txJQ/OWxVzRSLSEyg8hG99eBwnjhsEwH//7+ssWq0ZeEXk/Sk8hNwc45bPHcawAUU0NLVw0V/ns2W7ZuAVkb1TeAgAA4oLuP28I+mVl8Oaqh1cfv8rmoFXRPZK4SE7HTq0Pz/71AcBeP6dSn711OKYKxKR7krhIbv57JHDOO/YYAbe22Ys48k31sdckYh0RwoP2cPVHz+Ew0cEM/Be+eBrLNtYE3NFItLdKDxkDwV5Ofz+80dS1qeAmvomvn7vfCq21cVdloh0IwoPadV+/Qv57blHkJtjLK2o4eQbZ/KH55bR0NQSd2ki0g0oPGSvjh1dyk1nT6RvYR419U1c99jbfOTm55j+dkXcpYlIzLo0PMzsP8zseTPbamZNKY9NNTM3s5qkZU5X1id7OvOwocy4airnHD0cM1heuZ0L7n6ZC/78EsvVFyKStbr6yGMLcBvwzb083uzufZKW47uwNtmL0j69+PmnJ/DopVM4auQAAKYv3shpNz/HdY+9xba6xpgrFJGu1qXh4e5Puvt9wPKufF1Jj0OH9ufBrx/Hb845nCH9C2lsdv7w3HKm3TiTB+etokUnFYpkje7W55FrZqvMbL2Z/cvMJkZ5spmVmtk4MxvX1KTrc3cGM+OMifvz7JUncfnJYyjIy6Gypp7v/mMhn7ptNgtWbom7RBHpAt0pPN4GDgNGAQcBC4F/m9n+EfZxGbAYWFxRoU7dztS7II9vnzqeZ799Eh89dD8AXltdzadvm8O3H3iVDVs1tFckk5l71zc1mNlU4Bl3z2tju3eAX7j7Xe3cbylQCjBx4sTFr7766r6WKu00Z2klP370TRZv2AZA74JcLj15DF+ZMopeebkxVyci7WVm8919Ulvbdacjj9a0ANbejd19k7svcfcleXnvm0uSZsePKeNfl0/h2jMPoX9RPrUNzfzyicWc+uvnePrNDcTxJUVEOk9XD9XNNbNCoCC8XxguZmYnm9kYM8sxsz5mdg0wGHiyK2uUjsvLzeELx41kxlVTOf/YA8gxeG9TLRfeM48v/OklllZsi7tEEUmTrj7yOB/YQRAIueHtHcABwETgWWAbwWisY4EPu7subdfDDCgu4CefPJR/XX4Cx44eCASz9J528/P8+NE3qN6hob0iPV0sfR5dYdKkST5v3ry4y8h67s7jr6/nZ/96izVVOwAYWFzAVaeO53NHDSc3p92tkiLSBTKlz0P5PCIUAAAMbElEQVR6ODPj9A8O4dkrT+Jbp4yjMD+Hzdsb+MEjizjjt7N4ecXmuEsUkQ5QeEiXKMzP5YpTxvLslVP5+IQhALyxditn3T6Xy+57hbXhUYmI9AwKD+lSQ0uK+O25R/Dg14/j4CH9AHj0tbV86Fcz+c2z71DX2BxzhSLSHgoPicXRowby6GVTuO5TH2RgcQE7Gpu56eklnHLTTB5ftE5De0W6OYWHxCY3xzj3mBFMv3IqF0weSW6OsXrLDi7+2wLOvfNF3l6/Ne4SRWQvFB4Su/698/nRJw7hiStO4ISxZQDMXb6J0295nqv/73WqahtirlBEUik8pNsYO7gv93z5aP5w/pGMGNibFod75r7H1BtncO/cFTQ16yqGIt2FzvOQbqmusZm7Zr3L76YvpbYh6EQfVVbMtPHlTB5TyjGjS+nTS1PQiKRbe8/zUHhIt7a+uo7rn3ibR15Zs9v6vBxj4vASJh9YyuQxZRw+YgAFeTqQFtlXCg+FR0ZZuLqKxxatZ/bSSl5fW03qP9ui/FyOGjWQKWNKOf7AMg4e0o8cnb0uEpnCQ+GRsapqG3hh+SZmLa1kztJNLK/cvsc2A3rnc/yBZRw/ppTJB5ZxQGlvzBQmIm1ReCg8ssbaqh3MXlrJnGWbmL20kopt9XtsM7SkiMljgiau4w4spbxvYQyVinR/Cg+FR1Zyd5ZW1DB7aSWzl23ihWWb2Fa/5yWJxw/uu/Oo5JjRA+lbmB9DtSLdj8JD4SFAU3MLi9ZU7zwqmffeFhqadh/ym5tjTBzWn8ljysLO9xJd/VCylsJD4SGtqGtsZt6KLcxeVsnspZUsWrNn53thfg5HjRzI5DFlTBmjznfJLgoPhYe0Q3VtI3OXb2LOskpmLa1k+cY9O99Leudz3OjSnUcmI9X5LhlM4aHwkA5YV72DOUs3hX0mlWzYumfne7/CPIYO6M3QkiKGDQiWoSVFDA1/DiwuULhIj6XwUHjIPnJ3lm3cHgTJ0krmLt/Etro9O99TFeXn7gySoUnhEvzsTXnfXmoGk25L4aHwkDRrbnHeWFvNso01rN68gzVVwbJ6S/AztSN+bwpycxhSUhiES0kRwwb03hk2wwYUsV//QvJzdba8xKO94aHJgUTaKTfHmDCshAnDSvZ4rKXFqdxez5otu8JkzZZEuNSyZssOtodzdDU0t/Deplre21Tb6uvkGOzXrzDl6KX3bk1jhfkaDSbxUniIpEFOjlHet5DyvoUcPmLAHo+7O9U7Glm9R7jU7rxfVdsIQIvD2uo61lbX8TJbWn29sj4FDCwuoG9hPn0L85J+5tEv6XbfXrs/3q8onz698shVs5nsI4WHSBcwM0p6F1DSu4BDh/ZvdZua+ibWJh2prA4DJhEuG5POnK+saaCypuPXOenTK29XwEQNoMJ8+hQqgLKdwkOkm+jTK49xg/sybnDfVh+va2xmXXXdziOWqtpGttU1sa0u+Lk16fa2+sRjTTS37NmvWVPfRE19E+uqO15vcUEufQvz6Ve0K2z6Fe0KmOTbicf6FebTL7zdKy9Ho9J6MIWHSA9RmJ/LqLJiRpUVt/s57s6OxuadIbM1DJSdIVO3K2S2trIucbuplQDa3tDM9oZmOnq14Pxcaz1kwkDqGwZN33CbXbfDo6BeeZ02aq2lxWlxp8UJfwa3m1scb+22Oy0tjofbO+x8DIKf7uA4LS3BT09e57tvH+xn1zrHIbEuafvE66RuD84JYwdR3InXvFF4iGQwM6N3QR69C/IY3K9jk0G6O3WNLUnhkxo44ZHPjsad67bu2BVGiZ+pGpudTdsb2LS9Y81vZsHRWr/CfIp75eLhh3jiA7y5ZffbiQ/X5A/65qRgSA6MTDDjqqkKDxGJj5lRVJBLUUEu5f06to/mFqemPjz62bHrKCgInF2h1FrobN3RyNa6Rhqbd/9Ud2dniGWrHAv+PjkGhoGx83ZnZ6DCQ0Q6XW6O0b8on/5F+bDnYLQ2uTv1TS1hkOwZLLX1zZhBjhm5ObbzQ3W322bk5ATb7FqCkXKt3c412/nBnJuz++0cMyzpdk742skf5GbsrMnCdTlG+AFvGLt/8FsOe64L95F47s7HukFfkcJDRLo9M6MwP5fC/I4f/Uh66TRWERGJTOEhIiKRKTxERCQyhYeIiESm8BARkcgUHiIiEpnCQ0REIsvYi0GZ2UbgvQ4+PRcYDGwAmtNWVM+k92J3ej92p/djl0x5Lw5w90FtbZSx4bEvzGwcsBgY7+5L4q4nTnovdqf3Y3d6P3bJtvdCzVYiIhKZwkNERCJTeLRuE/Dj8Ge203uxO70fu9P7sUtWvRfq8xARkch05CEiIpEpPEREJDKFh4iIRKbwEBGRyBQeIiISmcJDREQiU3iIiEhkCg8REYlM4ZHEzHLN7AYz22hm28zsITMri7uuOJjZ9Wb2hpltNbO1ZnanmQ2Mu664mVmOmc0xMzezYXHXEyczO8XMXjCzGjOrNLPb4q4pLma2n5k9EH52bDGzf5vZxLjr6kwKj939J3AmcAyQ+GC4N75yYtUMnAeUAhMJ3o+74yyom/gWUBt3EXEzs6nAP4AbCf6NDAP+GGdNMbsNGAiMI5iWfR7wTzOzWKvqRJqeJImZvQdc6+53hfcPBJYCI929o9cGyQhm9hHgQXfvF3ctcQmn3H4c+AzwCjDc3VfHW1U8zGwuMNPd/zPuWroDM1sI/Nbd/xDeHw+8DQxy98pYi+skOvIImVkJMAKYn1jn7suArQTfvLPdh4DX4i4iLmaWA/wJuAqoirmcWJlZMXA0kGdmC8ImqxlmNinu2mJ0A/AZMxtkZoXA14BZmRocoPBI1jf8WZ2yvgrI2m/bAGb2GeAi4Iq4a4nRFcB6d38k7kK6gQEEnx3nAF8C9geeAh4Lv4Rlo9kEVxKsAGqATwMXxlpRJ1N47LIt/Nk/ZX0JwdFHVjKzs4A7gTPcfUHc9cTBzMYAVwKXxl1LN5H4v/Jnd1/o7g3Az4F84Pj4yopHeFT6DLCE4POjN/Az4HkzGxxnbZ1J4RFy9ypgJXBEYp2ZjSY46lgYV11xMrMLgDuAT7j79LjridEUYBDwuplVAokQXWhml8RXVjzcvRpYAaR2mHor67LBQGAUcKu7b3X3Bnf/I8Hn63HxltZ5FB67+wPwPTMbZWb9gOuBJ919RbxldT0zu5xgJM1p7j477npi9iBwIHBYuJwerj8VuCeuomJ2G3CBmR1sZnnAd4B6YE68ZXW9sF9jCXCJmRWbWZ6ZfZmgKTxjv3jmxV1AN/MLgvbcl4FewNMEw1Wz0S1AEzA9ebShu/eJraKYuHstScNzww9LCPpAauKpKnY3Enw4/hsoJBh99tHwqCQbfZKg0/w9gua7pcBZ7r481qo6kYbqiohIZGq2EhGRyBQeIiISmcJDREQiU3iIiEhkCg8REYlM4SEiIpEpPER6CDP7kpnVxV2HCCg8RNrFzO4OLwCVumTllOwiOsNcpP2mA+emrGuOoxCRuOnIQ6T9Gtx9fcqyEcDMVpjZtWb2p/DSvRvN7CfJV5Izs/5mdld4/Ys6M5ttZrtNnGdmY8PLH28xs1oze8XMpqVsc4KZvRo+/pKZHd41v77ILgoPkfT5JsHMzJMILhp1JXBx0uN/Bk4CPgccCSwDnkxM221mQwiuC9GbYPLFCcBPU14jP1z3jXAfVcD94bTgIl1Gc1uJtIOZ3U0wSWZqh/Uj7n6+ma0A3nX3aUnP+SXwaXcfY2ZjCWZe/bC7PxM+nphA7x53/6GZ/RS4ABjj7jtaqeFLBAE00d0XhusmA7PQpZKli6nPQ6T95gBfTlmXPKvu3JTHZgNXhZcl/QDBtS5mJR5098bwWuAHh6uOILh06R7BkaQJeD3p/trw52CCGV1FuoTCQ6T9at19acw1NLt7S9L9RNOBmq2kS+kfnEj6HJty/3iCpqw64E3ACK5KCOxstjoOeCNctQCYbGZFXVCryD5ReIi0X4GZ7Ze6JD0+ycx+aGbjzOx8gmue/xogPGJ5GLjdzE42s4OBuwguPva78Pm3EVxY6WEzO87MRpvZmamjrUS6AzVbibTfNGBd6srwCALgZmAMMB9oILga421Jm34ZuAn4H6A43O40d98A4O5rzWwK8EvgSSAXeJtg1JZIt6LRViJpEI62ut3dfxF3LSJdQc1WIiISmcJDREQiU7OViIhEpiMPERGJTOEhIiKRKTxERCQyhYeIiESm8BARkcgUHiIiEtn/B1CUqx3mK+1vAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plot_perplexity(dev_perplexities)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Prediction and Evaluation\n", "\n", "Once trained we can use the model to produce a set of translations. \n", "\n", "If we translate the whole validation set, we can use [SacreBLEU](https://github.com/mjpost/sacreBLEU) to get a [BLEU score](https://en.wikipedia.org/wiki/BLEU), which is the most common way to evaluate translations.\n", "\n", "#### Important sidenote\n", "Typically you would use SacreBLEU from the **command line** using the output file and original (possibly tokenized) development reference file. This will give you a nice version string that shows how the BLEU score was calculated; for example, if it was lowercased, if it was tokenized (and how), and what smoothing was used. If you want to learn more about how BLEU scores are (and should be) reported, check out [this paper](https://arxiv.org/abs/1804.08771).\n", "\n", "However, right now our pre-processed data is only in memory, so we'll calculate the BLEU score right from this notebook for demonstration purposes.\n", "\n", "We'll first test the raw BLEU function:" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": [ "import sacrebleu" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "100.00000000000004\n" ] } ], "source": [ "# this should result in a perfect BLEU of 100%\n", "hypotheses = [\"this is a test\"]\n", "references = [\"this is a test\"]\n", "bleu = sacrebleu.raw_corpus_bleu(hypotheses, [references], .01).score\n", "print(bleu)" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "22.360679774997894\n" ] } ], "source": [ "# here the BLEU score will be lower, because some n-grams won't match\n", "hypotheses = [\"this is a test\"]\n", "references = [\"this is a fest\"]\n", "bleu = sacrebleu.raw_corpus_bleu(hypotheses, [references], .01).score\n", "print(bleu)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since we did some filtering for speed, our validation set contains 690 sentences.\n", "The references are the tokenized versions, but they should not contain out-of-vocabulary UNKs that our network might have seen. So we'll take the references straight out of the `valid_data` object:" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "690" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(valid_data)" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "690\n", "when i was 11 , i remember waking up one morning to the sound of joy in my house .\n" ] } ], "source": [ "references = [\" \".join(example.trg) for example in valid_data]\n", "print(len(references))\n", "print(references[0])" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\"i 'm always the one taking the picture .\"" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "references[-2]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Now we translate the validation set!**\n", "\n", "This might take a little bit of time.\n", "\n", "Note that `greedy_decode` will cut-off the sentence when it encounters the end-of-sequence symbol, if we provide it the index of that symbol." ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [], "source": [ "hypotheses = []\n", "alphas = [] # save the last attention scores\n", "for batch in valid_iter:\n", " batch = rebatch(PAD_INDEX, batch)\n", " pred, attention = greedy_decode(\n", " model, batch.src, batch.src_mask, batch.src_lengths, max_len=25,\n", " sos_index=TRG.vocab.stoi[SOS_TOKEN],\n", " eos_index=TRG.vocab.stoi[EOS_TOKEN])\n", " hypotheses.append(pred)\n", " alphas.append(attention)" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 70, 11, 24, 1460, 5, 11, 24, 9, 0, 10, 0,\n", " 0, 1806, 4])" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# we will still need to convert the indices to actual words!\n", "hypotheses[0]" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['when',\n", " 'i',\n", " 'was',\n", " '11',\n", " ',',\n", " 'i',\n", " 'was',\n", " 'a',\n", " '',\n", " 'of',\n", " '',\n", " '',\n", " 'joy',\n", " '.']" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hypotheses = [lookup_words(x, TRG.vocab) for x in hypotheses]\n", "hypotheses[0]" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "690\n", "when i was 11 , i was a of joy .\n" ] } ], "source": [ "# finally, the SacreBLEU raw scorer requires string input, so we convert the lists to strings\n", "hypotheses = [\" \".join(x) for x in hypotheses]\n", "print(len(hypotheses))\n", "print(hypotheses[0])" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "23.4681520210298\n" ] } ], "source": [ "# now we can compute the BLEU score!\n", "bleu = sacrebleu.raw_corpus_bleu(hypotheses, [references], .01).score\n", "print(bleu)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Attention Visualization\n", "\n", "We can also visualize the attention scores of the decoder." ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [], "source": [ "def plot_heatmap(src, trg, scores):\n", "\n", " fig, ax = plt.subplots()\n", " heatmap = ax.pcolor(scores, cmap='viridis')\n", "\n", " ax.set_xticklabels(trg, minor=False, rotation='vertical')\n", " ax.set_yticklabels(src, minor=False)\n", "\n", " # put the major ticks at the middle of each cell\n", " # and the x-ticks on top\n", " ax.xaxis.tick_top()\n", " ax.set_xticks(np.arange(scores.shape[1]) + 0.5, minor=False)\n", " ax.set_yticks(np.arange(scores.shape[0]) + 0.5, minor=False)\n", " ax.invert_yaxis()\n", "\n", " plt.colorbar(heatmap)\n", " plt.show()" ] }, { "cell_type": "code", "execution_count": 71, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "src ['\"', 'jetzt', 'kannst', 'du', 'auf', 'eine', 'richtige', 'schule', 'gehen', ',', '\"', 'sagte', 'er', '.', '']\n", "ref ['\"', 'you', 'can', 'go', 'to', 'a', 'real', 'school', 'now', ',', '\"', 'he', 'said', '.', '']\n", "pred ['\"', 'now', 'you', 'can', 'go', 'to', 'a', 'right', 'school', ',', '\"', 'he', 'said', '.', '']\n" ] }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZUAAAEhCAYAAAC3AD1YAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAIABJREFUeJzt3XmYXFWd//H3J52EAFmAhM1AIAgBRcUdHVwQUXDHUQcVUAZFGR3UnzCKimwiojhuzKgDqFHEFVHABZElCq64sChKhBACkS1AEgJIku7v749zmtwUvdzqul23qvJ5Pc99quou556qdOrU2b5HEYGZmVkVJtSdATMz6x0uVMzMrDIuVMzMrDIuVMzMrDIuVMzMrDIuVMzMrDIuVMzMrDIuVMzMrDIuVMzMrDIT686AWa+SdEaZ8yLibeOdF7N2caFiNn4m1Z0Bs3aTY3+ZmVlVXFMxaxNJAp4JbA8sAa4K/6qzHuOailkbSNoeuBB4HHAXsBXwV+CVEbGkzryZVcmjv8za47PAVcAWEbE9MBP4LfC5WnNlVjHXVMzaQNJdwA4R8VBh3ybA4ojYqr6cmVXLNRWz9vgnMKNh3wxgdQ15MRs3LlTM2uP7wPcl7SNpJ0n7AOcC36s5X2aVcvOXWRtI2hj4DPAmYCPgYeBrwHuKTWJm3c6Filkb5WHFWwJ3ezhx55P0fGCriPhu3XnpFm7+MhuBpA8Ns/8DY0hrcJ7K84Bn5NfW2T4K/K8kf1eW5JqK2QgkrYyI6UPsvzcitmgiHc9T6TKS5gB/BP4OnBgRF9Wcpa7g0tdsCJIeI+kxwARJ2w6+ztvzSX0izfA8le7zBtIAi3PycyvBNRWzIUgaAIb6zyGgH/hwRJzaRHqep9JlJP0JOAq4Pm/bRkSzPyY2OI79VQNJT4+I39edDxvRXFIBcjWwR2H/AKmT/Z9Npjc4T6U40muDnaci6bgy50XESeOdl6FIejywDXB5RISkq4GX4yHgo3KhUo9LJPUDlwOXAJdExI0158kKIuKW/HSzipIcnKfyIWAxsCPwETbcL6nnFp6LNHjhDuAWYAfSF/rPa8jXoDcC5xZG6H0779tQ/71Kc/NXDST1Ac8CXgjsSxoRdAfws4g4vM682aNJejbwdGBacX9EnNJEGp6nMgxJnwKWAR8b/BLPo+tmRcRRNeXpJuCQiPhVfj0LuBl4TETcX0eeuoULlZpJ2gN4NfAeYOOI2KjkdX9n6DZ/ImLeGPIxjUd/af6j2XR6jaQTgA+SmsEeKByKiNhnDOl5nkoDScuAbSJibWHfROCOiJhVQ362Bo6LiHc27P8wcGFEXN3uPHUTFyo1kHQoqYbyQtIvtEvztqDsryBJb27YNRt4K3BWk7+gnw18FXhscTfpS7OvbDq9StIdwKsi4rd156VXSboVeEXxy1rSU0hf4NvVlzMbCxcqNcgji/4OfBj4XkT0V5TuE4FPRsR+TVxzLalf5yzW/yVe7FfYYEm6kzTqZ6DFdOYBpzN0M9rkVtLudrmp693A/7Guv+ltwOnN/EAaT5L2Bvoj4oq689LpXKjUIAcT3DdvuwBXAD8j9an8rYV0JwDLh5qsN8I19wPT3RQzNEkfJQ37PbPFdH4F3AbM59GFd50d0h1B0iHAIcB2wFLg7Ij4Wo35uRg4OSJ+IendwMdIQ8mPi4hP15WvbuBCpWaSNgfeCRwNTCvb5JQn5hVtChxGakZ4QhP3vwT4z1YKs14j6Wes66+aQBqZdANwe/G8iHhxE2muBGZGxJqq8mnjJ88rmh0RayT9GTgCWA78ICJ2rjd3nc1DimsgaRvW1VReCMwCfkVqhirrNtbvqBep6aCxr2U0lwIXSPoiaQTaIyLiG02m1SuubHhdRZPH30ihWZZWkFbXk7RNRNyRnzf+QHpEjYNFJucCZWtSQMkrASR5ouooXFOpgaS1pNFEl5IKkiuanUwnaYeGXfdHxL1jyMvNwxyKiNip2fRsHUn/Uni5B6nA/wSPLrx/1c58dYJiTLVhohfUOlhE0h9JoXUeC8yLiNdLmgn81REQRuZCpQaSthhLAdDpJE0FXgZsD9wK/CgiVtWbq9ZIet4whx4Gbhn8tT3MtWU69zfIUXaSto+IW/Pzxh9Ij6hrsIikfUmjIh8mjf67TtKbgAMj4mV15KlbuFCpSatfwHm+w3tJw4gH0zgL+HSrI5XGQtLupMEG/awbwdMHvDgi/tzu/FRF0hpSv0oxTH3xP83PgYMiYr3+lg2JpJ2A15P6IN4paVdgYkT8peasNU3S1hFx5zDHJgG4X2xkLlRqUMUXcA73cRjwceAmUjX9fcD8iDi5ibxsDBxL6tvZksKXZzPNX7lz+0rgpBwrSTndvSPihWXT6TR5VNLLSBMgB0OInAxcDPwa+BSwKiIOHCWdzYHVEfFAYd+mwKSIWD5O2R93kl4EnEcKObR3REyXtBdwbES8pIl0dgP25tF/g22N/SVpBfBnclidiLipnffvCRHhrc0bqUA5nnWFukhzVi5tIo0bgd0a9u0KLGoyL18k/Sd6J7AqP/6V9KXQTDr3kL4gi/smAfeM4fOZChxIGhF3IDC1xn+rmxvvT5pncnN+vjVp5vdo6VwBPLNh357Az+v8W6zg8/kDsH9+fl9+3Bi4s4k03kBqZrqq8LiaFBOv3e9nI1LgyLOAO/P/jY8AT6v7s+6WzTWVGki6hxSWYk1h3yTSl9PMkmncC2w9RBp3RnOLRy0FnhsRiyQtj4jNcoTW06OJGkaOlbR/RPy9sG8X4OKImNtEOh3VjJb/reZFxD2FfbOAhRGxRdm5Qfnfa1YUmiZzDLi7m/n36jSDfzP5+SMLlzWziFkesntSRHxH0n0Rsbmkw0g/mt43frkfNV8TgL2AA/I2ETgf+AEp+kXbm5m7gRfpqsdy0pdl0Y7AyibS+BPwXw37jiaNKmvG1IhYlJ+vljQ5Iq4HntFkOl8FfiTpMEn75C+FC0mT/ZrxGdLM6jkR8VxgDvAF0kicOvyIFF34+ZLm5pnV5wI/zMf3JDWLjeafwCYN+zYFur19/lZJ682LyvHsFjeRxhygcQ34r5EmQ9YmIgYi4oqIOCoiHgu8glQj/29Sc6gNpe6q0oa4AccBC0l9Ivvkx78BxzeRxpNYFyr8F/nxDuBJTeblauBx+fkvSJO8Xg/c2mQ6fcAHSJMEH8yPHyB12DaTTmXNaBX9W00FvkwqFAby45fJTWKkdVceVyKd75BWeZyQX4tUgJ5X999ji5/P4flv+WBgBfAaUpPRIU2ksQTYLD//G2nJ5a2AlTW9py2Bo4fYf2jh/8qkduapmzY3f9UgN3u8j/RHOjhyaz5wWhQitY6Sxvak/8SNI8iaqe0g6UBS881Pc6fr94HJwH9ExJeaSGdO8SXrRkg9HMOMphkmnUqa0aqWm0JmActiDM0eedjsZcAUYBGwE6nf4AURsbjCrLadpMOBd5EK2MXAZyLirCau/zJprtZXJJ0EvIVUg/tdRPzbOGR5tPxMIE0u3j8irs37ppEmru4UEcvanadu4kKlRvkPdTqFIapRcgZxHup6MalD8YIYY1DKPFLn9shNYLlfZh4wI5qYlDfMBLZBDwPfIq0dMmKhp7Qi4MHAqazrU3kf8K2IOKGJ/Ig03HqoUW1Nh6yvQh5p93LSe7qdFEy0J9ZSUUVLJ+R/tzfktL5W1+cj6dPAQxHxwfz6IOBN0USw1g1W3VWlDXEDnk1qHuovbAOkKKhl03gscArpF9XtpIB3jx1DXv5M+vXVmPZ1TaZzGGlY6T75+n1IEQPeDuwH/A44o0Q6Exi6Ga2vyfycQmoOPI0UwPG0/PpTJa79U+H530nNO4/amszPyeTRX8CL8ntbRRqAUPvfZM1/yx332ZAWzrup8Pp84M11f97dsLmmUoMqw83nqvrLSF/qLyHNFTmT9Ct41Ka0YriMhv33R8S0oa4ZJp2/Ac+JQtOApC1JzRq75Qlyv4hR1seQ9NeIeNwQ+6+LiCc2kZ/FpJnQ1xRGFD0LeF9E/Oso174xIr6htFDUUaRmxkf9Yo6IrzaRn1uB3SNipaRfkDqmV5KCeTY7KKJjVPG33KmfjdJCeG8k/YhYAmwXXvVxVC5USpB0OblpJypoOlHF4eYlTQH+jfQFOJe08Nck4C0RcfEo195EatdfUti3A2n+xI5N5GE5aUZ1cXLfVOC2WDfkdNSCarhzBguGJvJTjC21jDT8ur+VdFohaUVEzMgTHv9Bili8ttn8VJCPjvtb7pTPZoh8nUQaqHEN6QfKiD9GLHGU4nLmV5zeb0kTFVsKNy/paaR+g9eTZtV/HjgnIlZJOhj4CmlFyJF8Hzhb0ttJTT275HTOazI7VwDzJR1NGjQwh9QvckXO6xNpCKTY8F4Gh2hOLDwftHNOsxlLJc3JheUi4CW5cGl2CO/vJT0pcodtC+7Js8afAPw2f2lu3GKawCNBQS8jrfUxWhTk+VXcs6CKv+Vx+2xa9E3SnKnHk2piVoILlXKqrs61HG5e0tWkEUTfBl4UEb9vSOfrkj5VIqnjSUNkr2fd+zyXNMO/GW8FvkGagT6YzgJS8wGkvpLDR7j+RflxUuE5pPb5O0jNe834AvA0UrPFp0kT1kR6v824HLhQ0hmkYduPjPwq+2+VfYY0+xzgoPz4PNLn3qqvkkYAXgrsNsq5Lf8tS3pj4WUVSyeM52cDgKQfRZOBICPir/mHyLOAV1WVl17n5q8ScpMBpIiyVTQZtBxuXtIRpFpJJW28uf9jR9Iqh3e3kM5sUu1oaYlfzUNdf3pEHDnW+4+Q7nakuSVN/aKu4t+qkNYuwNqIuDm/nkdat6NtkQKq+Fse4TMpaurzGe/PRtIXIuI/xnDda0nNunVNvu06LlTMzKwyDtNiZmaVcaFiZmaVcaFiZmaVcaFiZmaVcaFiZmaVcaFiZmaVcaFiZmaVcaEyRpJmSjpBUqnlf8crDafTXel0Ul6cTvvS2aDUHSa5WzfSmiNBWr+8tjScTnel00l5cTrtS2dD2lxTMTOzyjhMyzD6pm0aE2cNH3U7BgYYWLmKCdOnoglDl81T7hx51dmIAVavfYDJEzclLYsyjDUjL4sSDLB64CEmT9gYjfQ7QRr+2GB+Bh5k8oRNRsxPrBk50G8wwGoeZjIbjZyfUfRiOp2UF6dTXTr3c9+yiNhyrPfZ7wWbxj33llu89Q/XPvzTiNh/rPcab45SPIyJszZn2xNbi2u426ceGP2kEnT7mOM7rm/SpEqSWXt76SXnh9f8Mu9mHeuSOLepxfUaLbu3n9/+dMT16x4xadubZrVyr/HmQsXMrHZBf4/80HKhYmZWswAGKl+2qR49V6hIWsC61e0OjYi9a8uMmVlJA7imYmZmFQiCNW7+6j15gtNMgElztq05N2a2oQig381fnamhuWt+k5cfSV7DfGDlqopyZGY2ul7pU/Hkx/WdDuwK7Dph+tS682JmG4gA+iNKbZ2u52oqrYiIe4B7ADaaW27MuJlZFXqjR8WFiplZ7YJwn4qZmVUjAtb0RpniQsXMrH6in5Fj83ULFypmZjULYMA1ld42ZekaHnfs7S2lsXbOmIOWrqdv1haVpKN7l1eSTt+M6ZWkU4X+FSurSahHJp410sRqgojG2pEjU1vrXFMxM7NKpMmPLlTMzKwCAayJ3pg2WNu7kPQTSe+r6/5mZp0iEP1MKLV1utpyGBEviYhPlDlX0mJJB5dNu9nzzczqNhAqtXU6N3+ZmdWsl/pU6mz+WiDp2Px8jqRzJd0h6XZJZ0ialo9dCMwBzpK0StLFkp6Rnxe3kHTAUOfX9R7NzMoR/TGh1Nbpas+hpCnAZcD1wFzg8cB2wGcBIuIVwBLgrRExNSJeHBFX5edTI2Iq8DHgRuDKoc5vIi8zJc2TNC96JhKPmXW6tPLjhFJbp+uE5q+XA4qI4/LrhyR9GPiVpMMjon+kiyUdArwL2CsilrWYl0dC36/uf6jFpMzMyokQq6Ov7mxUohMKlbnAHEmNM/MC2AZYOtyFkvYlhavfLyJurCAvpwPfAJjct/ENFaRnZlbKQI/0qXRCoXILsDAidh/hnEe1RUl6EvAd4M0R8dvRzi+jGPp+xuStx5KEmVnTUkd95zdtldEJ7+KHwGRJH5Q0TclsSa8unHMHsMvgC0mzgR8Dx0bE+UOkud75ZmadrT0d9ZL6JJ0m6W5J90v6nqRZI5x/tKSb8rl/l/SO0e5Re6ESEQ8C+5A66P8GrAAuBZ5cOO1k4GBJ90n6CfAiYDbwiYYRYC8f5nwzs47Vxo76Y4BXAXuSBkQBnD3UiZJeCZwIHBQR04A3AadJetFIN6iz+asPWA0QEbcCw05WjIgfk2omRfObPN/MrGP1l5/Y2CdpXuH1Pbnpvoy3ASdFxCKAHNXkRkk7RMQtDefuDFwTEb8BiIhfS7oW2AP42XA3qKVQyXNQdiYNA+5Ia6dP5p4XzGkpjXteWs0IskmTNqoknX/eMbeSdHY7bmHLaQw4unBXqSLasSMdDy8Qa6L01/HWQHEg0YnACaNdJGkz0hy+Pzxy34ibJK0kFRSNhcq3gMMk7QX8GtgLmAdcNNJ92l6oSHoKsICUsQvafX8zs07TZEf9ncDehddlaynT8uOKhv3LgaHWs7gLOBe4nHVdJe+JiD+PdJO2FyoR8SdgRrvva2bWqQI10/zVHxFjaS64Pz82fv9uBgzVdPBh4A2k/u2/kvq9L5D0UER8abib1N5Rb2Zm499RHxHLSdFGnjq4T9JOpFrKtUNc8jTg+xFxfSR/AX4AvGKk+4w5h44EbGZWjQjaFfvrDOD9kuZKmg58HPhpRCwe4txfAgdI2gVA0uOAAyj0yQylEyY/VkbSjsDNwPYRcVu9uTEzKyd11LclTMupwObAVcBGpFFcBwNIOgj4vxxPEeA0UlPZz/JclnuB7+Y0htVThYqZWbdqx4z6HEvx6Lw1HjsHOKfwei1pXssxzdyjknchaRNJ50v6kaTtJF2UZ2yukHSFpKcVzj1B0qWSTpF0V95OLBzfW9JaSQfmmZwrJH2nEApfkj4q6R95ludiSUfmy6/JjzfkyZAfbvJ9rItSPOChqmbWHkG5Bbq6YZGulgsVSdsAPwf+Abwyp/l5YAdSQMg/AudJKg50fx6pw+gx+ZoP5rHQg/qAF5PGTs8DnkKKRAxpNv2bgT3zLM9nAlfmY3vkx11z2PuPNPl2jiSN/75h7T9XNXmpmdnYeTnhZHfSpJjvRsR/RER/RCyJiAsi4sGIeAg4ljThphiLa2FEfDEi1ubZmlcDT29I+5iIWBURd5JGHAweXw1MAXaXNCUi7srDlKtwOrArsOvEKVNHO9fMrBIBDMSEUlunazWH/w48QKqZACBplqSvSVqSZ2remg9tWbju9oZ0HmDdxBxI47DvHup4RCwAPkgqrO7KK0E2FkhjEhH3RMTCiFioCZ3/j2dmvUL0l9w6XavfnMcA15FGB2ye930M2JbUPDUd2D7vr+zTiIgzIuI5pOa1q4Hz8iF3hJhZ1wlgTfSV2jpdq4XKWuAg4M/AAklbkSbSPAjcJ2kqaRx0ZSQ9U9JzJW0EPEyaJTq4OuTdpILFYe/NrGtEyM1fgyJiICIOJ4Wrv4LUFLYVKR7NtcCvWPelX4WppPXrl+V7vBg4MOflIVJogW9KWi7pQxXe18xs3LRp8uO4G/M8lYjYseH1e4H35pfPbjj964XzThgirb0Lzxc05qt4TURcRiHMwBBpnQKcMmLmzcw6SFpPpfP7S8rw5MdhTLz3QTb/TmuDymb+YsvRTyqjr6J21Mn/rCSZgYeqScfGn8PNdwt1RS2kDBcqZmY1S0OKXVMxM7MKtDH217jr2vpWDgcTOYikmVlXa9Ma9ePONRUzs5ql0Pdu/jIzs4r0Sp9K59elMknbSLogRy1eCOxfODZf0lkN53sRMTPrCilKcW9Mfuymmso5pHWU5wAbA+dWfQNJM4GZAFMfiTpjZja+UpiWzi8wyuiKQkXSbGAfYOeIWAGsyGuwXFzxrY4EjgdYHZ6LYWbtoq6ohZTRLe9iu/x4S2HfzeNwn0dC30/WlHFI3sxsaAOo1NbpuqKmAizNjzsAN+XnOxaO3w/MGnwhaSIp/lhTIuIeUjwxZkyYOZZ8mpk1rZdGf3VFTSUibgMWAJ+QNF3S1sBxhVP+ALxQ0twcvfijwKRHp2Rm1pl6paO+83O4zhuBjUiLfl0BfK1w7BzgAtLSxTeRlipe2piAmVkn6qU16rul+YuIuB14ecPu4jDit+Zt0P+Oe6bMzCoQwNouqIWU0TWFSrsFEP0tLgOz6sFK8sIWm1WSTP/0agYfTNhlx5bTuH/XGa1nBJh+6Q2VpNN/732VpIMq+mIIL2K6oemGpq0yXKiYmdWtS5q2ynChYmZWMy/SZWZmleqVmkpvNOI1kDRZ0rcl3SdpWd35MTMbyeAiXR791bleCzwTmB0RFfWWm5mNj0CsHeiN3/i9WqjsBNzkAsXMukWv9Kl0bNEo6d2S/ibpfklLJH1MUl8+FpKeUzh3b0lr8/P/Ic2231vSKknzm7jnTEnzJM0LD+k0s3YJN3+1w23AS4DFwJOBi/Lz/xvpooj4z9yP8pyI2LfJe66LUoyjFJtZewz2qfSCjq2pRMT3IuLmSP4EnA28cJxvuy5KMY5SbGbt45rKOJP0BuC9pP6RicBk4Dfjec9ilOLpjlJsZm0SiP4e6ajvyHchaXvg68DJwLYRMYMUy2uwmF4FbFq45DHtzaGZWbV6ZT2VjixUgKmkvN0NrJH0LOCQwvE/AG/O81F2JNVozMy6UvRQR31HFioR8VdSh/n5wHLgGOCbhVP+E9gZuBf4DjC/zVk0M6tUhEptna5j+1Qi4iTgpGGO/Zk0ubHofwrHTxi/nJmZVa07aiFldGyhUrsIYu2alpJYW1U49YrS0YRq/mhjo41aTmPGg9tUkBNY+IHdKknnscf8rpJ0YiAqScc2PO2oheS5fqcChwJTgIuBt0fEkOGsJG0FnEZay2oSsAh4aUT8Y7h7dGTzl5nZhiQC+gdUamvRMcCrgD2B7fK+s4c6UdIU4FJgNWmqxWbAQaSBUsNyTcXMrAM0MbKrT9K8wut78nSIMt4GnBQRiwAkvQ+4UdIOEXFLw7lvJhUk74iIwWabv4x2g46tqUj6Yg65YmbW04KmOuq3Bm4obEeWuYekzYA5pNGz6b4RNwErgT2GuOQFwN+B+ZLuyWGz/t9o9+nYmkpEHFF3HszM2qOpjvo7gb0Lr8vWUqblxxUN+5cD04c4fxapYHkP8O/Ak4CLJN0VEecMd5OOLVTMzDYkUX6MR39ELBzDLe7PjzMa9m9Gqq0Mdf7SiPhsfv17SV8n9ckMW6jU2vwlaRNJn5R0s6R7JV0kaed8bL6kswrnhqR3SLoqRy7+jaTdCscnSvqgpIWSlkv6paSn1/G+zMyaNd7zVCJiObAEeOrgPkk7kWop1w5xydWklrlHJTXSferuUzkT2A14FrAN8Fvgh5ImDXP+ocBrSNWyW0kBIAedSCpB9wdmAl8mVdU2L5uZ9ULf49D3ZtYeafTXhFJbi84A3i9prqTpwMeBn0bE4iHOnQ/MlPROSX2S9iCN/jpvpBvUVqhImgW8kTSy4M6IWE0qGLYlDXcbymkRsSQiHia94afntAS8C/iviFgUEf0R8SXgduBlTWTrSHLn12oeHsvbMjMbk4hyW4tOBS4ErgKWAn3AwQCSDpL0yHDhPBrspcBbSc1j5wInRMS3R7pBnX0qc/PjtalMeMQkYPthrrm98PwB1nU8zSLFC7tQUvFjn8S6sdhlnA58A2AyG93QxHVmZi1px+THiOgHjs5b47FzaOgriYgFwFOauUedhcrgmOhdIuLuxoOS9msirWWkQmbfiLhqrBlaL/S9thhrMmZmTQm6I65XGbU1f0XEXaRaweclzYY0jlrSqyVNbTKtAD4LfFLSLjmtqZL2k+Sw+GbW8aLk1unq7qg/nNSHsUDS/cB1wOsY22c3GNX4fEkrSZN2jqD+92hmNrKAGFCprdPVOk8lIh4Ejs1bo0MbzlXD6wUU8h8Ra4FP5c3MrKv0SvOXJz+ORC1WcqKiYcmt5mMwmcmTK0mnCiv32KqSdCY+UM1/xItu/WMl6bz0CXtXkk7/fY2Tnseoqr9BG3cVjOzqCC5UzMxqNhj7qxe4UDEzq1sALlTMzKwqvdL81XRjfZ51eU3Jc9eL39XEPVZJenaz15mZdadyI7+6YfRX04VKRJwTEUPF3m+apL0lrR3iHlMj4tdV3MPMrCv0yESVppu/JE0qrAJmZmatit7pqB+1piJpsaTjJF2eg40dJenGwvFJOeT8DTkk/U2SXltIYiNJZ+Zw9EslvT1f9xjgJ6SlMVfl7c35WEh6TuEeb8nprpR0tqSvS5pfOD5H0rmS7pB0u6QzJE2jSY5SbGa16ZGaStnmr8OB95ICODYOoD+ZFOXydaS4/M8HigvIvJYUFXMLUhTg/8nrIf8DeAlpwZmpeftq440lPQ/4n5yHLYAfA/9WOD4FuAy4nhSk8vGkIJKfbUyrBEcpNrOaqOTW2coWKmdGxJ9yjK2HBnfmkPPvJIWcvzaS2yKiuODLZRFxQUQMRMR5pKUrn9xEHt8EfDciLouItRHxTdK6K4NeDigijouIhyLiPuDDwEGS+pq4D6QoxbsCu05moyYvNTNrwUDJrcOV7VNZPMz+LYFNWb9m0uj2htfFkPVlzAZ+37DvlsLzucAcScsbzgnSwl9Ly97IUYrNrBYb4DyV4crHu4EHgV1IARybVabcXQrs0LBvDrAoP78FWBgRu4/h/mZmHWGDnadSlJvDPg98QtITlGwn6Uklk7iD1FE/d4RzzgZeK+kFeUnLA0nLDw/6ITA5DxaYlvMwW9Krx/SmzMzqsIF11I/kQ8B3gB8A9wMLgJ3LXBgRC4EvAL/Lo8MOGeKcnwPvJq05fx+pD+UHkHrSc6TjfUgd9H8jDSS4lOb6bczM6hUqt3W4UZu/ImLHhtfzSevDD75eDZyUt8ZrDy2R3juAdzTsawxzfyZw5uBrSb8Gri4cv5W8zrKZWTdSF9RCyuiK2F953stFwGrSOisfquAlAAAR/klEQVRPJ40KG8+bor5mB4+tL/orykqL+XhEfzUZWrPXE1tOY9NbHqggJzD9ijsqSeeln3l+Jensc+WtlaRz2V6zK0mnf+X9laTjEPrjLARdEIKljK4oVIDXAGcBfcCNwKsjYiwDA8zMOpNrKu0TEW+oOw9mZuOqRwqV2tZvbwzFMobrDy2GizEz62o9MvqrK2oqZmY9bQOc/GhmZuOoV0Z/VdL8Jeldkm7OUYqXSjol799R0ndz5ODlkn4paWbh0idJuipf9xtJuxXSXCDp2Ib7DNtkJmlingC5sHCvp1fx/szMxl2PNH+1XKhImgecCrw8IqYBuwMXSNqEFD34LmA3YBZwFGlY8KBDSSO7ZgG3kgI6jtWJwKuA/YGZpMmSF0navIn3si70vYdQmlkbKcptna6KmspaUjzm3SVNjYjlEfEb0sz3jYF3R8SKHGH4NxFRHDh/WkQsiYiHSRMqx1SzyNGS30WKlrwoIvoj4kukYJYvayKpQuj7f44lK2ZmY9MjM+pbLlQiYhFwEGm9k39IulLSi4EdgUUR8ajlgguKEYybjV5cNAuYClyYm76W56jFO5HWVimrEPp+yhizYmbWpLJNX11QU6mkoz6vk3KepMnAEcD5wNuBuZL6IsY0t/x+Ulh94JGVIoezjFQo7RsRV43hXkBD6PsJM0c528ysQl1QYJRRRZ/KrpL2z30oa0gBHQP4Hqn/5NOSZuSO9Gc1sczvH4ADJG2Zr/nocCfmaMmfBT4paZecr6mS9hulMDIz6wgaKLd1uir6VCYDx5GaspaT+jZeExEPkKIHb09aa2UZcBowqWS6nwb+CtxECh75o1HOP55UQzpf0sp8zyOocYKnmVlpbv5KIuI64F+GObYIGHJdkyEiES8o5iciVgD/2nCZCsfns3605LXAp/JmZtY1umVkVxme/DgMTZjAhOlTW0ojHniomrxsunE16Ww2o5J0Fh3ceuXvsWeXrbCO4t7GVaTHJtauqSSdS55cegT7iCZMqSaitCZUM1qoqojbNoIuGNlVhgsVM7NO4JqKmZlVxc1fZmZWjeiOkV1ldMTIKEknSLqk7nyYmdWmDaO/JPVJOk3S3Tnm4vckzSpx3X/k2IvHjnZuRxQqZmYbvPYMKT6GFCNxT9ZFGzl7pAsk7UCK23hdmRu4UDEz6wBtCij5NuDjOUbiCuB9wP654BjOl4APAfeWuUFlhYqkbSRdKGlFDj//llxd2jEfP1zSn/PxP+X4YA1J6BRJd+XtxIaDT5D001xtWyLpY5Im5WM75nsdIun6XK27WNK2Tb6HdVGK6ZEGTjPrNX2D31N5KxVTStJmwBxStBIAIuImYCWwxzDXvB14ICK+XTZzVdZUziGFZdkeeA5wSCFjhwPvJwWe3JxU6p0naefC9c8DlgCPAV4JfFDSXvn6rYCfA+cBs4FnAy8CPtCQhwNzOrNJccNOavI9rItSPOAoxWbWRuWbv7Ymf0/l7ciSdxgMkbWiYf9yYHrjyZLmAMcC7yj/JqpbpGs7UkiW/4qIlRFxF/CRwinvBk6KiGsiYiAifgxcDry+cM7CiPjiYIh8UmiWwVD4bwKuiYj/i4jVEbEU+FjeX3RiRCyLiJXAN2g+lP66KMUTHKXYzNokmor9dSf5eypvZdehGlx2pHEW9Gak2kqjs4CT8/dtaVUNKZ6dH5cU9t1SeD4X+F9Jn2u4922F18Uw+LB+KPy5wF45nP0gAX0N17QUSr8YpXjGxC2budTMrDXl+0v6I2Jh08lHLJe0BHgq6Uc7knYi1VKuHeKSFwFPkzQYzHcG8AxJ+0XEc4e7T1WFymBJNgdYVHg+6Bbg+Ij47hjTvwW4JCKaWXDLzKwriLZNfjwDeL+ky0k/oD8O/DQiFg9x7vYNr78LXAH890g3qKT5KyJuAxYAp0qaJmlLUlvcoE8DJ0h6spKNJT2nuCb9KL4GPF3SYZKmSJogaSdJ+1eRfzOz2rVnSPGpwIXAVaTKQB9wMICkgySteiQ7EbcVN+BhYGVE3DnSDarsqH8jsAmpSeuXpFIN4OGIOBP4BPAV4D5SM9mHKRkGPyLuAF4AHAAszml8n7Syo5lZdys5nLjV2kxeav3oiJgVEdMi4l8jYlk+dk5EDBtFNyL2joiTR7tHZWFaIuJ20rr0AEjaj1Sy3ZGPfxX46jDXnjDEvr0bXl9PGhU21PWLKYTFz/vmUwiNb2bW0XpkFkNlhYqkJ5M+lutIHesnA9/OqzJ2nejvZ2DFUAMimkujEg9XNLz53vsqSWbe225tOQ1NrCj0fV/jWI2xUUXpTJi9TSXpLH9GNenM+NGfK0mnf9Wq0U+ylvRKQMkqm782J80jWQVcSRpN8O4K0zcz611e+XF9EXE5sPOoJ5qZ2fq6pMAow6Hvzcw6QK80f7lQMTPrBC5UzMysKr2ySJcLFTOzurlPpTflENIzAaayWc25MbMNhWiYaNfFvEjX+taFvseh782sjXpkSLELlfWtC32PQ9+bWfu0aeXHcefmr4Ji6Pvp2qLm3JjZBqULCowyerqmIukESYvrzoeZ2YiaW6Sro/V6TWUOKSS/mVln65GaSq8XKs8BXlh3JszMRtMN/SVl9HShEhHz6s5DR1E1rZ0xUMFff1URnCtSVUTpCRWlc8delSTD1LdXMzT+vrOf2HIaW5z9+wpyAhOmbFRJOmufsksl6QDwi3NbT8OFipmZVcU1FTMzq0bgRbrMzKwawjUVMzOrkguVziRpAevWpj+0ca17M7NOpO5cef1Req5QMTPrOl0S16sMFyoFjlJsZnVxn0qHamjumt/k5UcCxwM4SrGZtVM3hGApo6djf42BoxSbWT16JPR9z9VUWuEoxWZWiy4Ja1+GCxUzs07gQsXMzKrgyY9mZlYpVRGotQO4UDEzq1uXdMKX4UJlBJWEeO8k0TljFmNt5+SlSmuXLK0knZ2/Wc1AkYe2nl1JOssOWN1yGvu+u/U0AH531JMqSWejqxdXkk5VemVIsQsVM7NO0CO/YV2omJl1AHfUm5lZNQLokYCSXTmjXtJiSQfXnQ8zs6pooNzW6VxTMTOrWS/NUxnXmoqkd0m6WdL9kpZKOiXv/4qkW/P+6yW9seG6l+X9qyT9UNKn8zopSLoQmAOclY9fnPdPlPRBSQslLZf0S0lPH8/3Z2ZWiYjyW4cbt0JF0jzgVODlETEN2B24IB++EngysBlwEjBf0uPzdY8FzgM+ko9/GnjLYLoR8QpgCfDWiJgaES/Oh04EXgXsTwpf/2XgIkmbN5HnmZLmSZoXvbJgtJl1BUW5rdONZ01lLalWt7ukqRGxPCJ+AxARX4qIeyKiPyK+BVwL7J2vewPw24j4ZkSsjYhLgfNHupEkAe8C/isiFuV0vwTcDrysiTwfCdwA3LCah5u4zMysRW2IUiypT9Jpku7OLUXfkzRrmHNfKukyScsk3SfpCknPHe0e41aoRMQi4CDgcOAfkq6U9GJJEySdJOkGSSskLQf2ALbMl84GbmlIrvF1o1nAVODC3PS1PKe7E7BdE9kuhL7fqInLzMxa06aayjGkFp09WffdePYw525O+k7cmfT9/A3gJ5K2H+kG49pRHxHnAedJmgwcQapxvDVvLwauj4gBSb8n1WoAluZjRXMaXje2TS0DHgD2jYirWsivQ9+bWfsF0F+6xOjL3QuD7snfXWW8DTgp/+hH0vuAGyXtEBHr/XiPiHMarv2CpOOBZwC3DneD8exT2VXS/pI2AdYAK0gf3XRS09jdwARJh5FqKoO+Bewp6d9yVe0FwAENyd8B7DL4IiIC+CzwSUm75PtPlbSfpMeM01s0M6tMEzWVrcnN9Hk7slT60makH+h/GNwXETcBK1n/O3i4659IahW6bqTzxrNPZTJwHKlfYzmpz+M1wFeB3wI3kmoljweuGLwoIm4EXkfqeF8BHEWqnhU7OU4GDs7tfD/J+44n1YTOl7QS+DupdtSVc3HMbANTfvTXneRm+rydXvIO0/Ljiob9y0k/9oclaSvge8AnI+LvI507bs1fEXEd8C/DHH7dKNdewLqRYkj6JoV+lYj4MfDjhmvWAp/Km5lZV2miv6Q/IhaO4Rb358cZDfs3I9VWhpRbe34GXAx8YLSbdOTkR0mvJA07XkkavfUaYL+2Z6SDovrahqXvupsqSWfqDZMqSWfe7SP2zZbyy633rCAnMPXaRZWkE2vXVpJOJdoQ+j4ilktaAjwVuBpA0k6kWsq1Q10jaUfgUuD7EXF0mft0ZKECPI80z2QKaU7KERFxeb1ZMjMbHwJUvqO+FWcA75d0OWlQ0seBn0bE4kflSdoNuASYHxHHlr1BR/Y3RMTRETErT258fER8ue48mZmNJ0WU2lp0KnAhcBWpT7sPOBhA0kGSVhXOfT9pisd7cvSSwe2gkW7QqTUVM7MNR5tWfoyIfuDovDUeOwc4p/D634F/b/YeLlTMzGrXHXG9ynChYmbWAbohrlcZPV+oSJoUEWvqzoeZ2Yh6pKbSkR31o5G0iaRP5rD690q6SNLO+dgCSZ+R9IM8CfKoJtJ1lGIza79Io7/KbJ2uKwsV4ExgN+BZwDakGfo/lDQ4KP8w4HOkST6fayJdRyk2s3q0IUpxO3RdoZLDNL8ReEdE3BkRq0khXbYlRd4EODciLovkwSaSd5RiM6tFm4YUj7tu7FOZmx+vTcuoPGISMDjtd/FYEnaUYjOrTRcUGGV0Y6EyGANsl4i4u/GgpLfz6ND4ZmadK+iZb62ua/6KiLtIi8V8XtJsSCGdJb1a0tR6c2dm1jxRrumrG5q/uq5QyQ4ndagvkHQ/Kb7/6+iKbiwzsyEMDJTbOlw3Nn+RO9+PzVujvdubGzOzFvVQ81dXFipmHaui5RL6H2hm0OIIVlX0TfW7xnWdmrfJ5MkVZATufPNTK0ln6188qkt27Ja3nkQ3NG2V4ULFzKwTuFAxM7NqOKCkmZlVJYAuCMFShgsVM7MO4D4VMzOrjgsVMzOrRAADLlR6jqSZwEyAqcyoOTdmtuHonY76bp1RP14c+t7M6hFRbutwLlTW59D3ZtZ+AfQPlNs6nJu/Chz63szqEZVFY6ibCxUzs07QBU1bZfRs85ekL0r6Sd35MDMb1eDorzJbh+vZmkpEHFF3HszMSuuRmkrPFipVUF9fS9dHf39FOTEbI1XTGNHq/wWo7v/Dyrmjn1PG1NetqSYhgH0rSMOFipmZVSICeuRHqAsVM7NO4JqKmZlVxoWKmZlVoztGdpXRkUOKJU2RtFLStnXnxcxs3AVEDJTaOl1HFCqS+iRtVdj1IuAvEXH7KNdtM745MzNrkx4J01JroSJpT0mfAW4DDi0cOgD4fj5nX0l/yjWXZZIuKZx3nKRFkj4qafcK8jNT0jxJ84LeqIqaWReIgIGBcluHa3uhIunxkj4i6UbgW8A/gZdExCfy8T7gFcAP8iVfAz4HzABmAycXknsn8KZ87FJJ10g6RtIOY8xeIUrxP8eYhJnZGDhKcXMkvU7S1cDPgGnAwRExNyKOiYirC6fuBSyLiIX59WrgscDWEfFwRCwYPDGSKyPiP0kFzlHAzsAfJf1S0guazGYhSvGUsbxNM7MxiYGBUluna2dNZTawE/AX4Brgb8Oc90jTV/YqYBfgOknXS3rPUBdFRH8h7RtJhcNWQ507nIi4JyIWRsRCoWYuNTNrQclaimsq60TEZ4CtgbOAVwJLJF0o6WBJ0wunHsC6pi8i4pqIOJBUQLwd+JikfQaPS9pS0hGSLgf+CjwTOBHYJiK+Pe5vzMysVT0UULKtfSoR8VBEfCciXg1sD5xH6hP5h6TDJO0BTAJ+DyBpsqQ3S5oVEQHcBwwA/fn4ccDNwH7A54FtI+KQiPhxRKxt53szMxurIMVGK7N1utomP0bECuArwFfycOItgdcB5+cCZNCBwH9LmgLcBRwfET/Px34IfDanZWbWncKLdFUqIu4C7pJ0DnB0Yf9q4KUjXPfHNmTPzGzcRRc0bZWh6JCOH0mTgWOAUzqh6UrS3cAtdefDzLrCDhGx5VgvlnQRMKvk6csiYv+x3mu8dUyhYmZm3a8jwrSYmVlvcKFiZmaVcaFiZmaVcaFiZmaVcaFiZmaVcaFiZmaVcaFiZmaVcaFiZmaVcaFiZmaV+f+MBdiOezSSbgAAAABJRU5ErkJggg==\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# This plots a chosen sentence, for which we saved the attention scores above.\n", "idx = 5\n", "src = valid_data[idx].src + [\"\"]\n", "trg = valid_data[idx].trg + [\"\"]\n", "pred = hypotheses[idx].split() + [\"\"]\n", "pred_att = alphas[idx][0].T[:, :len(pred)]\n", "print(\"src\", src)\n", "print(\"ref\", trg)\n", "print(\"pred\", pred)\n", "plot_heatmap(src, pred, pred_att)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Congratulations! You've finished this notebook.\n", "\n", "What didn't we cover?\n", "\n", "- Subwords / Byte Pair Encoding [[paper]](https://arxiv.org/abs/1508.07909) [[github]](https://github.com/rsennrich/subword-nmt) let you deal with unknown words. \n", "- You can implement a [multiplicative/bilinear attention mechanism](https://arxiv.org/abs/1508.04025) instead of the additive one used here.\n", "- We used greedy decoding here to get translations, but you can get better results with beam search.\n", "- The original model only uses a single dropout layer (in the decoder), but you can experiment with adding more dropout layers, for example on the word embeddings and the source word representations.\n", "- You can experiment with multiple encoder/decoder layers.", "- Experiment with a benchmarked and improved codebase: [Joey NMT](https://github.com/joeynmt/joeynmt)" ] }, { "metadata": {}, "cell_type": "markdown", "source": [ "If this was useful to your research, please consider citing:\n", "\n", "> J Bastings. 2018. The Annotated Encoder-Decoder with Attention. https://bastings.github.io/annotated_encoder_decoder/\n", "\n", "Or use the following `Bibtex`:\n", "```\n", "@misc{bastings2018annotated,\n", " title={The Annotated Encoder-Decoder with Attention},\n", " author={Bastings, J.},\n", " journal={https://bastings.github.io/annotated\\_encoder\\_decoder/},\n", " year={2018}\n", "}```" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.5" } }, "nbformat": 4, "nbformat_minor": 2 } ================================================ FILE: index.md ================================================ # The Annotated Encoder-Decoder with Attention Recently, Alexander Rush wrote a blog post called [The Annotated Transformer](http://nlp.seas.harvard.edu/2018/04/03/attention.html), describing the Transformer model from the paper [Attention is All You Need](https://arxiv.org/abs/1706.03762). This post can be seen as a **prequel** to that: *we will implement an Encoder-Decoder with Attention* using (Gated) Recurrent Neural Networks, very closely following the original attention-based neural machine translation paper ["Neural Machine Translation by Jointly Learning to Align and Translate"](https://arxiv.org/abs/1409.0473) of Bahdanau et al. (2015). The idea is that going through both blog posts will make you familiar with two very influential sequence-to-sequence architectures. If you have any comments or suggestions, please let me know: [@BastingsJasmijn](https://twitter.com/BastingsJasmijn). [Click here to open this notebook in Google Colab.](https://colab.research.google.com/github/bastings/annotated_encoder_decoder/blob/master/annotated_encoder_decoder.ipynb) # Model Architecture We will model the probability $$p(Y\mid X)$$ of a target sequence $$Y=(y_1, \dots, y_{N})$$ given a source sequence $$X=(x_1, \dots, x_M)$$ directly with a neural network: an Encoder-Decoder. #### Encoder The encoder reads in the source sentence (*at the bottom of the figure*) and produces a sequence of hidden states $$\mathbf{h}_1, \dots, \mathbf{h}_M$$, one for each source word. These states should capture the meaning of a word in its context of the given sentence. We will use a bi-directional recurrent neural network (Bi-RNN) as the encoder; a Bi-GRU in particular. First of all we **embed** the source words. We simply look up the **word embedding** for each word in a (randomly initialized) lookup table. We will denote the word embedding for word $i$ in a given sentence with $\mathbf{x}_i$. By embedding words, our model may exploit the fact that certain words (e.g. *cat* and *dog*) are semantically similar, and can be processed in a similar way. Now, how do we get hidden states $$\mathbf{h}_1, \dots, \mathbf{h}_M$$? A forward GRU reads the source sentence left-to-right, while a backward GRU reads it right-to-left. Each of them follows a simple recursive formula: $$\mathbf{h}_j = \text{GRU}( \mathbf{x}_j , \mathbf{h}_{j - 1} )$$ i.e. we obtain the next state from the previous state and the current input word embedding. The hidden state of the forward GRU at time step j will know what words **precede** the word at that time step, but it doesn't know what words will follow. In contrast, the backward GRU will only know what words **follow** the word at time step j. By **concatenating** those two hidden states (*shown in blue in the figure*), we get $$\mathbf{h}_j$$, which captures word j in its full sentence context. #### Decoder The decoder (*at the top of the figure*) is a GRU with hidden state $\mathbf{s_i}$. It follows a similar formula to the encoder, but takes one extra input $$\mathbf{c}_{i}$$ (*shown in yellow*). $$\mathbf{s}_{i} = f( \mathbf{s}_{i - 1}, \mathbf{y}_{i - 1}, \mathbf{c}_i )$$ Here, $$\mathbf{y}_{i - 1}$$ is the previously generated target word (*not shown*). At each time step, an **attention mechanism** dynamically selects that part of the source sentence that is most relevant for predicting the current target word. It does so by comparing the last decoder state with each source hidden state. The result is a context vector $\mathbf{c_i}$ (*shown in yellow*). Later the attention mechanism is explained in more detail. After computing the decoder state $\mathbf{s}_i$, a non-linear function $g$ (which applies a [softmax](https://en.wikipedia.org/wiki/Softmax_function)) gives us the probability of the target word $y_i$ for this time step: $$ p(y_i \mid y_{= 0.4.1** and was tested with **Python 3.6**. Make sure you have those versions, and install the packages below if you don't have them yet. ```python #!pip install torch numpy matplotlib sacrebleu ``` ```python %matplotlib inline import numpy as np import torch import torch.nn as nn import torch.nn.functional as F import math, copy, time import matplotlib.pyplot as plt from torch.nn.utils.rnn import pack_padded_sequence, pad_packed_sequence from IPython.core.debugger import set_trace # we will use CUDA if it is available USE_CUDA = torch.cuda.is_available() DEVICE=torch.device('cuda:0') # or set to 'cpu' print("CUDA:", USE_CUDA) print(DEVICE) seed = 42 np.random.seed(seed) torch.manual_seed(seed) torch.cuda.manual_seed(seed) ``` CUDA: True cuda:0 # Let's start coding! ## Model class Our base model class `EncoderDecoder` is very similar to the one in *The Annotated Transformer*. One difference is that our encoder also returns its final states (`encoder_final` below), which is used to initialize the decoder RNN. We also provide the sequence lengths as the RNNs require those. ```python class EncoderDecoder(nn.Module): """ A standard Encoder-Decoder architecture. Base for this and many other models. """ def __init__(self, encoder, decoder, src_embed, trg_embed, generator): super(EncoderDecoder, self).__init__() self.encoder = encoder self.decoder = decoder self.src_embed = src_embed self.trg_embed = trg_embed self.generator = generator def forward(self, src, trg, src_mask, trg_mask, src_lengths, trg_lengths): """Take in and process masked src and target sequences.""" encoder_hidden, encoder_final = self.encode(src, src_mask, src_lengths) return self.decode(encoder_hidden, encoder_final, src_mask, trg, trg_mask) def encode(self, src, src_mask, src_lengths): return self.encoder(self.src_embed(src), src_mask, src_lengths) def decode(self, encoder_hidden, encoder_final, src_mask, trg, trg_mask, decoder_hidden=None): return self.decoder(self.trg_embed(trg), encoder_hidden, encoder_final, src_mask, trg_mask, hidden=decoder_hidden) ``` To keep things easy we also keep the `Generator` class the same. It simply projects the pre-output layer (x in the `forward` function below) to obtain the output layer, so that the final dimension is the target vocabulary size. ```python class Generator(nn.Module): """Define standard linear + softmax generation step.""" def __init__(self, hidden_size, vocab_size): super(Generator, self).__init__() self.proj = nn.Linear(hidden_size, vocab_size, bias=False) def forward(self, x): return F.log_softmax(self.proj(x), dim=-1) ``` ## Encoder Our encoder is a bi-directional GRU. Because we want to process multiple sentences at the same time for speed reasons (it is more effcient on GPU), we need to support **mini-batches**. Sentences in a mini-batch may have different lengths, which means that the RNN needs to unroll further for certain sentences while it might already have finished for others: ``` Example: mini-batch with 3 source sentences of different lengths (7, 5, and 3). End-of-sequence is marked with a "3" here, and padding positions with "1". +---------------+ | 4 5 9 8 7 8 3 | +---------------+ | 5 4 8 7 3 1 1 | +---------------+ | 5 8 3 1 1 1 1 | +---------------+ ``` You can see that, when computing hidden states for this mini-batch, for sentence #2 and #3 we will need to stop updating the hidden state after we have encountered "3". We don't want to incorporate the padding values (1s). Luckily, PyTorch has convenient helper functions called `pack_padded_sequence` and `pad_packed_sequence`. These functions take care of masking and padding, so that the resulting word representations are simply zeros after a sentence stops. The code below reads in a source sentence (a sequence of word embeddings) and produces the hidden states. It also returns a final vector, a summary of the complete sentence, by concatenating the first and the last hidden states (they have both seen the whole sentence, each in a different direction). We will use the final vector to initialize the decoder. ```python class Encoder(nn.Module): """Encodes a sequence of word embeddings""" def __init__(self, input_size, hidden_size, num_layers=1, dropout=0.): super(Encoder, self).__init__() self.num_layers = num_layers self.rnn = nn.GRU(input_size, hidden_size, num_layers, batch_first=True, bidirectional=True, dropout=dropout) def forward(self, x, mask, lengths): """ Applies a bidirectional GRU to sequence of embeddings x. The input mini-batch x needs to be sorted by length. x should have dimensions [batch, time, dim]. """ packed = pack_padded_sequence(x, lengths, batch_first=True) output, final = self.rnn(packed) output, _ = pad_packed_sequence(output, batch_first=True) # we need to manually concatenate the final states for both directions fwd_final = final[0:final.size(0):2] bwd_final = final[1:final.size(0):2] final = torch.cat([fwd_final, bwd_final], dim=2) # [num_layers, batch, 2*dim] return output, final ``` ### Decoder The decoder is a conditional GRU. Rather than starting with an empty state like the encoder, its initial hidden state results from a projection of the encoder final vector. #### Training In `forward` you can find a for-loop that computes the decoder hidden states one time step at a time. Note that, during training, we know exactly what the target words should be! (They are in `trg_embed`.) This means that we are not even checking here what the prediction is! We simply feed the correct previous target word embedding to the GRU at each time step. This is called teacher forcing. The `forward` function returns all decoder hidden states and pre-output vectors. Elsewhere these are used to compute the loss, after which the parameters are updated. #### Prediction For prediction time, for forward function is only used for a single time step. After predicting a word from the returned pre-output vector, we can call it again, supplying it the word embedding of the previously predicted word and the last state. ```python class Decoder(nn.Module): """A conditional RNN decoder with attention.""" def __init__(self, emb_size, hidden_size, attention, num_layers=1, dropout=0.5, bridge=True): super(Decoder, self).__init__() self.hidden_size = hidden_size self.num_layers = num_layers self.attention = attention self.dropout = dropout self.rnn = nn.GRU(emb_size + 2*hidden_size, hidden_size, num_layers, batch_first=True, dropout=dropout) # to initialize from the final encoder state self.bridge = nn.Linear(2*hidden_size, hidden_size, bias=True) if bridge else None self.dropout_layer = nn.Dropout(p=dropout) self.pre_output_layer = nn.Linear(hidden_size + 2*hidden_size + emb_size, hidden_size, bias=False) def forward_step(self, prev_embed, encoder_hidden, src_mask, proj_key, hidden): """Perform a single decoder step (1 word)""" # compute context vector using attention mechanism query = hidden[-1].unsqueeze(1) # [#layers, B, D] -> [B, 1, D] context, attn_probs = self.attention( query=query, proj_key=proj_key, value=encoder_hidden, mask=src_mask) # update rnn hidden state rnn_input = torch.cat([prev_embed, context], dim=2) output, hidden = self.rnn(rnn_input, hidden) pre_output = torch.cat([prev_embed, output, context], dim=2) pre_output = self.dropout_layer(pre_output) pre_output = self.pre_output_layer(pre_output) return output, hidden, pre_output def forward(self, trg_embed, encoder_hidden, encoder_final, src_mask, trg_mask, hidden=None, max_len=None): """Unroll the decoder one step at a time.""" # the maximum number of steps to unroll the RNN if max_len is None: max_len = trg_mask.size(-1) # initialize decoder hidden state if hidden is None: hidden = self.init_hidden(encoder_final) # pre-compute projected encoder hidden states # (the "keys" for the attention mechanism) # this is only done for efficiency proj_key = self.attention.key_layer(encoder_hidden) # here we store all intermediate hidden states and pre-output vectors decoder_states = [] pre_output_vectors = [] # unroll the decoder RNN for max_len steps for i in range(max_len): prev_embed = trg_embed[:, i].unsqueeze(1) output, hidden, pre_output = self.forward_step( prev_embed, encoder_hidden, src_mask, proj_key, hidden) decoder_states.append(output) pre_output_vectors.append(pre_output) decoder_states = torch.cat(decoder_states, dim=1) pre_output_vectors = torch.cat(pre_output_vectors, dim=1) return decoder_states, hidden, pre_output_vectors # [B, N, D] def init_hidden(self, encoder_final): """Returns the initial decoder state, conditioned on the final encoder state.""" if encoder_final is None: return None # start with zeros return torch.tanh(self.bridge(encoder_final)) ``` ### Attention At every time step, the decoder has access to *all* source word representations $$\mathbf{h}_1, \dots, \mathbf{h}_M$$. An attention mechanism allows the model to focus on the currently most relevant part of the source sentence. The state of the decoder is represented by GRU hidden state $$\mathbf{s}_i$$. So if we want to know which source word representation(s) $$\mathbf{h}_j$$ are most relevant, we will need to define a function that takes those two things as input. Here we use the MLP-based, additive attention that was used in Bahdanau et al.: We apply an MLP with tanh-activation to both the current decoder state $$\bf s_i$$ (the *query*) and each encoder state $$\bf h_j$$ (the *key*), and then project this to a single value (i.e. a scalar) to get the *attention energy* $$e_{ij}$$. Once all energies are computed, they are normalized by a softmax so that they sum to one: $$ \alpha_{ij} = \text{softmax}(\mathbf{e}_i)[j] $$ $$\sum_j \alpha_{ij} = 1.0$$ The context vector for time step $i$ is then a weighted sum of the encoder hidden states (the *values*): $$\mathbf{c}_i = \sum_j \alpha_{ij} \mathbf{h}_j$$ ```python class BahdanauAttention(nn.Module): """Implements Bahdanau (MLP) attention""" def __init__(self, hidden_size, key_size=None, query_size=None): super(BahdanauAttention, self).__init__() # We assume a bi-directional encoder so key_size is 2*hidden_size key_size = 2 * hidden_size if key_size is None else key_size query_size = hidden_size if query_size is None else query_size self.key_layer = nn.Linear(key_size, hidden_size, bias=False) self.query_layer = nn.Linear(query_size, hidden_size, bias=False) self.energy_layer = nn.Linear(hidden_size, 1, bias=False) # to store attention scores self.alphas = None def forward(self, query=None, proj_key=None, value=None, mask=None): assert mask is not None, "mask is required" # We first project the query (the decoder state). # The projected keys (the encoder states) were already pre-computated. query = self.query_layer(query) # Calculate scores. scores = self.energy_layer(torch.tanh(query + proj_key)) scores = scores.squeeze(2).unsqueeze(1) # Mask out invalid positions. # The mask marks valid positions so we invert it using `mask & 0`. scores.data.masked_fill_(mask == 0, -float('inf')) # Turn scores to probabilities. alphas = F.softmax(scores, dim=-1) self.alphas = alphas # The context vector is the weighted sum of the values. context = torch.bmm(alphas, value) # context shape: [B, 1, 2D], alphas shape: [B, 1, M] return context, alphas ``` ## Embeddings and Softmax We use learned embeddings to convert the input tokens and output tokens to vectors of dimension `emb_size`. We will simply use PyTorch's [nn.Embedding](https://pytorch.org/docs/stable/nn.html?highlight=embedding#torch.nn.Embedding) class. ## Full Model Here we define a function from hyperparameters to a full model. ```python def make_model(src_vocab, tgt_vocab, emb_size=256, hidden_size=512, num_layers=1, dropout=0.1): "Helper: Construct a model from hyperparameters." attention = BahdanauAttention(hidden_size) model = EncoderDecoder( Encoder(emb_size, hidden_size, num_layers=num_layers, dropout=dropout), Decoder(emb_size, hidden_size, attention, num_layers=num_layers, dropout=dropout), nn.Embedding(src_vocab, emb_size), nn.Embedding(tgt_vocab, emb_size), Generator(hidden_size, tgt_vocab)) return model.cuda() if USE_CUDA else model ``` # Training This section describes the training regime for our models. We stop for a quick interlude to introduce some of the tools needed to train a standard encoder decoder model. First we define a batch object that holds the src and target sentences for training, as well as their lengths and masks. ## Batches and Masking ```python class Batch: """Object for holding a batch of data with mask during training. Input is a batch from a torch text iterator. """ def __init__(self, src, trg, pad_index=0): src, src_lengths = src self.src = src self.src_lengths = src_lengths self.src_mask = (src != pad_index).unsqueeze(-2) self.nseqs = src.size(0) self.trg = None self.trg_y = None self.trg_mask = None self.trg_lengths = None self.ntokens = None if trg is not None: trg, trg_lengths = trg self.trg = trg[:, :-1] self.trg_lengths = trg_lengths self.trg_y = trg[:, 1:] self.trg_mask = (self.trg_y != pad_index) self.ntokens = (self.trg_y != pad_index).data.sum().item() if USE_CUDA: self.src = self.src.cuda() self.src_mask = self.src_mask.cuda() if trg is not None: self.trg = self.trg.cuda() self.trg_y = self.trg_y.cuda() self.trg_mask = self.trg_mask.cuda() ``` ## Training Loop The code below trains the model for 1 epoch (=1 pass through the training data). ```python def run_epoch(data_iter, model, loss_compute, print_every=50): """Standard Training and Logging Function""" start = time.time() total_tokens = 0 total_loss = 0 print_tokens = 0 for i, batch in enumerate(data_iter, 1): out, _, pre_output = model.forward(batch.src, batch.trg, batch.src_mask, batch.trg_mask, batch.src_lengths, batch.trg_lengths) loss = loss_compute(pre_output, batch.trg_y, batch.nseqs) total_loss += loss total_tokens += batch.ntokens print_tokens += batch.ntokens if model.training and i % print_every == 0: elapsed = time.time() - start print("Epoch Step: %d Loss: %f Tokens per Sec: %f" % (i, loss / batch.nseqs, print_tokens / elapsed)) start = time.time() print_tokens = 0 return math.exp(total_loss / float(total_tokens)) ``` ## Training Data and Batching We will use torch text for batching. This is discussed in more detail below. ## Optimizer We will use the [Adam optimizer](https://arxiv.org/abs/1412.6980) with default settings ($$\beta_1=0.9$$, $$\beta_2=0.999$$ and $$\epsilon=10^{-8}$$). We will use 0.0003 as the learning rate here, but for different problems another learning rate may be more appropriate. You will have to tune that. # A First Example We can begin by trying out a simple copy-task. Given a random set of input symbols from a small vocabulary, the goal is to generate back those same symbols. ## Synthetic Data ```python def data_gen(num_words=11, batch_size=16, num_batches=100, length=10, pad_index=0, sos_index=1): """Generate random data for a src-tgt copy task.""" for i in range(num_batches): data = torch.from_numpy( np.random.randint(1, num_words, size=(batch_size, length))) data[:, 0] = sos_index data = data.cuda() if USE_CUDA else data src = data[:, 1:] trg = data src_lengths = [length-1] * batch_size trg_lengths = [length] * batch_size yield Batch((src, src_lengths), (trg, trg_lengths), pad_index=pad_index) ``` ## Loss Computation ```python class SimpleLossCompute: """A simple loss compute and train function.""" def __init__(self, generator, criterion, opt=None): self.generator = generator self.criterion = criterion self.opt = opt def __call__(self, x, y, norm): x = self.generator(x) loss = self.criterion(x.contiguous().view(-1, x.size(-1)), y.contiguous().view(-1)) loss = loss / norm if self.opt is not None: loss.backward() self.opt.step() self.opt.zero_grad() return loss.data.item() * norm ``` ### Printing examples To monitor progress during training, we will translate a few examples. We use greedy decoding for simplicity; that is, at each time step, starting at the first token, we choose the one with that maximum probability, and we never revisit that choice. ```python def greedy_decode(model, src, src_mask, src_lengths, max_len=100, sos_index=1, eos_index=None): """Greedily decode a sentence.""" with torch.no_grad(): encoder_hidden, encoder_final = model.encode(src, src_mask, src_lengths) prev_y = torch.ones(1, 1).fill_(sos_index).type_as(src) trg_mask = torch.ones_like(prev_y) output = [] attention_scores = [] hidden = None for i in range(max_len): with torch.no_grad(): out, hidden, pre_output = model.decode( encoder_hidden, encoder_final, src_mask, prev_y, trg_mask, hidden) # we predict from the pre-output layer, which is # a combination of Decoder state, prev emb, and context prob = model.generator(pre_output[:, -1]) _, next_word = torch.max(prob, dim=1) next_word = next_word.data.item() output.append(next_word) prev_y = torch.ones(1, 1).type_as(src).fill_(next_word) attention_scores.append(model.decoder.attention.alphas.cpu().numpy()) output = np.array(output) # cut off everything starting from # (only when eos_index provided) if eos_index is not None: first_eos = np.where(output==eos_index)[0] if len(first_eos) > 0: output = output[:first_eos[0]] return output, np.concatenate(attention_scores, axis=1) def lookup_words(x, vocab=None): if vocab is not None: x = [vocab.itos[i] for i in x] return [str(t) for t in x] ``` ```python def print_examples(example_iter, model, n=2, max_len=100, sos_index=1, src_eos_index=None, trg_eos_index=None, src_vocab=None, trg_vocab=None): """Prints N examples. Assumes batch size of 1.""" model.eval() count = 0 print() if src_vocab is not None and trg_vocab is not None: src_eos_index = src_vocab.stoi[EOS_TOKEN] trg_sos_index = trg_vocab.stoi[SOS_TOKEN] trg_eos_index = trg_vocab.stoi[EOS_TOKEN] else: src_eos_index = None trg_sos_index = 1 trg_eos_index = None for i, batch in enumerate(example_iter): src = batch.src.cpu().numpy()[0, :] trg = batch.trg_y.cpu().numpy()[0, :] # remove (if it is there) src = src[:-1] if src[-1] == src_eos_index else src trg = trg[:-1] if trg[-1] == trg_eos_index else trg result, _ = greedy_decode( model, batch.src, batch.src_mask, batch.src_lengths, max_len=max_len, sos_index=trg_sos_index, eos_index=trg_eos_index) print("Example #%d" % (i+1)) print("Src : ", " ".join(lookup_words(src, vocab=src_vocab))) print("Trg : ", " ".join(lookup_words(trg, vocab=trg_vocab))) print("Pred: ", " ".join(lookup_words(result, vocab=trg_vocab))) print() count += 1 if count == n: break ``` ## Training the copy task ```python def train_copy_task(): """Train the simple copy task.""" num_words = 11 criterion = nn.NLLLoss(reduction="sum", ignore_index=0) model = make_model(num_words, num_words, emb_size=32, hidden_size=64) optim = torch.optim.Adam(model.parameters(), lr=0.0003) eval_data = list(data_gen(num_words=num_words, batch_size=1, num_batches=100)) dev_perplexities = [] if USE_CUDA: model.cuda() for epoch in range(10): print("Epoch %d" % epoch) # train model.train() data = data_gen(num_words=num_words, batch_size=32, num_batches=100) run_epoch(data, model, SimpleLossCompute(model.generator, criterion, optim)) # evaluate model.eval() with torch.no_grad(): perplexity = run_epoch(eval_data, model, SimpleLossCompute(model.generator, criterion, None)) print("Evaluation perplexity: %f" % perplexity) dev_perplexities.append(perplexity) print_examples(eval_data, model, n=2, max_len=9) return dev_perplexities ``` ```python # train the copy task dev_perplexities = train_copy_task() def plot_perplexity(perplexities): """plot perplexities""" plt.title("Perplexity per Epoch") plt.xlabel("Epoch") plt.ylabel("Perplexity") plt.plot(perplexities) plot_perplexity(dev_perplexities) ``` /home/jb/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/rnn.py:38: UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.1 and num_layers=1 "num_layers={}".format(dropout, num_layers)) Epoch 0 Epoch Step: 50 Loss: 19.887581 Tokens per Sec: 7748.957397 Epoch Step: 100 Loss: 17.856726 Tokens per Sec: 7925.338918 Evaluation perplexity: 7.172198 Example #1 Src : 4 8 5 7 10 3 7 8 5 Trg : 4 8 5 7 10 3 7 8 5 Pred: 8 3 7 5 8 3 7 5 8 Example #2 Src : 8 8 3 6 5 2 8 6 2 Trg : 8 8 3 6 5 2 8 6 2 Pred: 8 8 8 8 8 8 8 8 8 Epoch 1 Epoch Step: 50 Loss: 15.715487 Tokens per Sec: 8662.903188 Epoch Step: 100 Loss: 12.368280 Tokens per Sec: 7860.172940 Evaluation perplexity: 3.709498 Example #1 Src : 4 8 5 7 10 3 7 8 5 Trg : 4 8 5 7 10 3 7 8 5 Pred: 4 8 7 5 10 8 7 5 7 Example #2 Src : 8 8 3 6 5 2 8 6 2 Trg : 8 8 3 6 5 2 8 6 2 Pred: 8 8 5 6 2 6 8 2 5 Epoch 2 Epoch Step: 50 Loss: 9.246480 Tokens per Sec: 7971.095313 Epoch Step: 100 Loss: 7.701921 Tokens per Sec: 7876.198908 Evaluation perplexity: 2.303158 Example #1 Src : 4 8 5 7 10 3 7 8 5 Trg : 4 8 5 7 10 3 7 8 5 Pred: 4 8 7 3 10 5 8 7 5 Example #2 Src : 8 8 3 6 5 2 8 6 2 Trg : 8 8 3 6 5 2 8 6 2 Pred: 8 8 5 6 2 6 8 5 2 Epoch 3 Epoch Step: 50 Loss: 6.166847 Tokens per Sec: 8069.631171 Epoch Step: 100 Loss: 5.673258 Tokens per Sec: 7855.858586 Evaluation perplexity: 1.775795 Example #1 Src : 4 8 5 7 10 3 7 8 5 Trg : 4 8 5 7 10 3 7 8 5 Pred: 4 8 7 5 10 3 7 8 5 Example #2 Src : 8 8 3 6 5 2 8 6 2 Trg : 8 8 3 6 5 2 8 6 2 Pred: 8 8 3 6 5 2 8 6 8 Epoch 4 Epoch Step: 50 Loss: 4.830031 Tokens per Sec: 8094.515152 Epoch Step: 100 Loss: 4.152125 Tokens per Sec: 7999.315744 Evaluation perplexity: 1.572305 Example #1 Src : 4 8 5 7 10 3 7 8 5 Trg : 4 8 5 7 10 3 7 8 5 Pred: 4 8 5 7 10 3 7 8 5 Example #2 Src : 8 8 3 6 5 2 8 6 2 Trg : 8 8 3 6 5 2 8 6 2 Pred: 8 8 3 6 5 2 8 6 2 Epoch 5 Epoch Step: 50 Loss: 3.638369 Tokens per Sec: 8112.868501 Epoch Step: 100 Loss: 3.784709 Tokens per Sec: 7843.288141 Evaluation perplexity: 1.433951 Example #1 Src : 4 8 5 7 10 3 7 8 5 Trg : 4 8 5 7 10 3 7 8 5 Pred: 4 8 7 5 3 10 7 8 7 Example #2 Src : 8 8 3 6 5 2 8 6 2 Trg : 8 8 3 6 5 2 8 6 2 Pred: 8 8 3 6 5 2 8 6 2 Epoch 6 Epoch Step: 50 Loss: 2.802792 Tokens per Sec: 8128.952327 Epoch Step: 100 Loss: 2.403310 Tokens per Sec: 7893.746819 Evaluation perplexity: 1.284198 Example #1 Src : 4 8 5 7 10 3 7 8 5 Trg : 4 8 5 7 10 3 7 8 5 Pred: 4 8 5 7 10 3 7 8 5 Example #2 Src : 8 8 3 6 5 2 8 6 2 Trg : 8 8 3 6 5 2 8 6 2 Pred: 8 8 3 6 5 2 8 6 2 Epoch 7 Epoch Step: 50 Loss: 2.174423 Tokens per Sec: 8181.341663 Epoch Step: 100 Loss: 1.838792 Tokens per Sec: 7833.160747 Evaluation perplexity: 1.173110 Example #1 Src : 4 8 5 7 10 3 7 8 5 Trg : 4 8 5 7 10 3 7 8 5 Pred: 4 8 5 7 10 3 7 8 5 Example #2 Src : 8 8 3 6 5 2 8 6 2 Trg : 8 8 3 6 5 2 8 6 2 Pred: 8 8 3 6 5 2 8 6 2 Epoch 8 Epoch Step: 50 Loss: 1.226522 Tokens per Sec: 8267.548130 Epoch Step: 100 Loss: 1.090876 Tokens per Sec: 7842.856308 Evaluation perplexity: 1.123090 Example #1 Src : 4 8 5 7 10 3 7 8 5 Trg : 4 8 5 7 10 3 7 8 5 Pred: 4 8 5 7 10 3 7 8 5 Example #2 Src : 8 8 3 6 5 2 8 6 2 Trg : 8 8 3 6 5 2 8 6 2 Pred: 8 8 3 6 5 2 8 6 2 Epoch 9 Epoch Step: 50 Loss: 1.216270 Tokens per Sec: 8181.132215 Epoch Step: 100 Loss: 0.636999 Tokens per Sec: 7866.309111 Evaluation perplexity: 1.088564 Example #1 Src : 4 8 5 7 10 3 7 8 5 Trg : 4 8 5 7 10 3 7 8 5 Pred: 4 8 5 7 10 3 7 8 5 Example #2 Src : 8 8 3 6 5 2 8 6 2 Trg : 8 8 3 6 5 2 8 6 2 Pred: 8 8 3 6 5 2 8 6 2 ![png](images/output_36_2.png) You can see that the model managed to correctly 'translate' the two examples in the end. Moreover, the perplexity of the development data nicely went down towards 1. # A Real World Example Now we consider a real-world example using the IWSLT German-English Translation task. This task is much smaller than usual, but it illustrates the whole system. The cell below installs torch text and spacy. This might take a while. ```python #!pip install git+git://github.com/pytorch/text spacy #!python -m spacy download en #!python -m spacy download de ``` ## Data Loading We will load the dataset using torchtext and spacy for tokenization. This cell might take a while to run the first time, as it will download and tokenize the IWSLT data. For speed we only include short sentences, and we include a word in the vocabulary only if it occurs at least 5 times. In this case we also lowercase the data. If you have **issues** with torch text in the cell below (e.g. an `ascii` error), try running `export LC_ALL="en_US.UTF-8"` before you start `jupyter notebook`. ```python # For data loading. from torchtext import data, datasets if True: import spacy spacy_de = spacy.load('de') spacy_en = spacy.load('en') def tokenize_de(text): return [tok.text for tok in spacy_de.tokenizer(text)] def tokenize_en(text): return [tok.text for tok in spacy_en.tokenizer(text)] UNK_TOKEN = "" PAD_TOKEN = "" SOS_TOKEN = "" EOS_TOKEN = "" LOWER = True # we include lengths to provide to the RNNs SRC = data.Field(tokenize=tokenize_de, batch_first=True, lower=LOWER, include_lengths=True, unk_token=UNK_TOKEN, pad_token=PAD_TOKEN, init_token=None, eos_token=EOS_TOKEN) TRG = data.Field(tokenize=tokenize_en, batch_first=True, lower=LOWER, include_lengths=True, unk_token=UNK_TOKEN, pad_token=PAD_TOKEN, init_token=SOS_TOKEN, eos_token=EOS_TOKEN) MAX_LEN = 25 # NOTE: we filter out a lot of sentences for speed train_data, valid_data, test_data = datasets.IWSLT.splits( exts=('.de', '.en'), fields=(SRC, TRG), filter_pred=lambda x: len(vars(x)['src']) <= MAX_LEN and len(vars(x)['trg']) <= MAX_LEN) MIN_FREQ = 5 # NOTE: we limit the vocabulary to frequent words for speed SRC.build_vocab(train_data.src, min_freq=MIN_FREQ) TRG.build_vocab(train_data.trg, min_freq=MIN_FREQ) PAD_INDEX = TRG.vocab.stoi[PAD_TOKEN] ``` ### Let's look at the data It never hurts to look at your data and some statistics. ```python def print_data_info(train_data, valid_data, test_data, src_field, trg_field): """ This prints some useful stuff about our data sets. """ print("Data set sizes (number of sentence pairs):") print('train', len(train_data)) print('valid', len(valid_data)) print('test', len(test_data), "\n") print("First training example:") print("src:", " ".join(vars(train_data[0])['src'])) print("trg:", " ".join(vars(train_data[0])['trg']), "\n") print("Most common words (src):") print("\n".join(["%10s %10d" % x for x in src_field.vocab.freqs.most_common(10)]), "\n") print("Most common words (trg):") print("\n".join(["%10s %10d" % x for x in trg_field.vocab.freqs.most_common(10)]), "\n") print("First 10 words (src):") print("\n".join( '%02d %s' % (i, t) for i, t in enumerate(src_field.vocab.itos[:10])), "\n") print("First 10 words (trg):") print("\n".join( '%02d %s' % (i, t) for i, t in enumerate(trg_field.vocab.itos[:10])), "\n") print("Number of German words (types):", len(src_field.vocab)) print("Number of English words (types):", len(trg_field.vocab), "\n") print_data_info(train_data, valid_data, test_data, SRC, TRG) ``` Data set sizes (number of sentence pairs): train 143116 valid 690 test 963 First training example: src: david gallo : das ist bill lange . ich bin dave gallo . trg: david gallo : this is bill lange . i 'm dave gallo . Most common words (src): . 138325 , 105944 und 41839 die 40809 das 33324 sie 33035 ich 31153 ist 31035 es 27449 wir 25817 Most common words (trg): . 137259 , 91619 the 73344 and 50273 to 42798 a 39573 of 39496 i 33524 it 32921 that 32643 First 10 words (src): 00 01 02 03 . 04 , 05 und 06 die 07 das 08 sie 09 ich First 10 words (trg): 00 01 02 03 04 . 05 , 06 the 07 and 08 to 09 a Number of German words (types): 15761 Number of English words (types): 13003 ## Iterators Batching matters a ton for speed. We will use torch text's BucketIterator here to get batches containing sentences of (almost) the same length. #### Note on sorting batches for RNNs in PyTorch For effiency reasons, PyTorch RNNs require that batches have been sorted by length, with the longest sentence in the batch first. For training, we simply sort each batch. For validation, we would run into trouble if we want to compare our translations with some external file that was not sorted. Therefore we simply set the validation batch size to 1, so that we can keep it in the original order. ```python train_iter = data.BucketIterator(train_data, batch_size=64, train=True, sort_within_batch=True, sort_key=lambda x: (len(x.src), len(x.trg)), repeat=False, device=DEVICE) valid_iter = data.Iterator(valid_data, batch_size=1, train=False, sort=False, repeat=False, device=DEVICE) def rebatch(pad_idx, batch): """Wrap torchtext batch into our own Batch class for pre-processing""" return Batch(batch.src, batch.trg, pad_idx) ``` ## Training the System Now we train the model. On a Titan X GPU, this runs at ~18,000 tokens per second with a batch size of 64. ```python def train(model, num_epochs=10, lr=0.0003, print_every=100): """Train a model on IWSLT""" if USE_CUDA: model.cuda() # optionally add label smoothing; see the Annotated Transformer criterion = nn.NLLLoss(reduction="sum", ignore_index=PAD_INDEX) optim = torch.optim.Adam(model.parameters(), lr=lr) dev_perplexities = [] for epoch in range(num_epochs): print("Epoch", epoch) model.train() train_perplexity = run_epoch((rebatch(PAD_INDEX, b) for b in train_iter), model, SimpleLossCompute(model.generator, criterion, optim), print_every=print_every) model.eval() with torch.no_grad(): print_examples((rebatch(PAD_INDEX, x) for x in valid_iter), model, n=3, src_vocab=SRC.vocab, trg_vocab=TRG.vocab) dev_perplexity = run_epoch((rebatch(PAD_INDEX, b) for b in valid_iter), model, SimpleLossCompute(model.generator, criterion, None)) print("Validation perplexity: %f" % dev_perplexity) dev_perplexities.append(dev_perplexity) return dev_perplexities ``` ```python model = make_model(len(SRC.vocab), len(TRG.vocab), emb_size=256, hidden_size=256, num_layers=1, dropout=0.2) dev_perplexities = train(model, print_every=100) ``` Epoch 0 /home/jb/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/rnn.py:38: UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.2 and num_layers=1 "num_layers={}".format(dropout, num_layers)) Epoch Step: 100 Loss: 22.353386 Tokens per Sec: 16007.731248 Epoch Step: 200 Loss: 34.410126 Tokens per Sec: 16368.906298 Epoch Step: 300 Loss: 44.763870 Tokens per Sec: 16586.324787 Epoch Step: 400 Loss: 57.584606 Tokens per Sec: 16717.486756 Epoch Step: 500 Loss: 40.508701 Tokens per Sec: 16486.886104 Epoch Step: 600 Loss: 51.919121 Tokens per Sec: 16529.862635 Epoch Step: 700 Loss: 82.279633 Tokens per Sec: 16973.462052 Epoch Step: 800 Loss: 35.026432 Tokens per Sec: 16724.939524 Epoch Step: 900 Loss: 63.407204 Tokens per Sec: 16606.524355 Epoch Step: 1000 Loss: 37.909828 Tokens per Sec: 19105.497130 Epoch Step: 1100 Loss: 90.584244 Tokens per Sec: 19643.264684 Epoch Step: 1200 Loss: 84.000832 Tokens per Sec: 19468.084935 Epoch Step: 1300 Loss: 54.331242 Tokens per Sec: 19679.282614 Epoch Step: 1400 Loss: 49.921040 Tokens per Sec: 19629.820942 Epoch Step: 1500 Loss: 21.851797 Tokens per Sec: 19565.639729 Epoch Step: 1600 Loss: 55.154270 Tokens per Sec: 19515.738007 Epoch Step: 1700 Loss: 40.758137 Tokens per Sec: 19486.791554 Epoch Step: 1800 Loss: 50.094219 Tokens per Sec: 19761.236905 Epoch Step: 1900 Loss: 90.545143 Tokens per Sec: 19447.650965 Epoch Step: 2000 Loss: 22.882494 Tokens per Sec: 19539.331538 Epoch Step: 2100 Loss: 99.448174 Tokens per Sec: 19278.704892 Epoch Step: 2200 Loss: 16.793839 Tokens per Sec: 19183.702688 Example #1 Src : als ich 11 jahre alt war , wurde ich eines morgens von den heller freude geweckt . Trg : when i was 11 , i remember waking up one morning to the sound of joy in my house . Pred: when i was born years old , i was a of the of the . Example #2 Src : mein vater hörte sich auf seinem kleinen , grauen radio die der bbc an . Trg : my father was listening to bbc news on his small , gray radio . Pred: my father was on his , the of the . Example #3 Src : er sah sehr glücklich aus , was damals ziemlich ungewöhnlich war , da ihn die nachrichten meistens . Trg : there was a big smile on his face which was unusual then , because the news mostly depressed him . Pred: he was very interested in the way , what was pretty much more , and then it was the . Validation perplexity: 31.839708 Epoch 1 Epoch Step: 100 Loss: 4.451122 Tokens per Sec: 19110.156367 Epoch Step: 200 Loss: 11.262838 Tokens per Sec: 19538.253630 Epoch Step: 300 Loss: 55.240711 Tokens per Sec: 19584.509548 Epoch Step: 400 Loss: 54.733456 Tokens per Sec: 19787.183104 Epoch Step: 500 Loss: 38.923244 Tokens per Sec: 19385.772613 Epoch Step: 600 Loss: 63.162933 Tokens per Sec: 19013.165752 Epoch Step: 700 Loss: 47.323864 Tokens per Sec: 18863.104141 Epoch Step: 800 Loss: 43.414978 Tokens per Sec: 19258.337491 Epoch Step: 900 Loss: 87.750214 Tokens per Sec: 19179.949782 Epoch Step: 1000 Loss: 39.787056 Tokens per Sec: 19110.748464 Epoch Step: 1100 Loss: 78.177170 Tokens per Sec: 19272.044197 Epoch Step: 1200 Loss: 37.122997 Tokens per Sec: 19194.535740 Epoch Step: 1300 Loss: 26.103378 Tokens per Sec: 19337.967366 Epoch Step: 1400 Loss: 78.804855 Tokens per Sec: 19018.413406 Epoch Step: 1500 Loss: 61.593956 Tokens per Sec: 19259.272095 Epoch Step: 1600 Loss: 81.611786 Tokens per Sec: 19259.527179 Epoch Step: 1700 Loss: 28.692696 Tokens per Sec: 19230.891840 Epoch Step: 1800 Loss: 84.163223 Tokens per Sec: 19071.272023 Epoch Step: 1900 Loss: 36.782116 Tokens per Sec: 19209.383788 Epoch Step: 2000 Loss: 56.666332 Tokens per Sec: 19127.522297 Epoch Step: 2100 Loss: 5.576357 Tokens per Sec: 18957.458966 Epoch Step: 2200 Loss: 38.791512 Tokens per Sec: 19166.811446 Example #1 Src : als ich 11 jahre alt war , wurde ich eines morgens von den heller freude geweckt . Trg : when i was 11 , i remember waking up one morning to the sound of joy in my house . Pred: when i was 11 years old , i was a of the . Example #2 Src : mein vater hörte sich auf seinem kleinen , grauen radio die der bbc an . Trg : my father was listening to bbc news on his small , gray radio . Pred: my father was on his , in the little , the of the . Example #3 Src : er sah sehr glücklich aus , was damals ziemlich ungewöhnlich war , da ihn die nachrichten meistens . Trg : there was a big smile on his face which was unusual then , because the news mostly depressed him . Pred: he saw very happy , what was pretty much , and it was the of the . Validation perplexity: 19.906190 Epoch 2 Epoch Step: 100 Loss: 58.981544 Tokens per Sec: 19121.747106 Epoch Step: 200 Loss: 34.874680 Tokens per Sec: 19689.768904 Epoch Step: 300 Loss: 27.895102 Tokens per Sec: 19751.401628 Epoch Step: 400 Loss: 52.931011 Tokens per Sec: 16369.447354 Epoch Step: 500 Loss: 77.191933 Tokens per Sec: 16337.808093 Epoch Step: 600 Loss: 65.645668 Tokens per Sec: 16307.871308 Epoch Step: 700 Loss: 7.141161 Tokens per Sec: 16420.432824 Epoch Step: 800 Loss: 76.990250 Tokens per Sec: 17512.558218 Epoch Step: 900 Loss: 43.835995 Tokens per Sec: 16399.672659 Epoch Step: 1000 Loss: 68.026192 Tokens per Sec: 16598.504664 Epoch Step: 1100 Loss: 23.746111 Tokens per Sec: 16368.137311 Epoch Step: 1200 Loss: 42.117832 Tokens per Sec: 16324.872475 Epoch Step: 1300 Loss: 47.894409 Tokens per Sec: 16532.223380 Epoch Step: 1400 Loss: 43.772861 Tokens per Sec: 16472.315811 Epoch Step: 1500 Loss: 60.978756 Tokens per Sec: 16368.088307 Epoch Step: 1600 Loss: 59.143227 Tokens per Sec: 16553.220745 Epoch Step: 1700 Loss: 34.091373 Tokens per Sec: 16557.579342 Epoch Step: 1800 Loss: 11.551711 Tokens per Sec: 16639.281663 Epoch Step: 1900 Loss: 40.060520 Tokens per Sec: 16666.679672 Epoch Step: 2000 Loss: 21.947863 Tokens per Sec: 16403.240568 Epoch Step: 2100 Loss: 12.891315 Tokens per Sec: 16656.630033 Epoch Step: 2200 Loss: 12.300262 Tokens per Sec: 16592.045153 Example #1 Src : als ich 11 jahre alt war , wurde ich eines morgens von den heller freude geweckt . Trg : when i was 11 , i remember waking up one morning to the sound of joy in my house . Pred: when i was 11 years old , i was a of the of the . Example #2 Src : mein vater hörte sich auf seinem kleinen , grauen radio die der bbc an . Trg : my father was listening to bbc news on his small , gray radio . Pred: my father was on his little , , the of the bbc . Example #3 Src : er sah sehr glücklich aus , was damals ziemlich ungewöhnlich war , da ihn die nachrichten meistens . Trg : there was a big smile on his face which was unusual then , because the news mostly depressed him . Pred: he looked very happy to what was pretty much more , because it was the of the . Validation perplexity: 15.555337 Epoch 3 Epoch Step: 100 Loss: 36.178066 Tokens per Sec: 16064.364293 Epoch Step: 200 Loss: 20.046204 Tokens per Sec: 16557.065342 Epoch Step: 300 Loss: 53.514584 Tokens per Sec: 16375.767859 Epoch Step: 400 Loss: 29.280447 Tokens per Sec: 16687.195842 Epoch Step: 500 Loss: 64.491814 Tokens per Sec: 16491.438857 Epoch Step: 600 Loss: 62.286755 Tokens per Sec: 16443.863308 Epoch Step: 700 Loss: 60.861393 Tokens per Sec: 16303.304238 Epoch Step: 800 Loss: 25.101744 Tokens per Sec: 16437.206262 Epoch Step: 900 Loss: 41.884624 Tokens per Sec: 16712.862598 Epoch Step: 1000 Loss: 65.880905 Tokens per Sec: 16406.042864 Epoch Step: 1100 Loss: 34.799385 Tokens per Sec: 16257.804744 Epoch Step: 1200 Loss: 57.244125 Tokens per Sec: 16403.685499 Epoch Step: 1300 Loss: 6.766514 Tokens per Sec: 16262.412676 Epoch Step: 1400 Loss: 31.528254 Tokens per Sec: 16723.894609 Epoch Step: 1500 Loss: 4.534189 Tokens per Sec: 16512.533272 Epoch Step: 1600 Loss: 50.852787 Tokens per Sec: 16820.837828 Epoch Step: 1700 Loss: 30.657820 Tokens per Sec: 16574.791159 Epoch Step: 1800 Loss: 75.787910 Tokens per Sec: 16441.350335 Epoch Step: 1900 Loss: 23.563347 Tokens per Sec: 16836.284727 Epoch Step: 2000 Loss: 10.594786 Tokens per Sec: 16522.362683 Epoch Step: 2100 Loss: 40.561062 Tokens per Sec: 16508.617285 Epoch Step: 2200 Loss: 15.348518 Tokens per Sec: 16624.360367 Example #1 Src : als ich 11 jahre alt war , wurde ich eines morgens von den heller freude geweckt . Trg : when i was 11 , i remember waking up one morning to the sound of joy in my house . Pred: when i was 11 11 years old , i was a of the joy . Example #2 Src : mein vater hörte sich auf seinem kleinen , grauen radio die der bbc an . Trg : my father was listening to bbc news on his small , gray radio . Pred: my father was on his little , , , the of the bbc . Example #3 Src : er sah sehr glücklich aus , was damals ziemlich ungewöhnlich war , da ihn die nachrichten meistens . Trg : there was a big smile on his face which was unusual then , because the news mostly depressed him . Pred: he saw very happy , what was pretty much , because it was the . Validation perplexity: 13.563748 Epoch 4 Epoch Step: 100 Loss: 9.601490 Tokens per Sec: 16309.901017 Epoch Step: 200 Loss: 13.329712 Tokens per Sec: 16693.352689 Epoch Step: 300 Loss: 61.213333 Tokens per Sec: 16774.275779 Epoch Step: 400 Loss: 37.759483 Tokens per Sec: 16628.037095 Epoch Step: 500 Loss: 35.616104 Tokens per Sec: 16677.874896 Epoch Step: 600 Loss: 58.753849 Tokens per Sec: 16452.736708 Epoch Step: 700 Loss: 11.741160 Tokens per Sec: 16615.759446 Epoch Step: 800 Loss: 24.230316 Tokens per Sec: 16804.673563 Epoch Step: 900 Loss: 27.786499 Tokens per Sec: 16373.396939 Epoch Step: 1000 Loss: 65.063515 Tokens per Sec: 16520.381173 Epoch Step: 1100 Loss: 34.756481 Tokens per Sec: 16492.656502 Epoch Step: 1200 Loss: 43.993877 Tokens per Sec: 17075.912389 Epoch Step: 1300 Loss: 36.514729 Tokens per Sec: 16812.641454 Epoch Step: 1400 Loss: 58.995735 Tokens per Sec: 16535.979640 Epoch Step: 1500 Loss: 29.516464 Tokens per Sec: 16500.141569 Epoch Step: 1600 Loss: 10.143467 Tokens per Sec: 16613.933279 Epoch Step: 1700 Loss: 53.287037 Tokens per Sec: 16756.922926 Epoch Step: 1800 Loss: 24.687494 Tokens per Sec: 16477.783348 Epoch Step: 1900 Loss: 21.578268 Tokens per Sec: 16808.344988 Epoch Step: 2000 Loss: 60.965946 Tokens per Sec: 16651.623717 Epoch Step: 2100 Loss: 18.895075 Tokens per Sec: 16636.292649 Epoch Step: 2200 Loss: 53.253704 Tokens per Sec: 16642.799323 Example #1 Src : als ich 11 jahre alt war , wurde ich eines morgens von den heller freude geweckt . Trg : when i was 11 , i remember waking up one morning to the sound of joy in my house . Pred: when i was 11 years old , i was a of the joy . Example #2 Src : mein vater hörte sich auf seinem kleinen , grauen radio die der bbc an . Trg : my father was listening to bbc news on his small , gray radio . Pred: my dad listened on his little , radio the bbc of the bbc . Example #3 Src : er sah sehr glücklich aus , was damals ziemlich ungewöhnlich war , da ihn die nachrichten meistens . Trg : there was a big smile on his face which was unusual then , because the news mostly depressed him . Pred: he saw a happy very happy , which was pretty much , because he was the most famous . Validation perplexity: 12.664111 Epoch 5 Epoch Step: 100 Loss: 21.919912 Tokens per Sec: 16266.471497 Epoch Step: 200 Loss: 31.320656 Tokens per Sec: 16527.955427 Epoch Step: 300 Loss: 40.778984 Tokens per Sec: 16517.710752 Epoch Step: 400 Loss: 63.466324 Tokens per Sec: 16770.294841 Epoch Step: 500 Loss: 49.329956 Tokens per Sec: 16694.936223 Epoch Step: 600 Loss: 52.290169 Tokens per Sec: 16755.442966 Epoch Step: 700 Loss: 51.911785 Tokens per Sec: 16768.565847 Epoch Step: 800 Loss: 25.005857 Tokens per Sec: 16813.186507 Epoch Step: 900 Loss: 50.679825 Tokens per Sec: 17109.031968 Epoch Step: 1000 Loss: 13.069316 Tokens per Sec: 16692.984251 Epoch Step: 1100 Loss: 12.595688 Tokens per Sec: 16546.293379 Epoch Step: 1200 Loss: 46.846031 Tokens per Sec: 16491.379305 Epoch Step: 1300 Loss: 30.238283 Tokens per Sec: 16558.196936 Epoch Step: 1400 Loss: 23.865877 Tokens per Sec: 16556.353749 Epoch Step: 1500 Loss: 42.451859 Tokens per Sec: 16784.645679 Epoch Step: 1600 Loss: 37.048477 Tokens per Sec: 16651.129133 Epoch Step: 1700 Loss: 17.043219 Tokens per Sec: 16655.630464 Epoch Step: 1800 Loss: 17.227308 Tokens per Sec: 16688.568658 Epoch Step: 1900 Loss: 23.672441 Tokens per Sec: 16609.439477 Epoch Step: 2000 Loss: 19.385946 Tokens per Sec: 16586.442474 Epoch Step: 2100 Loss: 25.717686 Tokens per Sec: 16879.694187 Epoch Step: 2200 Loss: 22.427767 Tokens per Sec: 16844.504307 Example #1 Src : als ich 11 jahre alt war , wurde ich eines morgens von den heller freude geweckt . Trg : when i was 11 , i remember waking up one morning to the sound of joy in my house . Pred: when i was 11 years old , i was by the morning of joy . Example #2 Src : mein vater hörte sich auf seinem kleinen , grauen radio die der bbc an . Trg : my father was listening to bbc news on his small , gray radio . Pred: my father listened on his little , gray radio waves the bbc of the bbc . Example #3 Src : er sah sehr glücklich aus , was damals ziemlich ungewöhnlich war , da ihn die nachrichten meistens . Trg : there was a big smile on his face which was unusual then , because the news mostly depressed him . Pred: he saw a very happy ending , which was pretty unusual , since then they were . Validation perplexity: 12.246438 Epoch 6 Epoch Step: 100 Loss: 19.048712 Tokens per Sec: 19024.102757 Epoch Step: 200 Loss: 31.636736 Tokens per Sec: 19387.779254 Epoch Step: 300 Loss: 15.952754 Tokens per Sec: 19559.196457 Epoch Step: 400 Loss: 24.849632 Tokens per Sec: 18968.450791 Epoch Step: 500 Loss: 47.227837 Tokens per Sec: 19009.957585 Epoch Step: 600 Loss: 8.887992 Tokens per Sec: 19024.581918 Epoch Step: 700 Loss: 58.158920 Tokens per Sec: 16834.343585 Epoch Step: 800 Loss: 32.257362 Tokens per Sec: 16725.454783 Epoch Step: 900 Loss: 5.977044 Tokens per Sec: 16398.470679 Epoch Step: 1000 Loss: 51.871101 Tokens per Sec: 16302.492231 Epoch Step: 1100 Loss: 44.715164 Tokens per Sec: 16505.477988 Epoch Step: 1200 Loss: 4.128096 Tokens per Sec: 19255.909773 Epoch Step: 1300 Loss: 53.065189 Tokens per Sec: 19016.853318 Epoch Step: 1400 Loss: 23.775473 Tokens per Sec: 18877.681861 Epoch Step: 1500 Loss: 15.587101 Tokens per Sec: 18916.694718 Epoch Step: 1600 Loss: 59.449795 Tokens per Sec: 19166.565245 Epoch Step: 1700 Loss: 48.393402 Tokens per Sec: 18836.264938 Epoch Step: 1800 Loss: 45.651253 Tokens per Sec: 18823.983316 Epoch Step: 1900 Loss: 51.898994 Tokens per Sec: 19015.027947 Epoch Step: 2000 Loss: 16.392334 Tokens per Sec: 19180.065119 Epoch Step: 2100 Loss: 20.312500 Tokens per Sec: 19059.061076 Epoch Step: 2200 Loss: 41.126842 Tokens per Sec: 19110.648056 Example #1 Src : als ich 11 jahre alt war , wurde ich eines morgens von den heller freude geweckt . Trg : when i was 11 , i remember waking up one morning to the sound of joy in my house . Pred: when i was 11 , i was a of the joy . Example #2 Src : mein vater hörte sich auf seinem kleinen , grauen radio die der bbc an . Trg : my father was listening to bbc news on his small , gray radio . Pred: my father listened to his little , radio shack the of the bbc . Example #3 Src : er sah sehr glücklich aus , was damals ziemlich ungewöhnlich war , da ihn die nachrichten meistens . Trg : there was a big smile on his face which was unusual then , because the news mostly depressed him . Pred: he looked very happy , which was pretty unusual , and then they had the news . Validation perplexity: 12.045694 Epoch 7 Epoch Step: 100 Loss: 22.484320 Tokens per Sec: 19136.387726 Epoch Step: 200 Loss: 54.793003 Tokens per Sec: 19562.003455 Epoch Step: 300 Loss: 52.516510 Tokens per Sec: 19494.585192 Epoch Step: 400 Loss: 25.631699 Tokens per Sec: 19127.415568 Epoch Step: 500 Loss: 15.818419 Tokens per Sec: 18909.082434 Epoch Step: 600 Loss: 40.660767 Tokens per Sec: 19063.824782 Epoch Step: 700 Loss: 21.253407 Tokens per Sec: 19011.780769 Epoch Step: 800 Loss: 9.494976 Tokens per Sec: 19032.447976 Epoch Step: 900 Loss: 21.503059 Tokens per Sec: 19120.646494 Epoch Step: 1000 Loss: 34.198826 Tokens per Sec: 18751.274337 Epoch Step: 1100 Loss: 21.471136 Tokens per Sec: 19119.629059 Epoch Step: 1200 Loss: 45.433662 Tokens per Sec: 19158.978952 Epoch Step: 1300 Loss: 48.697639 Tokens per Sec: 18852.568454 Epoch Step: 1400 Loss: 48.406239 Tokens per Sec: 19090.121092 Epoch Step: 1500 Loss: 10.506186 Tokens per Sec: 18996.606224 Epoch Step: 1600 Loss: 22.061657 Tokens per Sec: 18889.519602 Epoch Step: 1700 Loss: 11.148299 Tokens per Sec: 19179.133196 Epoch Step: 1800 Loss: 16.580446 Tokens per Sec: 19184.709044 Epoch Step: 1900 Loss: 20.219671 Tokens per Sec: 18889.205997 Epoch Step: 2000 Loss: 21.245464 Tokens per Sec: 18869.151894 Epoch Step: 2100 Loss: 29.567142 Tokens per Sec: 18825.496347 Epoch Step: 2200 Loss: 22.790722 Tokens per Sec: 18923.950021 Example #1 Src : als ich 11 jahre alt war , wurde ich eines morgens von den heller freude geweckt . Trg : when i was 11 , i remember waking up one morning to the sound of joy in my house . Pred: when i was 11 years old , i was a of the joy . Example #2 Src : mein vater hörte sich auf seinem kleinen , grauen radio die der bbc an . Trg : my father was listening to bbc news on his small , gray radio . Pred: my father listened to his little , radio the of the bbc . Example #3 Src : er sah sehr glücklich aus , was damals ziemlich ungewöhnlich war , da ihn die nachrichten meistens . Trg : there was a big smile on his face which was unusual then , because the news mostly depressed him . Pred: he looked very happy , which was pretty unusual , because he was going to put him in the . Validation perplexity: 11.837098 Epoch 8 Epoch Step: 100 Loss: 49.162842 Tokens per Sec: 19241.082862 Epoch Step: 200 Loss: 35.163906 Tokens per Sec: 19633.028114 Epoch Step: 300 Loss: 10.108455 Tokens per Sec: 17179.927672 Epoch Step: 400 Loss: 12.883712 Tokens per Sec: 16510.876579 Epoch Step: 500 Loss: 32.006828 Tokens per Sec: 16459.413702 Epoch Step: 600 Loss: 21.056961 Tokens per Sec: 16640.683528 Epoch Step: 700 Loss: 5.884560 Tokens per Sec: 16567.539919 Epoch Step: 800 Loss: 17.562445 Tokens per Sec: 16529.548052 Epoch Step: 900 Loss: 25.654568 Tokens per Sec: 16629.045928 Epoch Step: 1000 Loss: 30.116678 Tokens per Sec: 16519.515326 Epoch Step: 1100 Loss: 49.594883 Tokens per Sec: 16766.220937 Epoch Step: 1200 Loss: 35.545147 Tokens per Sec: 16729.972737 Epoch Step: 1300 Loss: 12.314122 Tokens per Sec: 16479.824355 Epoch Step: 1400 Loss: 5.982590 Tokens per Sec: 16592.352361 Epoch Step: 1500 Loss: 23.507740 Tokens per Sec: 16396.264595 Epoch Step: 1600 Loss: 36.874157 Tokens per Sec: 16554.722618 Epoch Step: 1700 Loss: 13.514697 Tokens per Sec: 16605.822594 Epoch Step: 1800 Loss: 6.016938 Tokens per Sec: 16390.681327 Epoch Step: 1900 Loss: 44.648132 Tokens per Sec: 16575.965569 Epoch Step: 2000 Loss: 21.025373 Tokens per Sec: 16363.246501 Epoch Step: 2100 Loss: 32.213993 Tokens per Sec: 16395.313089 Epoch Step: 2200 Loss: 29.033810 Tokens per Sec: 16528.855537 Example #1 Src : als ich 11 jahre alt war , wurde ich eines morgens von den heller freude geweckt . Trg : when i was 11 , i remember waking up one morning to the sound of joy in my house . Pred: when i was 11 years old , i was a of the joy . Example #2 Src : mein vater hörte sich auf seinem kleinen , grauen radio die der bbc an . Trg : my father was listening to bbc news on his small , gray radio . Pred: my father listened to his little , gray radio shack , the radio of the bbc . Example #3 Src : er sah sehr glücklich aus , was damals ziemlich ungewöhnlich war , da ihn die nachrichten meistens . Trg : there was a big smile on his face which was unusual then , because the news mostly depressed him . Pred: he looked very happy , which was pretty unusual , because he was the news of the most famous . Validation perplexity: 11.868392 Epoch 9 Epoch Step: 100 Loss: 33.819195 Tokens per Sec: 16155.433696 Epoch Step: 200 Loss: 26.771244 Tokens per Sec: 16447.243194 Epoch Step: 300 Loss: 22.235714 Tokens per Sec: 16557.847083 Epoch Step: 400 Loss: 16.233931 Tokens per Sec: 16802.777289 Epoch Step: 500 Loss: 34.811615 Tokens per Sec: 16637.208199 Epoch Step: 600 Loss: 11.960271 Tokens per Sec: 16478.541533 Epoch Step: 700 Loss: 32.807648 Tokens per Sec: 16526.645827 Epoch Step: 800 Loss: 25.779436 Tokens per Sec: 16572.304586 Epoch Step: 900 Loss: 18.101871 Tokens per Sec: 16472.573763 Epoch Step: 1000 Loss: 34.465992 Tokens per Sec: 16489.131609 Epoch Step: 1100 Loss: 47.311241 Tokens per Sec: 16501.563937 Epoch Step: 1200 Loss: 22.709623 Tokens per Sec: 16416.828638 Epoch Step: 1300 Loss: 45.883862 Tokens per Sec: 16338.132985 Epoch Step: 1400 Loss: 21.321081 Tokens per Sec: 16680.505744 Epoch Step: 1500 Loss: 11.126824 Tokens per Sec: 16636.646687 Epoch Step: 1600 Loss: 32.759712 Tokens per Sec: 16440.968759 Epoch Step: 1700 Loss: 19.354910 Tokens per Sec: 16476.318234 Epoch Step: 1800 Loss: 14.631118 Tokens per Sec: 16490.663260 Epoch Step: 1900 Loss: 2.233373 Tokens per Sec: 16390.177497 Epoch Step: 2000 Loss: 42.503407 Tokens per Sec: 16498.365808 Epoch Step: 2100 Loss: 35.935966 Tokens per Sec: 16257.764127 Epoch Step: 2200 Loss: 37.685387 Tokens per Sec: 16498.916279 Example #1 Src : als ich 11 jahre alt war , wurde ich eines morgens von den heller freude geweckt . Trg : when i was 11 , i remember waking up one morning to the sound of joy in my house . Pred: when i was 11 , i was a of joy . Example #2 Src : mein vater hörte sich auf seinem kleinen , grauen radio die der bbc an . Trg : my father was listening to bbc news on his small , gray radio . Pred: my father listened to his little , gray radio shack the bbc of the bbc . Example #3 Src : er sah sehr glücklich aus , was damals ziemlich ungewöhnlich war , da ihn die nachrichten meistens . Trg : there was a big smile on his face which was unusual then , because the news mostly depressed him . Pred: he looked very happy , which was pretty unusual since then , they were the . Validation perplexity: 11.886973 ```python plot_perplexity(dev_perplexities) ``` ![png](images/output_49_0.png) ## Prediction and Evaluation Once trained we can use the model to produce a set of translations. If we translate the whole validation set, we can use [SacreBLEU](https://github.com/mjpost/sacreBLEU) to get a [BLEU score](https://en.wikipedia.org/wiki/BLEU), which is the most common way to evaluate translations. #### Important sidenote Typically you would use SacreBLEU from the **command line** using the output file and original (possibly tokenized) development reference file. This will give you a nice version string that shows how the BLEU score was calculated; for example, if it was lowercased, if it was tokenized (and how), and what smoothing was used. If you want to learn more about how BLEU scores are (and should be) reported, check out [this paper](https://arxiv.org/abs/1804.08771). However, right now our pre-processed data is only in memory, so we'll calculate the BLEU score right from this notebook for demonstration purposes. We'll first test the raw BLEU function: ```python import sacrebleu ``` ```python # this should result in a perfect BLEU of 100% hypotheses = ["this is a test"] references = ["this is a test"] bleu = sacrebleu.raw_corpus_bleu(hypotheses, [references], .01).score print(bleu) ``` 100.00000000000004 ```python # here the BLEU score will be lower, because some n-grams won't match hypotheses = ["this is a test"] references = ["this is a fest"] bleu = sacrebleu.raw_corpus_bleu(hypotheses, [references], .01).score print(bleu) ``` 22.360679774997894 Since we did some filtering for speed, our validation set contains 690 sentences. The references are the tokenized versions, but they should not contain out-of-vocabulary UNKs that our network might have seen. So we'll take the references straight out of the `valid_data` object: ```python len(valid_data) ``` 690 ```python references = [" ".join(example.trg) for example in valid_data] print(len(references)) print(references[0]) ``` 690 when i was 11 , i remember waking up one morning to the sound of joy in my house . ```python references[-2] ``` "i 'm always the one taking the picture ." **Now we translate the validation set!** This might take a little bit of time. Note that `greedy_decode` will cut-off the sentence when it encounters the end-of-sequence symbol, if we provide it the index of that symbol. ```python hypotheses = [] alphas = [] # save the last attention scores for batch in valid_iter: batch = rebatch(PAD_INDEX, batch) pred, attention = greedy_decode( model, batch.src, batch.src_mask, batch.src_lengths, max_len=25, sos_index=TRG.vocab.stoi[SOS_TOKEN], eos_index=TRG.vocab.stoi[EOS_TOKEN]) hypotheses.append(pred) alphas.append(attention) ``` ```python # we will still need to convert the indices to actual words! hypotheses[0] ``` array([ 70, 11, 24, 1460, 5, 11, 24, 9, 0, 10, 0, 0, 1806, 4]) ```python hypotheses = [lookup_words(x, TRG.vocab) for x in hypotheses] hypotheses[0] ``` ['when', 'i', 'was', '11', ',', 'i', 'was', 'a', '', 'of', '', '', 'joy', '.'] ```python # finally, the SacreBLEU raw scorer requires string input, so we convert the lists to strings hypotheses = [" ".join(x) for x in hypotheses] print(len(hypotheses)) print(hypotheses[0]) ``` 690 when i was 11 , i was a of joy . ```python # now we can compute the BLEU score! bleu = sacrebleu.raw_corpus_bleu(hypotheses, [references], .01).score print(bleu) ``` 23.4681520210298 ## Attention Visualization We can also visualize the attention scores of the decoder. ```python def plot_heatmap(src, trg, scores): fig, ax = plt.subplots() heatmap = ax.pcolor(scores, cmap='viridis') ax.set_xticklabels(trg, minor=False, rotation='vertical') ax.set_yticklabels(src, minor=False) # put the major ticks at the middle of each cell # and the x-ticks on top ax.xaxis.tick_top() ax.set_xticks(np.arange(scores.shape[1]) + 0.5, minor=False) ax.set_yticks(np.arange(scores.shape[0]) + 0.5, minor=False) ax.invert_yaxis() plt.colorbar(heatmap) plt.show() ``` ```python # This plots a chosen sentence, for which we saved the attention scores above. idx = 5 src = valid_data[idx].src + [""] trg = valid_data[idx].trg + [""] pred = hypotheses[idx].split() + [""] pred_att = alphas[idx][0].T[:, :len(pred)] print("src", src) print("ref", trg) print("pred", pred) plot_heatmap(src, pred, pred_att) ``` src ['"', 'jetzt', 'kannst', 'du', 'auf', 'eine', 'richtige', 'schule', 'gehen', ',', '"', 'sagte', 'er', '.', ''] ref ['"', 'you', 'can', 'go', 'to', 'a', 'real', 'school', 'now', ',', '"', 'he', 'said', '.', ''] pred ['"', 'now', 'you', 'can', 'go', 'to', 'a', 'right', 'school', ',', '"', 'he', 'said', '.', ''] ![png](images/output_66_1.png) # Congratulations! You've finished this notebook. What didn't we cover? - Subwords / Byte Pair Encoding [[paper]](https://arxiv.org/abs/1508.07909) [[github]](https://github.com/rsennrich/subword-nmt) let you deal with unknown words. - You can implement a [multiplicative/bilinear attention mechanism](https://arxiv.org/abs/1508.04025) instead of the additive one used here. - We used greedy decoding here to get translations, but you can get better results with beam search. - The original model only uses a single dropout layer (in the decoder), but you can experiment with adding more dropout layers, for example on the word embeddings and the source word representations. - You can experiment with multiple encoder/decoder layers. - Experiment with a benchmarked and improved codebase: [Joey NMT](https://github.com/joeynmt/joeynmt) If this was useful to your research, please consider citing: > J. Bastings. 2018. The Annotated Encoder-Decoder with Attention. https://bastings.github.io/annotated_encoder_decoder/ Or use the following Bibtex: ``` @misc{bastings2018annotated, title={The Annotated Encoder-Decoder with Attention}, author={Bastings, J.}, journal={https://bastings.github.io/annotated\_encoder\_decoder/}, year={2018} } ```